WEBVTT 00:00.000 --> 00:10.600 All right, so next up is Mark. 00:10.600 --> 00:14.160 We'll be telling you about Python packaging. 00:14.160 --> 00:15.160 Indeed. 00:15.160 --> 00:16.160 Thank you, Bjorn. 00:16.160 --> 00:17.660 Yes, my name is Mark Ryan. 00:17.660 --> 00:21.060 I work for Revolce and we're going to be spending the next 30 minutes talking about 00:21.060 --> 00:22.560 my favorite subjects. 00:22.560 --> 00:25.600 Python package installation on risk 5. 00:25.600 --> 00:28.920 I don't know how many of you have actually tried using Python on this 5. 00:28.920 --> 00:32.160 Those of you who have have probably enjoyed a bit of a mixed experience. 00:32.160 --> 00:35.920 For those of you who haven't, I'm going to demonstrate to just how frustrating and experience 00:35.920 --> 00:36.920 it can be. 00:36.920 --> 00:39.320 I'm going to do this using a little video I pre-prepared. 00:39.320 --> 00:43.920 This is a video taken of me typing away on a risk 5. 00:43.920 --> 00:45.920 You can see that it is a risk 5. 00:45.920 --> 00:47.720 The first thing I'm going to do is start Python. 00:47.720 --> 00:49.520 As you can see, Python works nicely. 00:49.520 --> 00:54.040 You can start it up, you can run Python programs, and better still, you can install Python 00:54.040 --> 00:55.040 packages. 00:55.040 --> 00:57.920 In this case, I'm installing something called requests, which is a very popular Python 00:57.920 --> 01:01.920 package for sending HTTP requests, as you might imagine. 01:01.920 --> 01:05.480 It is installing without problem, it's also installing a lot of extra dependencies, which 01:05.480 --> 01:07.160 install without problem. 01:07.160 --> 01:11.120 To prove it works, I'm going to start the interpreter, import the package, and try and send 01:11.120 --> 01:12.120 a request. 01:12.120 --> 01:16.280 I apologize about the slow typing here, but we're sending it requests to the false 01:16.280 --> 01:26.280 and website, and we should get some text back soon. 01:26.280 --> 01:28.680 Where it works, okay. 01:28.680 --> 01:33.880 Python sort of works, will it does work, but let's try and install a different package. 01:33.880 --> 01:36.280 This time we're going to install something called sentence piece. 01:36.280 --> 01:39.280 Now, sentence piece is a word tokenizer. 01:39.280 --> 01:43.280 It's used in a lot of AI and machine learning workloads. 01:43.280 --> 01:46.680 You're going to notice it's going to take a little bit longer to install the installation 01:46.680 --> 01:50.680 script looks a little bit different, it's downloading a tarot.gz file, it's saying something 01:50.680 --> 01:52.880 scary, installing build dependencies. 01:52.880 --> 01:56.240 I'm going to let you into a little secret, this is not going to work, and it's going 01:56.240 --> 02:05.480 to fail rather spectacularly in maybe five seconds, and there we go, you've got a lovely 02:05.480 --> 02:11.360 error message, that is not a great user experience, I think you can all agree. 02:11.360 --> 02:15.040 So why is it that we can install some Python packages on this file without problem 02:15.040 --> 02:16.520 and others we can not install? 02:16.520 --> 02:19.360 Well, it turns out there's two different types of Python packages, there are pure Python 02:19.360 --> 02:25.080 packages, those written purely in Python, and there are binary packages, which are written 02:25.080 --> 02:29.600 Python and a mixture of C, sometimes C++, and increasingly Rust. 02:29.600 --> 02:33.680 You can download pure Python packages from Python, which is a central repository that stores 02:33.680 --> 02:37.800 Python packages on this file without problems, but you cannot download binary packages, 02:37.800 --> 02:42.360 and the reason is that Python will allow you to upload binary wheels for risk by 64, and 02:42.360 --> 02:48.760 a wheel is the file format for distribution Python packages, and it's not just sentence 02:48.760 --> 02:53.640 piece that's affected, a lot of the very popular Python packages, it's important Python 02:53.640 --> 02:58.600 packages of binary packages, if you think about things like NumPy, Cyphy, Cycup Learn, 02:58.600 --> 03:03.880 Pandas, Pillow, TensorFlow, PyTorch, PyRow, all of these things are binary packages, 03:03.880 --> 03:08.960 so they cannot be easily installed on RIS5, 64 devices at the moment. 03:08.960 --> 03:12.880 So just stop and think for a little bit about what a serious problem that is for the vendors 03:12.880 --> 03:17.520 of RIS5, 64 devices, they're going to have customers, their customers are likely to have 03:17.520 --> 03:22.440 RIS Python workloads, and those workloads are going to depend probably on one of 03:22.440 --> 03:25.320 more of these packages either directly or indirectly. 03:25.320 --> 03:29.560 If you look at the bottom of any sufficiently complicated Python stack, you're going to find 03:29.560 --> 03:33.600 NumPy at the bottom, and it's not possible to easily install NumPy at the moment. 03:33.960 --> 03:35.960 On RIS5, 64 devices. 03:35.960 --> 03:39.960 Now, you might be thinking that I'm over at dramaticizing this, and there is a solution, 03:39.960 --> 03:43.280 you could just install from source and pip allows you to do this. 03:43.280 --> 03:47.680 So if you remember, when we were trying to install sentence piece, what pip did, is it 03:47.680 --> 03:51.800 looked to see if there was a RIS5 binary that was and so it downloaded a source distribution 03:51.800 --> 03:56.600 and tried to install that, the installation fail, because I didn't have the right, build dependencies 03:56.600 --> 04:01.680 on my VM, but you might think that is a possible solution to your problems, but allow me 04:01.680 --> 04:03.600 to just abuse you of that notion. 04:03.600 --> 04:06.800 Building binary packages is very, very difficult. 04:06.800 --> 04:11.720 It's complicated process and it's very, very difficult to get right, and I'm going to demonstrate 04:11.720 --> 04:16.680 how difficult it is using some VMs that I have pre-created. 04:16.680 --> 04:20.440 So it's time for the live demo, which I really hope this works. 04:20.440 --> 04:22.360 What I have here is I have two VMs. 04:22.360 --> 04:26.200 On the right hand side, yes, it says noble. 04:26.240 --> 04:32.080 So that is Ubuntu 2404, and on the left hand side, I have jammy, that is Ubuntu 2204. 04:32.080 --> 04:36.440 I hope I said that right, noble is 2404, jammy is 2204. 04:36.440 --> 04:40.880 I'm going to refer them to them as noble and jammy from now on, because it's really difficult 04:40.880 --> 04:42.600 to say those version names. 04:42.600 --> 04:49.560 These are running on directly on my Mac, so there are AR64 VMs, I'm using QMU to run these. 04:49.560 --> 04:54.480 And they both have a shared folder, which is goal, falls them, well, you can see it there. 04:54.480 --> 05:00.760 It falls them 2505, so I'm going to use that share folder to copy files between the two VMs. 05:00.760 --> 05:03.880 Now the first thing I'm going to do is I'm just going to try and build send it to some source. 05:03.880 --> 05:07.040 I'm going to build it on noble, I want to build it on noble, because that's the more recent 05:07.040 --> 05:10.080 distribution, so it has a more recent tool chain. 05:10.080 --> 05:13.120 So I'll get all the laser stock to my stations. 05:13.120 --> 05:14.920 So I've got a little script to build it, actually. 05:14.920 --> 05:17.920 Let me just show you what that script is doing, it's not doing anything magic, it's just 05:17.920 --> 05:21.920 to save me from messing up on the typing, it's activating a virtual environment in which 05:21.920 --> 05:25.480 I'm going to preinstalled all the dependencies you need to build send it to space and then 05:25.480 --> 05:27.640 it just builds it with it. 05:27.640 --> 05:35.800 So let's build it, this should take about 10 seconds, I think, so we'll pause for a minute 05:35.800 --> 05:48.160 while it builds, it's got a faster laptop, yes, it's finished, okay, and if you see it's 05:48.160 --> 05:52.120 created this file here, it's great, this wheel file here. 05:52.120 --> 05:59.480 Now what I'm going to do is I'm going to copy this file, and I'm going to copy into my shared 05:59.480 --> 06:09.480 folder, oops, and we're going to switch over to our jammy, I should have done this while 06:09.480 --> 06:14.160 it was building. 06:14.160 --> 06:17.520 We're going to switch over to our jammy VM and we're going to try and install it, and 06:17.520 --> 06:23.600 actually I'm going to just create a version environment in which to install it. 06:23.600 --> 06:32.040 So let's install our wheel, and it fails and it fails straight away with a rather confusing 06:32.040 --> 06:33.040 error message. 06:33.040 --> 06:36.360 It says the wheel is not a supported wheel on this platform. 06:36.360 --> 06:40.320 Now to understand what this means, you can look at the file name, the file name actually 06:40.320 --> 06:42.480 gives you a clue. 06:42.480 --> 06:46.200 The file name is composed of a number of different parts, we have the name of the package, 06:46.240 --> 06:50.680 we have the version number, and then we have these two tags here, and these indicate the minimum 06:50.680 --> 06:54.960 version of Python required to run this package, and also the Python ABI against which 06:54.960 --> 06:58.880 the package was compiled, and they both say Python 3.12, and that's because on a bunch 06:58.880 --> 07:04.520 of noble, we have Python 3.12, but on a bunch of jammy, we have Python 3.10, and so 07:04.520 --> 07:10.600 this wheel is not compatible with the bunch of jammy in its default state. 07:10.600 --> 07:12.880 So that's the first kind of complication you're going to meet. 07:12.880 --> 07:16.480 When you build a Python wheel for distribution, you don't just build one wheel, you have 07:16.480 --> 07:18.040 to build multiple wheels. 07:18.040 --> 07:21.840 One, at least for each version of Python, your customers are going to use, and it's 07:21.840 --> 07:25.120 actually a lot more complicated than that, as we should see in a minute. 07:25.120 --> 07:32.120 So we can try and get around this problem by, well actually I anticipated this problem, 07:32.120 --> 07:37.760 and I pre-installed Python 3.12 on jammy, so we should be able to create another virtual 07:37.760 --> 07:47.480 environment that uses Python 3.12, let me delete the old one, let's create, and then 07:47.480 --> 07:52.480 we should be able to install it. 07:52.480 --> 07:59.920 Okay, let's try that, and this time it works, let's see does it actually work? 07:59.920 --> 08:07.280 You can see it says Python 3.12 now, we're going to try and import it, and we can 08:07.280 --> 08:11.520 import it, it fails, and it fails with another obscure error message. 08:11.520 --> 08:14.840 And if you read the error message, at least it's not too pages long, this time it's complaining 08:14.840 --> 08:16.200 about G-lipsy. 08:16.200 --> 08:20.360 And the problem here is that sentence piece, we compiled it on noble, and the noble has 08:20.360 --> 08:27.880 G-lipsy 2.39, and jammy only has G-lipsy 2.35, and when we compiled this on noble, it 08:27.880 --> 08:32.200 took advantage of symbol that was introduced in G-lipsy 2.38, and that symbol obviously 08:32.200 --> 08:35.640 isn't present on jammy, and so the package won't work. 08:35.640 --> 08:40.640 So our goal of trying to compile on the latest distribution to get the latest version of G-c 08:40.640 --> 08:47.400 sort of failed, but ultimately that's really what you want to do, but at the same time as 08:47.400 --> 08:52.720 we just see, you also need to compile on an old distribution with an old G-lipsy, so your 08:52.720 --> 08:56.800 wheel that you create will work on a wide range of distributions, and that's a sort 08:56.800 --> 09:01.440 of a contradiction that we shall return to later on to see how it solved up stream. 09:01.440 --> 09:05.920 And for now what I'm going to do is I'm just going to build sentence piece directly on jammy 09:05.920 --> 09:09.680 and try and install it in noble, just to prove that that is the problem. 09:09.680 --> 09:14.400 So again, this is going to take a few seconds to build, and while it's building, I am 09:14.400 --> 09:16.960 going to set up my virtual environment. 09:16.960 --> 09:21.960 I'll just build it wrong. 09:21.960 --> 09:39.360 Okay, it is built, let me delete the old wheel as here, and build copy in our new wheel. 09:39.360 --> 09:47.400 Let's try and install it. 09:47.400 --> 09:59.120 And it works, so we built on jammy, we can take that wheel and install it on noble, 09:59.120 --> 10:00.120 that's good. 10:00.120 --> 10:04.520 So the second thing you need to understand is when building wheels, you need to try and build 10:04.520 --> 10:09.520 on an old distribution as possible against the old G-lipsy possible. 10:09.520 --> 10:13.680 But sentence piece is actually a pretty simple package, it only contains, it's not, it doesn't 10:13.680 --> 10:16.600 contain a huge amount of code, and it doesn't have any dependencies. 10:16.600 --> 10:19.360 So let's try building something a bit more complicated. 10:19.360 --> 10:22.480 We're going to try and build numpy. 10:22.480 --> 10:25.360 Here's the script I'm using to build numpy, this is a pretty much how numpy is built up 10:25.360 --> 10:26.360 stream. 10:26.360 --> 10:32.160 Again, I'm building it inside a virtual environment with all the dependencies that are required. 10:32.160 --> 10:38.360 Okay, yeah, this is actually going to take about 50 seconds on this VM. 10:38.360 --> 10:43.800 So while it's building, allow me to just demonstrate that sentence piece actually does work, 10:43.800 --> 10:55.480 I have a small test here for sentence piece, so let's just run that. 10:55.480 --> 10:59.680 This is just tokenizing a sentence using sentence piece, it says hello false demo, and so 10:59.680 --> 11:03.640 we've tokenized this, and you can see that it is working fine. 11:03.640 --> 11:06.080 So this should be ready in a minute. 11:06.080 --> 11:09.760 We are building on jammy, because we've learned our lesson that we want to build using 11:09.760 --> 11:17.800 an old version of you, and we're going to try and install it on noble when it's finished. 11:17.800 --> 11:24.040 Okay, it's finished, so let's copy the package that we built into our folder, and let's 11:24.040 --> 11:32.320 try and pick installing it. 11:32.320 --> 11:35.760 It has installed, but does it work? 11:35.920 --> 11:38.920 Anybody guess? 11:38.920 --> 11:43.080 No, it doesn't work, and we get another huge page of error message. 11:43.080 --> 11:46.640 If you read through the error message, you eventually will come to this line here where it says 11:46.640 --> 11:51.480 the original error was, and it's complaining about a missing shared library, and so the problem 11:51.480 --> 11:57.480 here is that numpy, like a lot of other Python packages, have dependencies on shared libraries 11:57.480 --> 12:02.920 in addition to the G-lip C, and that shared library, open blasers not installed, locally 12:02.920 --> 12:08.960 on this noble VM, so when I try to use numpy, try to import numpy, can't find the library 12:08.960 --> 12:10.760 and it fails. 12:10.760 --> 12:16.880 Now what you could do is you could just try installing open blasers on this noble VM, and 12:16.880 --> 12:21.960 then re-importing numpy, and it will probably work, but it might work exactly correct. 12:21.960 --> 12:28.840 And the reason for this is when upstream packages like numpy use dependencies, they often 12:28.880 --> 12:33.760 rely on a very specific version of those dependencies, and they often expect them to be built 12:33.760 --> 12:36.560 in a very specific way, and sometimes to be patched. 12:36.560 --> 12:40.600 So if you just use the distro version of numpy, you're not necessarily going to get the same 12:40.600 --> 12:44.800 behavior as people who download numpy from pi pi r. 12:44.800 --> 12:55.240 So the way that this is achieved upstream is that when numpy builds their packages, they use 12:55.280 --> 13:02.240 a tool called audit wheel, and what audit wheel does is that it analyzes all of the binary 13:02.240 --> 13:07.920 components of a wheel, and it finds all of their dependencies, and it checks those dependencies 13:07.920 --> 13:11.880 to see if they're all a wireless, and if they're not on a wireless, it copies those dependencies 13:11.880 --> 13:16.160 directly into the wheel, and so when the wheel is distributed on pi pi, the wheel doesn't 13:16.160 --> 13:21.560 just contain numpy, but it also contains all of numpy binary dependencies, so things 13:21.560 --> 13:25.560 like open-blas and lib g4 from there package inside the wheel. 13:25.560 --> 13:31.560 And so we can just look at that working here, so what I'm going to do is I'm going to use 13:31.560 --> 13:39.080 audit wheel to repair my numpy wheel with this little script here, and it's written 13:39.080 --> 13:42.360 the wheel to a different directory, so let's have a look. 13:42.360 --> 13:47.560 You can see also that the name of the wheel has changed previously, the tank here was Linux, 13:47.560 --> 13:50.480 and now it says many Linux 2.35. 13:50.480 --> 13:55.040 And what that means, this many Linux 2.35 is a sort of a guarantee that this particular 13:55.040 --> 14:00.320 wheel will work on any Linux distribution that has a G-Libc 2.35 of greater, and the way 14:00.320 --> 14:09.320 that's achieved is that it's linked against G-Libc 2.35, any dependencies that it uses 14:09.320 --> 14:14.000 that aren't on this white list are copied into the wheel itself, and any dependencies 14:14.000 --> 14:18.320 that are on the white list, it makes sure that the package doesn't use any symbols 14:18.320 --> 14:24.240 and those dependencies that aren't present in a Linux distribution that ships G-Libc 2.35. 14:24.240 --> 14:38.000 So let us copy our new wheel to our shared folder, and yeah, oh, let's install the 14:38.000 --> 15:03.720 version of them by that list installed E-version, and it's working, important numpy numpy 15:03.720 --> 15:09.280 one, and it works, okay. 15:09.280 --> 15:13.680 So that is the third thing you have to consider, when you're distributing wheels, you 15:13.680 --> 15:19.160 need to repair them to make sure that they contain all the dependencies packaged directly 15:19.160 --> 15:23.880 within the wheel to guarantee that they're going to work on your end user's machines. 15:23.880 --> 15:28.200 Okay, so that's the end of the demo, wasn't too bad. 15:28.200 --> 15:33.440 Let's return to the presentation, can I do that? 15:33.440 --> 15:35.600 And we'll just recap what we just learned. 15:35.600 --> 15:39.560 So when you're distributing binary wheels in Python, you need to build multiple wheels. 15:39.560 --> 15:43.400 You need to build one wheel for each version of Python, your customer is going to use. 15:43.400 --> 15:48.240 But actually, if you want to support multiple versions of G-Libc and multiple versions 15:48.240 --> 15:52.880 of multiple types of C standard library, you want to support G-Libc and Muzzle Linux, then 15:52.880 --> 15:57.440 Muzzle, you need to build another set of wheels, if you want to support Mac OS and Windows, 15:57.440 --> 15:59.040 you need to build another set of wheels. 15:59.120 --> 16:04.640 If you want to support X86, AR64, RSI, PowerPC, you need to build another set of wheels. 16:04.640 --> 16:09.360 And if you multiply all these together, you can end up having to build like 30 or 40 or 50 wheels. 16:09.360 --> 16:15.840 I looked on the PIPI repository for NumPy, and they actually build 54 different wheels for 16:15.840 --> 16:17.440 a friend NumPy. 16:17.440 --> 16:20.280 And also, there's different Python interpreters, right? 16:20.280 --> 16:24.480 So there's a PIPI and that needs its own own special wheel. 16:24.480 --> 16:27.920 So that's the first tip of the building Python packages industry. 16:27.920 --> 16:31.200 You need to build lots and lots of different wheels. 16:31.200 --> 16:35.680 The second issue that we have come across is ideally you want to build with a brand new G-Libc. 16:35.680 --> 16:41.200 Sorry, G-C-C, and you can imagine that's important for RSI 5, or we want to build with G-C-C 14, 16:41.200 --> 16:42.560 so we get a vector support. 16:42.560 --> 16:47.360 But you also want to build with an old G-Libc, so you can widely distribute your wheels. 16:47.360 --> 16:50.480 And finally, we have this issue of dependencies. 16:50.480 --> 16:54.000 The issue of dependencies is actually a lot trickier than I've made it appear. 16:54.000 --> 16:57.520 It's not just a matter of running order wheel, and the reason it's not just a matter of running 16:57.520 --> 17:02.000 order wheel, is when you're distributing your package, you have to find out what 17:02.000 --> 17:06.080 version of the dependency that package wants, how it wants it built, how it wants it patched, 17:07.120 --> 17:10.160 and then you have to build it, and then you can repair your wheel. 17:10.160 --> 17:14.720 And often building the dependencies for a binary Python package is much more difficult 17:14.720 --> 17:16.080 than building the packages itself. 17:16.160 --> 17:24.160 OK, so this is all pretty tricky, and this is tricky on all platforms and all architectures. 17:24.160 --> 17:30.400 And so the sort of open source community has come up with a lot of projects to make this 17:30.400 --> 17:35.680 easier, to make the sort of the creation and distribution of Python packages easier. 17:35.680 --> 17:38.880 And so we talked a little bit about distribution before, this is a central repository 17:39.680 --> 17:41.040 for distributing Python packages. 17:41.040 --> 17:41.920 This is called Python. 17:42.400 --> 17:48.000 And that's when you just do pip install numpy, that's where people go through by default to find 17:48.000 --> 17:49.040 its packages. 17:49.040 --> 17:56.720 This problem of solving, of wanting to build with new GCC and old G libsy is sold by a project called 17:56.720 --> 18:01.360 many Linux. So many Linux is a project that publishes container images which are 18:01.360 --> 18:04.720 specifically designed to build Python packages. 18:04.720 --> 18:11.120 They ship with an oldest G libsy, so the newest many Linux container images many Linux 2.3.4, 18:11.200 --> 18:16.800 which ships with G libsy 2.3.4, but it takes advantage of a red hat project called GCC Toolset, 18:16.800 --> 18:21.360 which allows you to install the latest version of GCC without updating your G libsy. 18:22.160 --> 18:28.960 And so many Linux 2.3.4 has G libsy 2.3.4, but it also has GCC 14, which is a really nice combination. 18:28.960 --> 18:33.200 The mellin Linux images also contain the latest versions of the build tools, 18:33.200 --> 18:36.960 stuff like Git and C makes, so you don't have an old version of these tools, which is good, 18:37.520 --> 18:42.480 that might prevent you building things. And they also come with about 10 different pre-built versions 18:42.480 --> 18:46.080 of Python. And that's important because when you're building multiple wheels, you need separate 18:46.080 --> 18:49.280 versions of Python for each different wheel you want to build. 18:50.960 --> 18:55.280 We've talked a little bit about binary dependencies, and we also looked at audit wheel, 18:55.280 --> 18:59.280 so you need to sort of vendor these binary dependencies into your wheel, and that is done by audit 18:59.280 --> 19:03.840 wheel, and it can also be done by a tool called mature and matureness, tools designed to build 19:03.920 --> 19:09.440 hybrid Python rust packages, and it has this repair facility. 19:10.880 --> 19:15.200 We also talked about how you might have to build about 50 of these different wheels, 19:15.200 --> 19:19.120 and for each of these wheels you have to build, you'll have to identify the correct Docker 19:19.120 --> 19:22.720 image, the many Linux image to build them in. You've actually got to build a thing, 19:22.720 --> 19:26.240 you've got to repair the wheel, and then you've got to run all your tests, and you've got to do it 19:26.240 --> 19:30.720 50 times. So to prevent people from having to do this in every single upstream project, there's a 19:30.800 --> 19:34.080 program called CI Build Wheel, which automates this process for you. 19:36.160 --> 19:41.120 Once the wheel is built, you're going to need to test it in your CI, so you're going to need, 19:41.120 --> 19:44.960 and this is, yeah, you're not going to want to do this locally, you're going to want to do this in CI, 19:44.960 --> 19:50.640 so you're going to need some runners to do this. You're going to run a lot of tests to make 19:50.640 --> 19:56.560 sure you will work, and then finally once it's built and tested, you can upload it to the 19:56.560 --> 20:00.320 package registry, and there's a bunch of tools you can use to upload and install here. 20:00.320 --> 20:04.880 So trying a mature end can be used to upload packages to registries like pipeline, and pip, 20:04.880 --> 20:12.640 and UV can be used to install again packages. Okay, so to summarize, building Python packages 20:12.640 --> 20:21.520 is difficult, but on the more popular architectures like ARM and RSSX86, there's a whole pile of 20:21.520 --> 20:26.560 packages that are a whole pile of projects that are aimed to make this make the problem easier. 20:27.360 --> 20:32.240 But if you look at RSSX64, you will see that most of these packages are most of these projects 20:32.240 --> 20:40.240 don't actually support RSSX64. The only ones that do are audit wheel and matureren, and the upload 20:40.240 --> 20:49.680 packages, so trying mature and pip and UV all support RSSX64. So the problem is, on RSSX64, 20:49.680 --> 20:53.840 not only do we have this really difficult problem to solve, but a lot of the tools that are 20:53.920 --> 21:01.280 normally used to solve these problems don't currently support RSSX64. Okay, so but don't get too 21:01.280 --> 21:08.160 disheartened. If we were to have looked at this slide this time last year, there would be 21:08.160 --> 21:13.280 everything would be read in the right hand column. So progress has been made in the last year 21:13.280 --> 21:18.560 between getting RSSX64 support into some of these projects, but it's still not there, and so for the 21:18.560 --> 21:23.840 time being the problem still exists on RSSX64, and it's a big problem, because it's difficult 21:23.840 --> 21:30.880 to install these very useful Python packages. So recognizing this, RSSX64, which is an industry 21:30.880 --> 21:37.200 consortium, designed to improve the RSSX64 system, has created a project, an open source project 21:37.200 --> 21:45.120 called WillBuilder. WillBuilder, and the WillBuilder project is, well, what it does is it builds 21:45.280 --> 21:52.720 and distributes a small set of binary Python packages that we think will be useful. 21:54.080 --> 22:00.080 We're currently building about 30 packages, including things like NumPy, CyPy, Paners, MapLotLive, 22:00.080 --> 22:05.120 maturen, and we're planning to build more packages in the future. Not only do we build them, 22:05.120 --> 22:10.320 but we also make sure that they're kept up to date. So, you know, we don't just build like NumPy, 22:11.280 --> 22:16.000 1.26.3, and then leave it there for two years. You know, we've been building newer versions 22:16.000 --> 22:25.040 of NumPy as well. They all the things we're building are many Linux 2.3.5, RSSX64 images, and they're 22:25.040 --> 22:31.200 built with RV64GCs, so they should run on pretty much any device. And we go to great lengths to 22:31.200 --> 22:37.280 build these wheels in the same way that they build upstream. So they, the behavior of the wheels we 22:37.280 --> 22:41.840 build on wheelbuilder should match the behavior of the wheels you download for other architectures 22:41.840 --> 22:47.920 on Python. So, for example, if we go back to the NumPy dependency problem, when we build and package 22:47.920 --> 22:53.360 open-blasin side-iron NumPy wheels, we make sure we're building and packaging the exact same 22:53.360 --> 22:59.520 version of open-blasin upstream NumPy does. So, you should see the same behavior. And the wheels 22:59.680 --> 23:06.320 are tested, mostly, I say here, because sometimes it's not possible to run all the tests. 23:06.320 --> 23:11.040 It can be because there's a risk-wise specific bug, but generally it's because the test will 23:11.040 --> 23:16.560 require a dependency that doesn't exist on RSSX64. And so in those cases, we disable the tests. 23:16.560 --> 23:21.840 But in most cases, the wheels are tested in the CI before they're uploaded as well as we 23:21.840 --> 23:26.560 can test them by running the projects normal, normal tests. So, when you download wheels from 23:26.640 --> 23:33.120 wheelbuilder, the wheels should work. So, you're wondering how do I download the wheels from 23:33.120 --> 23:39.120 wheelbuilder? Well, it's pretty simple. The way this actually works is that the wheelbuilder 23:39.120 --> 23:44.320 project is in GitLab and GitLab actually allows projects to create their own Python package 23:44.320 --> 23:49.120 registries. So, we build the wheels in GitLab and we upload them to the package registry associated 23:49.120 --> 23:55.840 with that project. It has a quite a catcher URL here. But to install the packages, all you need to 23:56.400 --> 24:04.880 do is upgrade PIP. You need PIP 2.24.1 or greater for RSSX support. So, if you just have an earlier 24:04.880 --> 24:10.080 version, you'll have to upgrade PIP for us. Then you need to tell PIP to use the GitLab registry 24:10.080 --> 24:14.240 instead of Python. And you do that by setting this environment variable, or you can actually 24:14.240 --> 24:19.920 specify this on the command line using the minus minus index URL option. And then you just install the 24:19.920 --> 24:24.240 package. And so, I've got a little video here just to demonstrate this working. 24:25.840 --> 24:37.600 So, as you can see, I'm back on a wrist 5vm. I only have PIP 24.0 here. So, I'm going to upgrade PIP 24:37.600 --> 24:41.440 first. Otherwise, I will get an error when I try and install packages from wheelbuilder. 24:41.520 --> 24:56.400 So, the first thing we're doing, we're updating PIP. Now we're going to install, now we're 24:56.400 --> 25:06.160 going to export our index URL. So, PIP knows where to find the packages. We're going to 25:06.160 --> 25:10.400 try and install sendersbeaks. Now, remember, this failed spectacularly right at the start of the 25:10.400 --> 25:16.160 talk. But, now we specified the right index URL. It downloads the package and installed it, 25:16.160 --> 25:21.920 and it installs it really quickly. Let's try and install NumPy. Just to give you another example. 25:22.960 --> 25:26.640 And NumPy is a bit of a bigger package. It's got all those dependencies. So, it takes a little longer 25:26.720 --> 25:41.920 to install, but it should be installed pretty soon. Wow, that's slow. Yes, it's installed. 25:41.920 --> 25:45.520 And now let's just start Python. An important NumPy just to demonstrate it works. 25:45.840 --> 25:57.200 And indeed, it does. Now, there's just one more thing that I want to show you. I'm going to 25:57.200 --> 26:01.040 try and install another package. This time, I'm going to install something called twine. We mentioned 26:01.040 --> 26:09.760 twine earlier on. Twine is a package for uploading tenants. Oh, it's a package for uploading 26:09.760 --> 26:16.640 other wheels to package registry. So, the reason twine is interesting is because it's a 26:16.640 --> 26:22.320 pure Python package, but it has dependencies that are binary dependencies. And so, I'm able to 26:22.320 --> 26:29.680 install twine here. Even though I've told Pip to go and look at the wheel builder, get lab 26:29.680 --> 26:36.880 package registry, and not to go to pipeline. And we don't have twine in the wheel builder project, 26:36.880 --> 26:41.760 because it's a pure Python package. There's no point in this building it. And yet, this still works. 26:41.760 --> 26:46.880 And the reason it works is that when you send a request to get lab, the package will get lab 26:46.880 --> 26:51.360 padded red package registry. It will first check to see if it knows anything about that package. 26:51.360 --> 26:55.520 And if it doesn't have any wheels for that package, and it doesn't know anything about that package, 26:55.520 --> 27:01.120 it forwards the request to pipeline itself. And so, this allows us to just use one index 27:01.120 --> 27:09.280 URL to install a mixture of pure Python binary Python packages, which is quite nice. 27:10.320 --> 27:16.960 So, that is pretty much it for the talk. What I might just do is pop up the wheel builder project, 27:16.960 --> 27:23.760 so you guys can see it. So, it's here on GitLab. I guess I have the link in the talk at the end, 27:23.760 --> 27:29.520 so let me just show you quickly the project. Here's the list of the packages we're building, 27:29.600 --> 27:33.040 and you can see we're building multiple versions of some packages, so the CMake, 27:35.040 --> 27:43.440 markup save, map lotlib, OB tree, lots of different packages, pandas, tornado, TL parts, 27:43.440 --> 27:49.360 and you can just download those using that link and install them directly. And let me see, yes. 27:51.440 --> 27:53.680 Yep, and that is the end of the talk. 27:59.920 --> 28:04.960 Right, where a couple of minutes for questions, then it takes. 28:15.840 --> 28:21.840 Thank you, great project. What would it take to adapt the wheel builder for other architectures, 28:21.840 --> 28:26.000 like long arch? Sorry, could you repeat the question please? 28:26.000 --> 28:30.000 Like, what would it take to adapt the wheel builder for other architectures, like long arch? 28:30.800 --> 28:42.320 Oh, I see. Ah, yes, that's an excellent question, and you reminded me of some two things I meant 28:44.080 --> 28:49.600 that I didn't say. So, the way we're building these wheels. I remember when I was talking earlier on about 28:50.320 --> 28:55.600 how all these infrastructure projects are needed to build wheels for Python and how a lot of those 28:55.600 --> 29:00.320 are missing for risk by 64. And the two important ones were many Linux and CI build wheel. 29:01.360 --> 29:04.640 Because they don't exist for risk by 64, we have had to patch them ourselves. 29:05.440 --> 29:09.600 So we have created our own many Linux container based on the Ubuntu 20204 29:10.480 --> 29:14.560 and our own patch version of CI build wheel. And we use those internally to build the wheels. 29:14.560 --> 29:19.040 So to do this for another architecture that is not supported by many Linux or CI build wheel, 29:19.040 --> 29:22.720 you would need to do the same thing. And you would also need to make sure audit wheel 29:23.120 --> 29:28.480 supports long arch. And I'm not sure it does. I would have to check the policy files for audit wheel. 29:28.480 --> 29:34.240 And that is the tool you need to repair the wheels. So, if audit wheel doesn't do that, 29:34.240 --> 29:40.080 then you would need a patch version of audit wheel as well. So it's definitely doable, 29:40.080 --> 29:43.520 but you would need to patch these projects that we have to patch ourselves. 29:48.880 --> 29:49.360 Anymore? 29:53.200 --> 29:58.960 Is that going to mean to be many dials? Or let's say those are 13 per unit, or the 29:58.960 --> 30:03.440 twice sunshine, or something like this, which means that we are going to be many off-back edges. 30:04.400 --> 30:11.360 So that's a very good question. At the moment, we're building everything with RV64GC, okay. 30:14.240 --> 30:21.520 It is possible, it should be possible now to build NumPy. If we were to build it with GCC14, 30:21.520 --> 30:24.960 so we'd need to create a new many-lander container, or two-to-three, nine container, 30:24.960 --> 30:30.800 we should be able to build NumPy, that has some vector support in it, right? And that vector 30:30.800 --> 30:35.440 support would come through open blasts. And because open blasts does runtime detection of lots 30:35.440 --> 30:41.680 extensions are available, that will work on all part forms. So it will work on a, you know, 30:41.680 --> 30:46.240 something that doesn't have vector and something that does. But more generally speaking, there is no 30:46.240 --> 30:50.320 real way in Python at the moment, and this isn't just a risk by problem. There's no way to 30:50.320 --> 30:57.680 tank your wheels with information about what extensions those wheels expect. And this is a problem 30:57.680 --> 31:04.160 across all the architect, because it hasn't been noticed until very recently, because the x86 wheels 31:04.160 --> 31:11.040 have been built on distros that targets x86, v1. But that has just changed recently with album 31:11.120 --> 31:20.240 in x2.35, and now targeting AMD v2. And so some wheels built with many legs 2.34 have been 31:20.240 --> 31:26.720 crashing on really, really old machines, and there's no way really at the moment to label those 31:27.920 --> 31:33.600 labels those wheels. And so, yes, so that is sort of an open issue in the Python sort of packaging 31:33.600 --> 31:37.840 ecosystem, and it doesn't just apply to respect it applies to all the architectures, but it's 31:37.920 --> 31:41.920 really good that it's just come up in x86 because they're going to have to try and solve it, 31:41.920 --> 31:48.240 and we can get in on the act. I should also mention, actually, sorry, there's one other thing I 31:48.240 --> 31:52.800 want to mention, we don't want to do this forever. So our goal is just to build these wheels until 31:52.800 --> 31:57.600 such a time as all those infrastructure projects and start supporting risk five, and all the 31:57.600 --> 32:02.160 upstream projects start uploading risk five wheels to wheelbuilder. And when that happens, we'll 32:02.240 --> 32:06.240 start to wind the project down. So it's not something we're going to do forever. 32:14.160 --> 32:19.760 All right, thank you more.