WEBVTT 00:00.000 --> 00:16.800 I have everyone and welcome to Season 2 of the Small BSD, Sarri. 00:16.800 --> 00:23.320 So I am Emil, I work in the bank. 00:23.320 --> 00:29.800 I have flying Fabia, this didn't go away since last year. 00:29.800 --> 00:41.000 I'm using that BSD since 1998, yes, I am that old and I am a net BSD commuter since 2009. 00:41.000 --> 00:50.280 For those of you who use net BSD, I'm the initial pickaging author now, and old by Jonathan, 00:50.280 --> 00:58.200 which is doing a marvelous job with it, and I am the author of the microbium version of net 00:58.200 --> 01:01.800 BSD, which I will talk about. 01:01.800 --> 01:06.960 Hi, everyone, as well, I'm Corbin, Pierre Pranchali, also net BSD users in 2005, commuter 01:06.960 --> 01:12.000 since 2012, and one change since last year is that now I'm also a free BSD commuter, through 01:12.000 --> 01:15.400 my work at the foundation of the TV. 01:15.400 --> 01:33.880 Okay, so previously on Small BSD, last year I presented in this conference, the first 01:33.880 --> 01:40.480 works to have a net BSD kernel booting in a few milliseconds. 01:40.480 --> 01:48.200 That implied the first job was to implement the PVH capability to the kernel, so it can 01:48.200 --> 01:53.640 be called by QMU with a dash kernel flag. 01:53.640 --> 02:00.240 So that's a bit of assembly work, and some bits. 02:00.240 --> 02:06.680 The second work was implementing the MMIO handling. 02:06.680 --> 02:16.440 When you call a virtual machine on the BSD or free BSD, all Linux, with MMIO, the 02:16.440 --> 02:23.640 parameters, the devices, the virtual devices are declared in a common line, so you just 02:23.640 --> 02:29.920 have to parse the common line and declare them as devices. 02:29.920 --> 02:39.080 First of delays, you know, the actual delay function, where in the net BSD source code, well, 02:39.080 --> 02:48.120 kernel code, and in our case, the virtual machine case, those were not useful at all, so 02:48.120 --> 02:56.680 I killed them, and all this to have the machine type, it's called like that. 02:56.680 --> 03:04.440 The QMU machine type micro VM working with net BSD fully, and that's actually working 03:04.440 --> 03:16.800 and integrated in the upcoming net BSD 11 kernel should be out, real soon, we guess, and 03:16.800 --> 03:24.760 yes, it has the new micro VM configuration available for everybody. 03:25.720 --> 03:33.120 Oh, yeah, it also up into work with another virtual machine manager, which is called Fire 03:33.120 --> 03:44.000 Cracker, which is pretty good, but misses some features and makes the kernel load in a very, 03:44.000 --> 03:50.000 very long time, like 15 millisecond, and that's not acceptable. 03:50.080 --> 04:01.960 Okay, so, this is in two starts, so in net BSD 11, and the version I presented last 04:01.960 --> 04:16.920 year, add all the needed devices to use machine, okay, network, disk, well, everything you 04:17.000 --> 04:27.160 basically need to use a machine virtual machine, but there was something missing in that code, 04:27.160 --> 04:36.040 when you are using a virtual machine, even if you are using Virtio to visualize disk, a 04:36.040 --> 04:42.800 network and so on, you still, if you don't do anything about it, you are still emulating 04:42.880 --> 04:51.120 the serial console, for example, okay, what that means is that for every character that's written 04:51.120 --> 05:01.560 on the screen, you do what we call a VM exit, meaning you get out, the virtual machine, 05:01.560 --> 05:13.760 to execute a machine instruction, well, basically command, the character to be printed 05:13.760 --> 05:27.280 in the serial console, and there is actually a way to use Virtio devices to do that, and 05:27.280 --> 05:37.520 it's VIO-con. So, I implemented, the first thing I implemented is the early VIO-con device, 05:37.520 --> 05:46.880 meaning to see what's happening in the very first moments of the boot. The basic driver was already 05:46.960 --> 06:01.040 implemented, and I put things together to have multiple Virtio consoles. This as benefits 06:01.040 --> 06:13.000 that I didn't thought it would add. First, the first game is performance, okay, the first 06:13.080 --> 06:19.880 game is, as you are using a Virtio device, you don't do VM exit anymore, okay, you stay 06:19.880 --> 06:27.480 in the virtual machine to print characters. The method is pretty neat, actually, Virtio 06:27.480 --> 06:35.240 declares what you declare, declares a register, just called the emergency right register, 06:35.240 --> 06:41.640 while you just put your characters and they are printing the screen, that's neat. The 06:41.800 --> 06:50.120 multiple serial lines feature what it provides, well, obviously multiple serial lines, which 06:50.120 --> 07:00.200 can be, for example, used to control the VM, or maybe you can use it to redirect SDDRR on another 07:00.200 --> 07:09.480 console. You can think about many, many, many, use edges. I did it, firstly, because, I don't 07:10.440 --> 07:17.960 know, those of you who know Sergio Lopez, he works at Red Hat, and he's the author of the 07:17.960 --> 07:24.840 micro VM machine type for QMU, and he's working on a project, which is absolutely amazing, 07:24.840 --> 07:34.760 is called K Run. K Run, basically, is a project that allows you to run commands inside VMs, 07:34.840 --> 07:43.880 and you have to have, obviously, a fast, putting VM to use that. And it told me, well, K Run uses 07:45.160 --> 07:52.440 multiple Virtio consoles to run the commands, so do you have in mind to do the work, and yes. 07:53.880 --> 07:59.800 And there's another feature that's pretty cool that I wasn't expecting, is that QMU can use 08:00.760 --> 08:09.240 those multiple, serial consoles as circuits. So think about it, use your phone, if you 08:09.240 --> 08:18.200 am, and suddenly you can communicate from the host to the VM and vice versa using the socket. 08:19.640 --> 08:23.640 Let's see, let's see a quick demo. 08:24.280 --> 08:41.800 Okay, so, all start, simple, yeah, I'll do this. I can hold you, I'll not worry. 08:42.760 --> 08:54.120 I'll start a simple VM, and I will declare a supplementary port, okay, so that's a VM booting, okay. 08:56.200 --> 09:07.480 And, oh, there we go. So, on the left, I'm on the VM. 09:07.480 --> 09:16.840 If you can see host socket right here, and here I'm on the right, and I'm going to do 09:17.800 --> 09:32.920 socket, for example, no, see stuff. Yeah, so host guest, and I can do this kind of things. 09:37.000 --> 09:44.600 This is the name of the device of the Vio-Con device, and that's, I mean, 09:45.560 --> 09:48.360 that doesn't look like much, but it's really cool. 09:57.400 --> 10:05.400 Think of it, like you can control your VM from the host and vice versa. I know there are a lot 10:05.400 --> 10:14.680 of security issues with that. I know it, so use it with extra care, but, all right. 10:17.000 --> 10:24.280 Now, so that was one way to speed up the boot, which was in this low and now. 10:26.200 --> 10:31.080 So that's my part, credit for the code you do, and email did maybe 95% of what's in this talk. 10:31.960 --> 10:37.560 And I waited a long time before doing the remaining 1%, because if you gain one millisecond from 10, 10:38.120 --> 10:42.200 it's a 10% improvement, but if I had donated the beginning, it would have been just 1%. 10:42.200 --> 10:48.600 So now I can claim a 10% increase. Thanks to the support of memory disks. So when you load the 10:48.600 --> 10:53.960 kernel from QMU, you use what's called a generic PVH to load the kernel. And unfortunately, 10:53.960 --> 11:00.520 in that BSD, we didn't implement until recently, support for memory disks when loading the kernel 11:00.520 --> 11:08.520 directly in JMPV H mode. The reason why you may want to have memory disk support for net BSD is, 11:08.520 --> 11:13.480 for instance, if your root file system is on the FS, it's one way to actually implement it. 11:13.480 --> 11:19.000 You initialize the system from your amd disk and then you show root in it to your ZFS volume. 11:19.400 --> 11:26.280 Works great also for cryptography. Something I also did many years ago with CGD, that BSD driver for that. 11:26.840 --> 11:30.600 Also, one thing you may not know is that the installer of net BSD runs in a RAM disk, 11:30.600 --> 11:37.560 so you can actually take the generic kernel, use the RAM disk to boot on it and it can install the 11:37.560 --> 11:43.000 system. But basically with that, you also have zero driver code, because the access to the 11:43.000 --> 11:48.200 file system isn't memory. You only have to copy it once so you can use it safely. So it's 11:48.200 --> 11:52.920 literally just fire and forget, which is where we gain a bit of performance when booting the system. 11:52.920 --> 11:58.680 There's no driver initialization code or nothing. So the way this is implemented in JMPV H mode, 11:58.680 --> 12:03.720 there is a structure, which I think was standardized by Zen, called HVM starting for. 12:05.720 --> 12:11.400 So in that BSD, the way this works is that you have a file called Gen S assembly symbol, 12:11.480 --> 12:17.080 let's see if, which basically helps you access the content of his structures from SMB code, 12:17.640 --> 12:21.880 then in LOCORD.S, which is one of the first files that actually booted by the kernel, 12:22.600 --> 12:29.160 therefore, in SMB. I implemented more of the members of this struct HVM starting for, 12:29.160 --> 12:36.360 namely the modules. So instead of simply copying the command line in memory, 12:37.080 --> 12:41.560 at the end of the zone where the kernel can start, I also implemented the modules, 12:41.560 --> 12:46.280 so just getting the address of the modules, putting them in memory after the kernel, 12:47.160 --> 12:53.400 then following up with the command line as usual. And basically, when you do that, you can also 12:53.400 --> 13:00.760 patch X86 machine depth. It's in this part of the kernel, because it's still in 32 bits 13:00.840 --> 13:05.480 mode at this point, which is why the code is shared between I3D6 and AMD64, in that 13:05.480 --> 13:13.720 base D that's in the generic X86 folder, and basically, because this is just a bunch of data in 13:13.720 --> 13:22.520 memory areas, the code actually checks the magic for whatever is being loaded. There is a check 13:22.520 --> 13:27.560 for the alpha magic, so loading kernel modules, there is a check for PNG and JPEG files, 13:27.560 --> 13:32.200 because in that base D, we also support or support it splash images. So actually, right now, 13:32.200 --> 13:38.120 I couldn't really test it actually works. And then the code for a box to run this, 13:38.760 --> 13:45.880 to file system images. So basically, just setting the kernel, okay, this is your root file system, 13:45.880 --> 13:50.840 just boot it. And with email or over lunch, we were just thinking, oh, just change everything. 13:50.840 --> 13:55.480 No, we were just thinking, let's maybe add support for device trees, and maybe then we can 13:55.560 --> 14:03.000 have more flexibility in the way all of this is implemented. So with this, I can pass the mic 14:03.000 --> 14:17.640 back to the postman, get in mind that, for example, with what Pierre did, you just have, you can 14:18.040 --> 14:24.360 boot like a trashable virtual machine that boots in milliseconds, okay. So think about it, 14:24.360 --> 14:33.640 for example, as an SSH, a temporary SSH server, or proxy, rebound, whatever. And the second thing I want to 14:33.720 --> 14:44.040 add to what Pierre said is that we support, I mean, this code supports MD64, obviously, 14:44.040 --> 14:58.760 but also 32 bits. We didn't trash 32 bits, we didn't. So you are seeing that small BSD and the 14:58.840 --> 15:06.760 word that we are doing on the NIDSD kernel is going towards using the VM, just like an app, 15:06.760 --> 15:17.400 just like an executable. But what do we need to actually use and make use of an executable 15:17.960 --> 15:29.720 VM is at least variables. Right now, if you want to give an information to a virtual machine, 15:31.480 --> 15:40.600 a variable file, while you have to or share a file system, put something online or have some 15:40.680 --> 15:48.360 kind of trick that will fetch something from somewhere. And actually, I discovered that 15:48.360 --> 16:02.120 commute in the one million, 232 pages of documentation, as a feature called with, you know, I love 16:02.760 --> 16:12.120 just to be here. They just have a marketing problem. And for example, I discovered a 16:12.120 --> 16:24.920 feature that's amazing, that's called FWCFG. This means translating to the guest or variables or files. 16:24.920 --> 16:38.600 I mean, that rocks, okay, but the name did FWCFG. So the cool thing that is that you can just throw 16:38.600 --> 16:47.720 to the VM variables or files. And into small BSD, I did this little snippet that actually permits 16:47.800 --> 16:51.800 the following. 16:58.840 --> 17:07.000 And now, let's see, same, but if we don't know that, I'm going to export a variable for 17:07.080 --> 17:19.800 some food, which boy, okay. And now, oh, so, the variable from the host to the guest. Now, 17:19.800 --> 17:48.840 same thing with, oh, yeah, I suppose, and here, you will have your file, I mean, that's, 17:50.200 --> 17:57.480 the reason why the file is named like this is, that's what Qemun asks to name the files. 17:57.480 --> 18:02.360 I mean, you could just name it like, like you want, but they want it like that. So yeah, 18:02.360 --> 18:14.360 here, I have my HCOS shared on the guest. Now, think about it. You can pass files 18:14.440 --> 18:20.280 and variable. So this could be a shell script, okay, that would act differently, 18:21.080 --> 18:31.160 giving a variable, okay. You can think about many, many ways to use this kind of stuff 18:31.160 --> 18:37.560 and put it together with the VATIO consulting. Okay, I'm good. 18:37.640 --> 18:52.360 Okay, so that was the demo. And now, okay, I hope you are not going to hate me for that, 18:52.360 --> 19:01.960 but so I recently implemented for small BSD. I mean, you know that people nowadays only 19:01.960 --> 19:14.600 knows how to edit the demo and create Docker files. So I thought maybe if you are able to create 19:14.600 --> 19:22.760 a small BSDVM with a Docker file that will trigger many people. And so you are now able to create 19:23.720 --> 19:31.560 a small BSDR information using creating a Docker file, pretty straightforward, okay, you just 19:32.280 --> 19:41.560 tried a bunch of well-known Docker files. I obviously do not support yet the multi-stage 19:42.200 --> 19:46.840 Docker file. I'm thinking about how to do it, but basically you can take 19:48.520 --> 19:55.560 playing Docker file, classical one and change a APT or with the other one. 19:56.520 --> 20:07.400 APM, the Alpine one, an APK with Piccaging and you're set. Let's do a little demo on this one and 20:07.400 --> 20:14.920 I'll stop or yeah, okay. Trust me on this one, it works. 20:16.920 --> 20:24.760 No, it's just I see the timing and okay, so basically what I want to show here, 20:25.240 --> 20:35.160 yeah, I don't know. What I want to show here is the difference, okay, one thing is the model. 20:37.560 --> 20:41.400 This is a laptop, by the way, don't buy framework sucks. 20:41.640 --> 20:54.360 Yeah, I'm sorry. Yeah, I have one, that's where I'm saying it. 20:56.360 --> 21:03.560 What a second. Yeah, what I want to show, okay, so this is a laptop and obviously does a lot of things. 21:03.640 --> 21:11.960 And the performances are not as good as a server or even machine that doesn't have 21:13.160 --> 21:21.000 graphical stuff and so on. Normally what I'm going to show is that there's a huge difference 21:21.720 --> 21:31.000 booting, huge. A difference, booting a micro VM, reading a file as a disk and using a 21:31.000 --> 21:48.360 pairs in each other, they work. So let's see, so I have a tiny 21:48.360 --> 21:54.120 of a term machine which is called risk queue and it's about 20 megabytes, which is already pretty big. 22:01.960 --> 22:09.000 Okay, so this is the disk one, the disk version, okay, 20 milliseconds. 22:11.160 --> 22:16.920 And now I'm going to go with the RAM disk version. 22:16.920 --> 22:44.200 Yep, that fast. Take that, Linus. Okay, I think we are good and questions. 22:46.920 --> 22:57.400 Question is, if it's based on netBestD11, what? Question, I'm repeating the question. 23:01.400 --> 23:10.760 Okay, the features we did are merged into our own branch and in netBestD11, there is only the 23:10.760 --> 23:19.880 say, the season one. Those features are not yet included. We are working on reviewing them and so on, 23:19.880 --> 23:26.440 but we provide a small BestDK node, which is basically a BestDK node with our work and you can 23:26.440 --> 23:28.440 download it online. 23:44.440 --> 23:49.240 Because probably I didn't reduce it. I didn't reduce it, yeah, but you can. 23:49.320 --> 23:56.040 I mean, honestly, this tool could be like two hours long because we have plenty of things to show, 23:56.040 --> 24:04.600 but I have shown only the one. But there is a minimized option, which can minimize first, 24:04.600 --> 24:11.000 only using what's on the disk and second, using another tool I would, I would rather go 24:11.000 --> 24:17.000 sailor, which will only provide the binaries that you need. 24:41.000 --> 24:47.000 Okay, could you repeat that? I'm not sure, I got you. 25:01.720 --> 25:10.680 I have it, it sucks. I mean, there's a flag of small BestD of the start, 25:10.680 --> 25:19.000 starting script, that as the dash W flag, which is actually 9p, it works, but it's really unstable. 25:20.280 --> 25:25.640 It works. I mean, small BestD uses it to create the images, but.