WEBVTT 00:00.000 --> 00:11.000 So our next speaker is Claudio Blay and he's going to talk about remote execution with 00:11.000 --> 00:13.000 Nix and back to a round applause please. 00:13.000 --> 00:19.000 Yeah, hi everybody. 00:19.000 --> 00:21.000 I'm Claudio. 00:21.000 --> 00:24.000 I work at Twig. 00:24.000 --> 00:40.000 I'm going to talk about how we could integrate Nix and back to in the realm of remote execution. 00:40.000 --> 00:44.000 A bit of a background at first. 00:44.000 --> 00:51.000 We were working on large mono repo, which is a really large Haskell code base. 00:52.000 --> 00:57.000 A bit of a 10,000 Haskell modules for Mercury. 00:57.000 --> 01:05.000 So they're using Flex already and we are currently migrating this code base to back to. 01:05.000 --> 01:11.000 So what is back to back to is new Polygraph build system from meta. 01:11.000 --> 01:16.000 I think they released it roughly two years ago now. 01:16.000 --> 01:25.000 It supports distributed builds and also distributed caching, which is quite nice. 01:25.000 --> 01:34.000 It's still being worked on by the way, so they do release twice a month and they did so today. 01:34.000 --> 01:42.000 So if you want to try it out, just head over to there, get up back to repository and download it. 01:42.000 --> 01:49.000 It's just a single Raspberry, a static binary. 01:49.000 --> 01:53.000 Very easy to use. 01:53.000 --> 01:59.000 Okay, why would you even think about remote execution in this case? 01:59.000 --> 02:03.000 No, you want to have faster builds, obviously. 02:03.000 --> 02:09.000 So with the large code base, you're limited to just your local computer. 02:09.000 --> 02:18.000 You're going to feel a bit of a pain if you have to recompile lots of your modules. 02:18.000 --> 02:35.000 And remote execution is your faster build and also test execution through scaling of nodes available somewhere in your cloud or on print whatever. 02:35.000 --> 02:40.000 And also it provides you a consistent development environment. 02:40.000 --> 02:50.000 Think about reproducibility and you can reuse all the artifacts that are built by your remote executor. 02:50.000 --> 03:02.000 In this case, the remote executor or the remote cache is planned by our CI, so you can reuse those artifacts for all the developers. 03:02.000 --> 03:07.000 So they can really benefit from the build results that are done by your CI. 03:07.000 --> 03:12.000 And you don't have to rebuild stuff that has been already built. 03:12.000 --> 03:22.000 In either way, or even in your purchase or on your masterboards, it doesn't matter. 03:22.000 --> 03:30.000 And yeah, there's just one catch sort of because 03:30.000 --> 03:37.000 Back to and then nicks together do not lend themselves naturally to remote execution. 03:37.000 --> 03:47.000 This is because, yeah, buck likes to control their, let's control of have to control of the outputs. 03:47.000 --> 03:52.000 And all nicks is giving to buck to is similar to nicks though, right? 03:52.000 --> 03:57.000 So if you send that slim link, you can send it to to the remote executor that gets executed. 03:57.000 --> 04:12.000 But of course, it's just dangling if you don't have some preparations in place that makes the next door path appear on your remote remote runner. 04:12.000 --> 04:18.000 So in this case, the easy way out for this is to use a custom Docker image. 04:18.000 --> 04:21.000 That's what we do for the link in Linux use case. 04:21.000 --> 04:30.000 Since we are building this with buck, so we're building this pre-cand image upfront before we are even starting in the remote execution. 04:30.000 --> 04:41.000 The remote executor has access to the system and it uses it, so we make sure that every tool that we are using for the remote actions is actually, 04:41.000 --> 04:43.000 already there, right? 04:43.000 --> 05:06.000 But we also have to support MacOS since a lot of developers of Mercury, the company we are working for, are using Macs like ARM64 Macs, M2M3 machines and for MacOS, this whole thing is more challenging. 05:07.000 --> 05:20.000 For instance, you would need to provision the system upfront before we're doing the remote executors or running the most actions on your runners. 05:20.000 --> 05:32.000 And as far as I know, this would be a bit cumbersome deal with, I don't know, there's probably no MacOS image that you can just build upfront. 05:32.000 --> 05:43.000 And also it's quite expensive, having to run those machines, keeping them running or, yeah. 05:43.000 --> 06:01.000 So when you talk about this with a remote executor service that we are using, they recommended us to first look into how you can build your actions on MacOS, but targeting Linux, right? 06:01.000 --> 06:14.000 So, to such as symbols, to just use across remote builds in order to leverage the existing infrastructure that you are already in place for Linux. 06:14.000 --> 06:19.000 And that's what this talk is basically about. 06:19.000 --> 06:30.000 So in a moment I will show you how we set this up and how everything fits together, just doing a quick demo. 06:30.000 --> 06:40.000 Yeah, just wanted to mention there is some prior work from my colleague and the blog post. 06:40.000 --> 06:55.000 I've linked here, Konstantinos and Eugen, who managed to get this implemented in better ways that they're using a shared file system for the runners. 06:55.000 --> 07:08.000 As soon as the mixed packages build by some remote mixed machine, the other runners have immediately access to transparency access to the packages inside of the store. 07:08.000 --> 07:13.000 So this is very cool stuff, but it's only working for Linux, of course. 07:13.000 --> 07:22.000 Still, we are looking into doing that also for our project. 07:22.000 --> 07:29.000 When it comes to remote execution, the history for Linux is actually quite neat. 07:29.000 --> 07:42.000 All you need to do is configure next to be able to forward its build jobs to another machine of the other architectures, so when you're on MacOS, 07:42.000 --> 07:49.000 I'm 64, and you've got some Linux x8664 server. 07:49.000 --> 07:57.000 You just need to configure this machine, all you need to do is you need to be able to access H entered, 07:57.000 --> 08:06.000 and when you try to run some or when you try to build some package, some derivation for a different architecture, 08:06.000 --> 08:15.000 the next will just carry this machine and see if it fits the required features and the requested architecture, 08:15.000 --> 08:29.000 and then it will send off the build job and it will get back the build results, and they will appear in your store locally, and we're going to go. 08:29.000 --> 08:39.000 So this just notice is basically the reason we only build next packages locally. 08:39.000 --> 08:44.000 So we will see this in the second part. 08:44.000 --> 08:54.000 In this setup, we want to have remote cross birds, but we do only builds next packages locally, 08:54.000 --> 09:09.000 so to have them transparently send off to some other machine and come back to us, so we can actually participate from that machinery. 09:09.000 --> 09:16.000 For the back-to-use case, remote execution is a bit different. 09:16.000 --> 09:26.000 For one, there are two different variants of back-to, and there's the internal matter variant, that's these different protocols, different systems, 09:26.000 --> 09:33.000 and there's the OSS variant which uses the basic remote APIs, so it's able to use any, 09:33.000 --> 09:40.000 based on any service that also implements these APIs. 09:40.000 --> 09:48.000 There are some commercial ones, like build by build by your own flow, and there are also some open source, 09:48.000 --> 10:04.000 and build your own servers, services like build by, and just to mention it, there are also a few clients that speak these protocols. 10:04.000 --> 10:17.000 There are some more, so it's quite a cross-pring ecosystem in that sense. 10:17.000 --> 10:34.000 So how does back-to-run remote execution work? There's basically a service, and the service is able to spawn up some workers, depending on your level of cloud infrastructure you have, basically. 10:34.000 --> 10:43.000 It works on the action level, so this is in contrast to Nick's, which has a much larger bracket, right? 10:43.000 --> 10:49.000 So you just walk on the, on the derivation level, basically. 10:49.000 --> 11:00.000 And it usually supports the caching, so the route server will cache your results, and if it happens that you want to build or somebody in your team builds the same, 11:00.000 --> 11:09.000 once it builds same action, it doesn't need to be, it's already cached, and you can download the cached results. 11:09.000 --> 11:26.000 Also, many of the remote execution services provide interesting telemetry, nice UIs, or graphs, where you can see how much time each split took, how much of the cache was used, 11:26.000 --> 11:38.000 you can even inspect the logs and so on, and some of these also have some suggestions that you can apply to your configuration, say, 11:38.000 --> 11:44.000 toggle this lag, and it's going to be faster, such things. 11:45.000 --> 12:11.000 Okay, now it's time for the demo, I had, yeah, I had a faithful Raspberry Pi with me for this purpose, but unfortunately it didn't survive the, it didn't survive the second, the train ride somehow. 12:12.000 --> 12:18.000 Just have to fight, yeah, that's better. 12:26.000 --> 12:37.000 Okay, okay, that's the, okay. 12:37.000 --> 12:51.000 So I basically created a virtual machine last night, and this is the machine, right? 12:51.000 --> 12:54.000 I don't, I'm not sure. 12:54.000 --> 12:57.000 Yes, okay, that's a good machine. 12:57.000 --> 13:07.000 So you quickly could just, like me. 13:07.000 --> 13:19.000 Okay, just, it's working, it's very slow, the Raspberry Pi was much faster, but you can see, yeah. 13:19.000 --> 13:30.000 And it's filling the stuff, you know, the thing, the next way action is run locally, as I said, we only want to run this locally at any point of time. 13:30.000 --> 13:39.000 And then it's going to build, okay, it's going to build the executable, and since I, 13:39.000 --> 13:50.000 I said, it should run the, the resulting, resulting binary, it failed because, of course, we were building for a little six 86 64 on an arm 64 machine, right? 13:50.000 --> 14:00.000 So just wanted to show you how this is basically set up. 14:00.000 --> 14:14.000 What this does is we have two platforms, actually, define execution platforms, one is the local platform that's always related to your local system. 14:14.000 --> 14:29.000 And the other is the Linux 86 64 platform, which is only locally enabled, or which can only run local jobs, if the host system is the same architecture, of course. 14:29.000 --> 14:39.000 We noticed, always enabled, and we are setting some configuration, some constraints, basically. 14:39.000 --> 14:50.000 And also, though, the remote execution properties that we set in order to inform the remote executable, what kind of, what kind of machine we are expecting here. 14:50.000 --> 15:05.000 So I have, I'm running, I'm running native link here, so that's the remote execution service that is running on my laptop on the host machine. 15:05.000 --> 15:19.000 And that this one is executing the remote, remote jobs, if you change like something from the hello, sorry. 15:19.000 --> 15:29.000 Yeah, I don't even have a decent editor here on this machine. 15:29.000 --> 15:36.000 Okay. Ah, no, that doesn't look, yeah, cannot edit the command line. 15:36.000 --> 15:44.000 All right, it just build it, so I hope, no, but of course, couldn't run it. 15:44.000 --> 15:59.000 Let's remove the target, let's remove the target platform, and now it should build locally, right, so, and now it should run. 15:59.000 --> 16:08.000 There, okay, yes, that worked. 16:08.000 --> 16:28.000 We could even have a look, it's, back to the nice system, you can inspect it, and you can even have a look at the execution platform resolution, so that's a back to audit command, and let's see what it says. 16:28.000 --> 16:37.000 You can always debug your configurations that you did in, when you register your execution platforms. 16:37.000 --> 16:42.000 In this case, you can see the original platform. 16:42.000 --> 16:55.000 This, this was configured with is the Linux X8664 platform, and also selected as an execution platform, the same one, and also for the two chain depths that are hand with nicks. 16:55.000 --> 17:09.000 It's even, or it also selected the same platform, so just one important thing to note is that you have to set. 17:09.000 --> 17:24.000 In order to manage this correctly, you have to set constraints on your, on your, on the two chains, that's the last thing I wanted to show. 17:24.000 --> 17:31.000 That's the two chains. 17:31.000 --> 17:36.000 Okay. 17:36.000 --> 17:49.000 That's the main important part is, okay, you don't see my cursor, okay, yes. 17:49.000 --> 18:05.000 This thing which looks a bit funny here, the select statement is basically say, it's basically looking at the current configuration of the target, and then it is given a mapping, right, and we are basically giving it an identity map. 18:05.000 --> 18:26.000 So what this is saying is, okay, when the target system that you built for is X86 Linux, the executable that is, when you use this as an executable, this is also just the same platform, right, so it's a one to one mapping, it's an identity map as I said. 18:26.000 --> 18:34.000 So this makes sure that the right platform is selected from the exit platforms that are registered. 18:34.000 --> 18:49.000 This is kind of important because back to tends to select just the first one, that is available, when you do not add these constraints. 18:50.000 --> 18:55.000 Go back to. 18:55.000 --> 19:02.000 Okay, yeah, that's my time. Thank you for, thank you so much. 19:02.000 --> 19:09.000 Thanks for the talk, great talk. Are there any questions? 19:09.000 --> 19:24.000 The question is, why would you not choose one or the other? 19:24.000 --> 19:50.000 Yeah, the main thing is the developer experience, basically, you want to have really fine grad actions, and we do have put great efforts into doing that, introducing that, you can really build like single objects of your, of your build graph, you don't need to mix, you cannot do that, right? 19:50.000 --> 20:00.000 You can even pick some HTML file that you want to build from your, from the API docs, right, so you just want to build one of these. 20:00.000 --> 20:23.000 Also, it has, it has its caching for, for normal build artifacts, but also for tests, right, so that is, that is, that is, at least what we are after. So it's not done, we are not done with the project, but we'll get to it. 20:30.000 --> 20:37.000 Thank you very much.