WEBVTT

00:00.000 --> 00:11.000
So our next speaker is Claudio Blay and he's going to talk about remote execution with

00:11.000 --> 00:13.000
Nix and back to a round applause please.

00:13.000 --> 00:19.000
Yeah, hi everybody.

00:19.000 --> 00:21.000
I'm Claudio.

00:21.000 --> 00:24.000
I work at Twig.

00:24.000 --> 00:40.000
I'm going to talk about how we could integrate Nix and back to in the realm of remote execution.

00:40.000 --> 00:44.000
A bit of a background at first.

00:44.000 --> 00:51.000
We were working on large mono repo, which is a really large Haskell code base.

00:52.000 --> 00:57.000
A bit of a 10,000 Haskell modules for Mercury.

00:57.000 --> 01:05.000
So they're using Flex already and we are currently migrating this code base to back to.

01:05.000 --> 01:11.000
So what is back to back to is new Polygraph build system from meta.

01:11.000 --> 01:16.000
I think they released it roughly two years ago now.

01:16.000 --> 01:25.000
It supports distributed builds and also distributed caching, which is quite nice.

01:25.000 --> 01:34.000
It's still being worked on by the way, so they do release twice a month and they did so today.

01:34.000 --> 01:42.000
So if you want to try it out, just head over to there, get up back to repository and download it.

01:42.000 --> 01:49.000
It's just a single Raspberry, a static binary.

01:49.000 --> 01:53.000
Very easy to use.

01:53.000 --> 01:59.000
Okay, why would you even think about remote execution in this case?

01:59.000 --> 02:03.000
No, you want to have faster builds, obviously.

02:03.000 --> 02:09.000
So with the large code base, you're limited to just your local computer.

02:09.000 --> 02:18.000
You're going to feel a bit of a pain if you have to recompile lots of your modules.

02:18.000 --> 02:35.000
And remote execution is your faster build and also test execution through scaling of nodes available somewhere in your cloud or on print whatever.

02:35.000 --> 02:40.000
And also it provides you a consistent development environment.

02:40.000 --> 02:50.000
Think about reproducibility and you can reuse all the artifacts that are built by your remote executor.

02:50.000 --> 03:02.000
In this case, the remote executor or the remote cache is planned by our CI, so you can reuse those artifacts for all the developers.

03:02.000 --> 03:07.000
So they can really benefit from the build results that are done by your CI.

03:07.000 --> 03:12.000
And you don't have to rebuild stuff that has been already built.

03:12.000 --> 03:22.000
In either way, or even in your purchase or on your masterboards, it doesn't matter.

03:22.000 --> 03:30.000
And yeah, there's just one catch sort of because

03:30.000 --> 03:37.000
Back to and then nicks together do not lend themselves naturally to remote execution.

03:37.000 --> 03:47.000
This is because, yeah, buck likes to control their, let's control of have to control of the outputs.

03:47.000 --> 03:52.000
And all nicks is giving to buck to is similar to nicks though, right?

03:52.000 --> 03:57.000
So if you send that slim link, you can send it to to the remote executor that gets executed.

03:57.000 --> 04:12.000
But of course, it's just dangling if you don't have some preparations in place that makes the next door path appear on your remote remote runner.

04:12.000 --> 04:18.000
So in this case, the easy way out for this is to use a custom Docker image.

04:18.000 --> 04:21.000
That's what we do for the link in Linux use case.

04:21.000 --> 04:30.000
Since we are building this with buck, so we're building this pre-cand image upfront before we are even starting in the remote execution.

04:30.000 --> 04:41.000
The remote executor has access to the system and it uses it, so we make sure that every tool that we are using for the remote actions is actually,

04:41.000 --> 04:43.000
already there, right?

04:43.000 --> 05:06.000
But we also have to support MacOS since a lot of developers of Mercury, the company we are working for, are using Macs like ARM64 Macs, M2M3 machines and for MacOS, this whole thing is more challenging.

05:07.000 --> 05:20.000
For instance, you would need to provision the system upfront before we're doing the remote executors or running the most actions on your runners.

05:20.000 --> 05:32.000
And as far as I know, this would be a bit cumbersome deal with, I don't know, there's probably no MacOS image that you can just build upfront.

05:32.000 --> 05:43.000
And also it's quite expensive, having to run those machines, keeping them running or, yeah.

05:43.000 --> 06:01.000
So when you talk about this with a remote executor service that we are using, they recommended us to first look into how you can build your actions on MacOS, but targeting Linux, right?

06:01.000 --> 06:14.000
So, to such as symbols, to just use across remote builds in order to leverage the existing infrastructure that you are already in place for Linux.

06:14.000 --> 06:19.000
And that's what this talk is basically about.

06:19.000 --> 06:30.000
So in a moment I will show you how we set this up and how everything fits together, just doing a quick demo.

06:30.000 --> 06:40.000
Yeah, just wanted to mention there is some prior work from my colleague and the blog post.

06:40.000 --> 06:55.000
I've linked here, Konstantinos and Eugen, who managed to get this implemented in better ways that they're using a shared file system for the runners.

06:55.000 --> 07:08.000
As soon as the mixed packages build by some remote mixed machine, the other runners have immediately access to transparency access to the packages inside of the store.

07:08.000 --> 07:13.000
So this is very cool stuff, but it's only working for Linux, of course.

07:13.000 --> 07:22.000
Still, we are looking into doing that also for our project.

07:22.000 --> 07:29.000
When it comes to remote execution, the history for Linux is actually quite neat.

07:29.000 --> 07:42.000
All you need to do is configure next to be able to forward its build jobs to another machine of the other architectures, so when you're on MacOS,

07:42.000 --> 07:49.000
I'm 64, and you've got some Linux x8664 server.

07:49.000 --> 07:57.000
You just need to configure this machine, all you need to do is you need to be able to access H entered,

07:57.000 --> 08:06.000
and when you try to run some or when you try to build some package, some derivation for a different architecture,

08:06.000 --> 08:15.000
the next will just carry this machine and see if it fits the required features and the requested architecture,

08:15.000 --> 08:29.000
and then it will send off the build job and it will get back the build results, and they will appear in your store locally, and we're going to go.

08:29.000 --> 08:39.000
So this just notice is basically the reason we only build next packages locally.

08:39.000 --> 08:44.000
So we will see this in the second part.

08:44.000 --> 08:54.000
In this setup, we want to have remote cross birds, but we do only builds next packages locally,

08:54.000 --> 09:09.000
so to have them transparently send off to some other machine and come back to us, so we can actually participate from that machinery.

09:09.000 --> 09:16.000
For the back-to-use case, remote execution is a bit different.

09:16.000 --> 09:26.000
For one, there are two different variants of back-to, and there's the internal matter variant, that's these different protocols, different systems,

09:26.000 --> 09:33.000
and there's the OSS variant which uses the basic remote APIs, so it's able to use any,

09:33.000 --> 09:40.000
based on any service that also implements these APIs.

09:40.000 --> 09:48.000
There are some commercial ones, like build by build by your own flow, and there are also some open source,

09:48.000 --> 10:04.000
and build your own servers, services like build by, and just to mention it, there are also a few clients that speak these protocols.

10:04.000 --> 10:17.000
There are some more, so it's quite a cross-pring ecosystem in that sense.

10:17.000 --> 10:34.000
So how does back-to-run remote execution work? There's basically a service, and the service is able to spawn up some workers, depending on your level of cloud infrastructure you have, basically.

10:34.000 --> 10:43.000
It works on the action level, so this is in contrast to Nick's, which has a much larger bracket, right?

10:43.000 --> 10:49.000
So you just walk on the, on the derivation level, basically.

10:49.000 --> 11:00.000
And it usually supports the caching, so the route server will cache your results, and if it happens that you want to build or somebody in your team builds the same,

11:00.000 --> 11:09.000
once it builds same action, it doesn't need to be, it's already cached, and you can download the cached results.

11:09.000 --> 11:26.000
Also, many of the remote execution services provide interesting telemetry, nice UIs, or graphs, where you can see how much time each split took, how much of the cache was used,

11:26.000 --> 11:38.000
you can even inspect the logs and so on, and some of these also have some suggestions that you can apply to your configuration, say,

11:38.000 --> 11:44.000
toggle this lag, and it's going to be faster, such things.

11:45.000 --> 12:11.000
Okay, now it's time for the demo, I had, yeah, I had a faithful Raspberry Pi with me for this purpose, but unfortunately it didn't survive the, it didn't survive the second, the train ride somehow.

12:12.000 --> 12:18.000
Just have to fight, yeah, that's better.

12:26.000 --> 12:37.000
Okay, okay, that's the, okay.

12:37.000 --> 12:51.000
So I basically created a virtual machine last night, and this is the machine, right?

12:51.000 --> 12:54.000
I don't, I'm not sure.

12:54.000 --> 12:57.000
Yes, okay, that's a good machine.

12:57.000 --> 13:07.000
So you quickly could just, like me.

13:07.000 --> 13:19.000
Okay, just, it's working, it's very slow, the Raspberry Pi was much faster, but you can see, yeah.

13:19.000 --> 13:30.000
And it's filling the stuff, you know, the thing, the next way action is run locally, as I said, we only want to run this locally at any point of time.

13:30.000 --> 13:39.000
And then it's going to build, okay, it's going to build the executable, and since I,

13:39.000 --> 13:50.000
I said, it should run the, the resulting, resulting binary, it failed because, of course, we were building for a little six 86 64 on an arm 64 machine, right?

13:50.000 --> 14:00.000
So just wanted to show you how this is basically set up.

14:00.000 --> 14:14.000
What this does is we have two platforms, actually, define execution platforms, one is the local platform that's always related to your local system.

14:14.000 --> 14:29.000
And the other is the Linux 86 64 platform, which is only locally enabled, or which can only run local jobs, if the host system is the same architecture, of course.

14:29.000 --> 14:39.000
We noticed, always enabled, and we are setting some configuration, some constraints, basically.

14:39.000 --> 14:50.000
And also, though, the remote execution properties that we set in order to inform the remote executable, what kind of, what kind of machine we are expecting here.

14:50.000 --> 15:05.000
So I have, I'm running, I'm running native link here, so that's the remote execution service that is running on my laptop on the host machine.

15:05.000 --> 15:19.000
And that this one is executing the remote, remote jobs, if you change like something from the hello, sorry.

15:19.000 --> 15:29.000
Yeah, I don't even have a decent editor here on this machine.

15:29.000 --> 15:36.000
Okay. Ah, no, that doesn't look, yeah, cannot edit the command line.

15:36.000 --> 15:44.000
All right, it just build it, so I hope, no, but of course, couldn't run it.

15:44.000 --> 15:59.000
Let's remove the target, let's remove the target platform, and now it should build locally, right, so, and now it should run.

15:59.000 --> 16:08.000
There, okay, yes, that worked.

16:08.000 --> 16:28.000
We could even have a look, it's, back to the nice system, you can inspect it, and you can even have a look at the execution platform resolution, so that's a back to audit command, and let's see what it says.

16:28.000 --> 16:37.000
You can always debug your configurations that you did in, when you register your execution platforms.

16:37.000 --> 16:42.000
In this case, you can see the original platform.

16:42.000 --> 16:55.000
This, this was configured with is the Linux X8664 platform, and also selected as an execution platform, the same one, and also for the two chain depths that are hand with nicks.

16:55.000 --> 17:09.000
It's even, or it also selected the same platform, so just one important thing to note is that you have to set.

17:09.000 --> 17:24.000
In order to manage this correctly, you have to set constraints on your, on your, on the two chains, that's the last thing I wanted to show.

17:24.000 --> 17:31.000
That's the two chains.

17:31.000 --> 17:36.000
Okay.

17:36.000 --> 17:49.000
That's the main important part is, okay, you don't see my cursor, okay, yes.

17:49.000 --> 18:05.000
This thing which looks a bit funny here, the select statement is basically say, it's basically looking at the current configuration of the target, and then it is given a mapping, right, and we are basically giving it an identity map.

18:05.000 --> 18:26.000
So what this is saying is, okay, when the target system that you built for is X86 Linux, the executable that is, when you use this as an executable, this is also just the same platform, right, so it's a one to one mapping, it's an identity map as I said.

18:26.000 --> 18:34.000
So this makes sure that the right platform is selected from the exit platforms that are registered.

18:34.000 --> 18:49.000
This is kind of important because back to tends to select just the first one, that is available, when you do not add these constraints.

18:50.000 --> 18:55.000
Go back to.

18:55.000 --> 19:02.000
Okay, yeah, that's my time. Thank you for, thank you so much.

19:02.000 --> 19:09.000
Thanks for the talk, great talk. Are there any questions?

19:09.000 --> 19:24.000
The question is, why would you not choose one or the other?

19:24.000 --> 19:50.000
Yeah, the main thing is the developer experience, basically, you want to have really fine grad actions, and we do have put great efforts into doing that, introducing that, you can really build like single objects of your, of your build graph, you don't need to mix, you cannot do that, right?

19:50.000 --> 20:00.000
You can even pick some HTML file that you want to build from your, from the API docs, right, so you just want to build one of these.

20:00.000 --> 20:23.000
Also, it has, it has its caching for, for normal build artifacts, but also for tests, right, so that is, that is, that is, at least what we are after. So it's not done, we are not done with the project, but we'll get to it.

20:30.000 --> 20:37.000
Thank you very much.