WEBVTT 00:00.000 --> 00:10.600 Hello everybody here at FosterMet for Lighting Talks. I want to introduce to you 00:10.600 --> 00:16.920 Claus Ellic and he's talking about tweeting bit-differentiance indibendent of 00:16.920 --> 00:23.720 their origin. If you're from a warm welcome and enjoy the talk. 00:23.720 --> 00:33.560 Okay, thank you very much. As a title suggests, I want to look into how built-in 00:33.560 --> 00:37.680 finishes are treated, especially in turning a build system, and why I believe that it's 00:37.680 --> 00:42.560 important to have that independent of where the definition is coming from. And of 00:42.560 --> 00:46.520 course, there's all implementing free and open software. And since it's a lightning 00:46.520 --> 00:51.640 talk that's jumped right in an example, I brought you a make file, not because I believe 00:51.640 --> 00:56.360 make files are a particular close to the independent of where the definition is, but 00:56.360 --> 00:59.840 because a lot of people can read make files, I hope you understand what this example 00:59.840 --> 01:07.120 is all about. In fact, make files are pretty, the opposite of what I think. So make 01:07.120 --> 01:12.240 files as you have targets, use files, and they're identified by a pass into the file 01:12.240 --> 01:16.960 system. And then you say, well, this file better be newer than some of those other 01:16.960 --> 01:21.720 files. And if not, then there's a command run and hopefully it will have the right side 01:21.720 --> 01:27.640 effect. And that it's all about side effects is something you notice. If you look at all 01:27.640 --> 01:33.920 those phone targets that don't even create a file that just there for side effects. Still, 01:33.920 --> 01:38.840 make files are used to build software. So there is a computational content to that, and 01:38.840 --> 01:44.520 let's have a look at how that might look like. I actually just look like in the build 01:44.520 --> 01:49.640 tool I'm working on. So we have this first object file. It's a file, and we say, well, 01:49.640 --> 01:54.080 it is the output of running some command. It's actually the output at location food at 01:54.080 --> 01:59.880 oh. The interesting thing is, obviously, what is behind that hash there, and it's an 01:59.880 --> 02:04.760 action. So what is an action given by, well, you run some commands. It's a make file. 02:04.760 --> 02:11.160 You probably have a string that indicates string, and then pass it to a shell. You specify 02:11.160 --> 02:16.120 some environment variables, definitely pass. Make files just use the ambient 02:16.120 --> 02:20.280 pass on the location, but to get it well defined, you specify something, either something 02:20.280 --> 02:26.440 very generic, as here in that example, or something very specific to the installation, like 02:26.440 --> 02:30.440 a next pass or so. The interesting thing is, so, now, how do we talk about files? I 02:30.440 --> 02:34.280 say, I don't want to say, well, take the file from the end of the file system. And there 02:34.280 --> 02:38.840 I use that it is a committed source file, and now what I speak to the one entirely, use 02:38.840 --> 02:43.240 it so we can use the git plot by the endifier to identify a file, and then we define it 02:43.240 --> 02:49.480 by its content rather than, oh, it happens to be there on my disk. Okay, the next thing is, 02:49.480 --> 02:53.800 well, it's got some interesting, if you look at this library. Well, of course, it's also 02:53.800 --> 02:59.320 some output of some action, but if you look at that action, well, we also have the command, 02:59.320 --> 03:05.480 we expect one file to be created, and there's an input, well, we have the output of the previous 03:06.040 --> 03:12.520 action, and that is the hash user scene, so that way we can encode the whole action graph 03:12.520 --> 03:19.800 into these identifiers, and have an description in the, well, have a description, first of all, 03:19.800 --> 03:26.680 independent of where on my disk, the source files were, but also with that, I get an action 03:26.680 --> 03:33.640 graph, and having all this input output very explicit, I'm independent of running the command 03:33.640 --> 03:38.920 in the same working directory, I can have a separate directory if I could all the files in 03:38.920 --> 03:45.320 where they belong to and get out the expected outputs, and that way I can also use remote execution 03:45.320 --> 03:50.840 or whatever, distributed over many machines, and maybe share cache with other ones working the same 03:50.840 --> 03:57.000 thing, but the more interesting thing if you look at targets, so make file, we have files, 03:57.000 --> 03:59.960 but if you look at that, make file, you see, well, there are two libraries, two and bar, 04:00.840 --> 04:06.680 and if you look at bar, some of the inputs is also seems to be that food and HPP, so there 04:06.680 --> 04:13.720 is a dependency, well, hidden in the, in terms of that make file, and I guess that's why the 04:13.720 --> 04:20.040 template make first got out of fashion, and you want to make that structure explicit, so you write 04:20.040 --> 04:24.920 something like that, saying, well, I have food, it is a library, it has health, it has sources, 04:25.880 --> 04:31.880 say, so it's bar, but bar depends on food, and I have an application of binary, 04:31.880 --> 04:40.520 that depends on bar, and only implicitly then depends on food, so I should mention a bit 04:40.520 --> 04:45.800 about the syntax here, this reference to the built-root, saying it's a library, I'm referencing 04:45.800 --> 04:50.120 to a built-root because there's a built-root, I can't know any programming language on earth, 04:50.120 --> 04:55.240 so I want that delegated to a separate proof set, and since this toy example, 04:55.240 --> 05:02.760 and I talk is not the only C++ project that should be built by the tool, the C++ specific knowledge 05:02.760 --> 05:08.920 is, of course, in a separate logical repository, and that syntax is kind of a, this, 05:08.920 --> 05:13.560 like way of writing some type, the first-character, first MTC says, which kind of addressing a rule 05:13.560 --> 05:19.880 it is in this case rule in a separate repository, that repository is called rules from that project, 05:19.960 --> 05:24.040 that you find a directory called CC, and that you find a rule description file, 05:24.040 --> 05:31.000 and from that you better take the entry library. Okay, so what does a rule do? It takes all the inputs, 05:31.000 --> 05:36.040 and then gives a description of the output, and you've seen already how you can talk about files 05:36.040 --> 05:41.400 without having to build them, and if you look at who we get something, well we get 05:43.000 --> 05:47.720 a primary artifact, which is a library, it is created by some action, and here you see that the rules 05:47.800 --> 05:51.480 are a bit more elaborate than what I've shown on the introductory examples, 05:52.440 --> 05:58.040 because you won't maybe the compiler as an input implicit to a tool tennis well, so you 05:58.040 --> 06:02.440 have to move everything to a sub directory to make space for the compiler being an input of 06:02.440 --> 06:09.320 that action that's why the output pass is in a sub directory, and now it doesn't look as ridiculous 06:09.320 --> 06:15.720 in the pass that well we take the output called full and put it as position full, and you also have 06:15.800 --> 06:21.000 the header file, because to use the library need the header file, and the only other useful information 06:21.000 --> 06:25.080 in library that doesn't depend on anything else is, well the linking, if you want to link 06:25.080 --> 06:29.880 against this library, that is what you should at your linker command plan. It's because more interesting 06:29.880 --> 06:35.080 if you now take the library bar that depends on full, we still get a primary artifact, which is the 06:35.080 --> 06:40.600 library, we still get a header file, but we get the additional information that, well, if you want to 06:40.600 --> 06:47.720 compile against that library, you better include that header file for as well, and the same, 06:47.720 --> 06:52.920 if you want to link, well, you don't only need our primary artifact bar, you also need that other 06:52.920 --> 06:58.600 library, which will depend upon, and the linker command plan says well at first bar, then full, 06:58.600 --> 07:05.640 so that the open libraries simply from bar are found in full, and that is something where we get 07:05.720 --> 07:12.920 away from the way we defined the library, so that is all the information we need, so we no longer 07:12.920 --> 07:17.080 have to reflect that it's coming from a library that actually depends on another library, 07:17.080 --> 07:21.800 we can depend on that description, that is the only thing that a consuming target gets, 07:22.360 --> 07:26.680 and by including it here we basically can forget about the build switch, we have all the 07:26.680 --> 07:32.760 information we need, and there's a second thing we can do, here we have the script of files, 07:32.760 --> 07:38.920 still in, as an instruction on how to build them, we can actually do that, and then we get this, 07:39.480 --> 07:45.320 see how the edges have changed, now it's more concrete, or actually what we have replaced is all 07:45.320 --> 07:51.800 the description of an action by the result of running that action, so that is now basically a package, 07:52.360 --> 07:56.040 you can use that, we have some controversial bestow where you can store the files, and that's 07:56.040 --> 08:01.480 also why it's no harm of having the files duplicate in everyone depending on it, because the only 08:02.440 --> 08:11.720 copying is the hash saying, well that is where we refer to, and yeah, so that's why it's 08:11.720 --> 08:19.560 basically a package, and it's just a small projection, so how when can we replace such a library by 08:19.560 --> 08:24.360 it's kind of a package version, well if the repository hasn't changed, so we have to have a look 08:24.360 --> 08:29.080 at repositories, so first this is again the build description, if you say, well this by 08:29.080 --> 08:35.240 something that should be reusable, then I just rename that entry, but I didn't change the definition 08:35.240 --> 08:39.720 though, it's still the same thing, and then as a new by I say, well basically take with bar, 08:39.720 --> 08:45.080 and make it reusable, and I have to provide some information, which I haven't really talked about, 08:45.080 --> 08:50.280 there's a configuration that you kind of debug with, et cetera, and I have to specify which part of 08:50.280 --> 08:55.240 the configuration should be taken to account, the rest is from one away, that way I can immediately 08:55.400 --> 09:00.520 know what is relevant, and now I have to talk about repositories, this is literally what I 09:00.520 --> 09:05.240 added to this example, it's saying, well here you find take the current working directory, 09:05.240 --> 09:11.080 but actually it's a Git repository, take the head commit, and the same is for the rules, I have, 09:11.080 --> 09:16.920 when I talk about rules, I mean that repository in some parallel paths, and again it's a Git repository, 09:16.920 --> 09:25.160 take the Git commit repository to be that, I could have also specified some some branch or whatever, 09:26.040 --> 09:31.320 in some external repository instead of copying it here, in any case what I get is this, 09:32.120 --> 09:39.320 I'm saying okay, you got a repository, and the root is just that gets tree identifier, and in 09:39.320 --> 09:46.120 case you don't know what the tree identifier looks, there's some repository where you can look it up, 09:46.760 --> 09:51.080 and that's actually what this already gives me, I don't need a check out, I'm building against 09:51.160 --> 09:56.360 the repository with these three identifiers, because I create a fresh directory for action anyway, 09:57.800 --> 10:01.480 and now the first time I'm showing the tool I'm working on, which is called just those, 10:01.480 --> 10:07.640 so you call the repository tool, is to ask it to analyze the target, well you get 10:07.640 --> 10:11.960 precisely what I've shown in the earlier slides, so that five mentioned there's just the 10:11.960 --> 10:15.800 repository script for showing you, the interesting thing is saying well I went to an 10:15.800 --> 10:20.200 target that is meant for caching, and here's the cache key, that's why I increase the lock limit, 10:21.400 --> 10:26.120 what does it contain? Well the target inside that repository and the configuration, 10:27.000 --> 10:31.080 and a description of the repository, again the interesting thing is what comes from 10:31.080 --> 10:35.480 that good plot identifier, and now you see the graph, so the basic message is, 10:38.120 --> 10:43.160 you only have the trees, because the tree of a source repository determines the source 10:43.160 --> 10:48.120 tree completely, no matter where you start that thing or where you got that repository from, 10:48.840 --> 10:54.120 so that is a key, and the other thing is, I only looked at the definition of the 10:54.120 --> 11:00.840 naming, so I could also, in order to make it a canonical key, rename the repository names 11:00.840 --> 11:05.880 by taking the entry point zero, and then enumerating some canonical reversal, 11:05.880 --> 11:11.160 though that's why the bindings are they are changed, and now if I build something that uses 11:11.160 --> 11:17.720 that library, well you get some also you compute some built actions, but as a side effect, 11:18.200 --> 11:24.520 you evaluate the package, so if you now look at the same library again, you see the other 11:24.520 --> 11:30.040 definition of the package, all the action graph is gone, and by gone I really need gone, 11:30.040 --> 11:37.320 if I build it, it says no actions at all, and in fact it's not only that they are all cached, 11:37.320 --> 11:43.160 but the analysis now says well you have a package definition of that, you can use it from there, 11:44.120 --> 11:52.200 of course, that requires to use it that this repository, the graph is shown doesn't change, 11:52.200 --> 11:56.360 so it's mainly useful for someone from an external repository, logically one, 11:56.360 --> 12:01.240 could be several sub-directory, consumes it library, and there is one more thing, 12:02.120 --> 12:06.680 I've shown you in detail all the keys, so it's basically the get tree, 12:06.680 --> 12:11.400 of course you would never specify dependency or I want that get tree, but the way you specify 12:11.560 --> 12:18.600 the something like oh I want that archive or that release or that commit or that 12:19.320 --> 12:24.440 take of some get project or something like that, and there's of course a functional dependency 12:24.440 --> 12:30.200 from the commit ID to the tree ID, or from this is the hash of an archive, what is the tree, 12:30.200 --> 12:36.040 if you unpack that archive, so I can build a service that answers that questions, and I mean that 12:36.040 --> 12:40.600 service of course needs to know all the sources, but whenever you use a project you should have 12:40.600 --> 12:47.000 a backup copy of this upstream source is anyway, even for the third party projects, and from that 12:47.000 --> 12:54.520 information I can compute the tree at the cache kit that I've just shown you, and the next 12:54.520 --> 12:58.360 thing is that contains all the information to build it, so I can have a service that works with 12:58.360 --> 13:03.320 the remote execution service I might use anyway to share computation, not only to boost 13:03.320 --> 13:12.600 you to buy it, but also to share it with others working on the same project, and in that way 13:12.600 --> 13:18.440 I can really, I'm in the situation that I can take the project I'm working on to check out, 13:18.440 --> 13:24.840 do my changes built, and all the dependencies I don't even have to fetch days day on my service, 13:24.840 --> 13:29.240 and I have all the benefits as if I were bootstrapping all the dependencies without having to 13:29.240 --> 13:34.440 clutter these things around, because even at the end, the only things are bootstrapping 13:34.440 --> 13:39.160 a particular compiler that can as well say on the remote execution together with service backing 13:39.160 --> 13:47.640 up the sources. Yeah, so with path to the data sort, I would be, so that is basically the 13:47.640 --> 13:54.440 message I want to say, and it is, so what is it? Take a message, first of all, definitions 13:54.440 --> 13:59.080 are an interesting object. In their own right, you can take them, you can serialize them, 13:59.080 --> 14:02.200 you can send them over the wire, you can sort them some content and I submit a store, 14:03.080 --> 14:06.920 and if you focus on housing that you find rather than at which place they're defined, 14:07.720 --> 14:12.200 that has some benefits, for example, the seamest position between building and package building, 14:12.200 --> 14:17.400 you've seen that, oh, I just replaced that one hash by another hash and the meaning of that hash, 14:18.280 --> 14:22.280 and of course as I said, it's all free and open, it's all software. You can download 14:22.280 --> 14:27.080 from GitHub and depending what you use, you might also find it in your favorite package repository. 14:27.800 --> 14:29.080 Thank you. 14:36.280 --> 14:38.280 Okay, thank you for the talk.