WEBVTT 00:00.000 --> 00:14.720 All right, everyone, I'm Matt Swazo, I work on the Google open source security team. 00:14.720 --> 00:23.280 We're speaking about build auditing as part of the OSS rebuild project. 00:23.280 --> 00:28.840 So let's just rebuild aims to reproduce open source packages. 00:28.840 --> 00:32.560 The focus on language package ecosystems. 00:32.560 --> 00:40.120 We currently support PIPI, NPM, crates, Ruby, gyms. 00:40.120 --> 00:43.120 Thanks to one of our organizers. 00:43.120 --> 00:46.720 And Debian as well. 00:46.720 --> 00:49.880 So we provide evidence of our reproduction, 00:49.880 --> 00:58.320 the form of a salsa at a station, and include our build setup in the at a station itself. 00:58.320 --> 01:02.760 So the public users can run that themselves. 01:06.760 --> 01:12.280 This process happens out of band from the actual main team or upload. 01:12.280 --> 01:18.200 And the sort of first-party Providence publication pathway. 01:18.200 --> 01:27.320 He's another P word, which is nice, because maintainers then don't need to modify their build release process at all. 01:27.320 --> 01:30.960 They can even release from that laptop if we can rebuild it. 01:30.960 --> 01:36.520 They can have an attestation from a trusted builder. 01:36.520 --> 01:43.040 And I guess first, while I'm up here now, it was just rebuild is not just me. 01:43.080 --> 01:46.760 I have a whole team of engineers, both. 01:46.760 --> 01:51.600 On our team and external collaborators, and would not be up here. 01:51.600 --> 01:53.600 The project would not exist without them. 01:53.600 --> 01:58.960 So to get more into sort of what it looks like. 01:58.960 --> 02:03.920 And I have made this as hard as possible to see. 02:03.920 --> 02:07.480 Imagine you can see the boxes and arrows. 02:07.480 --> 02:12.800 But the architecture is a complaint. 02:12.880 --> 02:16.000 So we have our upstream source going into our build. 02:16.000 --> 02:18.400 We get the upstream package. 02:18.400 --> 02:19.600 We compare them. 02:19.600 --> 02:21.280 The stabilise comparison. 02:21.280 --> 02:26.320 And publish them as an attestation. 02:26.320 --> 02:28.080 The whole slide show will be like this. 02:28.080 --> 02:32.080 So I guess, yes, let's get started. 02:36.080 --> 02:39.520 That would be incredibly helpful, I think. 02:40.480 --> 02:43.680 Could have checked this somehow. 02:48.400 --> 02:49.680 Oh, geez. 02:55.680 --> 02:57.680 Oh, boy. 02:57.680 --> 02:58.680 OK. 03:10.000 --> 03:15.440 Yeah, I mean, yeah, hopefully you can see the boxes. 03:15.440 --> 03:18.640 Just want to look at exactly. 03:18.640 --> 03:20.800 Please use your imagination. 03:20.800 --> 03:25.360 So with the sort of attestations we're publishing, again, 03:25.360 --> 03:31.280 this can be in addition to or instead of a maintainer publishing 03:31.280 --> 03:33.200 their own first party attestation. 03:33.200 --> 03:38.480 So when an existing one is present when the publisher has 03:38.560 --> 03:42.000 provided one, we can corroborate it. 03:42.000 --> 03:44.320 And as we'll be this object of the talk, 03:44.320 --> 03:46.400 we can also augment it. 03:46.400 --> 03:53.440 So adding detail that maybe was not present during the original build. 03:53.440 --> 03:57.040 Maybe going to an even higher level. 03:57.040 --> 04:02.800 So what is the trust sort of relationship we're talking about here? 04:02.800 --> 04:07.040 So we have this source code here, hopefully understandable, 04:08.000 --> 04:12.320 we have our build environment that we trust. 04:12.320 --> 04:14.560 We think is trustworthy. 04:14.560 --> 04:19.280 And with some effort, obviously, the equal sign is doing a lot of work. 04:19.280 --> 04:24.640 But we can get a reproducible artifact out the other side. 04:24.640 --> 04:30.960 But like all models, it's not accurate. 04:30.960 --> 04:34.160 It's just convenient, maybe. 04:34.160 --> 04:38.880 Neither the upstream source nor our build environments or inherently trust 04:38.880 --> 04:40.240 worthy. 04:40.240 --> 04:48.400 So while a convenient assumption, it's certainly not a safe one. 04:48.400 --> 04:55.520 And as you actually show of hands, who has seen a presentation in the last 04:55.520 --> 04:59.760 two years with XE, the XE attack in it, 04:59.840 --> 05:04.160 who has seen three, five. 05:04.160 --> 05:09.600 OK, so I'm going to similar about, but for all of you who 05:09.600 --> 05:17.760 have only heard it once, let this be another opportunity. 05:17.760 --> 05:22.400 So suffice it to say, three years of social engineering 05:22.400 --> 05:26.480 sort of culminates with this nice, elegant, effective, 05:26.480 --> 05:34.320 scary attack on ultimately Debian's SSH package offering sort of preoff 05:34.320 --> 05:39.120 remote code execution, the maintainer gained control of the project, 05:39.120 --> 05:44.640 and was able to backdoor XE utils. 05:44.640 --> 05:49.840 So the backdoor itself is split between the published source code 05:49.840 --> 05:54.480 underhanded, again, a lot of elements of it were well-offuscated, 05:54.480 --> 05:58.160 and the additional trigger, ultimately, 05:58.160 --> 06:03.600 was published out of tree, so in the source tarball, 06:03.600 --> 06:12.000 malicious, that enables the build in the Debian build environment, 06:12.000 --> 06:16.640 Debian, and Ubuntu, maybe, probably quite similar. 06:16.640 --> 06:21.840 And that results in the artifact being compromised, 06:21.840 --> 06:25.840 and fully reproducible, can be a project reproduced it. 06:25.840 --> 06:29.120 Great, looks good, obviously not. 06:29.120 --> 06:35.280 So there was maybe an animating moment for our project, 06:35.280 --> 06:37.440 and I assume many others in the space. 06:41.280 --> 06:44.800 And so we sort of restructured, maybe, 06:44.800 --> 06:49.600 OSS rebuilds approach to not just verification by repetition, 06:49.600 --> 06:56.000 but verification by observation, in addition. 06:56.000 --> 07:00.160 And sort of what we found has been, we think, 07:00.160 --> 07:08.640 generalizable across, you know, package ecosystems, not just Debian. 07:08.640 --> 07:17.200 So our instrumented environment sort of comprises maybe three pieces, 07:17.280 --> 07:21.680 our execution sandbox, which is distrusted by the rest of the system, 07:21.680 --> 07:26.400 just does the build, the network proxy, which records 07:26.400 --> 07:31.200 fetched resources without needing to modify the internal build, 07:31.200 --> 07:34.000 which is important for us, because given the scope, 07:34.000 --> 07:40.560 we're not really in the business of being able to run a distro, 07:40.560 --> 07:47.840 basically. And finally, our fiscal analyzer or graph construction, 07:47.840 --> 07:52.400 which comprises the ABPA, EBPF based monitor, 07:52.400 --> 07:56.880 and a data flow graph builder library. 07:56.880 --> 08:01.680 We'll talk about right now. 08:01.680 --> 08:04.880 No, well, network proxy first. 08:04.880 --> 08:09.200 So we terminate TLS between the build, 08:09.200 --> 08:11.840 this proxy component, record the trace log. 08:11.840 --> 08:17.600 We have the ability to modify, in line, the results we get back upstream. 08:21.040 --> 08:25.920 Yeah, but more importantly, we are able to, like, say, 08:25.920 --> 08:29.360 record everything coming out of, you know, NPM install. 08:30.320 --> 08:39.440 And another, I guess, a bit, nice, is that even if the build is launching 08:40.640 --> 08:45.440 separate containers, we actually hook the container creation process 08:46.400 --> 08:51.440 in the proxy itself, and we can add in the trust route 08:51.440 --> 08:54.800 and the network configuration to that container launch, 08:54.880 --> 09:00.000 so that it passively sort of adopts the path we have built through the proxy. 09:00.880 --> 09:06.000 Just, you know, nice and necessary when you need it, 09:06.000 --> 09:09.760 but for Debian, for instance, you don't. 09:11.360 --> 09:18.240 The CISC graph SDK recently opensource, but is our library for turning 09:18.240 --> 09:24.160 the events generated by our EBPF monitor into the process and data flow 09:24.160 --> 09:30.480 graph will be talking about, yeah, pretty straightforward. 09:31.920 --> 09:36.160 The graph compresses efficiently, it's very important, so we're 09:36.960 --> 09:41.040 instead of talking maybe gigabytes, talking, megabytes still, 09:41.040 --> 09:47.760 but we get for it is a process and file system trace that's 09:47.760 --> 09:51.760 tremendously useful as a basis for analysis. 09:52.720 --> 09:57.680 With this sort of exhaustive view of the build, like, who cares? 10:00.080 --> 10:08.480 But even beyond, say, detecting XE, what we found was that it has proven really 10:08.480 --> 10:13.840 valuable for actually working towards reproducibility and achieving reproducibility. 10:14.800 --> 10:17.680 So, start there. 10:20.480 --> 10:25.520 Much ink has been spilled on dependency management and lock files, 10:25.520 --> 10:31.760 some sure everyone in this room is acquainted with and our experiences have been 10:31.760 --> 10:38.000 no different dependency manifests are convenient 10:38.080 --> 10:44.320 fictions and really, when we want to know what's going on in a build, 10:44.320 --> 10:49.680 especially portably across ecosystems, there's really no better place than 10:49.680 --> 10:51.280 network trace itself. 10:52.800 --> 10:56.960 Base, not best there in precise at worst. 10:58.960 --> 10:59.520 Let's take a look. 11:00.320 --> 11:07.600 Okay, what version of XE, do you think that is going to fetch? 11:12.480 --> 11:14.720 Well, it really looks like that doesn't it? 11:16.720 --> 11:20.320 These are trick. It is 0.8.8. 11:21.120 --> 11:25.600 Not only does it upgrade the patch version, it also upgrades the minor version, 11:25.680 --> 11:31.120 just fun. Let's pick on someone else. This is great. 11:32.240 --> 11:34.800 What version of Guava do you think we're going to be using? 11:42.480 --> 11:45.920 31.1, obviously. Just like we asked. 11:48.640 --> 11:52.080 Okay, so what version of tools are we going to use? 11:55.840 --> 12:03.040 Yeah, trick question. It's the same trick as all the rest, but it is still a trick question. 12:03.920 --> 12:08.080 Yeah, it is. Just the record of the last person to have built it. 12:11.200 --> 12:12.320 So 0.3.8. 12:18.960 --> 12:25.440 Another fun tidbit we can get from this sort of privileged position we have in the build is 12:26.560 --> 12:33.680 that many package ecosystems will do creative things say during insulation time. 12:35.520 --> 12:43.040 It's a varying degrees and one end of that spectrum is NPM and maybe the most 12:43.040 --> 12:49.840 permissive ecosystem. So our network instrumentation will tell you that on this call, 12:50.080 --> 12:56.240 we'll see all of these. Okay, how many people can read it? All right, whatever. It's more than just 12:57.280 --> 13:05.680 registry NPMJS. There's this big GitHub release and this long URL asset. 13:07.040 --> 13:15.040 That is dynamically accessed based on the local installation environment at runtime. 13:16.000 --> 13:19.680 Which is nice because we get, you know, platform specific. 13:20.640 --> 13:29.280 Minorities, but this is out of sort of the normal NPM model, right? If I have an 13:29.280 --> 13:37.600 attestation say for my NPM package, that doesn't include these, right? What are they? Who knows? 13:38.480 --> 13:44.880 Are they guaranteed to resolve, continue to resolve to the same artifact? No, hands. They don't 13:44.880 --> 13:54.320 count. Yeah, and this is the snippet in installed JS in the NPMJS pre-built. That does the 13:55.040 --> 14:05.360 30 work. So shifting gears to the CIS graph for reproducibility, there's not a lot of tribal 14:05.360 --> 14:15.680 knowledge in debugging reproducibility issues. And detailed execution tracing found allows us to 14:15.680 --> 14:21.920 encode more of that knowledge than we could normally into something that maybe resembles a 14:21.920 --> 14:27.440 winter or a set of rules that might be able to guide you through debugging and issue like this. 14:27.520 --> 14:39.920 This is generally easier than source code auditing because the problematic cause of a reproducibility 14:39.920 --> 14:45.840 issue often far removed from its effect. So you may be configuring the environment variables 14:45.840 --> 14:52.720 sort of centrally and exporting them through your cost deck, maybe not obvious. So at the site, 14:52.800 --> 15:02.880 what's happening? Maybe to go through a couple of these, reading CPU info is not wrong, 15:02.880 --> 15:11.600 it's not bad, but it is often an indication that your build is doing dynamic things based off of 15:11.600 --> 15:18.800 the core count or scaling parallelism of the build in response to the machine that's running on 15:18.880 --> 15:25.760 if you want to add a different machine, that's where you get into problems. So this is a frequent 15:25.760 --> 15:31.920 thing where you can't often just set these parameters and rarely build, we'll just try to do it 15:31.920 --> 15:46.960 themselves. And therapy dragons. Going to this next one, locals, huge sources of issues, especially 15:46.960 --> 15:55.120 with make where things like the sort order of arguments into a linker depend on the sorting 15:55.840 --> 16:04.800 your local supports or permits. So seeing things like LCLC should give you the warm fuzzies. 16:06.720 --> 16:12.400 The file prefix map similarly, that's just saying don't include my local build path in 16:12.400 --> 16:24.960 the debug symbols or in the file macro. So you'd like to see that on these command lines, 16:24.960 --> 16:35.200 those without it, that could be a cause of issues. The bottom one, I guess, I guess on the flip side 16:35.280 --> 16:43.840 of the parallelism question, J being set anywhere could be an issue, then put it red or yellow, 16:43.840 --> 16:58.320 but it's just something to consider. And instability graphs work great for this. You are 16:58.320 --> 17:03.600 recording what's happening in the build, comparing them across multiple rounds, the same thing, 17:04.480 --> 17:12.880 very easy to identify non-determinism, either in file accesses, argument order, that sort of thing. 17:12.960 --> 17:26.720 This is, yeah, it's been tremendously useful. Okay, and now tracing for detection, so 17:28.000 --> 17:36.400 consider the XE case. Now that sort of instrumenting the build, self-test, reproduce it, 17:36.800 --> 17:54.000 looks good, but how do we prevent XE? Why is XE's build reading a test file at all? 17:54.800 --> 18:01.440 You'd never did this before. Why has it started doing that during the build process? They 18:01.440 --> 18:11.120 with tests disabled. So this is now really fixed out like a sore thumb. We can pretty effectively 18:11.120 --> 18:19.040 find these test files. And again, you find a really sharp increase. Well, from 0 to 3, 18:19.040 --> 18:24.560 but that's as good an increase. You love to see it. Very obvious. Similarly, 18:25.200 --> 18:38.000 I can't really see this, but why is head writing out this .0 file? That is also sore thumb territory. 18:39.600 --> 18:43.280 Glaringly apparent, maybe with the benefit of hindsight, but an easy rule, 18:44.160 --> 18:50.400 .0 files come from compilers. Let's keep it that way. Declaring invariance like this for build behavior 18:51.280 --> 18:58.720 can really point us towards this anomalous or buggy, but in any case, suspect build behavior. 19:04.320 --> 19:12.880 Right, so what I was alluding to before, we can generalize to sort of comparing the two adjacent 19:12.880 --> 19:21.360 graphs of a build and trying to figure out sort of what is interesting about it. This is one of the 19:21.360 --> 19:30.960 more complex changes in version over version, not major version over version changes in 19:30.960 --> 19:38.960 deviant packages that I could find. It does change the build process, but in terms of summarization, 19:39.600 --> 19:47.360 mostly path and version changes that are easily sort of structured. There's no fanciness here, 19:47.360 --> 19:54.400 really, that's going on. It is just comparing the two graphs. Yeah, so the key insight here, 19:54.400 --> 20:02.480 being that the flux between versions is often low enough that we can manageably identify 20:02.720 --> 20:15.440 variants when it happens. It's not shown here, but file accesses can similarly be 20:15.440 --> 20:21.840 dipped between builds and, well, if you add a file, you expect that file to show up in the build. 20:21.840 --> 20:27.040 That's not surprising, but you're dealing with a similar ability to sort of model and 20:27.040 --> 20:33.360 establish expectations for how some change in the repo affects the change in the build. 20:35.760 --> 20:44.960 Unnecessary behavior. So, may have been radicalized on this point, but we just don't do anything 20:44.960 --> 20:54.480 in a build, except build. We don't need tests. So, the Phantom JS dependency downloaded, you know, 20:54.480 --> 21:00.000 thing from the internet is never going to run in a build, but NPM doesn't, you know, differentiate 21:01.040 --> 21:08.320 dev and test dependencies. So, we want the build, you know, which requires dev dependencies, 21:08.320 --> 21:17.360 we're going to pull the test dependencies down. With CIS graph, it's kind of a double edge 21:17.520 --> 21:26.640 stored, it is easier to identify these cases of Linter's test triggering, but the noise causes 21:26.640 --> 21:34.800 more difficulty for analysis. Imagine running ahead of this browser and seeing how the graph 21:34.800 --> 21:44.560 lights up and how to reason about that change over version changes. So, please leave CIS to CIS 21:48.320 --> 21:57.040 hard enough already. And thankfully, Debian has recently last six months made a push for packages 21:57.040 --> 22:02.960 to actually observe the, please don't run tests flag. A lot of packages just chose not to. So, 22:03.920 --> 22:16.080 thank you if you're doing that. And looking forward, runtime interpreters are a big challenge. 22:17.680 --> 22:27.520 What you saw was actually with a bash, a shell hook that we added into our EDPF monitor. So, 22:27.520 --> 22:39.200 it's not impossible to do this. We are sort of trying to scale down our, you know, the granularity 22:39.200 --> 22:46.080 of our ability to get at the behavior of the build. Run time interpreters like Python, the JVM, 22:46.080 --> 22:52.960 make that challenging or opaque. We've had some initial look, but this is a, I think, a very interesting 22:52.960 --> 22:57.680 future direction for build instrumentation. I think there's a lot that can be done there. 22:59.120 --> 23:09.760 Build complexity builds are software that have to do a lot of things, right? You have to build 23:09.760 --> 23:15.840 portably allow configuration parameterization and it needs to be usable to the end user. 23:16.720 --> 23:23.200 But really, we don't need that. We just need the build to do like the, you know, hundred things 23:23.200 --> 23:31.120 that we tell it to. So, if we maybe project that down onto a concrete build, think of it maybe 23:31.120 --> 23:37.760 as like unrolling or materializing a specific build, I think that's an interesting area to go 23:37.760 --> 23:43.520 where we might not have the constraints of it needing to be approachable or easily manipulable 23:43.520 --> 23:57.360 by humans. And maybe a last direction that we find interesting is having a sort of golden 23:59.360 --> 24:07.440 graph, a golden image of the build that we can use to guide sort of porting or packaging, 24:07.440 --> 24:16.400 say, into a distro, a piece of software. So, we have this re-implementation, right, or, you know, 24:16.400 --> 24:24.240 a wrapper at best that we need to do to get a build into a distro or into some new context, 24:24.240 --> 24:30.400 having a detailed description of what, you know, a good build looks like for that thing 24:31.120 --> 24:38.240 is really useful as a sort of mechanical way of evaluating maybe your progress towards that 24:38.240 --> 24:43.280 endpoint. And with execution traces, we really have that ability. 24:46.000 --> 24:55.840 Wrap up. Excited by these, you know, enhancements across all the ecosystems we support, 24:55.840 --> 25:03.760 no, OSS rebuild largely right without any involvement of maintainers, they don't need to, 25:03.760 --> 25:10.720 they can just sort of get these detailed data about their build without, you know, 25:10.720 --> 25:17.120 them needing to modify their CI, their release process, anything. Prospect of this sort of 25:17.120 --> 25:23.840 zero cost improvement is really what interested me in starting the project and we'd love for 25:23.920 --> 25:35.040 anyone who shares this to get involved repo, docs. We don't and can't make all of our data public, 25:35.040 --> 25:40.720 but if you are interested, we are very willing to collaborate and share that, 25:41.760 --> 25:46.400 much easier at a more human scale to share it than just to stick it on the internet as we do 25:47.040 --> 25:53.520 at a stations. We're also interested in how you all can apply these tools and methods to your 25:53.520 --> 25:59.200 own areas of interest, everything is open source on OSS rebuild, please check it out and 25:59.200 --> 26:03.520 I believe there's potential here and we're eager to see what you can build and rebuild. 26:04.480 --> 26:06.080 Questions? 26:07.040 --> 26:09.040 Yeah, sorry. 26:28.480 --> 26:32.960 It's difficult at sort of a large scale to differentiate between things that are 26:32.960 --> 26:37.600 objectionable and things that are like bad. We've definitely found lots of objective things 26:38.720 --> 26:45.760 the last sort of illusion there to future directions and it would be really interesting to try to 26:46.720 --> 26:56.240 remediate that badness with what we know. Please stop having this effect happen and try to 26:56.240 --> 27:02.560 try to set that as an endpoint, but yeah.