WEBVTT 00:00.000 --> 00:11.760 All right, hello everyone, my name is Jonas, and I work on the bugging at Apple, and today 00:11.760 --> 00:15.000 I want to talk about the Webbot assembly the bugging in all of the B. 00:15.000 --> 00:22.560 So I'll do my best, the speaker also kind of echoes. 00:22.560 --> 00:28.840 So this work was primarily motivated by a Swift, and targeting Webbot assembly from Swift has 00:28.840 --> 00:29.960 come a long way. 00:29.960 --> 00:36.280 It started originally as a community project, Akio created WAKIP as an open source project 00:36.280 --> 00:42.200 in 2018, and it was the first Webbot assembly run time written in Swift targeting Swift. 00:42.200 --> 00:47.320 And then a few years later, Max, who may or may not be in the room, there he is, he took 00:47.320 --> 00:52.240 over, I mean, the ownership of the project, and he renamed it to WasmKIP. 00:52.240 --> 00:58.560 After that, Yuda, he interned that Apple, and together with Max, they got 100% specta scofferage. 00:58.680 --> 01:01.680 For Webbot assembly with WasmKIP. 01:01.680 --> 01:09.280 And then with the release of Swift 6.0 in 2024, and that at Swift targeting Webb assembly became an experimental 01:09.280 --> 01:16.680 feature, and also in Swift CI, we adopted WasmKIP to target targeting, so the compiler support. 01:16.680 --> 01:23.640 And then last year in 2025, all their hard work culminated in Webb assembly being officially 01:23.640 --> 01:29.440 supported as of Swift 6.2, and WasmKIP now ships as part of the tool chain. 01:29.440 --> 01:33.760 And of course, all of this is open source, I know the kit at the end might be kind of confusing. 01:33.760 --> 01:38.520 The run time is developed under the Swift Wasm organization on GitHub, and of course, the compiler 01:38.520 --> 01:40.800 support is just in upstream Swift. 01:40.800 --> 01:44.880 And if you want to learn more, check out SwiftWasm.org. 01:44.880 --> 01:50.320 So our goal for Swift Compile to Webb assembly is to provide a first class debugging experience, 01:50.320 --> 01:54.800 and that means full source level debugging, where you can set breakpoints, step in, and 01:54.800 --> 01:58.320 over Swift code, and of course, show your variables. 01:58.320 --> 02:03.200 And so in order to achieve that, the debugger needs to know about the Swift programming language, 02:03.200 --> 02:04.920 which means that there's two approaches. 02:04.920 --> 02:10.920 One, we could teach existing tools to debug Webb assembly and teach it about Swift, or we could 02:10.920 --> 02:15.720 teach LDB, which already knows about Swift, about Webb assembly. 02:15.720 --> 02:20.200 So maybe the first approach sounds simpler on the surface, but it really isn't. 02:20.200 --> 02:24.800 The debugging Swift is far from trivial, and LDB already has years and years of investments 02:24.800 --> 02:26.200 in that space. 02:26.200 --> 02:30.520 On top of that, if we add Webb assembly support to LDB, that doesn't just benefit Swift, 02:30.520 --> 02:34.080 but pretty much every language that is supported by all the game. 02:34.080 --> 02:37.080 And that's why we thought that was the better approach. 02:37.080 --> 02:42.080 So before we commit to any approach, I do want to compare a few options that exist for 02:42.080 --> 02:44.760 the debugging, or existed before I started all this. 02:44.760 --> 02:50.120 And we identified three main approaches, each with their own pros and cons, and an interesting 02:50.120 --> 02:55.520 observation here is that they all use LDB one way or another. 02:55.520 --> 03:00.200 If you've played around with Webb assembly, you're probably already familiar with WasmTime. 03:00.200 --> 03:04.080 It is the reference implementation of the Webb assembly runtime, and it's developed by the 03:04.080 --> 03:05.880 bytecode alliance. 03:05.880 --> 03:10.640 It uses just in time compilation to generate native machine code with the debugging 03:10.640 --> 03:11.640 info. 03:11.640 --> 03:16.160 And so the debugging in WasmTime means debugging the runtime, and the jitter code just runs 03:16.160 --> 03:18.680 as part of that same process. 03:18.680 --> 03:23.280 And so you can debug WasmTime this way with either GDB or LDB. 03:23.280 --> 03:26.720 And to the debugger, it just looks like you're debugging native code. 03:26.720 --> 03:29.720 And so what's neat about this approach is that like we're like no changes necessary, 03:29.720 --> 03:32.640 LDB doesn't need to know about Webb assembly at all. 03:32.640 --> 03:36.840 The trade-off on the other hand is that you are debugging the runtime, and so it can make 03:36.840 --> 03:40.800 it kind of hard to distinguish between the runtime code and like decoded you're trying 03:40.800 --> 03:41.800 to debug. 03:41.800 --> 03:44.440 For example, if you're looking at a backtrace, you're going to get all these frames coming 03:44.440 --> 03:48.600 from the runtime that you have to mentally filter out. 03:48.600 --> 03:50.720 Chrome takes a different approach. 03:50.720 --> 03:55.600 They have a delft tools extension, which adds support for debugging C and C++ code, 03:55.600 --> 03:58.960 right from within the browser's built-in developer tools. 03:58.960 --> 04:03.720 And this extension also uses LDB under the hood, and it uses that to parse dwarf and create 04:03.720 --> 04:08.680 types, but pretty much everything else is done by the browser itself. 04:08.680 --> 04:12.400 And so as a developer, that's nice because you can use the same tools for debugging JavaScript 04:12.400 --> 04:14.520 as you do for a web assembly. 04:14.520 --> 04:18.680 The downside is for Swift supporter and the other language, you need to extend the browser 04:18.680 --> 04:22.840 and teach it about it, which is one of the things we wanted to avoid. 04:22.840 --> 04:28.480 The third approach is the web assembly micro runtime, sometimes abbreviated as whammer, and 04:28.480 --> 04:33.360 it's also developed by the bytecode alliance, and it's a lightweight standalone runtime 04:33.360 --> 04:36.720 targeting embedded and IoT applications. 04:36.720 --> 04:42.000 And this has a small debug server that talks the GDB remote protocol, which is an industry 04:42.000 --> 04:47.440 standard that's supported by both GDB and LDB, the spitany. 04:47.440 --> 04:51.240 It's also the approach we started pursuing, and the debug stop in a micro runtime meant 04:51.240 --> 04:55.600 that we had something that already existed and we could work with. 04:55.600 --> 04:58.080 So at this point, you're probably curious what this looks like. 04:58.080 --> 05:01.680 So I'm going to jump straight in with a demo and show you what it looks like before 05:01.680 --> 05:05.040 I talk about boring implementation details. 05:05.040 --> 05:08.960 So hopefully, that's not too small, but I realize that I might have underestimated the 05:08.960 --> 05:10.440 size of the screen. 05:10.440 --> 05:14.840 So there's a short recording of Swift, which is compiled to web assembly, and we're using 05:14.840 --> 05:17.680 LDB individual studio code to the bug is. 05:17.680 --> 05:20.760 The code is fairly trivial, especially for the people in the back. 05:20.760 --> 05:26.920 We got a dictionary mapping some fruits to prices, and then we have a kind of contrived function 05:26.920 --> 05:28.920 that just adds entries to the maps. 05:28.920 --> 05:30.920 So I'm going to start playing the demo. 05:30.920 --> 05:36.080 So I'm going to start a break point here, and then I'm going to run already compiled this, 05:36.080 --> 05:40.600 and you can see we got a back trace, exactly as you would expect. 05:40.600 --> 05:44.000 Then we have local variables. 05:44.000 --> 05:47.760 Show you, this is the input to the function, function arguments. 05:47.760 --> 05:50.800 Then I continue, so you're going to see the values changing, because we're in the next 05:50.800 --> 05:53.000 iteration of this function getting called. 05:53.000 --> 05:55.240 And I'm going to continue stepping out of it. 05:55.360 --> 05:59.600 Here I'm just proving that those values look the way you expect, so I continue, or 05:59.600 --> 06:04.960 step out of it, and then here's the fruit prices variable that we can also inspect. 06:04.960 --> 06:08.800 So far, everything just looks like native debugging, so to prove to you that this is 06:08.800 --> 06:15.760 actually debugging web assembly, I'm doing a business assembly here to show you all the instructions. 06:15.760 --> 06:19.360 But in order to get here, we had to build several key pieces. 06:19.360 --> 06:22.160 So let me talk about that next. 06:22.160 --> 06:27.160 One of the key design decisions was using the GDPRot protocol, so I want to start out with 06:27.160 --> 06:29.280 an overview of that architecture. 06:29.280 --> 06:34.160 So in a native debug session, LDB uses a client server architecture, where the client 06:34.160 --> 06:39.680 is LDB itself, and it runs on the host, and then it communicates with a debug stub, which 06:39.680 --> 06:44.160 runs next to the program you're debugging, which we call the inferior. 06:44.160 --> 06:49.360 The stub is a lightweight and tidal binary that directly controls the inferior, and usually 06:49.360 --> 06:53.040 does so with help from the operating system, for example, a mechOS, and Linux that's 06:53.040 --> 06:54.840 going to be P3s. 06:54.840 --> 06:58.360 And also I want to call out that the same client server architecture is used regardless 06:58.360 --> 07:02.160 of whether you're debugging remotely or locally, in the local case, like you know the host 07:02.160 --> 07:05.000 and the target are just the same machine. 07:05.000 --> 07:10.360 The GDPRot protocol is simple, it's well documented, it's widely adopted, and it's 07:10.360 --> 07:15.840 supported by a bunch of tools, including LDB and GDB, and I realized that this year, it 07:15.840 --> 07:19.320 has existed for 40 years, so it has really proven itself. 07:19.320 --> 07:22.240 Which is the reason we want to build on top of that. 07:22.240 --> 07:26.080 So here's what that architecture looks like for WebAssembly. 07:26.080 --> 07:30.560 Here the runtime would implement the GDPRot stub, and how it does that is entirely implementation 07:30.560 --> 07:31.560 defined. 07:31.560 --> 07:37.080 It can different run thanks can choose to optimize for their own constraints. 07:37.080 --> 07:42.600 And then just the normal way LDB will talk using the GBRot mode protocol to the debug stub, 07:42.600 --> 07:45.920 and that way control the WebAssembly inferior. 07:45.920 --> 07:50.960 And the key takeaway here is that this approach provides a standardized way for any debugger 07:50.960 --> 07:56.480 to talk to any WebAssembly runtime that exposes the GDPRot protocol, and the nice thing here 07:56.480 --> 08:00.040 is like LDB doesn't need to know about the implementation of the runtime and the runtime 08:00.040 --> 08:04.440 doesn't need to be aware of that debugger debugging it. 08:04.440 --> 08:09.280 And also WebAssembly is modeled around a stack-based virtual machine, and as we'll see later 08:09.280 --> 08:15.840 that poses some concepts that can directly be translated into concepts in the GDPRot protocol, 08:15.840 --> 08:21.120 and so this requires a handful of extensions that both the runtime and the debugger need to support. 08:21.120 --> 08:28.080 Luckily the protocol was always designed to be extensible, and the majority of the packets are all standard. 08:28.080 --> 08:33.080 So let's take a look at the implementation, and previous to this work LDB already had some 08:33.080 --> 08:38.760 wares in support, in particular for the Chrome Dev tools integration that I mentioned earlier. 08:38.760 --> 08:45.200 It supports reading, wares and binaries, loading them into memory, and then using the dwarf to create types. 08:45.200 --> 08:50.320 And the micro runtime, as a debug stop, was also designed to work with LDB. 08:50.320 --> 08:55.040 The repository contained a patch, which was based on work from follows Severini. 08:55.040 --> 09:00.320 He was the driving force behind some of the earlier existing WebAssembly supporting LDB, 09:00.320 --> 09:04.960 and some of my work is a continuation of some of the PRs he put up there. 09:04.960 --> 09:10.040 And we also made it a goal to remain compatible with the debug stop in the micro runtime. 09:10.040 --> 09:15.160 During this process we encounter some places where we might have taken some small different design decisions. 09:15.160 --> 09:19.160 But nothing that warrants and breaking fatability. 09:19.160 --> 09:21.240 All right, objects files. 09:21.240 --> 09:28.280 So the first challenge here was that LDB's existing objects file wasn't plugin, was rather rudimentary. 09:28.280 --> 09:33.960 To keep things simple, we were looking for a handful of known sections and ignoring everything else. 09:33.960 --> 09:39.720 And we needed to expand this to read arbitrary sections, in particular Swift Castle metadata, 09:39.720 --> 09:43.000 or language-specific metadata sections that we needed to read. 09:43.080 --> 09:48.680 And most of the work was reading the spec, knowing how to parse those data sections. 09:48.680 --> 09:53.480 WebAssembly generally follows the elf model where sections contain segments. 09:53.480 --> 09:57.560 The one interesting thing is that those segments can be active or passive. 09:57.560 --> 10:02.680 And active segments are automatically loaded into memory during module initialization, 10:02.680 --> 10:06.280 and they use what's called an init expression to specify where. 10:06.280 --> 10:11.320 And it's an init expression consists of a series of wasom operations or subset of them, 10:11.400 --> 10:16.200 which meant that we had to implement a tiny wasom interpreter in her objects file plugin, 10:16.200 --> 10:19.800 which was a fun exercise, as you can imagine. 10:20.680 --> 10:24.920 The next challenge was supporting symbols, and for that we needed a symbol table. 10:24.920 --> 10:29.480 But WebAssembly's concept of a symbol table is exclusively used for linking, 10:29.480 --> 10:35.560 and it isn't preserved in the final binary, but wasom does provide something called a name section, 10:35.560 --> 10:38.360 which is one of those new sections we had to parse. 10:38.360 --> 10:42.520 And the name section contains the function names, and an offset, or actually an index 10:42.520 --> 10:47.720 into the function section, and the function section contains an offset into the code section, 10:47.720 --> 10:50.920 together with a size, and I think that's it. 10:50.920 --> 10:54.920 And so we had to combine all of this together, the name, the size, the offset, 10:54.920 --> 10:57.960 and then populate all the B's concept of a symbol table. 10:57.960 --> 11:00.600 And so with that you can now set breakpoints by symbol name, 11:00.600 --> 11:03.000 even if you do not have to worth debugging information. 11:03.000 --> 11:09.000 Now, now we can set, so we have a symbol table, we can set breakpoints, 11:09.000 --> 11:10.200 we can run to them. 11:10.200 --> 11:13.400 The next thing you want to do is be able to see where you are. 11:13.400 --> 11:15.400 So for that we do the BT command. 11:15.400 --> 11:20.760 And so normally LDB would examine the frame pointer register, and then walk the stack. 11:21.800 --> 11:27.800 But doing unwinding depends on concepts that such as registers, and an API defined stack layout, 11:27.800 --> 11:29.800 and these things just don't exist in WebAssembly. 11:30.760 --> 11:34.440 So for native code, LDB has to do the unwinding ourselves, 11:34.440 --> 11:37.960 but we realize that for WebAssembly, we can rely on the runtime, 11:37.960 --> 11:41.080 which already needs to know this information. 11:41.080 --> 11:43.480 And this brings us to our first extension packet, 11:43.480 --> 11:45.480 to the GDB remote protocol. 11:45.480 --> 11:50.680 It's the Q-Wasm call stack packet, and this queries the runtime for a list of program counters, 11:50.680 --> 11:53.240 representing the call stack for the current thread. 11:53.240 --> 11:57.160 LDB then symbolicates in using the information from the symbol table, 11:57.160 --> 12:00.760 or from the dwarf debug info, and then you get the backtrace, as you would expect. 12:02.680 --> 12:03.880 Next up is variables. 12:05.000 --> 12:08.200 For that, we need a little bit of background about how the bugger is used to work 12:08.840 --> 12:10.520 to parse debug information. 12:10.520 --> 12:14.920 If you were in the talk this morning about dwarf 6, you might have already been prepared. 12:16.120 --> 12:21.000 So the bugger uses dwarf location descriptions to find and recover variable values 12:22.360 --> 12:25.560 at runtime, and these things can take several forms. 12:25.640 --> 12:29.240 So they can be empty, in case the value is unavailable, 12:29.240 --> 12:32.040 because the variable has been optimized out. 12:32.040 --> 12:35.880 They can be implicit when the value is known, but there's no runtime representation, 12:35.880 --> 12:37.400 something like a constant. 12:37.400 --> 12:40.840 If a variable is in memory, you have a memory location, and we get an address, 12:40.840 --> 12:44.760 and if it lives in a register, we have a register location, and we get a registered name. 12:45.640 --> 12:50.200 Empty and implicit locations work exactly the same way in WebAssembly as they do in for native code, 12:50.840 --> 12:53.960 but memory and registers be slightly differently. 12:54.920 --> 13:00.440 WebAssembly doesn't have a concept of registers, and so therefore no register location descriptions. 13:01.000 --> 13:05.640 However, WebAssembly has a few other places where it can store values, namely locals, 13:05.640 --> 13:07.480 globals, and an operand stack. 13:07.480 --> 13:11.720 And so to handle those cases, WebAssembly uses something called virtual registers, 13:11.720 --> 13:15.160 which is entirely at the bugging concept, to describe this in dwarf, 13:15.160 --> 13:17.240 and have the bugger query the runtime. 13:17.960 --> 13:22.200 So when a value is stored in WebAssembly, local global or along the operand stack, 13:22.600 --> 13:28.280 if you use a dwarf vendor extension called DWOPWASM location, it takes two arguments, 13:28.280 --> 13:32.120 the first one's specifying which one of these three it is, and the second one's specifying it 13:32.120 --> 13:34.440 index like the first or the second or the third global. 13:36.520 --> 13:41.320 Each register also has its own corresponding GDP remote packet, these are also extensions, 13:41.320 --> 13:45.640 that the bugger then uses when it encounters it to query the runtime and get the value back out. 13:46.440 --> 13:51.000 So let's look at an example for a function argument, 13:51.160 --> 13:56.520 so in native code, depending on the API, you might expect that to be passed in a register and get a register name. 13:57.080 --> 14:03.400 So in WebAssembly, we will get a dwarf location description that uses the DWOPWASM location. 14:03.400 --> 14:08.200 And so in this particular case, the value is stored as a function local at index two, 14:08.200 --> 14:09.560 so the second function local. 14:09.560 --> 14:11.160 So when all of the being encounters this, 14:11.160 --> 14:16.200 parts as the dwarf expression, encounters this, and then it's going to query the runtime with a queue 14:16.920 --> 14:21.320 with local and index two, gets the value back and it shows it to you. 14:23.320 --> 14:28.520 Variables located in memory, behave pretty much the same and can use dwarf standard memory locations, 14:29.160 --> 14:32.120 but WebAssembly's architecture requires a little bit of care. 14:32.120 --> 14:38.360 Specifically, WebAssembly follows a segmented memory model, where code and data 14:39.000 --> 14:43.640 live at separate address spaces, also different modules each have their own address space. 14:44.360 --> 14:47.240 And hello, the B doesn't currently have address space support, 14:47.240 --> 14:51.240 which is something that's come up a few times today, so we needed a creative solution, 14:51.240 --> 14:56.680 and what we did is we used the top 32 bits of a 64-bit address to encode the address space. 14:56.680 --> 15:02.920 We used the first two bits for the type, so like code or memory, and then the remaining 15:02.920 --> 15:07.320 30 bits we used to encode the module. And so this approach works well for 32-bit 15:07.320 --> 15:12.760 wasom, which is still very much the default, but obviously for WebAssembly 64, we'll need all 64 bits 15:12.760 --> 15:17.160 for the address, and we will need to come up with something different. Luckily, this address space 15:17.160 --> 15:21.560 support node gives something that we've been discussing on the forums. I need to hurry up. 15:22.840 --> 15:27.240 Although Swift uses dwarf as it's a debugging format for anything but trivial types, 15:27.240 --> 15:32.440 we also need the Swift Reflection metadata, for example, to resolve the concrete type of 15:32.440 --> 15:37.880 a generic at runtime. And so Swift uses a common library, a lib Swift Reflection, 15:37.880 --> 15:44.200 and it's used at runtime for performing reflection, and it's used by the debugger to generate types. 15:44.920 --> 15:49.640 And so what that means is it's nice to have one implementation, so we don't have to duplicate the code. 15:49.640 --> 15:53.000 What that means is that this library also needs to be able to parse 15:53.000 --> 15:58.520 those sections I mentioned earlier, and so we had to redo some of that work in lib Swift Reflection. 15:58.600 --> 16:04.040 But the thing is that they was pretty much all it took to support Swift, because all the other things 16:04.040 --> 16:09.160 are built on top of the primitives provided by the GDP remote protocol, and we also didn't 16:09.160 --> 16:15.240 need to modify the runtime or anything to make this all work. Finally, we introduced a 16:15.240 --> 16:20.440 WebAssembly platform to all DB, and so when you create a target in all DB with a WebAssembly 16:20.440 --> 16:25.240 triple, this platform will automatically get selected, and so one of the responsibilities of the 16:25.240 --> 16:30.200 platform is launching binaries. And this platform can be configured with a specific runtime, 16:30.200 --> 16:34.200 so that when you load it in all DB, and you type run, it's automatically going to launch that 16:34.200 --> 16:39.800 under that runtime, connected to GDP, and make it look like you're just debugging something natively. 16:39.800 --> 16:44.600 And once configured, like this thing disappears, and you get the user experience that you're used to. 16:46.200 --> 16:50.840 So with all that, I'm happy to say that I think we delivered on our original goal, 16:50.840 --> 16:54.440 and all the B now have first class debugging support for WebAssembly, 16:54.440 --> 16:59.880 and not just for Swift, but for any language that is supported. We also succeeded in building a 16:59.880 --> 17:05.240 solution that's not tied to a particular runtime. We have already three runtime today that support 17:05.240 --> 17:10.600 debugging with all the B this way, so besides the micro runtime and Wasm kit, there's also support 17:10.600 --> 17:16.360 in WebKit's JavaScript core engine. And so all these runtime support, the protocol extensions 17:16.360 --> 17:20.920 that I've discussed earlier, they are formally documented on the LDB website, and I hope to see 17:20.920 --> 17:28.120 more the Buggers and run times adopt them. So it's next, the immediate priority for us is to 17:28.120 --> 17:33.080 extend our test suite to build all our test binaries for WebAssembly, and then run them under a runtime. 17:33.080 --> 17:38.600 That's going to allow us to reuse our thousands of existing tests to test our WebAssembly support, 17:38.600 --> 17:43.880 and hopefully this will also help uncover bugs in a different run times GDP stops. 17:44.680 --> 17:48.680 And beyond that, we'll want to support more Swift features as they make their way over to WebAssembly, 17:49.320 --> 17:54.040 and then add our spaces is something we'll need to do in order to get Wasm 64 support going. 17:55.080 --> 18:00.520 I want to thank everyone that made it possible to make this happen, Apollo for his work on WebAssembly 18:00.520 --> 18:06.280 and LDB, Adrian Prantel for his work on the Swift parts, David Spickett for viewing my PRs, 18:06.280 --> 18:11.480 Lex and Yuda for their work on Wasm kit, and Yija show and Mark on the WebKit team for 18:11.480 --> 18:16.520 adding this to JavaScript core. Thank you. I think there should be a little bit of time for questions.