WEBVTT 00:00.000 --> 00:11.240 Hello everyone, my name is Mattude and I am CEO of Ephesias. 00:11.240 --> 00:14.800 I'm here with Olivier Zion, works with me at Ephesias. 00:14.800 --> 00:20.520 We are here to introduce the Exat Tracer, so it's a project we've been working on for 00:20.520 --> 00:27.240 the past two years, mainly with AMD and Lawrence Livermore National Laboratory, and we're 00:27.240 --> 00:33.040 basically adapting the LTTNG Tracer, which has been really focusing on the embedded 00:33.040 --> 00:37.200 and telecom world to HPC. 00:37.200 --> 00:43.400 So I'm just going to backtrack a little bit and give some context so that we're on the 00:43.400 --> 00:45.400 same page in the room. 00:45.400 --> 00:51.000 So quick introduction about what is tracing and how it differs from other strategies 00:51.000 --> 00:54.000 for observation of systems like profiling. 00:54.000 --> 00:59.440 So it's based on instrumentation of code, you can either statically insert or dynamically 00:59.440 --> 01:04.160 enable instrumentation or it can be done fully dynamically. 01:04.160 --> 01:11.440 You then collect or react to a sequence of events that are emitted during the code execution. 01:11.440 --> 01:16.960 So one important aspect is to have many more introduces the stiffness on the workload, 01:16.960 --> 01:21.440 so that the issues you face reproduce on the observation. 01:21.440 --> 01:27.040 So for that low overhead is really key, it can be compared to logging, however it needs 01:27.040 --> 01:33.640 to handle very high event throughput in the order of millions of events per second. 01:33.640 --> 01:34.920 So why tracing? 01:34.920 --> 01:37.040 So why would you want to use tracing? 01:37.040 --> 01:42.520 So one of the main reason for using it is to identify the root cause of issues. 01:42.520 --> 01:47.000 So Eisenberg, so bugs that are hard to reproduce that only reproduce once in a blue 01:47.000 --> 01:53.120 moon, issues that disappear on the observation, identify the root cause of performance 01:53.120 --> 02:01.280 bottlenecks, protocol API, API specifications, yeah, violations, and to monitor program behavior 02:01.280 --> 02:02.480 and then react. 02:02.480 --> 02:10.680 So one example is hooking on application code and gathering a detailed snapshot of the events 02:10.680 --> 02:12.680 that led to that problem. 02:12.680 --> 02:18.560 So you can then identify performance and outliers and root cause of errors. 02:18.560 --> 02:20.000 So tracing and profiling. 02:20.000 --> 02:26.640 So I know that profiling is, so both have their own, I would say, strength. 02:26.640 --> 02:32.600 So profiling is more lightweight because then it gather samples, tracing needs to be inherently 02:32.600 --> 02:39.480 lightweight because it extracts a lot of information, so where does each strategy fit? 02:39.480 --> 02:46.440 So they are both complementary, profiling is very good at identifying active usage of resources. 02:46.440 --> 02:52.360 Tracing can do that as well, but the profiling as the benefit of just taking samples to do 02:52.360 --> 02:54.040 that, so it's more lightweight. 02:54.040 --> 02:58.440 However, tracing really excels at identifying resource misuse. 02:58.440 --> 03:04.880 So when the CPU is not actually used, so you're not using 100% of your CPU, you don't 03:04.880 --> 03:12.200 know why, and you're waiting for another system, for IO, for some other thread to complete 03:12.200 --> 03:13.200 something. 03:13.200 --> 03:19.560 So identifying those more complex patterns where your resources are not used as they should 03:19.560 --> 03:25.000 be, tracing is excellent at that, so this is where you're tracing really shines. 03:25.000 --> 03:33.600 So a little bit of history on the LTTNG evolution. 03:33.600 --> 03:41.560 So I started the project in 2005, the main focus has been since then mainly telecom embedded 03:41.560 --> 03:46.840 in real-time system, large multi-core systems, scaling to many, many cores. 03:46.840 --> 03:54.520 Linux kernel and application tracing, we've created the common trace format version 1 and 2. 03:54.520 --> 04:01.280 It's been developed in collaboration with the development of people doing trace analyzers. 04:01.280 --> 04:05.240 So bubble trace, trace compass, we'll come back to that. 04:05.240 --> 04:11.520 So a few words about the common trace format, so I created that format with the tool infrastructure 04:11.520 --> 04:17.600 work group around 2010 and the one that was released 2012. 04:17.600 --> 04:21.680 So it was based on a domain-specific language to describe the metadata. 04:21.680 --> 04:24.000 It provides a structured type system. 04:24.000 --> 04:28.920 It can be defined statically or dynamically, the static definition would be more for embedded 04:28.960 --> 04:34.760 systems where you can ship the description of the trace with your SDK and then you just gather 04:34.760 --> 04:38.960 the hardware, the trace from hardware in binary. 04:38.960 --> 04:43.160 Clock descriptions, allow correlating between traces. 04:43.160 --> 04:48.040 It's a binary trace format, so it's really compact, it's fast to produce. 04:48.040 --> 04:52.760 So in it, it's a trade-off between compactness and really not using a lot of CPU when you 04:52.760 --> 04:53.760 write it out. 04:53.760 --> 04:55.680 So we're not talking about compression there. 04:55.680 --> 05:01.040 We're talking about writing binary data that can be either aligned or not aligned depending 05:01.040 --> 05:07.840 on the architectures so we can get the most of the performance of the various architectures 05:07.840 --> 05:10.160 that are targeted. 05:10.160 --> 05:13.440 So it can be easily generated by hardware and software component. 05:13.440 --> 05:17.440 So CTF2 has been released last year, 2024. 05:17.440 --> 05:22.680 So the main difference is we switch from a custom DSL to JSON. 05:22.680 --> 05:26.920 So it's a super set of the CTF1.8 type system. 05:26.920 --> 05:32.400 It has new built-in types and its main strength is to allow user-defined metadata and 05:32.400 --> 05:33.560 extensions. 05:33.560 --> 05:39.000 So you can tag specific fields saying, well, this field, this array of four bytes, it's actually 05:39.000 --> 05:45.560 in my own name space, it's IPv4 and I know that in my tooling, I need to print an 05:45.560 --> 05:50.280 inter-pret IPv4 as a network address. 05:50.280 --> 05:52.440 So it's a binary trace format. 05:52.440 --> 05:56.800 So the binary trace format is the same as CTF1.8 plus additions. 05:56.800 --> 06:03.280 So the new types that are supported into are being added and that's pretty much it. 06:03.280 --> 06:04.280 Bubble trace. 06:04.280 --> 06:11.600 So that's the reference implementation of CTF as trace reader and producer. 06:11.600 --> 06:14.320 So it can convert filter, seek, and analyze trace. 06:14.320 --> 06:22.280 It's both a plug-in system and also a CC++ library API with Python bindings and a common line 06:22.280 --> 06:23.280 interface. 06:23.280 --> 06:27.920 So you can basically support our betrayary trace formats and convert to CTF or analyze 06:27.920 --> 06:33.440 them on the fly just by implementing plugins to read those other trace formats. 06:33.440 --> 06:37.400 So it's really meant to bring all the data together. 06:37.400 --> 06:39.720 It's a bit below adding custom phases. 06:39.720 --> 06:43.000 It does the correlation also across traces by times them. 06:43.000 --> 06:49.920 So you have various information sources and you correlate them in its model. 06:49.920 --> 06:57.240 And it has been built in plugins to both read and write CTF1.8 and CTF2. 06:57.240 --> 06:59.080 So trace compass. 06:59.080 --> 07:03.680 It's a graphical interface to view in analyze traces. 07:03.680 --> 07:06.360 We'll show some more in the coming slides. 07:06.360 --> 07:10.800 It supports the common trace format as well as other trace formats and logs and custom 07:10.800 --> 07:11.800 logs. 07:11.800 --> 07:16.080 It has plug-in system for views in analysis faces. 07:16.080 --> 07:20.400 So you can create your own and add them into that project. 07:20.400 --> 07:22.600 There are many built-in views in analysis. 07:22.600 --> 07:28.160 For the Linux kernel, for instance, this guy is a hardware resources, CPURQ usage, many 07:28.160 --> 07:29.160 more. 07:29.160 --> 07:34.400 There's a critical path analysis, log correlation with traces, network packet correlation 07:34.400 --> 07:35.400 and so on. 07:36.400 --> 07:41.760 The visual example of how trace compass look like at the top. 07:41.760 --> 07:48.480 We have the resource view showing each of the CPUs, their state, the frequency, they run 07:48.480 --> 07:49.480 on. 07:49.480 --> 07:52.560 So that's useful to know what the CPUs are doing over time. 07:52.560 --> 07:55.160 So the x-axis is time. 07:55.160 --> 07:57.000 Under that, we have the control flow view. 07:57.000 --> 07:59.240 I would call that also maybe the tread view. 07:59.240 --> 08:00.800 So it's on a per tread basis. 08:00.800 --> 08:03.240 We show what each tread is doing over time. 08:03.240 --> 08:07.960 And then we can see the waiting and the unblocking between threads. 08:07.960 --> 08:13.320 We have a detailed event list and then some Instagrams at the bottom. 08:13.320 --> 08:18.040 So the main features of LTTNG, so being low over-ed, it has a really fast per CPU ring 08:18.040 --> 08:22.080 buffer in the order of 100 nanosecond per event. 08:22.080 --> 08:26.200 It has a user configurable dynamic runtime filter per evaluation. 08:26.200 --> 08:30.480 Let's say of a two integer that would be around 15 nanosecond so when you want to filter 08:30.480 --> 08:34.040 out a lot of the event that you don't need. 08:34.040 --> 08:38.080 Event streaming to disk or network, it's an absolute mode where you do flight recorder 08:38.080 --> 08:42.400 tracing in memory and when something interesting happens, then you capture from memory. 08:42.400 --> 08:48.320 So you remove all the IO that you otherwise need to do to stream out the data and you just 08:48.320 --> 08:49.760 capture when it's needed. 08:49.760 --> 08:51.440 So that's really useful. 08:51.440 --> 08:56.480 It has trigger mechanism event notifications with payload capture that's to react to what 08:56.480 --> 08:59.840 is happening on the system, rather than buffer things. 08:59.840 --> 09:07.440 In a session rotation mode, that's mainly meant for basically splitting traces into chunks 09:07.440 --> 09:13.000 so that each can be sent and analyzed faster than just waiting for the entire experiment 09:13.000 --> 09:14.800 to complete. 09:14.800 --> 09:16.280 Trace correlation. 09:16.280 --> 09:21.280 We do correlation across tracing domains, then external user space, and across host. 09:21.280 --> 09:25.440 Across host, we are based on the time synchronization of the machine, whether you use 09:25.440 --> 09:27.320 NTP or PTP. 09:27.320 --> 09:31.480 There are also algorithms in trace compass that real-line traces at post-processing based 09:31.480 --> 09:37.920 on network communication, so it basically take the TCP sequence numbers and real-line traces 09:37.920 --> 09:40.680 based on that communication. 09:40.680 --> 09:43.320 So I'm doing that part as well. 09:43.320 --> 09:44.320 Yeah. 09:44.320 --> 09:45.320 HPC collaboration. 09:45.320 --> 09:50.320 So in the past two years, two years, we've started working with the Oregon National Laboratory. 09:50.320 --> 09:52.320 They've developed TAPI. 09:52.320 --> 09:54.920 It's based on the LTTNG and Babel Trace. 09:54.920 --> 10:02.480 They provide instrumentation of OpenCL, Level 0, Kuda runtime, HIP, OpenMPT. 10:02.480 --> 10:09.760 It's developed for tracing Aurora, the Aurora cluster with 10,000 node, 9 million cores. 10:09.760 --> 10:15.000 It captures millions of events per second per node, and it runs for hours when tracing 10:15.000 --> 10:16.000 is enabled. 10:16.000 --> 10:17.000 So that's the scale. 10:17.000 --> 10:22.080 TAPI provides nice summary about what is happening on the system as it runs. 10:22.080 --> 10:27.320 So we are also collaborating with the Lawrence Livermore National Laboratory on the Exatracer 10:27.320 --> 10:31.760 Project, so we're developing it in current elaboration with AMD. 10:31.760 --> 10:39.760 So we've created the instrumentation of RAKEM for Exatracer, so it connects LTTNG to RAKEM 10:39.760 --> 10:41.560 and MPI. 10:41.560 --> 10:48.760 So the HIPHSA, RaktX, GPU kernel, dispatch layers, OpenMPI, KrayMPI, so that all goes 10:48.760 --> 10:50.160 into LTTNG. 10:50.160 --> 10:57.640 It's been developed to trace L capita with 11K nodes and 11 million cores. 10:57.640 --> 11:03.320 So we are collaborating on a research project with Polytechnic Morayal as well on trace 11:03.320 --> 11:04.320 compass. 11:04.320 --> 11:11.520 It's a trace analyzer and visualizer, and we're working with them on improvement of scalability 11:11.520 --> 11:18.480 to large traces, so more than 100 gigabytes in that's really pertinent in the context of 11:18.480 --> 11:20.160 HPC. 11:20.160 --> 11:25.920 And some of the goals there is to reduce the reaction time, so basically interactivity 11:25.920 --> 11:32.880 when navigating in a very large trace, and also reduce analysis delay. 11:32.880 --> 11:37.960 So this is the delay between, let's say you take a 12 hour run, that's a huge trace 11:37.960 --> 11:38.960 that you gathered. 11:38.960 --> 11:44.560 You don't want to wait another 12 to 24 hours of pre-computation before you can start 11:44.560 --> 11:46.760 navigating in your data. 11:46.760 --> 11:50.760 So ideally, we want to have some kind of pipeline in there, so that the analysis can 11:50.760 --> 11:54.600 be done on other resources in parallel with your workload. 11:54.600 --> 11:57.080 So let's welcome back to that. 11:57.080 --> 12:02.880 So here's a stack diagram of the HPC software stack that we target. 12:02.880 --> 12:06.920 So you can see that on the target, at the application level, we instrument in green, that's 12:06.920 --> 12:11.960 where tracing is available, so CC++ application, GPU kernel dispatch. 12:11.960 --> 12:18.440 We have partially, tracing partially available at Python and Java levels with logger integration. 12:18.440 --> 12:23.960 At the runtime, we have partially available at the lib C, somewhere is instrumented. 12:23.960 --> 12:30.120 HSA, HIP, open at the MPIs are instrumented. 12:30.120 --> 12:33.200 And then at the system level, we have instrumentation of the linear kernel. 12:33.200 --> 12:38.780 And that can all be funneled into the same tools on the align timeline, and you can really 12:38.780 --> 12:42.100 see correlated information from across all those layers. 12:42.100 --> 12:45.820 So that's one of the strength of the LTTNG tooling. 12:45.820 --> 12:53.780 And then we have on the side the Exat Tracer that connects on each of those level of instrumentation. 12:53.780 --> 13:01.500 We have the SDK, BarCitF will become maybe useful in the picture to trace on the GPU itself, 13:01.500 --> 13:03.340 but that's not there yet. 13:03.340 --> 13:08.780 Our guntapy also is there with bubble trace and trace compass. 13:08.780 --> 13:09.980 All right. 13:09.980 --> 13:23.300 So now I will leave Olivier. 13:23.300 --> 13:29.620 So the Exat Tracer is targeting many of the layers of rock M, basically, and MPI. 13:29.620 --> 13:34.980 So we have MPI, open MPI, create MPI, those are the two implementations that we have tested. 13:34.980 --> 13:40.260 We have HSA, HIP, rock TX, and GPU kernel dispatch. 13:40.260 --> 13:47.300 So how this is done, basically, for non-programmers here, these, for example, HSA and HIP are 13:47.300 --> 13:52.620 libraries in C++, and there's header files that describe the API. 13:52.620 --> 13:57.100 And we parse this API with client, which is a compiler for C and C++. 13:57.100 --> 14:03.100 And from there, we derive a definition of trace points, and we generate wrappers, automatically 14:03.100 --> 14:08.620 based on this definition of the API, and it's the same for MPI also. 14:08.620 --> 14:13.420 And so with this wrapper generated, we basically do interception. 14:13.420 --> 14:19.780 So for the HIP and HSA, they have mechanism inside of the runtime to basically allow interception 14:19.780 --> 14:25.060 of every call, and for MPI and rock TX, we rely on symbol overwriting. 14:25.060 --> 14:31.380 So symbol overwriting is something software to allow you to, basically, hijack symbols during 14:31.380 --> 14:36.220 the runtime, and so you can call your function instead of the defined function by the 14:36.220 --> 14:40.820 programmers, and then you can call the next function, which is the original one. 14:40.820 --> 14:45.220 And so this is the advantage that you don't have to recompiles your software. 14:45.220 --> 14:52.460 You only need to, the LD preload symbol over right for example, is just a environment variable, 14:52.460 --> 14:54.020 and you don't have to recompile. 14:54.020 --> 15:00.900 And the GPU kernel is basically instrumented during this patch, using the Roperfather SDK. 15:00.900 --> 15:07.660 This is an example of a trace generated by bubble trace, so bubble trace, the common line 15:07.660 --> 15:13.300 interface is basically just a simple convention, so we can read a trace and put it on the terminal. 15:13.300 --> 15:20.700 So as you can see, it's quite verbote, there's a lot of HIP in here, I have two events. 15:20.700 --> 15:25.860 So where we have a time stamp, we have a relative time as the event before it. 15:25.860 --> 15:30.540 We have the host name, and we have the provider in the trace point, so in this case is HIP event. 15:30.540 --> 15:34.980 It's a mailbox or a memory location, and it's the entry of that function. 15:34.980 --> 15:40.180 And then we have context fields, so for example, this is the CPU ID 17, so we know which 15:40.180 --> 15:46.980 chip you has been running this event, and then we have application context also, for example, 15:46.980 --> 15:53.580 we have the MPI rank, so we know that the event happened on rank 1, instead of rank 0 for 15:53.580 --> 15:55.060 the second event. 15:55.060 --> 15:59.420 And then we have the payload, so the payload is basically what's interesting in terms 15:59.420 --> 16:06.580 of argument of the function, so we have the correlation ID to determine which thread and 16:06.580 --> 16:10.260 which function call happened, because we also have exit of the function, so we can do 16:10.260 --> 16:16.740 a correlation between entry and exit, and you have the pointer of the, that was mallocate 16:16.740 --> 16:22.660 in the size of it, and the next event is similar, it's basically a send of MPI, and we 16:22.660 --> 16:28.260 can see for example, well, it's a return, so we don't see it in that much, but we know 16:28.260 --> 16:32.260 that the function has returned basically. 16:32.260 --> 16:36.780 This is a view in trace compass, and I must mention that the previous slide here, this 16:36.780 --> 16:42.580 is the default output of bubble trace, but bubble trace is meant to be a program instead, 16:42.580 --> 16:48.420 so this is really like just to see the trace, but you can, like, tap he has done, do many 16:48.420 --> 16:53.620 more things than that, this is a view of trace compass, it's just a really small view 16:53.620 --> 17:04.500 of it, I don't know if we can see, yeah, so we have the top, the top we have hip, so 17:04.500 --> 17:11.820 the two event of hip, we have the malloc, and we have a main copy on the memory, and yeah, 17:11.820 --> 17:18.500 and we have MPI under it, and the MPI event are sorted by rank, so we have MPI send, followed 17:18.500 --> 17:25.460 by MPI finalize, and we have for rank 0, and for rank 1 we have an event which we cannot 17:25.460 --> 17:31.340 see the name, and we have MPI finalize, and so we can see in this timeline of view, so the 17:31.340 --> 17:40.300 x-axis is actually a time of the day, and we can kind of correlate between the nodes, 17:40.300 --> 17:46.060 the communication between the nodes, in that way, so this is just trivial view, but we 17:46.060 --> 17:51.860 can have much more detail if we want, so the challenge of tracing the HPC cluster, 17:51.860 --> 17:59.260 basically, is that there's a really large volume of data, and this impulse execution 17:59.260 --> 18:03.620 overed when you want to do tracing, there's also a lot of memory footprint and bandwidth 18:03.620 --> 18:09.420 associated with that, and also high throughput, a value, for example, to this good networking, 18:09.420 --> 18:14.620 and also to the storage of the data itself, at the end, like Matthew mentioned, there's 18:14.620 --> 18:19.260 a waiting time in between the trade generation, and it's visualization for analysis results, 18:19.260 --> 18:26.140 so if you have 12 hours run of something, you don't want to spend another day to wait 18:26.140 --> 18:34.140 for visualizing the results, so pipelining is a way to circumference that, and then there's 18:34.140 --> 18:40.060 also the interactivity of a trade visualization at scale, so when you have a lot of data, 18:40.060 --> 18:46.660 if just zoom in in the trace to see a more detailed view, it's lagging the view, it's not 18:46.660 --> 18:51.540 very interesting for the users, so that's another challenge, and also there's the precision 18:51.540 --> 18:56.500 of trace collaboration across hosts, well, Matthew mentioned that we already support that, 18:56.500 --> 19:04.740 but maybe the correction could be better in some cases, and so for the future work that we plan 19:04.740 --> 19:12.660 is that we want to improve the granularity of coverage, so for example, we use client to 19:12.660 --> 19:17.300 basically parse the editor file in C, but C is not really a good semantic for that, it is 19:17.380 --> 19:22.180 an express what is an input in an output parameter, for example, so we cannot assume that we can 19:22.180 --> 19:29.460 access this memory on the output or the output, so basically we're blind in some cases, so using 19:29.460 --> 19:36.980 API annotation instead, we could, for example, yeah, we could annotate the function argument 19:36.980 --> 19:43.300 input output or tag in unions, and we also have the site project, it's an API defining an 19:43.300 --> 19:48.660 nextable type system for instrumentation, and it's important is that component type, and it's 19:48.660 --> 19:58.660 a runtime in a ways, agnostic, yeah, I'll skip that, that's all, questions? 20:14.180 --> 20:22.020 If you have some kind of API results in the idea, you look with things to implement some 20:22.020 --> 20:28.180 arbitrary program language tracing, or if you want to add custom processing, but back in 20:28.180 --> 20:34.260 a day when I did something, so if you find it in Python and copy it and turn out that it's 20:34.260 --> 20:40.580 really crucial, and you can do a lot of interesting things, if you have this kind of API kind of 20:40.580 --> 20:45.620 something that's not in tracing only in a part of the program, there are really interested 20:46.660 --> 20:52.580 while not tracing everywhere else, for example, you can't stop if there was some API, so 20:52.580 --> 21:01.300 is there one? Yeah, so there are CAPIs to control tracing, yes, so we provide APIs, I guess it's 21:01.300 --> 21:07.540 to control tracing, but as well as for instrumentation, right, of various languages, so the answer is 21:08.500 --> 21:17.620 we have some, but I would say the main interfaces for trace control is idricum in line, or C 21:17.620 --> 21:24.180 C++ APIs, but then we can do bindings, and I think we have Python bindings, so you could 21:24.180 --> 21:29.700 create your own. For the instrumentation, the direction I'm going is the lib side, the project, 21:29.700 --> 21:37.380 SID, and I'm basically defining an API across run times and languages for that, but so that's coming. 21:37.380 --> 21:42.740 So currently, the trace point of LTTNG, USD is really more CR-oriented, so you need to do bindings. 21:42.740 --> 21:45.780 Yeah, thank you very much. Welcome, another question? Yeah. 21:57.460 --> 22:05.380 So the question is, in the context of GPU profiling, which profiling and tracing, so do you mean tracing, 22:05.380 --> 22:11.700 what is happening on the GPU or the dispatch? So what we instrument at the moment, the main focus 22:11.700 --> 22:16.980 initially was really what is happening on the CPU controlling the dispatch of stuff to the GPU. 22:17.700 --> 22:23.060 That includes, however, the kernel dispatch, so then you know which kernels have completed running, 22:23.060 --> 22:27.780 and when they've started, so that provides extremely useful information, so that's the level we 22:27.780 --> 22:33.220 currently open, but there's currently research at polytechnic trying to add instrumentation directly 22:33.300 --> 22:39.780 in the GPU to extract traces from a GPU ring buffer consumed by the CPU, but that's further work. 22:41.220 --> 22:42.020 Thanks. 22:45.620 --> 22:54.420 I used the Chrome tracing set up with like gladiations on this tree, to be like, how does more 22:54.500 --> 23:07.940 you'll take to form a query with that. So your question is, how do our trace format CTF compare 23:07.940 --> 23:16.980 to Chrome tracing? As I recall, I hope I'm not so it's based on protobuff, the Chrome tracing format, 23:16.980 --> 23:24.260 as I recall, or is it JSON based? Yeah. Okay, so in that case, there's a limitation 23:24.500 --> 23:31.380 on the big the traces can be, in that case, that was one of the core limitation. CTF, we can scale 23:31.380 --> 23:37.940 to huge huge traces, and we're binary traces, and it's self-described with the metadata. So basically, 23:37.940 --> 23:42.660 we get the best of both worlds. We have a structured data type, but it's binary, so it's much faster 23:42.660 --> 23:48.100 and much more compact. And I think that's it. Thank you. Yeah, that's all we have to find out 23:54.420 --> 23:56.420 you