WEBVTT 00:00.000 --> 00:15.120 So I'm here because, well, there are too many people around there saying, hey, you should rewrite 00:15.120 --> 00:17.480 your project into Rust. 00:17.480 --> 00:24.720 And I am maintaining a project in C, which has like 25 years old, and I am saying no, and 00:24.720 --> 00:30.720 now I'm going to say why, or maybe why I'm not going to rewrite my project to Rust. 00:30.720 --> 00:38.920 I am not going to be as funny as the one before me, it's impossible. 00:38.920 --> 00:47.280 So if you are expecting something better, just you can go away, there is nothing to see here. 00:47.280 --> 00:52.160 Anyway, if you are not scared, yeah, the quest is simple. 00:52.160 --> 01:04.160 We have many years old C code base, which was written in like 1998, 99 in ISOC, and so 01:04.160 --> 01:09.640 we are expecting to keep this because it runs and rewriting this completely through 01:09.640 --> 01:15.120 something else, means like imposing more bugs than fixing. 01:15.120 --> 01:22.120 Yes, we do have memory safety concerns, there are lots of people actually saying, 01:22.120 --> 01:32.280 anything written in C is bad because it's memory unsafe, and I am saying, well, it depends. 01:32.280 --> 01:41.680 If it doesn't crash for like last 10 years, I am asking you where is it unsafe. 01:41.680 --> 01:47.520 On the other hand, we are improving its ever since, and in the new code there are some problems, 01:47.520 --> 01:55.360 and we are fixing them continuously, but it's still a piece of software, which is rather stable. 01:55.360 --> 02:03.200 So I am asking why should I rewrite something to Rust when it's not broken? 02:03.200 --> 02:07.600 So the requirements are, I want to keep what works. 02:07.600 --> 02:15.720 I don't want to rewrite a whole parser written in C, which has been working for last 25 years. 02:15.720 --> 02:24.360 I want to automate most of the refactoring I am going to do to make the future updates safe. 02:24.360 --> 02:30.880 I want to lead the developers, the way they should code, it's obvious that you can shoot 02:30.880 --> 02:41.920 your own leg in the most creative place in C, and I am not going to make you know do it. 02:41.960 --> 02:46.160 I am just going to make it harder for you. 02:46.160 --> 02:53.800 I want to allow all the food guns you want to have, but it will be obvious that you are shooting your leg. 02:53.800 --> 03:02.800 And then on the code review, I'm going to look at it and say, no, that's not going there. 03:02.800 --> 03:10.120 So as I am saying here, it should be hard to write bad code, but I don't like making it impossible 03:10.120 --> 03:18.160 because making the bad code impossible throws out even some good code. 03:18.160 --> 03:24.840 So what I'm expecting to happen with on safe code is, ideally, it's obvious, it built error. 03:24.840 --> 03:27.200 It's what Rust has. 03:27.200 --> 03:32.280 Or there can be aesthetic analyzer error, which is something you can just put into your CI, 03:32.280 --> 03:37.440 and then you say, well, it's killed by the aesthetic analyzer where I'm not going to, 03:37.480 --> 03:40.480 I'm not going to include your updates. 03:40.480 --> 03:46.320 It can fail on some unit test error, it can just stick out. 03:46.320 --> 03:56.960 If I see a typecast in a plane side in your changes, I say no, because you should not do typecast, 03:56.960 --> 04:00.560 but more on that later. 04:00.640 --> 04:08.240 And one of those requirements is if some code is intended to be innocent, 04:08.240 --> 04:16.400 it must actually be innocent, which is a reason why I'm not going to rewrite it into C++. 04:16.400 --> 04:26.160 Because I don't want to study what's the plus happening, whether it's overloaded and where. 04:26.240 --> 04:34.960 So innocent codes, the code which looks innocent must also be innocent. 04:34.960 --> 04:39.120 Now the controversial thing, how to do it? 04:39.120 --> 04:43.280 The as a part is what should be the default? 04:43.280 --> 04:46.960 Basically, if you do something, do it locally. 04:46.960 --> 04:52.760 If you can do it locally, don't rely on anything which is global. 04:52.840 --> 04:55.160 You should put const everywhere. 04:55.160 --> 04:59.880 This is something which Rust has done it, it has done well. 04:59.880 --> 05:07.240 Doing it, view posit, doing it making everything mutable, marked as mutable, 05:07.240 --> 05:12.040 and everything const is just const by default. 05:12.040 --> 05:15.880 There are other things like functions that are not pure. 05:15.880 --> 05:20.840 I am not against things like side effects. 05:20.840 --> 05:26.680 These are legitimate, but it should have a reason. 05:26.680 --> 05:32.520 What I hate actually is like returning things in the arguments. 05:32.520 --> 05:37.800 And so, because it's also obscures what you are doing, 05:37.800 --> 05:43.400 but what I am trying to say, the code should say what it's doing. 05:43.400 --> 05:51.560 And if you try to write a code which shows what you are doing, 05:51.560 --> 05:54.200 then you should be safe. 05:54.200 --> 06:02.600 If you are writing a code which is adhering to specific rules, 06:02.600 --> 06:06.280 it's just adhering to specific rules. 06:06.280 --> 06:11.880 And the last point here, no void pointers anywhere. 06:11.880 --> 06:17.640 If you see any void pointer, just replace it by something else. 06:17.640 --> 06:22.200 There is no single reason to use void pointers. 06:22.200 --> 06:26.360 Apart from things like the return value from the memory allocator. 06:26.360 --> 06:31.240 If you are allocating memory, yes, you get a void pointer that's okay. 06:31.240 --> 06:35.000 But you should not put a void pointer as an argument 06:35.000 --> 06:41.400 of some object of some size, not at all never. 06:42.040 --> 06:45.080 But first we have to read the get rid of globals. 06:45.080 --> 06:47.160 It's typically a context. 06:47.160 --> 06:50.920 You don't have a reason to look at a global variable, 06:50.920 --> 06:55.000 which has how you should, for example, format time. 06:55.000 --> 06:57.800 It's typically a context. 06:57.800 --> 07:00.120 Or it's a global information. 07:00.120 --> 07:04.760 It's something like how the page size was the page size. 07:04.760 --> 07:08.520 Well, yes, then it's a read only global information, 07:08.520 --> 07:10.680 and you are not writing it. 07:10.680 --> 07:13.640 Well, yes, then there is a really shared data, 07:13.640 --> 07:16.360 and you probably have to do some log access. 07:16.360 --> 07:21.640 It should be explicit more on that later. 07:21.640 --> 07:25.480 This is my favorite part. 07:25.480 --> 07:27.560 You should not use void pointers. 07:27.560 --> 07:30.600 And I will say it several times again. 07:30.600 --> 07:32.120 You can use unions. 07:32.120 --> 07:33.720 Unions are good. 07:33.720 --> 07:35.560 Unions are fine. 07:35.560 --> 07:41.720 And you can, with now, with C11 and 18 and 20 C or 4, 07:41.720 --> 07:46.680 or what's that now, you can use anonymous structures 07:46.680 --> 07:49.000 inside unions, inside structures. 07:49.000 --> 07:52.840 Well, yes, it looks ugly for the beginning. 07:52.840 --> 07:59.880 But you can use a structure, put there the type of what piece 07:59.880 --> 08:03.960 of the union is there, and you can use a macro. 08:03.960 --> 08:10.840 And if you are against macros, don't try it in C. 08:10.840 --> 08:14.360 C is a language with macros. 08:14.360 --> 08:17.240 And if the macro preprocessor, if the C preprocessor 08:17.240 --> 08:21.400 is not enough for you, hello M4. 08:21.400 --> 08:27.800 Is there anybody who has written some M4 code in like 10 years? 08:27.800 --> 08:32.280 Yes, I have written like a thousand lines of M4, 08:32.280 --> 08:35.720 and it has a big thick warning of the beginning. 08:35.720 --> 08:37.480 Do not read this code. 08:37.480 --> 08:40.440 Do not continue after this line. 08:40.440 --> 08:41.960 Until you want to get brain damage. 08:45.240 --> 08:46.920 And I am not kidding. 08:46.920 --> 08:48.760 It actually is in the code. 08:48.760 --> 08:49.480 You can Google it. 08:49.480 --> 08:52.520 You can find it. 08:52.520 --> 08:54.520 You can have generated code. 08:54.520 --> 08:59.400 You can have a linked list type for every single thing. 08:59.400 --> 09:02.120 You want to put into the linked list. 09:02.120 --> 09:04.760 So if you have one structure and these structures 09:04.760 --> 09:08.200 are put into the linked list, you have a linked list 09:08.200 --> 09:10.040 of these structures as a type. 09:10.040 --> 09:12.840 And then you have another structure and you put 09:12.840 --> 09:14.360 that structure into a linked list. 09:14.360 --> 09:18.440 You have a linked list of that structure as another type. 09:18.440 --> 09:20.920 You have a complete set of functions, 09:20.920 --> 09:25.960 manipulating these data structures for every single data type. 09:25.960 --> 09:28.680 Yes, it is greedy. 09:28.680 --> 09:34.120 But we are in the times where the compiler time 09:34.120 --> 09:39.880 is actually much cheaper than the time you spend 09:39.880 --> 09:42.040 debugging these problems. 09:42.040 --> 09:47.320 So please make the compiler do what the compiler should do. 09:47.320 --> 09:52.120 And yes, this is kind of abusing the type system. 09:52.120 --> 09:53.640 Please do it. 09:53.640 --> 09:57.800 This is one of those things she can do. 09:57.800 --> 10:04.120 And please do not typecast plainly anywhere for any reason. 10:04.120 --> 10:08.600 If you want to typecast, you have to write the macro. 10:08.600 --> 10:13.320 And the macro says what you are intending with doing it 10:13.320 --> 10:17.880 and not only that the macro should actually check 10:17.880 --> 10:22.200 that what you put in is intended. 10:22.200 --> 10:23.240 It is possible. 10:23.240 --> 10:27.720 And I don't have it in this presentation, but I have it in the code. 10:28.120 --> 10:32.760 It is possible to check that the specific points you are putting in 10:32.760 --> 10:37.000 is actually of the type you are expecting to be put in. 10:37.000 --> 10:40.840 Yes, it is possible to write type checking macros. 10:40.840 --> 10:46.440 And if it makes sense, you should do it. 10:46.440 --> 10:52.680 There are other things one should do like acquiring 10:52.680 --> 10:55.800 managing the local acquired resources. 10:55.800 --> 11:00.360 Well, you acquire anything and then you return. 11:00.360 --> 11:02.120 Well, no, please. 11:02.120 --> 11:05.160 It should be the explicit releasing. 11:05.160 --> 11:06.280 Make some problems. 11:06.280 --> 11:09.400 So you can have some clean up hooks. 11:09.400 --> 11:11.560 It is not a new feature. 11:11.560 --> 11:15.400 It is a thing not yet standardized in C because it 11:15.400 --> 11:21.640 it got in some very nasty problems inside the committee. 11:21.640 --> 11:27.160 But every component is okay with clean up hooks, 11:27.160 --> 11:33.960 which are executed when a variable gets out of scope. 11:33.960 --> 11:35.800 You can have end of task hooks. 11:35.800 --> 11:40.600 Who is that are run just before you enter the poll again? 11:40.600 --> 11:43.400 You can have different times for different allocation 11:43.400 --> 11:47.720 scopes and you just have to move the data from the local 11:47.720 --> 11:50.520 erochatory source to the global erochatory source, 11:50.520 --> 11:54.040 either basically by explicitly copying, 11:54.040 --> 12:02.280 to make it explicit that now you are writing something global. 12:02.280 --> 12:05.720 And yes, you should mark stack and stack all 12:05.720 --> 12:08.520 and elochatory data is the worst one. 12:08.520 --> 12:10.200 At least please mark them. 12:10.200 --> 12:14.680 I don't see much, much of better things. 12:14.760 --> 12:16.680 Yeah, this is some example. 12:16.680 --> 12:21.720 This is how one can use an unlock macro. 12:21.720 --> 12:24.680 This basically is a function, it's a dummy function. 12:24.680 --> 12:28.360 It gets a, it's got a size from some table. 12:28.360 --> 12:30.520 And the table has its lock inside. 12:30.520 --> 12:34.200 So the get the information you have to lock. 12:34.200 --> 12:39.560 So there is a macro doing some some weird things 12:39.560 --> 12:47.720 with the locking, what's the thing to look at is the return 12:47.720 --> 12:51.160 inside the locking and the return is safe. 12:51.160 --> 12:56.760 You can return from the locked context because the lock 12:56.760 --> 12:59.080 is released automatically. 12:59.080 --> 13:05.560 So this means you can just let it go. 13:05.560 --> 13:09.560 The implementation is quite scary. 13:09.560 --> 13:12.520 One has to look through it and read through it. 13:12.520 --> 13:14.440 It takes some time in the beginning. 13:14.440 --> 13:21.640 When one gets to actually the grasp of what is is doing, 13:21.640 --> 13:25.640 yes, then you can use it and you can be sure that it works. 13:25.640 --> 13:30.520 And there is, yes, this is one part, which basically 13:30.520 --> 13:35.080 you peruse is, abuse is a force cycle to do the block, 13:35.080 --> 13:38.040 to do the block c-mentics. 13:38.040 --> 13:41.720 And there is also the cleaner function which is called 13:41.720 --> 13:44.760 when the block is left. 13:44.760 --> 13:50.680 And this is basically, you can see and here it is doing the 13:50.680 --> 13:53.080 object lock simple. 13:53.080 --> 13:59.400 And the unlocking is actually done in the cleaner function. 13:59.400 --> 14:03.800 And this is typically put in one place together in some 14:03.800 --> 14:07.560 header file. 14:07.560 --> 14:10.360 Yes, then there is some memory allocation strategies. 14:10.360 --> 14:13.400 You should use what fits your project. 14:13.400 --> 14:14.200 Do not be bad to me. 14:19.080 --> 14:21.800 In a bird, what we are using, yeah, I have not said not 14:21.800 --> 14:23.640 said that, but I'll return to that. 14:23.640 --> 14:27.080 In a bird, we are using hierarchical pools. 14:27.080 --> 14:29.000 And the pools are keeping track of everything. 14:29.000 --> 14:35.720 So if you walk from a root place, you can traverse all the 14:35.720 --> 14:39.400 allocated memory and show what's located and what's 14:39.400 --> 14:40.040 where. 14:40.040 --> 14:41.560 There is a temporary allocation. 14:41.560 --> 14:43.400 It just gets freed at the task. 14:43.400 --> 14:46.120 You do the MPL lock and then it gets freed. 14:46.120 --> 14:48.920 You don't have to worry. 14:48.920 --> 14:52.920 You can also get temporarily some global resources. 14:52.920 --> 14:54.520 And it's the same principle. 14:54.520 --> 14:58.440 You reference it and immediately schedule a release task 14:58.520 --> 15:01.240 to be done at the end of the task. 15:01.240 --> 15:03.720 So you know, it's safe to store. 15:03.720 --> 15:09.400 It's not safe to, yeah, I was skip, skip, skip. 15:09.400 --> 15:11.240 Yes, you can see it in a bird. 15:11.240 --> 15:13.480 You can see it in a lip UCV. 15:13.480 --> 15:14.760 This is myself. 15:14.760 --> 15:15.320 Thank you. 15:15.320 --> 15:29.880 Thank you.