WEBVTT 00:00.000 --> 00:11.000 Alright, our next stack is porting a game engine render to Vulkan as an absolute beginner 00:11.000 --> 00:15.000 by Dr. Carol Suprinovich. 00:15.000 --> 00:30.000 Thank you, so hello everyone, and like on this stack I wanted to show, but maybe 00:30.000 --> 00:37.000 this much of things like calling a lot, can we try that one? 00:38.000 --> 00:45.000 So I wanted to show like my work on porting a game engine to Vulkan. 00:45.000 --> 00:52.000 This will be very interesting perspective, because I didn't have like any previous experience in Vulkan at all 00:52.000 --> 00:56.000 and also an experience with like a professional game engines. 00:56.000 --> 01:03.000 So I did like some amateur programming in OpenGL area, but nothing can Vulkan. 01:03.000 --> 01:09.000 And like the purpose of this talk is that probably right now there is like a lot of 01:09.000 --> 01:14.000 a lot of open source projects that are using OpenGL right now. 01:14.000 --> 01:21.000 And there's some difficulty to do to that and also describe that in a moment. 01:21.000 --> 01:27.000 And also I just installed a laptop area. 01:27.000 --> 01:34.000 And also that will benefit from porting to Vulkan. 01:34.000 --> 01:41.000 There is it really that much about it. 01:41.000 --> 01:44.000 So I would tell about my experience. 01:44.000 --> 01:51.000 So first the game engine I was 14 people come, is for a game called OpenGL. 01:51.000 --> 01:58.000 And that's like an open source social VR game or like social VR worlds game, 01:58.000 --> 02:03.000 because it has both desktop mode and the VR mode. 02:03.000 --> 02:07.000 And the game uses a completely custom agile, which is very nicely designed, 02:07.000 --> 02:13.000 but right now it's supporting only, it was supporting only OpenGL and OpenGL. 02:14.000 --> 02:19.000 And from the start, it looks like the engine was designed to have the multi-club backends. 02:19.000 --> 02:25.000 It was done in a very clever way, I will show that in a moment. 02:25.000 --> 02:31.000 So first of all, OpenGL has a very long history. 02:31.000 --> 02:36.000 So it was first designed in 1980s. 02:36.000 --> 02:38.000 It was called Irish G.L. at first. 02:38.000 --> 02:43.000 And then in 1990s, I think in 1993, OpenGL was released. 02:43.000 --> 02:49.000 And of course it felt that during the years. 02:49.000 --> 02:55.000 But the problem was that a lot of that crashed from the previous, 02:55.000 --> 03:00.000 like completely different, like GPU programming paradigms. 03:00.000 --> 03:06.000 So the biggest problem with OpenGL was that it's single-traded. 03:06.000 --> 03:08.000 And it's also state-based. 03:08.000 --> 03:12.000 So for example, you've probably seen like the comments like your color, 03:12.000 --> 03:14.000 or also you set a state. 03:14.000 --> 03:17.000 And then like later comments raised that. 03:17.000 --> 03:21.000 And it's also very high level, which is surprisingly. 03:21.000 --> 03:23.000 That's also a problem. 03:23.000 --> 03:27.000 So it's simple, like from the programmers perspective. 03:27.000 --> 03:32.000 You can write a program that will render, for example, like a spinning triangle. 03:32.000 --> 03:36.000 Like a few lines of code. 03:36.000 --> 03:41.000 But the problem is of that complexity is hidden in drivers. 03:41.000 --> 03:46.000 And that's the case when, for example, those drivers. 03:46.000 --> 03:50.000 In different ways, or the performer. 03:50.000 --> 03:57.000 And the work on the other hand is modern API for programming graphics cards. 03:57.000 --> 04:01.000 And it's based on a previous API by AMD. 04:01.000 --> 04:04.000 That was called a mantle. 04:04.000 --> 04:10.000 And it's designed very much in the way you would expect it currently. 04:10.000 --> 04:15.000 So you can program in multi-fledged way. 04:15.000 --> 04:21.000 There are no internal states. 04:21.000 --> 04:24.000 For example, you can build command buffers on separate threads. 04:24.000 --> 04:30.000 You can utilize multi-core CPUs very efficiently to that. 04:30.000 --> 04:33.000 And it's also very low level. 04:33.000 --> 04:40.000 So it's very close to how GPU actually works. 04:40.000 --> 04:42.000 And with the benefits. 04:42.000 --> 04:46.000 So for us, like probably a big problem with OpenGL, 04:46.000 --> 04:49.000 was that we had trouble with compatibility. 04:49.000 --> 04:53.000 So on Windows, a mess ad drivers are amazing. 04:53.000 --> 04:58.000 So for example, on AMD, on Windows, on Intel, it was very good. 04:58.000 --> 05:03.000 But on Windows, there were times that the game was totally 05:03.000 --> 05:05.000 unusual. 05:05.000 --> 05:08.000 On AMD, due to the driver's. 05:08.000 --> 05:13.000 And on Nvidia, there were also often personal messages. 05:13.000 --> 05:19.000 And also on Mac OS, where like OpenGL got deprecated already in 2018. 05:19.000 --> 05:24.000 It was also impossible to run the game due to broken OpenGL drivers. 05:24.000 --> 05:27.000 So that was one problem. 05:27.000 --> 05:31.000 Another thing is that this is a social VR game. 05:31.000 --> 05:37.000 It requires a lot of, it's kind of very high performance application. 05:37.000 --> 05:43.000 So there, like we had trouble with, we had trouble with, 05:43.000 --> 05:49.000 for example, OpenGL was not utilizing GPU correctly, 05:49.000 --> 05:51.000 especially if there was lots of draw calls. 05:51.000 --> 05:54.000 Then there was like a serious performance loss. 05:54.000 --> 05:59.000 And also in VR applications, it's very important to avoid 05:59.000 --> 06:00.000 stutter. 06:00.000 --> 06:02.000 And this is a pretty good trend. 06:02.000 --> 06:05.000 So things like upload these textures to leave you or so, 06:05.000 --> 06:06.000 with calls, stutter. 06:06.000 --> 06:09.000 So that can be eliminated with full content. 06:09.000 --> 06:13.000 You to like, multi-fed programming. 06:13.000 --> 06:17.000 Then another thing was, it's a pretty big renderer. 06:17.000 --> 06:21.000 We had trouble with, like, bugs in the renderer. 06:21.000 --> 06:25.000 And that's where, like, we will kind of validation layers. 06:25.000 --> 06:26.000 A lot. 06:26.000 --> 06:29.000 I will tell about more later. 06:29.000 --> 06:32.000 And one more thing was, in OpenGL, 06:32.000 --> 06:36.000 you always provide shaders in GLS, GLS format. 06:36.000 --> 06:43.000 And then they get, they get compiled to, to GPU specific format at runtime. 06:43.000 --> 06:49.000 And the problem was that, that, the game was taking a lot of time to start. 06:49.000 --> 06:53.000 It was sometimes like up to two minutes on OpenGL. 06:53.000 --> 06:55.000 On OpenGL, it starts pretty much instantly. 06:55.000 --> 07:00.000 Because on OpenGL, you have kind of binary representation. 07:00.000 --> 07:02.000 That is much closer. 07:02.000 --> 07:07.000 And easier to translate like GPU native format. 07:07.000 --> 07:10.000 So we've challenges. 07:10.000 --> 07:16.000 So probably the biggest problem with OpenGL is that it's totally different. 07:17.000 --> 07:19.000 And, of course, the NoperGL. 07:19.000 --> 07:22.000 And there is a lot of complexity. 07:22.000 --> 07:27.000 So there is a lot of concepts such as render, passes, pipelines, 07:27.000 --> 07:31.000 the speed, there is a lot of things happening there. 07:31.000 --> 07:34.000 And also part of that, part of that complexity, 07:34.000 --> 07:38.000 is also due to how long level blockad is. 07:38.000 --> 07:41.000 So there is a memory management. 07:41.000 --> 07:44.000 Also needs to be done, like, completely manually. 07:44.000 --> 07:46.000 And so you need to, like, track all the resources. 07:46.000 --> 07:49.000 And even, even, you cannot, 07:49.000 --> 07:51.000 look, if memory just for a single resource, 07:51.000 --> 07:53.000 you have to allocate bigger buffers, 07:53.000 --> 07:57.000 because there is a much separate, like, the resources you can allocate. 07:57.000 --> 07:59.000 So that's also a big problem. 07:59.000 --> 08:02.000 And the last one is supporting custom shaders. 08:02.000 --> 08:05.000 So the game supports custom shaders. 08:05.000 --> 08:07.000 And that's a bit tricky. 08:07.000 --> 08:10.000 In full can scenes, since then you would have to bank 08:10.000 --> 08:15.000 a compiler for shaders to get a big game to download 08:15.000 --> 08:18.000 shaders at runtime. 08:18.000 --> 08:26.000 And we've, so to learn, like, there is a lot of different resources. 08:26.000 --> 08:28.000 And the ones I can recommend, of course, 08:28.000 --> 08:31.000 the third official resources for full can. 08:31.000 --> 08:35.000 There is tutorial, samples, and so on. 08:35.000 --> 08:39.000 And then another interesting resource is group can guide. 08:39.000 --> 08:41.000 And that one is very interesting. 08:41.000 --> 08:44.000 I think it's written by probably, by again, developer. 08:44.000 --> 08:47.000 And it's very focused on the development. 08:47.000 --> 08:51.000 And this one was useful, because it had a lot of, like, 08:51.000 --> 08:52.000 practical advice. 08:52.000 --> 08:54.000 For example, like, how to store, like, per frame, 08:54.000 --> 08:58.000 data, how to manage objects for, for the vendors. 08:58.000 --> 09:00.000 So it's amazing to use for. 09:00.000 --> 09:03.000 And the last one is also amazing. 09:03.000 --> 09:06.000 So there are a whole lot of examples by, 09:06.000 --> 09:07.000 such a relapse. 09:07.000 --> 09:11.000 And with those, it's very nice to start learning, 09:11.000 --> 09:13.000 because normally we can. 09:13.000 --> 09:15.000 You have to, like, spend a, like, a lot of, like, 09:15.000 --> 09:18.000 kind of a bullet plate called for initializing things. 09:18.000 --> 09:20.000 So for choosing GPU, and so on, 09:20.000 --> 09:23.000 directly with extensions you need. 09:23.000 --> 09:26.000 And with this, you can just download the samples. 09:26.000 --> 09:29.000 And you can immediately start experimenting with them. 09:29.000 --> 09:32.000 And even better, since a license for this is MIT, 09:32.000 --> 09:34.000 you can reuse them in our code. 09:34.000 --> 09:37.000 And when they started working on, 09:37.000 --> 09:39.000 we can back in the format. 09:39.000 --> 09:44.000 I just used, I just used the bullet plate called for, 09:44.000 --> 09:45.000 from here. 09:45.000 --> 09:51.000 Okay, so another tool is vendor doc. 09:51.000 --> 09:54.000 So you can capture frames, device step, 09:54.000 --> 09:57.000 and it's, it's awesome, amazing. 09:57.000 --> 10:01.000 And you can even, when you have, like, some sort of rendering issues. 10:02.000 --> 10:05.000 You can even edit shaders, 10:05.000 --> 10:08.000 interactively in the capture frame, inside vendor doc. 10:08.000 --> 10:12.000 And see, for example, you can fix the issue by editing them. 10:12.000 --> 10:16.000 And here, there is space out from, from vendor doc, 10:16.000 --> 10:18.000 running here, it's running for the, 10:18.000 --> 10:21.000 it's a capture of frame from our, 10:21.000 --> 10:23.000 a vocal vendor. 10:25.000 --> 10:28.000 And it's also possible to analyze performance. 10:28.000 --> 10:31.000 So you can see here, like, a duration in, 10:31.000 --> 10:34.000 microseconds for each call. 10:37.000 --> 10:39.000 And you can also analyze pipeline state. 10:39.000 --> 10:42.000 And the way I was using it here, 10:42.000 --> 10:45.000 I was using frame captures, from Vulkan, and from Virgil. 10:45.000 --> 10:48.000 And then comparing them, site by site, 10:48.000 --> 10:52.000 to see if Vulkan, there is working correctly. 10:52.000 --> 10:55.000 And next part, so what is amazing about Vulkan, 10:55.000 --> 10:57.000 is validation layers. 10:57.000 --> 11:00.000 And with those, the way they work is, you render the program 11:00.000 --> 11:02.000 normally, but you enable validation layers. 11:02.000 --> 11:06.000 And then, it will point out pretty much all the issues 11:06.000 --> 11:08.000 with, with program. 11:08.000 --> 11:12.000 If there's, like, example of, like, one of such validation 11:12.000 --> 11:15.000 validation layer, a message, of course, with the bugger, 11:15.000 --> 11:17.000 you can put some breakpoints on them. 11:17.000 --> 11:20.000 So then it's even easier to locate a very new program, 11:20.000 --> 11:22.000 given issue is happening. 11:22.000 --> 11:24.000 And, and that's amazing. 11:24.000 --> 11:27.000 We've helped for pretty much, maybe, like, half of the work 11:27.000 --> 11:30.000 was done through just fixing validation layers. 11:30.000 --> 11:32.000 And then, when there were fixed, in general, 11:32.000 --> 11:35.000 fixed immediately worked. 11:35.000 --> 11:38.000 Vulkan memory, LK tower. 11:38.000 --> 11:41.000 So that's why it's a very helpful library. 11:41.000 --> 11:43.000 It's a single header library. 11:43.000 --> 11:46.000 That manages memory allocation for you. 11:46.000 --> 11:50.000 And it's used by, like, very big games studios. 11:50.000 --> 11:53.000 And for example, I know that it's used in Ubisoft. 11:53.000 --> 11:57.000 And it's also used by, by of the source projects. 11:57.000 --> 12:00.000 So, for example, by Dolphin emulator. 12:00.000 --> 12:04.000 And it has also very nice features for the bugging memory 12:04.000 --> 12:06.000 use set. 12:06.000 --> 12:10.000 And then, the next part is, like, most renders. 12:10.000 --> 12:13.000 And the other renders, it was also, was based, 12:13.000 --> 12:17.000 had a so-called render hardware interface. 12:17.000 --> 12:21.000 And that means that it had a set of, 12:21.000 --> 12:24.000 or set of kind of internally instructions. 12:24.000 --> 12:27.000 And it first builds frame in that intermediate, 12:27.000 --> 12:28.000 like set of instructions. 12:28.000 --> 12:31.000 That can be interpreted by different rendering buttons. 12:31.000 --> 12:35.000 And then, different backgrounds can interpret it. 12:35.000 --> 12:39.000 And the cool part about it is that it can be civilized to a file. 12:39.000 --> 12:42.000 So, then you have that intermediate format, 12:42.000 --> 12:44.000 serialized to a file. 12:44.000 --> 12:46.000 And you can replay it in different buttons. 12:46.000 --> 12:49.000 So, for example, we could run a game in OpenGL, 12:49.000 --> 12:53.000 save that intermediate, like, frame to a file. 12:53.000 --> 12:56.000 And then, you play it in input parameter. 12:56.000 --> 13:04.000 And that's, for example, we've, with a parting to new graphic APIs. 13:04.000 --> 13:08.000 And here, there's a set of set of instructions 13:08.000 --> 13:10.000 for render hardware interface. 13:10.000 --> 13:13.000 So, as you see, the interface simple, 13:13.000 --> 13:15.000 there's just 60 instructions in total. 13:15.000 --> 13:17.000 And a lot of them are kind of repetitive. 13:17.000 --> 13:21.000 Like, you can see, for example, those, like, really uniforms at the end. 13:21.000 --> 13:25.000 So, it's mostly, like, the parting work is mostly, 13:25.000 --> 13:26.000 parting those instructions. 13:26.000 --> 13:30.000 And then also parting, you have to see, 13:30.000 --> 13:34.000 adding, like, back-and-site representations of all the objects, 13:34.000 --> 13:37.000 like, like, mesh, frame buffers, and so on. 13:37.000 --> 13:39.000 And that's why, but they're rendering that. 13:39.000 --> 13:42.000 So, rendering, like, interprets the, 13:42.000 --> 13:44.000 the hardware interface commands. 13:44.000 --> 13:51.000 It also, like, creates, like, back-and-site representations 13:51.000 --> 13:57.000 of render objects, like, frame buffers, and textures, and so on. 13:57.000 --> 14:01.000 And it also meant that, as we saw stuff loading, 14:01.000 --> 14:04.000 and the generates, full-cam, command buffers, 14:04.000 --> 14:08.000 which are, later, submitted to GPU to render. 14:08.000 --> 14:12.000 Then, another thing that is very different from OpenGL 14:12.000 --> 14:15.000 is that, on OpenGL, you just bind shaders, 14:15.000 --> 14:18.000 and the rest of this, and then render. 14:18.000 --> 14:20.000 And we can be, like, more complicated, 14:20.000 --> 14:22.000 because that will be very inefficient. 14:22.000 --> 14:24.000 And so, on OpenGL, you have pipelines, 14:24.000 --> 14:27.000 and pipeline describes, like, all the, 14:27.000 --> 14:31.000 the whole setup for given, given, a drop-hole. 14:31.000 --> 14:34.000 And then, you can reuse pipelines from frame to frame, 14:34.000 --> 14:38.000 and you can even save them to a file, and then, a cover later. 14:38.000 --> 14:45.000 So, last part of that was making that pipeline cache. 14:45.000 --> 14:49.000 And, other than that, there is also object-like time management. 14:49.000 --> 14:53.000 So, since, you know, principally, class done automatically, 14:53.000 --> 14:58.000 here, you have that, you have to do that manual, on your site. 14:58.000 --> 15:03.000 And, you have to be special objects, 15:03.000 --> 15:06.000 that have, like, shad pointers, 15:06.000 --> 15:08.000 to objects used by given frame. 15:08.000 --> 15:10.000 And then, it needs to be a recycler object, 15:10.000 --> 15:13.000 because if the objects have deleted on a different thread, 15:13.000 --> 15:16.000 then you need to go to the cycle, and then, 15:16.000 --> 15:19.000 then, get, like, properly, this is what it looks 15:19.000 --> 15:21.000 inside on, on the address. 15:21.000 --> 15:24.000 Oh, yes. 15:24.000 --> 15:27.000 Okay, and the one or thing is, of course, 15:27.000 --> 15:30.000 is good to avoid writing shaders, like several times, 15:30.000 --> 15:33.000 but the problem is that, while you can, of course, 15:33.000 --> 15:36.000 write a GLSL, more form, 15:36.000 --> 15:39.000 from, from, from a GL, it will be kind of problematic, 15:39.000 --> 15:42.000 to maintain, like, several sets of shaders. 15:42.000 --> 15:45.000 And so, the way we did that is, 15:45.000 --> 15:48.000 there was separated headers, and there are also 15:48.000 --> 15:51.000 processor macros for those. 15:51.000 --> 15:55.000 And the way it works is, on work, 15:55.000 --> 15:58.000 you have the script process, for different things. 15:58.000 --> 16:01.000 So, for example, we used, like, separate the script process 16:01.000 --> 16:04.000 for texture, textures, unique from buffers, 16:04.000 --> 16:06.000 and for restores buffers. 16:06.000 --> 16:08.000 And all of the, you don't have those. 16:08.000 --> 16:11.000 And so, like, just a process of the files, 16:11.000 --> 16:15.000 is possible to, to, to, to, to, to, to, 16:15.000 --> 16:18.000 prepare, like, if, if, if they're, like, input enough, 16:18.000 --> 16:20.000 with definitions. 16:20.000 --> 16:23.000 Okay, and the, other than that, there is still a lot of 16:23.000 --> 16:26.000 40 magic, so, right now, like, let there are shrubs, 16:26.000 --> 16:28.000 and then, it's also one, that's quite nicely. 16:29.000 --> 16:31.000 It's something, like, local bounce scenes. 16:31.000 --> 16:33.000 We are getting, like, two and a half times, 16:33.000 --> 16:34.000 the performance, and that's, like, 16:34.000 --> 16:37.000 we very, you've, like, simple approach right now. 16:37.000 --> 16:40.000 And yet, there is a lot of, of optimization, 16:40.000 --> 16:42.000 meaning, so, first, like, we still have, like, 16:42.000 --> 16:44.000 a lot of status, when texture is lost. 16:44.000 --> 16:46.000 So, the, so, the textures, we, we apply 16:46.000 --> 16:48.000 the ton, like, separate, like, transfer queue, 16:48.000 --> 16:49.000 and separate, uh, thread. 16:49.000 --> 16:53.000 And then, uh, there is, uh, more, like, work on memory, 16:53.000 --> 16:54.000 management need that, uh, for, 16:54.000 --> 16:57.000 GPUs, we, we've limited, uh, a mental video ram, 16:57.000 --> 16:59.000 and then, like, like, long peak of meat maps, 16:59.000 --> 17:01.000 so, starting from, uh, starting from, for example, 17:01.000 --> 17:03.000 like, obviously there's a lot of textures, 17:03.000 --> 17:05.000 so, probably, immediately in renderer. 17:05.000 --> 17:07.000 And then, as, like, for, for the, 17:07.000 --> 17:10.000 we have, like, stream, uh, from them, also. 17:10.000 --> 17:12.000 There is still, uh, yes, a portal button, 17:12.000 --> 17:15.000 but, that should be, probably pretty simple. 17:15.000 --> 17:18.000 And then later, for, uh, like, standalone devices, 17:18.000 --> 17:20.000 especially, or, like, headset, 17:20.000 --> 17:23.000 with, like, eye tracking, uh, variable rate, 17:23.000 --> 17:25.000 shading will be, uh, also, uh, 17:25.000 --> 17:27.000 willing for prefer, for mass a lot. 17:27.000 --> 17:29.000 And that's, uh, specifically, like, uh, will come 17:29.000 --> 17:31.000 things, so, that's another benefit. 17:31.000 --> 17:35.000 Also, uh, uh, uh, uh, do you have, uh, questions? 17:35.000 --> 17:37.000 Uh, thanks. 17:37.000 --> 17:39.000 Thank you. 17:39.000 --> 17:40.000 Thank you. 17:40.000 --> 17:41.000 Thank you. 17:41.000 --> 17:43.000 Thank you. 17:43.000 --> 17:44.000 Thank you. 17:44.000 --> 17:45.000 Thank you. 17:45.000 --> 17:46.000 Thank you. 17:46.000 --> 17:47.000 Thank you. 17:47.000 --> 17:48.000 Thank you. 17:48.000 --> 17:49.000 Thank you. 17:49.000 --> 17:50.000 Thank you. 17:50.000 --> 17:52.000 Thank you. 17:52.000 --> 17:53.000 Thank you. 17:53.000 --> 17:54.000 Thank you. 17:54.000 --> 17:56.000 It's, uh, I can't hear you well. 17:56.000 --> 17:58.000 Uh-huh. 18:02.000 --> 18:03.000 So, uh, thank you. 18:03.000 --> 18:06.000 I guess I'm wondering, when, when doing something, 18:06.000 --> 18:08.000 like, partying a game engine render, 18:08.000 --> 18:10.000 like, how, how do you approach testing, 18:10.000 --> 18:12.000 to make sure that, like, writing your things 18:12.000 --> 18:14.000 doesn't introduce regressions? 18:14.000 --> 18:16.000 Ah, so for that, uh, uh, uh, uh, 18:16.000 --> 18:19.000 very useful part was that serialization, uh, 18:19.000 --> 18:22.000 serializing the frame to, to a file, 18:22.000 --> 18:24.000 because then, uh, you can have, like, special tool. 18:24.000 --> 18:26.000 We have a tool called GPU frame player, 18:26.000 --> 18:28.000 and then you can replay this same, 18:28.000 --> 18:30.000 same file, uh, in, uh, 18:30.000 --> 18:32.000 uh, same, like, intermediate format file, uh, 18:32.000 --> 18:34.000 uh, for the frame in, uh, 18:34.000 --> 18:35.000 an, uh, an, um, for example, 18:35.000 --> 18:36.000 Vulkan or in different buckets, 18:36.000 --> 18:38.000 and then you can compare it, 18:38.000 --> 18:39.000 for example, St.P. 18:39.000 --> 18:40.000 His missing or St.P. 18:40.000 --> 18:41.000 Grandracing correctly. 18:41.000 --> 18:43.000 So, uh, there's a special tool for that. 18:43.000 --> 18:45.000 And for that, uh, it's a very important tool 18:45.000 --> 18:46.000 to have that, uh, 18:46.000 --> 18:49.000 uh, serialization of, uh, uh, uh, 18:49.000 --> 18:50.200 or, uh, R-H-i, uh, 18:50.200 --> 18:52.000 events and the, the frame. 18:52.000 --> 18:54.000 Um, uh, thank you. 18:54.000 --> 18:54.960 Thank you. 18:55.960 --> 18:56.640 Thank you.