WEBVTT 00:00.000 --> 00:11.640 OK, OK, well, let's get started. 00:11.640 --> 00:16.600 I think you guys already heard from a number of folks involved 00:16.600 --> 00:20.760 in the design implementations of distributed databases. 00:20.760 --> 00:25.880 And I am going to talk to you about where 00:25.880 --> 00:30.280 you actually will need, where those distributed databases 00:30.280 --> 00:32.280 on the first place, right? 00:32.280 --> 00:34.840 Or if some bad search come to us. 00:34.840 --> 00:40.040 But let me first, maybe define what do I mean by the distributed 00:40.040 --> 00:41.600 databases, right? 00:41.600 --> 00:44.360 I think what if you look at the database landscape right now, 00:44.360 --> 00:44.520 right? 00:44.520 --> 00:47.640 We can clearly see those two different database categories. 00:47.640 --> 00:51.280 There are some databases which have been originally 00:51.280 --> 00:52.880 designed for single node. 00:52.880 --> 00:54.480 MySQL postgres, right? 00:54.480 --> 00:58.800 And then there is a different generation of databases. 00:58.800 --> 01:03.720 We are designed to be distributed databases from a ground up, right? 01:03.720 --> 01:07.720 And a lot of database which is designed for cloud native age, 01:07.720 --> 01:14.720 we are typically distributed databases. 01:14.720 --> 01:18.960 Now, what is the key difference in the approach 01:18.960 --> 01:22.520 of a high mobility and scalability? 01:22.520 --> 01:25.880 Now, if you have something like a single node database, 01:25.880 --> 01:30.400 like mySQL, well, if you need to have a ability, 01:30.400 --> 01:33.080 you have replication, right? 01:33.080 --> 01:35.760 You can scale that by using the big boxes, 01:35.760 --> 01:38.600 maybe read write, split, and that is essentially 01:38.600 --> 01:42.840 what you have, have a bit of having complete copies of data. 01:42.840 --> 01:45.720 And then you execute the query on a single node. 01:45.720 --> 01:48.120 And that's kind of relatively simple problem. 01:48.120 --> 01:52.400 Now, distributed databases is then we also have a lot of nodes 01:52.400 --> 01:56.280 but typically we have only partly copies of data 01:56.280 --> 02:00.080 because data can be much, much larger, and fits in a single node. 02:00.080 --> 02:02.840 And then also you will have a distributed execution. 02:02.840 --> 02:07.040 That means when distributed query is going to touch many nodes 02:07.040 --> 02:10.080 one way or around. 02:10.080 --> 02:13.840 Now, if you really look at the high end, of course, 02:13.840 --> 02:18.000 no single node can run a Facebook, right? 02:18.000 --> 02:20.920 I mean, it's just, you know, two freaking big. 02:20.960 --> 02:23.240 Lots of data, lots of queries. 02:23.240 --> 02:28.880 So if you look at that large scale or extreme scale, right? 02:28.880 --> 02:32.200 Or if you are happy to write, you know, 02:32.200 --> 02:35.000 mid size of the application, but you happen to be in China, 02:35.000 --> 02:37.880 in all those cases, right? 02:37.880 --> 02:41.600 You need really distributed systems, right? 02:41.600 --> 02:42.680 It's one way or another. 02:42.680 --> 02:44.200 How can they approach that? 02:44.200 --> 02:47.960 Well, you can either do that in an application level 02:48.920 --> 02:53.960 and that is what a lot of folks did in, you know, 02:53.960 --> 02:56.360 early to meet the thousands, right? 02:56.360 --> 02:58.000 That's where Facebook was started. 02:58.000 --> 03:00.840 Some may remember live journal, right? 03:00.840 --> 03:03.200 Two popularized, the term, Schradian, right, 03:03.200 --> 03:05.560 and some of the early approaches to that. 03:05.560 --> 03:09.120 Because actually, we do not have at least an open source wall 03:09.120 --> 03:12.120 any good distributed databases. 03:12.120 --> 03:17.880 Then there is also a approach to write which I will call 03:17.880 --> 03:19.520 you know, proxy. 03:19.520 --> 03:20.480 If no, it is just back. 03:20.480 --> 03:24.080 That is a pretty complicated approach. 03:24.080 --> 03:25.480 Something like a VTAS. 03:25.480 --> 03:29.200 Where you have full blown database, right? 03:29.200 --> 03:30.680 And then you have some, you know, 03:30.680 --> 03:33.440 proxy and all of that, which deals with all that, 03:33.440 --> 03:37.800 complicated stuff, so you as a user don't have to. 03:40.240 --> 03:44.440 So Schradian, and this will process and are complicated, 03:44.440 --> 03:50.360 especially if you think about a really kind of complete solution. 03:50.360 --> 03:55.240 In my day, right, when you had a lot of those applications, 03:55.240 --> 03:57.520 Schradian roots, there are so many people 03:57.520 --> 04:00.320 implement some, you know, solution, which would kind of work 04:00.320 --> 04:03.880 in 95%, I mean, maybe kind of 99%, right? 04:03.880 --> 04:07.200 But that's all would be very, very fragile. 04:07.200 --> 04:10.920 And I think we came now to understand in what, 04:10.920 --> 04:13.840 having the application developers, right, 04:13.840 --> 04:19.560 to try to double in writing those distributed data, 04:19.560 --> 04:22.840 processing algorithms and distributed database 04:22.840 --> 04:24.480 is not a good idea, right? 04:24.480 --> 04:27.720 That is just, you know, some people still use it, right? 04:27.720 --> 04:29.880 Like somebody like, if you think a Facebook 04:29.880 --> 04:32.760 and a bunch of other companies started in that era, 04:32.760 --> 04:37.280 they sort of have their own VTAS-like proxy, right? 04:37.280 --> 04:39.800 Which does a lot of that kind of magic. 04:39.800 --> 04:43.000 So, application developers, they don't have to think 04:43.000 --> 04:45.720 about a distributed database, right? 04:45.720 --> 04:48.160 So, I think that is a very important thing. 04:48.160 --> 04:51.160 Hey, you know what, if you, in this day and age, 04:51.160 --> 04:55.240 and if you actually need that distributed database, 04:55.240 --> 05:00.720 then probably you should not do the manual Schradian 05:00.720 --> 05:02.320 in the application. 05:02.320 --> 05:06.040 Now, as a fairytale, that counts as sort of butt, right? 05:06.040 --> 05:08.680 Because if you think about a distributed systems, 05:08.720 --> 05:10.560 they are complicated. 05:10.560 --> 05:12.200 And I think it was kind of interesting 05:12.200 --> 05:14.880 like to watch your previous presentations, like, 05:14.880 --> 05:17.640 well, how do you even think about the time 05:17.640 --> 05:20.040 in those systems and you have a many nodes, right? 05:20.040 --> 05:22.080 All those kind of complicated things, 05:22.080 --> 05:25.320 how you handle visibility, consistency, 05:25.320 --> 05:28.040 isolation modes, and those kind of flash distributed systems. 05:28.040 --> 05:31.560 That is quite hard, right? 05:31.560 --> 05:34.160 And even if it's like a fully managed solution, 05:34.160 --> 05:37.000 that doesn't completely isolate you as an application 05:37.000 --> 05:41.040 developer, because you have to understand the systems. 05:41.040 --> 05:43.400 You have to understand how we can fail. 05:43.400 --> 05:45.960 You have to understand what kind of bugs, 05:45.960 --> 05:48.560 and hey, believe it or not, database engineers 05:48.560 --> 05:51.200 are not perfect, there are bugs in a database, right? 05:51.200 --> 05:54.720 And if you are dealing with some very complicated systems, 05:54.720 --> 05:58.040 you run into them, when a database is not behaving as a true, 05:58.040 --> 05:58.880 right? 05:58.880 --> 06:02.200 And understanding how it should in a distributed basis 06:02.200 --> 06:04.160 that is complicated, right? 06:04.160 --> 06:07.320 So if you are dealing with complicated 06:07.320 --> 06:09.760 a distributed system needlessly, right, 06:09.760 --> 06:15.000 then that can be that idea. 06:15.000 --> 06:16.800 So that brings us to the question, 06:16.800 --> 06:18.520 which is the premise of my presentation. 06:18.520 --> 06:22.960 Okay, when do you actually need that distributed database? 06:22.960 --> 06:25.440 And then you can just pick that little nice, 06:25.440 --> 06:27.440 my school of post-grace school instance, 06:27.440 --> 06:29.920 kind of maybe, you know, replicate that for high 06:29.920 --> 06:32.640 ability to scale reads and this kind of stuff, right? 06:32.640 --> 06:34.840 How high can you go? 06:34.840 --> 06:38.440 And actually, a while ago, I put this kind of little post 06:38.440 --> 06:40.760 on Twitter, you know, just to check where people 06:40.760 --> 06:42.480 fail things are. 06:42.480 --> 06:46.600 And I was actually surprised, especially by their 06:46.600 --> 06:48.800 post-grace field community folks, you can say, well, 06:48.800 --> 06:52.840 actually, we are running some, you know, 06:52.840 --> 06:57.840 100 terabyte plus instance on a BFFB, the hardware. 07:00.520 --> 07:02.320 And then I talk to some post-grace field guys 07:02.320 --> 07:03.560 and say, yeah, I would just, you know, 07:03.560 --> 07:05.600 giving the sheet and pools right to use that real 07:05.600 --> 07:08.400 and say, well, you know what, that is not the most 07:08.400 --> 07:10.840 pleasurable thing to do, but you can do that 07:10.840 --> 07:12.720 in a certain cases, right? 07:12.720 --> 07:17.280 Well, so that I think gives us some idea. 07:17.280 --> 07:20.440 Now, I think what is also interesting, in this case, 07:20.440 --> 07:25.160 is that hardware available for those days 07:25.160 --> 07:26.760 is actually pretty big. 07:26.760 --> 07:31.760 Well, how big, how many cores do you think, 07:33.320 --> 07:35.360 you can get the instance, let's say, on Amazon, 07:35.360 --> 07:39.120 how many cores you can get? 07:39.120 --> 07:40.920 What's it? 07:40.920 --> 07:41.920 200. 07:41.920 --> 07:46.400 Well, I look today, right, and you can actually get 07:46.400 --> 07:50.120 almost 2,000 cores, right? 07:50.120 --> 07:51.840 And 32 gigabytes of memory. 07:51.840 --> 07:55.240 And that requires a budget with, maybe the same also, 07:55.240 --> 07:56.920 2,000 zeros, right? 07:56.920 --> 07:59.840 But, you know, if a money is not an option, you can get, 07:59.840 --> 08:03.640 like, a huge, huge, huge instance out here. 08:03.640 --> 08:05.480 Now, if you look at, you know, hey, 08:05.480 --> 08:07.720 we have all reasonable people, you know, 08:07.720 --> 08:11.440 we can buy, can't buy Mercedes every minute, right? 08:11.440 --> 08:14.360 When, you know what, when you can look at kind of more 08:14.360 --> 08:16.920 out, so they come out of your hardware, right? 08:16.920 --> 08:20.320 And that would be, who would be something like this, right? 08:20.320 --> 08:22.240 Which is also pretty big. 08:22.240 --> 08:26.600 So, if you look at, in this case, it's interesting 08:26.600 --> 08:28.760 to think about what kind of performance 08:28.760 --> 08:30.520 a scalability we can get, right? 08:30.520 --> 08:34.120 And you can actually get like a number of millions of queries 08:34.120 --> 08:38.920 from a single node, let's say, my school, right? 08:38.920 --> 08:41.440 And in reality, the queries are more complicated, 08:41.440 --> 08:44.760 still going to be hundreds of thousands in many cases. 08:44.760 --> 08:48.680 Note though, what varies something you better check 08:48.680 --> 08:53.360 because both your scalability, as well as exact performance, 08:53.360 --> 08:56.720 A is not going to be linear scaling the number of CPU cores, right? 08:56.720 --> 09:00.920 And that's also going to be very workload dependent. 09:00.920 --> 09:06.400 I would also say what you need to mind maintenance. 09:06.400 --> 09:09.200 In many cases, what really bites you in the butt 09:09.200 --> 09:11.840 is not those kind of performance for normal queries. 09:11.840 --> 09:14.480 You run small queries, you know, couple of reads, 09:14.480 --> 09:16.320 mostly memory, who cares. 09:16.320 --> 09:18.920 But then imagine if you have that, for example, 09:18.920 --> 09:22.800 90 terabyte table in your 100 terabyte database 09:22.800 --> 09:25.800 and you need to add a hold on to it, right? 09:25.800 --> 09:28.600 Or build a new index. 09:28.600 --> 09:30.760 That can be very unpleasant. 09:30.760 --> 09:33.200 It can take a lot of time, right? 09:33.200 --> 09:34.960 And in this case, if you're bound to consume the node, 09:34.960 --> 09:38.040 especially if you look at some solutions like my school, 09:38.040 --> 09:40.360 for example, we don't even implement 09:40.360 --> 09:43.720 the parallel things for many of those, right? 09:43.720 --> 09:49.720 So that becomes very important if you look at those systems. 09:49.720 --> 09:52.320 Especially if you need to hold this door. 09:52.320 --> 09:57.480 Yeah, double the storage, but that is, of course, another thing. 09:57.480 --> 09:59.680 Now, if you think about the database use, right? 09:59.680 --> 10:06.320 I would say there are a bunch of different users 10:06.320 --> 10:09.360 typically you have in the system, right? 10:09.360 --> 10:12.360 That comes from your core production database, like, oh my gosh, 10:12.360 --> 10:15.160 you know, I am e-commerce, no matter what happens, 10:15.160 --> 10:18.200 I need people to be able to keep buying pain as money, right? 10:18.200 --> 10:19.680 Then there's some secondary things, like, well, 10:19.680 --> 10:21.840 maybe I also want them to see the ad, 10:21.840 --> 10:24.440 so they buy something else, but if it doesn't work, 10:24.440 --> 10:25.720 it's not such a big deal, right? 10:25.720 --> 10:28.800 And then there's like telemetry, analytics, so on and so forth. 10:28.800 --> 10:31.640 And all of them, they all have a different parameters, right? 10:31.640 --> 10:35.240 Like, for example, in terms of data, we can find what humans, 10:35.240 --> 10:37.400 even if you're posting a lot of, you know, like, 10:37.400 --> 10:39.760 you know, messages on the chart, right? 10:39.760 --> 10:42.880 Don't generate as much data as machines, 10:42.880 --> 10:47.040 which can generate, like, often, tens and thousands of times more. 10:47.040 --> 10:50.160 And that is very often a lot of those massive data scales 10:50.160 --> 10:55.160 can tell, even small, in a small environment. 10:56.160 --> 10:57.520 Now, I think it's also good to think 10:57.520 --> 10:59.560 what exactly applications you have, right? 10:59.560 --> 11:01.760 And you can think about the types in a different ways, 11:01.760 --> 11:03.200 but basically on one thing, you can say, 11:03.200 --> 11:05.960 hey, there is this kind of a little application. 11:05.960 --> 11:07.960 We run in our company, it's hosted in Toronto, 11:07.960 --> 11:09.840 now, in our server, wherever it is, right? 11:09.840 --> 11:14.240 And then all the way to something like, like, a Facebook, 11:14.240 --> 11:16.120 you know, multiple users, very kind of, 11:16.120 --> 11:18.160 intermingled data, right? 11:18.160 --> 11:20.040 We need to require a lot of connections. 11:20.040 --> 11:23.520 You want to maybe know what your friend's post, right? 11:23.520 --> 11:24.680 All this kind of stuff, right? 11:24.680 --> 11:28.080 That is a landscape, which I would say. 11:28.080 --> 11:31.320 And that, I think, place, this is a database 11:31.320 --> 11:34.200 and maybe need it or not, right? 11:34.200 --> 11:38.200 If you're having, like, a self-hosted application 11:38.200 --> 11:41.200 in the enterprise, even in a pretty large one, 11:41.200 --> 11:45.680 chances are, amount of data, amount of traffic, right? 11:45.680 --> 11:49.640 Can be handled by your kind of conventional old-style 11:49.640 --> 11:52.440 single-load database, with some replication for everything, 11:52.440 --> 11:53.280 and stuff. 11:53.280 --> 11:57.360 But then, if you are going to this massive, you know, 11:57.360 --> 12:00.240 web-scale public applications, well, 12:00.240 --> 12:02.360 different story altogether. 12:02.360 --> 12:06.400 Now, here is the challenge I would see. 12:06.400 --> 12:08.640 It's kind of cuts above ways. 12:08.640 --> 12:11.240 In some cases, I see kind of developers, you know, 12:11.240 --> 12:12.720 attending talk like this, right? 12:12.720 --> 12:15.600 So maybe hearing from Facebook and their meta, 12:15.600 --> 12:18.240 meet up on something and things like, wow, 12:18.240 --> 12:20.080 I need the web-scale database. 12:20.080 --> 12:22.320 I need to future proof myself, right? 12:22.320 --> 12:26.600 And they install something, which is way, way to more scalable 12:26.600 --> 12:30.480 and to complicated compared to what they need, right? 12:30.480 --> 12:33.600 Oh, I'm just going to have, like, a war-pressed website, 12:33.600 --> 12:35.800 which two people a day is going to visit. 12:35.800 --> 12:39.840 Well, yeah, that's if my mom is not in location, right? 12:39.840 --> 12:42.040 No, no, only distributed databases for that, right? 12:42.040 --> 12:45.360 And on the other hand, you also don't want to be in a situation 12:45.360 --> 12:48.280 where you pick the database, which is not distributed, 12:48.280 --> 12:49.600 kind of do this way, right? 12:49.600 --> 12:54.400 And then you are, actually, have to have a, you know, 12:54.400 --> 12:57.720 have massive scalability needs, you have to scale massively 12:57.720 --> 13:00.480 because that's the application became kind of super successful, 13:00.480 --> 13:03.600 you know, like, think about like an open AI, right? 13:03.600 --> 13:06.600 Like, going from zero to 100 million users 13:06.600 --> 13:11.600 right in this, then you have to do a different choices. 13:12.440 --> 13:15.800 And I think it's also important to highlight here, 13:15.800 --> 13:21.440 is what it is often not really a single answer. 13:21.440 --> 13:25.200 You would find what many organizations run multiple database 13:25.200 --> 13:27.920 technologies for a number of different reads, right? 13:27.920 --> 13:29.640 Because, you know, operational database 13:29.640 --> 13:31.760 and analytical database often different, right? 13:31.760 --> 13:34.520 And maybe you need to have some relational database 13:34.520 --> 13:36.560 and database, which are good, that, you know, 13:36.560 --> 13:39.960 storing documents, right, or vector, or search, right? 13:39.960 --> 13:45.360 And so often there is going to be a portfolio of databases. 13:45.360 --> 13:48.360 And this kind of a trade-off between, you know, 13:48.360 --> 13:51.360 like a simple single-load database 13:51.360 --> 13:54.760 and distributed scalable database is one of those choices, 13:54.760 --> 13:57.360 right, which I would say use to build 13:57.360 --> 14:01.440 portfolio of a database is your organization use. 14:01.440 --> 14:04.920 And that's all I had to say. 14:04.920 --> 14:07.760 Yeah, with a minute to spare, huh? 14:07.760 --> 14:08.760 Thank you. 14:08.760 --> 14:17.040 Maybe you had time for one question, if it's a quick one? 14:17.040 --> 14:18.040 No? 14:18.040 --> 14:19.840 It's all clear. 14:19.840 --> 14:20.840 Yes. 14:20.840 --> 14:21.840 You've already left. 14:21.840 --> 14:23.840 But, yeah, you've already left. 14:23.840 --> 14:24.840 Thank you. 14:24.840 --> 14:26.560 A free hour talk, if no, breaks, right? 14:26.560 --> 14:27.560 Okay. 14:27.560 --> 14:28.560 Thank you. 14:28.560 --> 14:29.560 Thank you. 14:29.560 --> 14:31.000 So, you have a free hour talk, if no, breaks, right? 14:31.000 --> 14:32.000 Thank you. 14:32.000 --> 14:33.000 Okay. 14:33.000 --> 14:34.000 Thank you. 14:34.000 --> 14:35.000 Thank you. 14:35.000 --> 14:36.000 Thank you. 14:36.000 --> 14:37.000 Thank you. 14:37.000 --> 14:37.720 Thank you.