WEBVTT 00:00.000 --> 00:12.200 I typically ask this question in front of like a bunch of natural engineers, and I ask 00:12.200 --> 00:19.360 like how many of you know no Kubernetes by show of hands, I imagine that this in this room 00:19.360 --> 00:24.480 this would be a lot more, so how many of you know Kubernetes and how many have you 00:24.480 --> 00:33.440 played the round with it? That's the answer I expected quite a lot more than in front 00:33.440 --> 00:40.680 of more traditional networking folks. So what's cool about Kubernetes right? So the 00:40.680 --> 00:46.800 API server is really about making sure that eight humans and machines, controllers, in 00:46.800 --> 00:53.680 this case, can happily coexist and work together. So the API server exposes a number 00:53.720 --> 00:58.800 of resources, be the part, be the service, be the deployment, and there's some 00:58.800 --> 01:07.080 storage behind it. So another part of it is the controller or the reconciler or the 01:07.080 --> 01:16.240 operator or other names that are available for it, that basically execute the controller. 01:16.240 --> 01:24.120 They get a certain input, they run a certain task, they actuate a certain system and 01:24.120 --> 01:33.600 provide status back into the API. So we thought this controller or the custom resource 01:33.600 --> 01:41.440 definition and that framework inside of Kubernetes is a pretty awesome thing to drive 01:41.440 --> 01:48.560 network automation. So this close loop, the most practical thing that you can compare 01:48.560 --> 01:58.840 with is your, to terms that you monitor a certain temperature, if it goes below or above 01:58.840 --> 02:06.120 a certain threshold, you actuate a certain system to drive something. Another thing that we 02:06.120 --> 02:11.800 liked about the KRM, the Kubernetes resource model is that you get a lot for free. So 02:11.800 --> 02:19.360 you get the enormous ecosystem that's out there for free. There's other controllers 02:19.360 --> 02:25.880 for its search manager to name one that can help you issue certificates. Certificates 02:25.880 --> 02:31.480 that you might need on a traditional networking box to expose API interfaces like 02:31.480 --> 02:43.840 Natcon for Genemi or whatnot. And the API model is really cool in the sense that if you know 02:43.840 --> 02:49.920 how to drive one resource, you know how to drive them all basically. So it's a pretty consistent 02:49.920 --> 02:57.440 thing. And as a developer inside of that framework, basically you don't need to reinvent 02:57.440 --> 03:05.800 that a real on a lot of things. So how do we leverage this framework for network automation? 03:05.800 --> 03:11.960 So one of the things that the guys from Kubernetes that really goods is to abstract a lot 03:11.960 --> 03:21.960 of things, right? So they segmented the problem space and the domain. They used reusable components, 03:21.960 --> 03:26.680 they made it declarative. The clarity is really important in the sense that I want my 03:26.680 --> 03:31.640 network to look like this. And this you fire off to a controller and the controller 03:31.640 --> 03:36.640 figures out what needs to be done, given a certain state and the network getting from 03:36.640 --> 03:47.640 A to B. And the closed loop system was something that really we liked. So comparing what Kubernetes 03:47.640 --> 03:53.600 is that in containers and orchestrating containers is I mean if you compare it to networking 03:53.600 --> 04:00.240 or the control plane side of the networking, it's pretty much the same. So we have a network 04:00.240 --> 04:07.920 which is essentially a bunch of devices, a bunch of interfaces which are all interconnected. 04:07.920 --> 04:16.120 And the differences are like really certain use cases, the protocols that we use in networking 04:16.120 --> 04:23.280 to interconnect a bunch of devices using one's P.A. or I.A.S. or B.G.P. or whatever. The APIs and 04:23.280 --> 04:30.200 that's required. So the Kubernetes controller is really to help network engineers build 04:30.200 --> 04:39.560 Kubernetes controllers themselves to mainly drive traditional networking gear or to allow 04:39.560 --> 04:46.040 to expose your traditional network in a cloud native friendly way. So the ideas that 04:46.040 --> 04:52.520 we built a number of abstractions on top of each other where you have a bunch of controllers 04:52.520 --> 05:00.600 that would take traditional networking config for a vendor gear. Another abstraction layer 05:00.600 --> 05:07.880 would be a normalized model where you could create your own abstraction for what a network 05:07.880 --> 05:15.000 should look like. Because either way or either vendor you use an interface is still an interface, 05:15.000 --> 05:22.920 a portal, a still a port, a physical port. Those the terminologies are quite a say. So 05:22.920 --> 05:29.720 essentially what you need to define a VPC is basically a high level abstraction. Then the 05:29.720 --> 05:35.160 network design that you want to apply. Do I want my network to run VGP and generally 05:35.160 --> 05:41.880 doesn't need to run OSPF? Does it not need to run I.S.I.S. Which encapsulation is required 05:41.880 --> 05:48.680 VX LAN or MPLS? Which addressing do I need to use? Do I want to use single stack? Do I want 05:48.680 --> 05:55.400 to use dual stack? Do I want to go V6 only? And then the resources on the other side of the 05:55.400 --> 06:01.400 pain is really about which nodes are in the topologies, which links are there, how are the 06:01.400 --> 06:09.080 interconnected, which IP addresses are in my stateful database like my iPAM system, which I maintain. 06:09.400 --> 06:18.920 I'm so on. So the mental model is really about creating these layers of abstraction on top 06:18.920 --> 06:27.960 of each other. And I'll go through this a bit faster to get into the first layer. So you have 06:27.960 --> 06:36.280 a network with a number of devices. On top of it you would have a provider, which in the queue 06:36.280 --> 06:44.760 that story is the schema driven configuration component. On top of that you would have a number 06:44.760 --> 06:54.520 of other controllers that would expose abstract networking config as you would want it. So 06:54.520 --> 07:01.480 the queue net initiative is really to help network engineers define their own abstraction models 07:01.480 --> 07:09.000 because person A might want to abstract a network in a certain way, but person B might want to do it 07:09.000 --> 07:18.600 in another way. So coming into as they see our schema driven configuration and state, 07:19.800 --> 07:27.320 this is really the component that talks to the network. So it bridges the gap between Kubernetes 07:27.400 --> 07:35.480 so the control of that integrate all of the CRDs and those kind of things and the actual 07:35.480 --> 07:43.240 analysis on the on the cell bound. So what we do in as you see is basically we ingest all of the 07:43.240 --> 07:52.040 young models. So young models is the networking slang for the data modeling language that we use 07:52.040 --> 08:00.920 inside of networking. So we could have gone with protobov or crept or whatever, but the time 08:00.920 --> 08:06.440 that the angle was invented none of those were available and in the networking industry. We kind 08:06.440 --> 08:14.360 of want to do things or we have the thing to reinvent the wheel sometimes a little bit too much, 08:14.360 --> 08:22.040 but okay. So as you see handles structured config format, different ones. So we do Yamo, 08:22.040 --> 08:29.240 we do Jason, we do XML, we do protobov, we have several interfaces, cell bound to the device, 08:29.800 --> 08:39.720 G&MI and NATCOMF and basically what this does and we have to learn it a hard way is this aggregates 08:39.720 --> 08:47.640 all of the intends that you define in your Kubernetes resource. It ingest that for a certain physical 08:47.640 --> 08:53.800 network device and validates the config. So we have our schema or data modeling language, 08:53.800 --> 09:01.800 we have actual config, we try to validate it offline, we merge all of those declarative snippets of 09:01.800 --> 09:08.520 config and then when we have a valid config we the disk controller actually pushes it down to the 09:08.600 --> 09:15.800 device and monitors that the config is going to be in sync with the actual device. 09:17.960 --> 09:26.200 So we have a number of KRM resources to do that. So we have a config CR. This is basically your 09:29.880 --> 09:35.880 sonic config or your S.O. Linux config or your S.O.S.config or your Junior config or 09:36.840 --> 09:45.400 that you ingest into a CR. We have a also a config set CR that is where for instance you would want to 09:46.280 --> 09:54.600 replicate a control plane echo that's similar to a lot of devices. So basically the 09:56.120 --> 10:05.320 analogy with a replica set a bit. We have CRs that are not humanly created but created by the 10:05.320 --> 10:12.280 controller which is the running config. So that's syncs the actual config that's on the running 10:12.280 --> 10:20.040 device and the unmanaged config. So unmanaged config is snippets of config that's on the box 10:20.040 --> 10:27.160 that's not defined by an intent and that really allows us to onboard physical devices 10:28.200 --> 10:34.600 making sure that users are gaining trust in the system that it's working and then they can 10:34.600 --> 10:42.120 slowly onboard snippets of config that they define on their devices inside of the system. 10:42.120 --> 10:48.760 So they can really, I mean we can really start onboarding, brownfield systems like existing networks 10:48.760 --> 10:55.720 on there to onboard it into our system. So this is running short for time I guess. 10:55.960 --> 11:07.400 The CRD. So we ingest schema, the schema CR is a reference to a GitHub repo where the 11:07.400 --> 11:14.680 young models are hosted, we ingest those into a schema server. We have a bunch of discovery rules 11:14.680 --> 11:22.040 where we can automatically discover network targets and network devices in the network and then a bunch 11:22.120 --> 11:29.480 of other ones. The young schema, as I mentioned, it's a bunch, it's a data modeling language 11:29.480 --> 11:35.480 that we use inside of networking or the complex site of networking at least where we define 11:35.480 --> 11:43.320 modules, containers list, leaves, types, those kind of things. So I won't bore you too much with 11:43.320 --> 11:51.160 a bit, but it looks a bit like this. And what we do from and as a see point of view is we 11:52.440 --> 12:00.920 do all of the validations of the data model language. So we validate leaf graphs, we validate ranges, 12:00.920 --> 12:07.320 lengths, mandatory statements, patterns, choice cases and those kind of things inside of the thing. 12:08.280 --> 12:16.600 The important takeaway is that we know or to a high degree of certainty that the conflict 12:16.600 --> 12:23.960 is going to be valid before we send it down towards the device. Obviously there's always some 12:23.960 --> 12:33.320 runtime things on the device that can go wrong, like for instance the fit might be at a certain 12:33.320 --> 12:39.880 threshold, not allowing us to provision certain adults and things can still go wrong on the 12:39.880 --> 12:47.960 device, but we try to cover as much as we can before it hits the device. So that was the device 12:47.960 --> 12:55.800 layer and I'll go quickly through it. So we have another controller on top of that. That's 12:56.760 --> 13:03.560 a choreo, that's also part of the CubeNAT initiative. And that's really to once you have that 13:03.560 --> 13:10.920 vendor config is how do you abstract that? How do you make your own model available to open API 13:11.720 --> 13:20.600 or whatever and expose that. So you can define, I won't a VPC with a minimum set of inputs and that will 13:20.680 --> 13:29.320 render basically the device config, vendor config that can be pushed down with as you see. 13:29.960 --> 13:36.760 What I forgot to mention it as you see is that currently we do support everything or 13:38.040 --> 13:42.920 anything that has a young model. So we support a Resta, we support Cisco, we support Juniper, 13:43.720 --> 13:48.680 obviously the own network operating systems that Nokia provides like as a Linux and SLS, 13:49.240 --> 13:53.880 Juniper, so anything Sonic, anything that has a young model, we can ingest. 13:54.840 --> 13:58.120 Sometimes there's still a bug with okay, the bugs are there to be solved right. 13:59.160 --> 14:05.080 And choreo which is also part of the CubeNAT initiative really helps network engineers 14:05.800 --> 14:15.800 to find their own Kubernetes controllers. The key takeaway here is that we allow network engineers to 14:16.760 --> 14:24.920 or make it more user-friendly to define a controller for networking purposes. So the business 14:24.920 --> 14:33.000 logic that they can apply is either Python there or ginger templates or common codes to generate 14:33.000 --> 14:39.000 networking snippets basically. With that I think I still have a minute left. 14:39.960 --> 14:48.280 So a bunch of PR codes we have a discord where we collaborate with a bunch of network engineers 14:49.560 --> 14:55.400 trying to develop this stuff, our GitHub repo and some some YouTube links where you can check 14:56.920 --> 15:02.040 some more informative sessions about what CubeNAT is and what we're trying to do. 15:02.360 --> 15:09.400 With that I'm at the end of my presentation and I'll take any questions if they are there. 15:20.440 --> 15:25.800 Hey Hans, thanks for the presentation two questions quick. How are you handling state from the 15:25.800 --> 15:31.640 devices with respect to monitoring state and secondly if there is any storage so as you 15:31.640 --> 15:36.040 remember things is in memory or are you writing to discuss well is it at CD? 15:36.040 --> 15:45.800 Yeah. So inside of the as a C2 we also touch state from the networking device and we 15:45.800 --> 15:52.520 kind of catch that there in memory state. The config part like the actual config which is 15:52.520 --> 16:01.080 synced with the device we also persisted in stored at in memory.