WEBVTT 00:00.000 --> 00:12.960 Hello everyone, so I'm Everest, my course, I need a 5-0-0. 00:12.960 --> 00:23.920 I will show and demonstrate what we have done with Open Research Institute, so I'm part 00:23.920 --> 00:32.320 of the team. I'm not the specialist of this project, but I will try to explain this in more 00:32.320 --> 00:41.840 details. So it's about space and terrestrial communication, especially more than them. 00:41.840 --> 00:51.320 So what is Open Voice? Well, it's an open source digital voice protocol. It's designed 00:51.320 --> 01:02.200 to be a bandwidth constraint, and it could be used on terrestrial or satellite communication. 01:02.200 --> 01:13.080 This project is done by a lot of people around the world. 01:13.080 --> 01:29.720 Why we develop this kind of new modern? Well, there is no really high quality voice communication. 01:29.720 --> 01:45.320 Because we mainly want to use the FCC or the spectrum radio spectrum on until UHF, we need 01:45.320 --> 01:55.400 to be less than 25 kilohertz. So in this particular modem and modulation, we are above that, 01:55.720 --> 02:08.760 we can have more quality with that. What's important is that it's free implementable protocol, 02:08.760 --> 02:19.640 so it's completely open source. We have already demonstrated in summer 2025, and on the last meeting, 02:20.600 --> 02:32.600 we have succeeded in transmitting over the air, voice and data. So who is Open Research 02:32.600 --> 02:43.320 Institute? Well, it's an installation based in US, with volunteers, which are about, well, 02:43.400 --> 02:57.640 all around the world. There is a lot of project on this foundation. In two years ago, I used 02:57.640 --> 03:07.960 personally, DVBS2 FPG-RP, which allowed to, for example, to broadcast on Q100, which is a 03:07.960 --> 03:18.520 geostation satellite, and while I use the IP, Open Source IP FPGA-DVBS2. So it could be DVBS2, 03:18.520 --> 03:24.760 it could be, well, there is projects like AirF Bitbanger, which is more electronic cards, 03:25.480 --> 03:41.800 but it could, the Open Research Institute also use regular regulatory works, so with the FCC in US. 03:42.360 --> 03:59.720 You can find all the details on the link, which is there. So what this? Well, it built around 04:00.680 --> 04:13.160 16 kilobits, Open Source Voice Core, and with this Open Source Core, we have a super 04:13.160 --> 04:23.320 your voice quality compared to other. So the voice quality is good, and it can be mixed also 04:23.320 --> 04:34.360 with data, so packet, imagine you can chat and send audio at the same time. Generally, we have 04:34.360 --> 04:42.760 some packet protocol, and we have some audio protocol, and with this, with this, open protocol, 04:42.760 --> 04:54.120 we can mix both. So above the voice quality, I don't have the samples here, but you can 04:55.160 --> 05:06.360 reach them with this links. We compare audio quality of several codecs. There is a whole 05:06.440 --> 05:17.480 presentation here on YouTube, please don't go there, because it's a spoiler, as the 05:17.480 --> 05:37.640 presentation is quite same as this day. So then what's the architecture of this protocol? 05:37.640 --> 05:45.320 Well, we need first a 40 minutes second fixed frame lens, why, but it's a back practice for 05:45.400 --> 05:56.200 opus codec, in order to have low latency and quality and less overhead. In order to 05:56.200 --> 06:04.920 moderate it, we use the minimum shift gain modulation. So as we have 16 kilobits 06:05.480 --> 06:23.480 codec, we have now 45.2 bitrate, which is MSK sorry. So we have a toned separation of 27.1 kilobits. 06:24.200 --> 06:36.360 So we have a nertonyl bandwidth constant envelope, so the main spectrum is between 21 kilobits. 06:40.040 --> 06:49.080 Using this MSK is useful to filters and amplifiers, we don't need to have 06:49.320 --> 07:01.640 really a linear amplifier like we use in QPSK, for example. So it's easier. 07:04.920 --> 07:18.280 Now deep, a little more in the codec itself. So first, we need to have like a transport stream 07:18.360 --> 07:29.160 in DVV. Here we have some baseman data frame, okay. And the interlocutor is the interface between 07:29.160 --> 07:47.240 the humane and the baseman packet. So the interlocutor is written in Python and is getting all the 07:49.160 --> 08:00.600 input, which is the audio, the chat and even from that as. And can then send and be received by 08:00.600 --> 08:12.600 the other interlocutor. So there is two main cases. The first is quite easy, which is 08:13.480 --> 08:25.320 I go from one to one just by EIP, okay. So imagine you just have two computers or two devices. 08:26.120 --> 08:36.280 And with this baseman packet, you translate and receive, okay. Now we can also do that 08:36.920 --> 08:51.960 through RF. And then at this time, we use the modem. So there is what the interlocutor look like, 08:51.960 --> 09:06.040 well, it's a web-based interface or a common line. So you can easily chat and generate 09:06.040 --> 09:20.360 audio just with mic. So how we, well, there is several cases to connect with the modem. 09:21.400 --> 09:32.280 Well, the first one is, well, the simplex1, which say that, okay, we send the data frame, 09:33.240 --> 09:40.120 the baseman data frame, going to the modem, and then going to a radio frequency in 09:40.120 --> 09:50.440 MISK and then demodulated and going to the all the interlocutor setting. So for that, we have 09:50.440 --> 09:59.880 to configure the IP address of the modem to reach the transmitter on the receiver. 10:02.680 --> 10:13.080 Another case is like a band type or a transponder, satellite transponder. So instead of having 10:13.080 --> 10:21.800 the radio frequency directly going from one point to another, we use a satellite, which is 10:21.800 --> 10:31.880 a transponder. And then so this is the same case, but we use the transponder or the satellite 10:31.880 --> 10:48.280 to go to the receiver. And there is also some more complicated example, but which is quite interesting. 10:49.000 --> 11:03.240 Here, we use another satellite, which embedded a processing inside. So the processing 11:03.240 --> 11:17.480 is receiving the operating voice protocol here on MISK. So the app link is on MISK, but the 11:17.480 --> 11:27.640 don't link, instead of sending it in MISK, so the same baseman, we multiplex it here in DVB 11:27.640 --> 11:40.280 has 2. DVB has 2 is a normally, well, a general satellite protocol, which could be used mainly 11:40.280 --> 11:49.400 on video, well, it's the standard commercial protocol to send you all the videos. 11:50.520 --> 12:04.600 But you can also send some IP data. So the idea here is that satellite is getting all the 12:05.560 --> 12:13.720 maybe another slide like that. So the satellite received the MISK, demodulated, 12:14.440 --> 12:25.720 then re-injected in a multiplex. So you have, you can have multiple claimed multiplex to the receiver 12:25.720 --> 12:35.960 side. So for example, you can have some conference, which means that several anti-locutors send 12:36.680 --> 12:48.440 the MISK or the opening voice app to the satellite, which demodulate and translate on DVB as 2. 12:48.440 --> 12:59.160 Of course, this is received, all the participants receive the multiplex broadcast DVB as 2. 13:05.240 --> 13:14.440 So how does all it work? How does all your text and data go out of interlocutors, 13:14.440 --> 13:25.800 all the web interfaces is working? Well, you have here a microphone speaker keyboard, 13:25.800 --> 13:33.000 terminal browser, and then you have, well, one, become a line, or the web interface. 13:36.120 --> 13:39.640 As I told before, it's mainly based on Python scripts. 13:40.200 --> 13:50.360 So this is the example of the web interface and just the common line, and you see that there is 13:50.360 --> 14:04.440 a lot of common blocks on that. On the web browser, we use web circuit and going to the Python class, 14:05.080 --> 14:11.160 and on the terminal one, we have already we have a direct transmission on that. 14:17.720 --> 14:27.400 So here you can see more in data, and you can see that we can have some option of 14:27.800 --> 14:36.600 transcription, for example with Whisper, to a voice, to a text transmission. 14:41.000 --> 14:46.840 So this is on the received side of, well, it is the side, 14:49.240 --> 14:54.840 instead of the here it is the transmit side, and here's the received side. 14:55.160 --> 15:05.160 So a lot of components are in common. So what does an operating voice, 15:05.160 --> 15:18.680 based on data? So here we are coming back to the baseband, we have developed in order to go to the 15:18.680 --> 15:33.960 mode, okay? So the first one, so this is, there is several packet type, which is audio, text, 15:33.960 --> 15:45.960 control, but a feature one is data, which is any data. First of all, the operating voice here, 15:46.040 --> 15:56.440 as a station ID, which could be your call sign, for example, or a derived of the call sign. 15:59.160 --> 16:10.520 You can have a notification token, which is not a military, but just if we need that in a 16:10.520 --> 16:22.680 feature release. After that, we have a constant over a bit surfing, which means that here, 16:23.320 --> 16:36.120 so on the payload. Why that? Because when we transmit data, we have variable length of data, 16:36.680 --> 16:46.680 and this constant over a bit surfing can be used to keep track of all the data, 16:47.320 --> 17:00.040 rear-sumble packets, and can also do some surfing in order to have a constant frame. 17:00.760 --> 17:16.360 Because we have a constant frame format. So for example here, we have several 17:17.240 --> 17:37.560 civil packets, and we can see that there is the cups at each start of packet, okay? Not on the 17:37.560 --> 17:55.080 voice, but on the data. So now, we are on the audio side, okay? So we are 80 bytes of 17:55.080 --> 18:08.840 office audio at each time. We use RTPA data, which is real-time protocol, which is there to have 18:09.880 --> 18:18.040 the synchronization, and to see if we recreate a clock at the receiver side. 18:19.000 --> 18:30.040 So the audio frame is the audio payload, and the office packet and the RTPA packet. 18:30.680 --> 18:48.360 Sorry. So on the RTPA data, so we are going deeper and deeper on the protocol, 18:49.160 --> 18:57.560 and on the RTPA, we have a hash of this station ID. We have a sequence number in order to 18:58.200 --> 19:08.200 if we lose some packet, and we can then try to resumble to skip some synchronization. 19:08.600 --> 19:17.160 The timestamp is increments normally every 40 milliseconds, so if we have 19:17.800 --> 19:29.960 skip in timestamp, we know that we have lose some packets, and we have we have a failed type, 19:29.960 --> 19:38.600 which is for now the office codec, but maybe in the future we can switch to another codec. 19:38.600 --> 20:02.600 So now we are in text control message. So the payload is NUTF8F8Data's. The RTPA is not used, because 20:02.600 --> 20:14.520 we use some data, and we don't have to synchronize all that. As the text control message 20:14.520 --> 20:23.400 are variable lengths, it could be much longer than the frame, and then we use the cups 20:23.400 --> 20:39.400 to resumble all that. The UDP data is in the internal port number, which we can 20:39.720 --> 20:51.000 payload tip. So we use this to prioritize the incoming data, and we want that the voice 20:52.120 --> 21:01.400 is priority to the control, and then the text and the data. So we use this UDP 21:01.400 --> 21:11.000 either for that, and so we can prioritize all that. The UDP data is a source and destination 21:11.000 --> 21:26.600 IP, and a protocol field. So there is a first tip of data defined to transport 21:27.560 --> 21:37.400 any IP data. This is a future update. I think that we are developing right now. 21:41.640 --> 21:49.160 So now we have as we have analyzed all the protocol, the baseband protocol, how 21:49.160 --> 22:07.240 can we modulate and go into radio frequency? Well, we use civil stage before modulating in MSK. 22:07.880 --> 22:26.200 First, there is a randomization process, which is a city SDS common LFS process, then there is a 22:26.200 --> 22:38.680 forward error correction, then an interliver with 30 rows, a sync word insertion and sub-detection 22:39.240 --> 22:52.680 in order to help to demodulate it. The MSK modulation is a design which has been 22:52.840 --> 23:09.720 influenced by MSK outcomes. So when we get this process, so the frame is coming in, then we have 23:10.680 --> 23:21.000 feeful and some plug domain crossing, then we run the mice and encode in VTRB, then have an 23:21.000 --> 23:31.560 interliver, we add a sync word, and then we have the modulation king and the same on the 23:31.560 --> 23:42.680 receive path. So what is the performance and the characteristic? Well, the cutting gain is about 23:42.680 --> 23:49.080 5 dB from the VTRB decoder and we have 2 dB more because we have a soft decision. 23:49.640 --> 24:04.760 The sync word performance is quite good and has a pick to side up to 1.1, which seems to be 24:05.080 --> 24:18.360 optimal. The sync word detection has two three sold. We have one, when we try to hunting 24:19.160 --> 24:29.560 the demodulation, when we have MSK, we have the radio frequency arriving and we first and 24:29.800 --> 24:38.360 globally, to know how is some power rate and try to synchronize it. So we have first 24:41.640 --> 24:50.520 three sold for the first hunting and then as soon as we have ants, then we have another 24:50.520 --> 25:04.600 three sold to verify the sync word. For the latency, because it's important, the main 25:04.600 --> 25:16.680 latency is under the device, which means that the modem latency is quite short, but right now this 25:16.760 --> 25:27.960 is more the device, which is the PC, for example, which has 50 to 100 minutes of full frame. 25:28.520 --> 25:40.680 You have OSD days, put a shown playback queue and on the receive side, it's the same. So it's 25:40.680 --> 25:52.760 mainly some issue with the device. We can have less latency device, but right now we have 25:52.760 --> 26:04.120 so around 100 minutes again. On the latency of the modem itself, well, we work with the 26:04.680 --> 26:16.120 711.4 megats clock, which is the puto as they are maximum sample rate clock. So the 26:16.120 --> 26:28.840 transmit is at the bond 63 microsecond. On the receive side, well, we need some data, so we 26:28.840 --> 26:37.400 have, we need enough data to have a sub-decision of the VRB decoder. So it's about one 26:37.400 --> 26:53.400 millisecond, which is very short compared to the interlocutor latency. There is an implementation 26:53.400 --> 27:09.720 of the modem in C++, but we need to do that in each gear. Why? Because as soon as you have 27:11.960 --> 27:21.640 developed something in each gear, you can create an AZIC. Outwear is also fast-efficient and compact, 27:21.640 --> 27:34.680 so it's a good way to learn FPGA, which means that there is a lot of volunteers who learn 27:34.680 --> 27:41.640 at the same time on this kind of project. So it could be FPGA, it could be software. 27:42.200 --> 27:53.880 And why we use the amateur radio band? Well, because it's coordinates for the 27:54.120 --> 28:11.560 space and it's non-commentsure and we can use sub-band. As soon as it is UHF, we can experiment on that. 28:11.720 --> 28:27.240 Why MSK, not JMSK? We have a lot of even on satellite. We are not constrained by bandwidths, 28:28.280 --> 28:38.120 really, as a commercial satellite, so we can have a little more bandwidth for MSK and we have 28:38.200 --> 28:54.120 a better SNA on that. We have lower, well, the JMSK has lower side, side-lapse, but we don't really 28:54.120 --> 29:03.720 need to pack a lot of channels. Why is 16 kilobits for code? Well, it's very good for security. 29:04.200 --> 29:21.240 And going up is occupying more bandwidth and in the contrary, it's a minimum to have a good quality one. 29:22.120 --> 29:30.040 Yeah, so we use Opus, but why not codec two, which is also an open source, 29:33.320 --> 29:44.440 developed by an amateur radio hand guy. Well, the codec two is mainly focused on very low bit rate, 29:45.160 --> 29:55.320 and so for amateur is more on the HF band instead of UHF, where you don't have a lot of bandwidth, 29:55.320 --> 30:05.880 only 2.5 kilobits, bandwidths. Here it's another case, because we have plenty of bandwidths. 30:06.200 --> 30:20.920 But this codec two is also very good. There is just several aspects and we can choose one on the other. 30:24.440 --> 30:33.240 The important thing and what we want is mainly some quality on the audio. 30:36.600 --> 30:54.520 So we try to have some benchmark on that, try to comparing with other, and there is several 30:55.160 --> 30:59.640 it's not very easy to open as a comparison, but we try to do that. 31:02.040 --> 31:12.040 So we use the Pictus Cycle Hub ratio comparison for the FEC comparison interleving comparison. 31:13.000 --> 31:20.680 And so there is a lot of other metrics we can try to use. 31:24.520 --> 31:37.000 It will be published in future work. So here you have the, the component, the, is it for each 31:39.000 --> 31:49.160 codec, so open, open voice protocol and 17 p25, discharge the MR, and it's in, or use, 31:49.160 --> 32:00.440 you, you use a fusion, and you can see that, yes, we want to be open source, and so only, 32:01.720 --> 32:08.520 only few are good and especially for the vocal license, 32:10.840 --> 32:14.600 because all of the other DVSI license. 32:19.800 --> 32:35.480 And this is for the synchro lens, so 8.1 is quite good compared to the other. 32:38.280 --> 32:47.960 I'm not especially on on this metrics, so I will, I give you all the, the comparison that 32:50.120 --> 32:54.760 I'm not, I'm not very specialist on that, sorry. 32:58.440 --> 33:10.200 Here you have a comparison with all the FEC, the code rate, et cetera, and we can see if we have 33:10.360 --> 33:23.960 a FEC architecture, is a soft decision, which, which is, okay, to, well, the soft decision, 33:25.720 --> 33:32.840 increase, well, increase the quality of the decoding also. 33:33.160 --> 33:50.440 Here is the interior lever, so the interior lever is, to, you know, that to, on, if, if you have 33:50.440 --> 33:56.840 an interference on the radio frequency, we want that if there is a burst on one, we don't 33:56.840 --> 34:05.960 lose all the bits at the same time, so we try to spread it and enter it, enter it. 34:09.800 --> 34:21.560 So there is several mechanics to do that, so we use 32 row here. 34:21.720 --> 34:31.880 We are to find all this component, all these open tools, so you can have all the, this link on that. 34:34.280 --> 34:42.440 You can, and as soon as you're interesting in this project or in other, you can get involved 34:43.320 --> 34:53.320 with this link here on getting started, and we have weekly meeting with all the, all the projects. 34:56.360 --> 34:58.040 Thank you.