WEBVTT 00:00.000 --> 00:10.240 All right. Welcome everyone. I'm Florian. I'm based in Switzerland at the Bernie University 00:10.240 --> 00:14.280 of Applied Sciences, and today I'm going to present to you a project that we've been working 00:14.280 --> 00:22.880 on for, yeah, a bit like a year in the core phase, let's say, and let's call it openparlated.ch. 00:22.880 --> 00:29.000 So basically, if you ask yourself what has happened in Switzerland, in the education policy, 00:29.000 --> 00:34.000 for example, in the last, let's say, five years, as a few, for example, a researcher, 00:34.000 --> 00:40.760 a journalist, or also civil society organization, advocacy manager, for example, this is really 00:40.760 --> 00:45.000 hard to answer this question. It's going to take you a lot of time or a lot of money like 00:45.000 --> 00:50.280 you need access to, like, an expensive lobbying or monitoring tool to get to kind of 00:50.280 --> 00:56.360 acquire this data. Why is this the case? As a lot of countries in the world, like Switzerland 00:56.360 --> 01:01.000 also federalist country, so the decisions are not just made at the national level, but also 01:01.000 --> 01:05.760 at the subnational level. So that means we have one parliament, we have 26 cantinal 01:05.760 --> 01:11.440 parliament, which is like the subnational, and then we have even 461 municipal parliament, 01:11.440 --> 01:17.760 and that's for a country of 8 million people, so you can imagine. Yes, and if you look 01:17.760 --> 01:24.000 at, like, legislative data in Switzerland, you don't have to be able to read that, but basically 01:24.000 --> 01:29.080 the messages, it's apples and oranges. So there's a couple of parliament that provide data 01:29.080 --> 01:33.360 through APIs, but most of them, they have, like, they rarely have websites, there's a lot 01:33.360 --> 01:39.120 of PDFs going on. Yeah, so it's a really massive situation. So the data is not really accessible, 01:39.120 --> 01:46.720 it's definitely not harmonized. Yeah, and it comes in, like, many forms and shapes. So again, 01:46.720 --> 01:52.520 as I've said, it's quite costly, you need a lot of resources to have access to these APIs. 01:52.520 --> 01:57.800 There's also kind of symmetries, like, given that, because, yeah, if you have, if you have 01:57.800 --> 02:02.920 the financial means, then you're able to access the data if you don't have the financial 02:02.920 --> 02:08.200 means you simply can't. And there's also a lot of inefficiencies, especially in the research, 02:08.200 --> 02:13.040 but also in the journalism sector, it's like, like, political scientists, for example, 02:13.040 --> 02:18.800 they get the data to clean them for their specific project, but they don't share the data 02:18.800 --> 02:25.520 with others. So what we try to do or what we decided to do is we want to collaboratively, 02:25.520 --> 02:31.520 basically, involving all the user groups, build an open standard, but also, obviously, an open 02:31.520 --> 02:42.560 API for harmonized Swiss legislative data. With that, and that's open par data.ch. So we give 02:42.720 --> 02:47.840 everyone at this point access to harmonized the open data from currently 78 national, 02:47.840 --> 02:54.960 Cantonese and municipal parliament. So researchers, journalists, civil society organizations 02:54.960 --> 03:00.960 can, on the one hand, analyze the states, but also monitor it. And with that, we basically 03:00.960 --> 03:08.480 want to foster transparency participation and innovation. So, basically, we have, kind of, 03:08.480 --> 03:13.200 like, a two track approach. On the one hand, we kind of, like, we're currently fighting symptoms 03:13.200 --> 03:19.440 on one hand. So we built this API as an MVP with a kind of short-term solution. So we import 03:19.440 --> 03:24.960 the data, we clean it, we harmonize it, and we publish it openly. Through our API, that's basically 03:24.960 --> 03:32.960 what we did in 2025, to basically create value ASAP. And also, like, get feedback from the community 03:32.960 --> 03:38.560 from the users. As the next step, and that's what we're currently doing, we want to kind of 03:38.560 --> 03:44.320 build this into a product, kind of in the medium-term. So we're adding, like, a value-based governance, 03:44.320 --> 03:48.320 but also a business model to be able to find it an actually sustained structure. 03:49.600 --> 03:53.680 But then, on the other hand, we're also trying to fight root causes. That's why we're developing 03:53.680 --> 03:59.920 data standards in a kind of, like, a Swiss framework to a organization for public and that sense 04:00.320 --> 04:05.200 legislative data. And in the next step, as soon as you have to stand it, which is going to be the 04:05.200 --> 04:13.280 case, at the end of the year, basically, fingers crossed, to then enable an encourage in that sense, 04:13.280 --> 04:18.880 like, also actively lobby parliament and governments to implement standards and also publish their 04:18.880 --> 04:24.640 data through Open APIs. Yes. And that's, if that all goes right, we will be able to kind of, 04:24.640 --> 04:28.880 like, phase out one crawler after another that we're currently using to get most of the data. 04:30.080 --> 04:37.680 Yes. So maybe just quickly about how do we do that? So, as of set, we scrape the data, we do that with 04:37.680 --> 04:43.360 it's a bit cut off, sorry, from the rendering, but we do that with a patchy hop ETL, put that into 04:43.920 --> 04:52.880 post-gressed field database, and then that gets queried with fast API front-end backend, angular 04:52.960 --> 05:00.160 front-end, and then react admin panel. So you get the JSON output from our API. You can test that, 05:00.160 --> 05:04.240 or you can check it out, it's currently in better, but we're about to switch to release version 05:04.240 --> 05:10.320 in the next weeks, and we always welcome feedback. So, please make an issue if you see something 05:10.320 --> 05:19.280 that doesn't behave as it should. What data do we have? So, yeah, we have data on, like, what do we 05:19.280 --> 05:24.480 name that mean by? Legislative data, sorry. So, we have data on bills, we have data on 05:24.480 --> 05:28.960 political actors, so, like, MPs, how they wrote, which committees that they are part of, 05:29.680 --> 05:35.760 when do they meet agendas in parliament, meeting minutes, we have votes in parliament, we have 05:35.760 --> 05:41.360 different degrees, so laws, we have a lot of documents, also, yeah, we extracted all the information 05:41.360 --> 05:46.000 from those documents as well, so it's all terrible. So, a lot of text and a lot of processes, basically. 05:46.960 --> 05:51.280 Yeah, so, for this is how if you query our API, this is what it could look like, 05:51.280 --> 05:57.200 I'm just one excerpt, for example, for this case, the concose of vote, of the canton of 05:57.200 --> 06:06.240 vote, this would be like an extract from the bills. Then, we also built like a GUI, but for the 06:06.240 --> 06:11.440 sole purpose of kind of like for people, also for non-technical people to be able to preview the data, 06:11.840 --> 06:15.920 it's not really supposed to be used as kind of a monitoring tool itself, but more to, 06:15.920 --> 06:21.920 so people know what's going on. I think quite a cool feature is also if you go through the GUI, 06:21.920 --> 06:31.520 you will all get the API query as the URL and hit HTML and JSON. So, what can you do with our data? 06:32.320 --> 06:38.240 We basically, this like, evaluated or are focusing on two use cases or like two use case 06:38.240 --> 06:43.840 clusters, so to say, one is kind of like one time or your recurring data analysis, 06:44.560 --> 06:48.720 and every focus on researchers and journalists, like data journalists in that sense, 06:48.720 --> 06:54.400 and then there's kind of the use case of like more continuous and real-time tool assisted 06:54.400 --> 07:00.160 political monitoring. By civil society organizations, by intercontinental conferences, that's a 07:00.160 --> 07:04.640 very Swiss thing, you don't have to know what that is, but then also journalists, so that they 07:04.960 --> 07:11.120 for example alerts, if something like something that interests them pops up. But we also want to 07:11.120 --> 07:16.160 kind of enable like exchange of data and information between different parlaments, 07:16.160 --> 07:24.480 between administration and parlaments, etc, etc. Yes, so just in an actual, what can you do? 07:24.480 --> 07:31.600 For example, as a researcher, for example, you can analyze bills, so how do they progress in parlaments, 07:31.680 --> 07:36.640 how do they go through the different stages, what are the success factors, like who submits them, 07:36.640 --> 07:42.640 etc, etc. You can evaluate and analyze topical trends, you can analyze decrease and their 07:42.640 --> 07:49.200 legislative footprint, basically, voting behavior, for example, but also links to other 07:49.200 --> 07:55.760 data sources, like media coverage, companionsion, company registry, etc, etc. And the goal is really 07:55.760 --> 08:00.320 here to kind of also build like an ecosystem or a research ecosystem around the state, as so. 08:00.960 --> 08:08.240 We want to encourage and enable researchers to cooperate, to share data, to share 08:08.800 --> 08:13.680 clean and rich data, so if they process the data, they can share it with others, or they can share 08:13.680 --> 08:19.760 pre-trained ML class, like machine learning classifiers, topic modeling, stuff, but also data processing 08:19.760 --> 08:28.560 pipelines. Yes, and just like how, how does this kind of like look like, but it's, yeah, it's not 08:28.560 --> 08:33.920 super important, but basically, again, we get the data from from the parlaments, but also from 08:33.920 --> 08:39.680 third parties, via some APIs, mostly, by web crawling, and then we can offer it to different 08:39.680 --> 08:45.360 researchers, they can cooperate, among each other, but also kind of going through our platform at some point. 08:47.040 --> 08:52.480 Yes, and like we just like finish that work, basically, and it is already like the first article 08:52.480 --> 08:58.560 published a couple of weeks ago, which looks at, for example, both in behavior, in this case, 08:58.560 --> 09:04.960 in Geneva, so basically on a left-right axis kind of compare how the different parties vote. 09:05.760 --> 09:12.000 There's also like another funny prototype happening, so there's a module here, kind of undell 09:12.000 --> 09:17.440 analyzed how animals are discussed in the Swiss parliament. So for example, here you see 09:17.440 --> 09:23.440 bunnies, like this is how much, like, like, rabbits were since the 1990s, like, 13 times mentioned 09:23.440 --> 09:30.800 in Swiss parliament, for example. Yeah, so what we are, what I do is, so basically we have this 09:30.800 --> 09:35.040 prototype or the special version now, but the question is really, okay, how can we keep the 09:35.040 --> 09:40.400 state accessible and how can we like operate this like open data infrastructure sustainably 09:40.400 --> 09:45.600 and in the public interest? So the questions are kind of okay, how do we govern it, 09:45.600 --> 09:50.880 democratically, how can we like operate it, efficiently and financially sustainably, 09:50.880 --> 09:55.520 and how can others contribute high quality and drop purple data? And then last but not least, 09:55.520 --> 10:02.240 what can we learn from this? Yeah, so basically, yeah, we're in the process of doing that. 10:02.240 --> 10:07.040 I got like a grant from the Swiss government to kind of focus on these two aspects. 10:08.480 --> 10:11.600 So we are currently evaluating and develop in principles. 10:11.600 --> 10:19.520 Yeah, so yeah, that's quite a quite self explanatory. I'm also objecting for the whole data in 10:19.520 --> 10:24.720 infrastructure and what we're also doing is kind of focusing on the business model currently. 10:24.720 --> 10:32.640 So we're based on OSS, but also open data or data and digital comments, project business models, 10:33.200 --> 10:38.720 we kind of prioritize, prioritize a couple of revenue streams that we have a look at that 10:38.720 --> 10:44.880 really evaluate more thoroughly. For example, usage, fees, membership fees, kind of like in the 10:44.880 --> 10:52.160 comments, comments area, dualizing, but also grants and donations, grants is how this project 10:52.160 --> 10:56.960 has been funded so far, but also kind of selling support, services and consultancy. 10:58.800 --> 11:04.160 And the next steps are really of this project is kind of developing, implementing and evaluating 11:04.320 --> 11:11.120 again, the business model governance and organizational structure contributes and then, but we also 11:11.120 --> 11:18.000 want to implement contribution and data quality mechanisms and then last but not least check out 11:18.000 --> 11:22.960 if there are kind of synergies between what we are doing and like a potential legislative database. 11:23.360 --> 11:27.360 And if that all works out, like we want to scale this kind of maybe as so to some extent to 11:27.360 --> 11:32.400 nationally, but maybe also to other kind of like data spaces in Switzerland. 11:33.520 --> 11:38.080 Yeah, and just very quickly what have you learned so far, super important to continuously 11:38.080 --> 11:43.200 involve the stakeholders and the users from the very, very start. That's what we did. 11:43.920 --> 11:48.640 Coalition of the willing, so we offer a hand to everybody and say do you want to join us? 11:48.640 --> 11:51.920 But if they say no, we don't get blocked by these kind of people. 11:52.880 --> 11:59.040 Yes, and then super important transparency for us, so we want to really lift this well, 11:59.040 --> 12:04.000 so everything is in the open, working in the open, there's no closed meeting notes, everything is on 12:04.000 --> 12:10.000 GitLab and focus. And so there's like various amounts of creeps, feature creep, 12:10.000 --> 12:16.880 blah blah blah, data creep or whatever, try to fight those. Yes, I think that's it from my side, 12:16.880 --> 12:22.800 and I would love to hear your feedback. And if you have any ideas or people, you think I should talk to you. 12:22.800 --> 12:26.000 Thank you so much. 12:29.200 --> 12:31.040 Couple of minutes for questions, it's great. 12:32.560 --> 12:35.040 Yeah, so I had to run through this. Yeah. 12:46.880 --> 12:50.640 We should also see that some point to access on that. Does it mean that there are some 12:50.640 --> 12:55.760 elements that send this for people to rest of that? No. 13:00.480 --> 13:05.440 Yeah, so the question was that you mentioned, and you asked the weather, whether there's 13:05.440 --> 13:11.440 economists that offer data through financial compensation? No, that was a misunderstanding. No, 13:11.440 --> 13:16.160 it's just you pay for for a monitoring tool for somebody to give you access to the data, 13:16.160 --> 13:22.000 but like data is all public, so we only currently supply public data, but it's not really accessible. 13:22.000 --> 13:26.000 So as if you want to scale, if you don't want to just have a look at like one specific 13:26.000 --> 13:31.840 parliament at one point in time, then it's just, yeah, it's a lot of effort, basically. 13:31.840 --> 13:38.000 Yeah, and the second was about the switch to creating to APIs. Yes, you were 13:38.000 --> 13:42.640 on an in-game experience, when the parliament is doubting what was in API. 13:42.640 --> 13:45.920 So that doesn't look like it was less switched than what we could scrape. 13:45.920 --> 13:50.320 And so in the end, we remained on scraping, although we have to switch for some parts, 13:50.320 --> 13:54.800 because of the APIs of what's working. Yeah, that's a very good question. 13:54.800 --> 13:59.840 So the question was whether, so from the experience in France, if I got this correctly, 14:00.640 --> 14:05.040 like the data through that was provided through APIs, it was less than what was 14:05.040 --> 14:07.120 available through the website, so to stay with the website. 14:08.160 --> 14:12.400 Yeah, so during the time that we did the project, there was no switch so far, 14:13.040 --> 14:17.760 but the thing is that basically, so we try to kind of counteract this through the standard 14:17.760 --> 14:23.280 basically. So within the standard, there's defined, okay, this has to be published, 14:23.280 --> 14:28.400 kind of in this in that way, so they can't do that. But yeah, 14:28.400 --> 14:32.640 now, but I think it's definitely a very good point, and I think it's also, 14:32.720 --> 14:39.200 I mean, this is a data project, but it's as well, it's also a kind of lobbying project to that extent. 14:40.720 --> 14:42.000 Yes, thank you. 14:42.000 --> 14:47.920 And very good point, I just seen other initiatives, similar, I just know from Brazil, 14:47.920 --> 14:51.920 I think here is a border, it's Black's ML. Yeah, with openly the open standard. Yeah, 14:51.920 --> 14:59.280 have to take a look. Yeah, yeah, Black's ML, so for the decrease, kind of for the text, 14:59.360 --> 15:06.560 we will include LexML and a comment also, which is connected to LexML. Yeah, that's a little, that's, yeah. 15:08.400 --> 15:12.960 Yeah, and if you have other questions, I'm going to be outside for the next five minutes or whatever, 15:12.960 --> 15:17.120 or just until there's no more questions. Thank you.