UBITECH participates at the kick-off meeting hosted by CERTH (January 30 & 31, 2024) in Thessaloniki, Greece of the CEDAR Research and Innovation Action, officially started on January 1st, 2024. The project is funded by the European Commission (Grant Agreement No. 101135577) and spans on the period January 2024 – December 2026. CEDAR will (1) identify, collect, fuse, harmonise, and protect complex data sources to generate and share 10+ high-quality, high-value datasets relevant for a more transparent and accountable public governance in Europe; (2) develop interoperable and secure connectors and APIs to utilise and enrich 6+ Common European Data Spaces (CEDS); (3) develop innovative and scalable technologies for effective big data management and Machine Learning (ML) operations; (4) deliver robust big data analytics and ML to facilitate human-centric and evidence-based decision-making in public administration; (4) validate the new datasets and technologies (TRL5) in the context of fighting corruption, thus aligning with the EU strategic priorities: digitalisation, economy, democracy; (5) actively promote results across Europe to ensure their adoption and longevity, and to generate positive, direct, tangible, and immediate impacts.
Within CEDAR, UBITECH coordinates the identification and definition of the open and proprietary data sources, repositories and collections to be used in the project, which will further guide the design of CEDAR data flows within the conceptual architecture and thedevelopment of CEDAR Data- and ML- Ops, as well as the data to be shared / integrated with existing CEDS. Moreover, UBITECH will significantly contribute to the Data Modelling, Harmonisation and Alignment mechanisms that will leverage semantic technologies to define the data models providing a common data structure to be used by different use cases, as well as to the Data Protection and (Pseud)Anonymisation mechanisms that will protect the personal data identified incorporating several data pseud-anonymization techniques including masking, generalisation, and perturbation, and assessment methods with regards to the re-identification risk and how to mitigate it.