Distributed Databases Federation

ubi:fedquery An Infrastructure for the Federation of Distributed Databases and Data Repositories

Complex organizational structures in today’s governments and business world, the continuously growing volume of stored data, as well as the numerous types of data sources and data storage solutions comprise the heterogeneous and diverse scenery wherein organizations have to live in and operate. However, efficient operations generate a new necessity and strong requirement for organizations to have timely access and unified view to both structured and unstructured data that come from diverse systems, databases and data repositories.

To this end, ubi:fedquery promotes the real-time aggregation and federation of data coming from geographically-distributed, heterogeneous data sources. Creating a federated central repository of aggregated data, ubi:fedquery provides access to distributed data sources as they are part of a single data repository and deploys several mechanisms that allow the almost real-time update of the aggregated data from the respective primary records available through the geographically distributed databases and repositories.

The database schema of the federated central repository to be created will cover (at least) the minimum subset of the obligatory data fields existed in all types of the distributed data sources to be integrated. The database schema will be extended so as to cover the descriptive metadata about the origin and the provenance of the aggregated data and the federated repositories. This minimum subset of obligatory metadata of the aggregated records constitutes the peer-schema upon which the federation of the distributed databases and repositories will be based. The federation of the distributed databases will be supported by a hybrid mechanism of ubi:fedquery that will allow the aggregation of the diverse data records (on the one hand), providing an interoperability layer with a set of Web Services for the instant update of the respective record (the peer-schema related dataset) in the federated central repository once a record in a distributed repository is modified, and the federated query (based on multi-criteria searches upon the peer-schema related dataset) to all geographically distributed databases and repositories.

Each distributed data source taking part in the federation process will invoke the implemented set of Web Services, as well as utilize the designed Data Schemas in order to interoperate with the federated central repository and perform the required updates and insertions of new data coming from the distributed databases. Thus, the federated central repository will be continuously updated and consistent with the complete set of databases.

By utilizing ubi:fedquery, your organization will be able to provide both the facilities for the execution of federated queries to geographically-distributed, heterogeneous data sources, as well as the continuous integration of information inside the federated central repository in a transparent way.