Data Hubs


The data hubs are located at seven major GHGA sites across Germany: at the DKFZ in Heidelberg, the University of Tübingen, the Technical Universities in Munich and Dresden (TUM and TUD), the University of Cologne, the MDC in Berlin and the University Kiel.

Once in operation, the GHGA data portal will present as a single point of contact for the up- and download and analysis of genomic data. Behind the scenes, this ‘central’ face of GHGA is serviced by seven data hubs operating as a federated network.

The data hubs are connected to local well-established omics centres (e.g. sequencing centres, but also proteomics facilities). These generate a significant portion of the omics data and associated technical metadata in Germany - and are therefore major providers of data for GHGA. The data hubs provide - besides storage and compute infrastructure for GHGA services - significant resources, professional operations, technical security, and scalability to GHGA. One great benefit of this data hub network is that it will establish replication services across data hubs - with geo-redundant storage and backup provided at each data hub.

The originally diverse set ups of the data hubs are in the process of being consolidated towards a joint infrastructure. The data hub at DKFZ in Heidelberg is taking the lead role in this endeavour and will offer central services that are required to realise the federated GHGA infrastructure.

Recently, the legal framework which regulates the details of cooperation between the partners was signed by all data hubs - a big step towards a joint GHGA infrastructure across the participating hubs.