No Research or Innovative Care Without Data


The impact data sharing and GHGA can have for disease communities.

Regardless of research or clinical care, data is power. Genomic information from one individual only holds that much information. Only when combining data from larger cohorts, will research gather the power for robust results and new scientific discoveries. 

GHGA aims to bring these data treasures together and unlock the data’s full potential. Sharing data, in a protected and safe manner, enables scientific discovery and the development of diagnostic tools and therapies.



Omics is a vital part of biomedical research and increasingly part of standard health care


Rapid technological advances have dramatically decreased sequencing costs, making the technology affordable and accessible for researchers. In addition, advanced bioinformatic methods have made sequencing a standard research and diagnostics technique causing a vast amount of data being produced. The rapid growth of available data is a major challenge, but also an unprecedented opportunity for research. 

Genomic data is becoming an increasingly important tool in healthcare. It allows the diagnosis of (rare) diseases whose underlying genetic modifications can often only be identified by genome sequencing. Screening for disease risk genes (e.g. cancer) can also assist in disease prevention, allowing closer medical precautions of individuals at risk. Individually tailored therapy approaches based on genome analysis account for biological variation in patients and have tremendous potential in helping patients not responding to standardised therapy approaches. So from cancer to rare diseases, genomics can greatly help with monitoring and improving health conditions and save lives.  

Genomic data is sensitive, and patients know that. Yet patients are willing to donate their data to science - hoping to help future patients with new developments in science. A study involving cancer patients found that 97 percent are generally willing to make clinical data available for biomedical research purposes. The major condition for their consent? High data security. A goal GHGA strives towards.

So far, molecular analysis of genomic data for diagnosis or personalised therapies is not part of standard health care in Germany. However, this is about to change. In 2021, a new law laid the legal groundwork for a model project aiming to integrate genomic medicine in daily clinical care over 5 years starting in 2024. 


Already, omics can have a big impact on health care. Here we highlight five examples, where omics data have the potental to revolutionise health care and explain how GHGA is part of this process.


Data sharing during crisis: the corona pandemic

Without genome research the corona pandemic would have been very different. We would not have been able to decode the SARS-CoV-2 genome, develop a vaccine in record breaking time, or monitor emerging variants and thereby inform public health policy. Without sequencing of human host genomes, we would not have been able to predict individual risk factors. Data sharing enables and accelerates research discovery. In crisis situations, the need to share research data quickly, yet securely, is most pressing. Establishing infrastructure to facilitate secure, yet democratic data access for legitimate research questions will provide Germany with a solid foundation for the future. During the corona pandemic GHGA was part of the CoGDat initiative, and established a portal to collect and share Sars-CoV-2 raw sequencing data, enabling molecular surveillance of virus variants across Germany.

Sharing omics data for precision cancer medicine

Comprehensive genomic and transcriptional analysis can enable personalised medicine and improve patient care. The ongoing multicenter observational NCT/DKFZ/DKTK MASTER (Molecularly Aided Stratification for Tumour Eradication Research) study demonstrates that molecular analysis provides diagnostic and therapeutic benefits for rare cancer patients. Using standardised precision oncology workflow, the tumour genome of the patient is sequenced and bioinformatically analysed. The results are discussed in a multidisciplinary tumour board and evidence-based clinical care recommendations are given accordingly. With new therapy suggestions for more than 85% of cases and increased progression-free survival rates for up to a third of patients, the MASTER program proves the clinical use case for genome medicine and underlines the need for reference data. The MASTER data-set will be one of the first data sets available within GHGA.

Reference Genome Collection for Rare Diseases

Rare diseases are caused by variants in the genetic code and are often rare in the general population. The identification of these rare genetic changes relies on a large collection of reference genomes to which the patient genome can be compared to. The rarer the disease the larger the data sets need to be in order to find the genetic abnormality. For cases occurring with a frequency of 1 in 1 000 000, a cohort of 100 million genomes is needed to robustly call rare disease cases. Only if large reference data sets are established and shared across clinics can the diagnosis of patients be successful. The impact of a swift and correct diagnosis and disease treatment or management is an immense success for the individual patients - but it also eases the pressure on the health care system. Using sequencing techniques early in the diagnostic process can more than triple the diagnostic rate, at one-third of the cost per diagnosis. GHGA is closely working together with the rare disease community to ensure that e.g. workflows tailored to their community needs are being provided.




AI assisted health care requires data

Artificial intelligence (AI) is already widely used in healthcare, with increasing fields of application: robot assisted surgery, apps to help with the identification of skin cancers, evaluation of MRI and X-ray images, wearables for diabetes patients, fitness trackers, etc. Large datasets are needed to train the AI algorithms behind the applications. The further expansion of AI assisted health care has the potential to save billions of US$ per year, worldwide. GHGA will provide secure access to large, homogenised data sets to enable and promote AI-based big data analysis for reserach porpuses.


Using population data for research

The German National Cohort (NAKO Gesundheitsstudie) is a long-term population study organised and conducted by a network of German research institutions. The aim is to shed light on the causes of common diseases such as cancer, diabetes, cardiovascular diseases and infectious diseases, and to uncover the role of e.g. environmental factors, nutrition, lifestyle and our genes. The comprehensive health data collection from a total of 200,000 participants, now expanding to omics data, will serve to elucidate the causes and risk factors of these widespread diseases, as well as identify opportunities for early detection and prevention. The omics data will be stored within GHGA - making this large population-based data set also available for secondary research use.