GHGA Metadata Schema Version 0.4.0 Released

In December 2021, we released version 0.4.0 of the GHGA Metadata Model, that contains key metadata information and is a solid foundation for further refinement of the GHGA metadata schema.

The Metadata Workstream provides the model for the data stored in GHGA. It is a joint effort of the conceptual and technical departments of GHGA. The team is composed of experts with extensive knowledge from various areas, including database technologies, legal framework, community standards and FAIR data principles, that feeds into the definition of the concept of GHGA Metadata

The GHGA Metadata model was publicly released within GHGAs GitHub repository. This core model serves as the foundation for further developments of our metadata model. Capturing essential information, such as donors of specimens, experiments and the analysis of data, already increases the reusability of data deposited within GHGA. 

Additionally we have identified ontologies that define the data that needs to be captured, so data submitters can see upfront what information needs to be provided and are aware of the semantics of the metadata they provide. Ontologies can be described as hierarchical vocabularies that capture term definitions, descriptions and URIs (Uniform Resource Identifiers). The evaluation of ontologies is an ongoing process that will support GHGAs efforts in complying with W3Cs vision of the “Web of Linked Data”, the Semantic Web.

This core model will be extended in the near-term future to capture domain specific information. Currently we are working on accommodating the use cases of the rare disease and cancer community, while making sure we are able to serve the entire omics community in the future.