GHGA Vortragsreihe: Christian Fufezan (virtuell)

Christian Fufezan von der Universität Heidelberg und GlaxoSmithKline wird am 12. November 2025 im Rahmen der GHGA-Vortragsreihe „Advances in Data-Driven Biomedicine” über „urgap – unified resource governance and data provenance” sprechen. Dieser Vortrag wird auf Englisch gehalten.

Hier anmelden.

Zusammenfassung: 

Data intensive research now depends on repeated processing of large file collections, yet current practice often duplicates effort and obscures lineage. We describe urgap, a cloud-native framework for file-based data engineering that makes identity and provenance intrinsic to every artifact. To achieve this, urgap relies on location-agnostic data identity captured by the urgap canonical file signature (ucfs) and "Provenance as Code" (PaC) architecture. Outputs carry their own history, enabling safe reuse and automatic skipping of redundant steps across projects and platforms. Furthermore, urgap enables standardized microservices and exposure of all encapsulated processes as RESTful endpoints and Model Context Protocol (MCP) servers, the executors of modern agentic AI approaches. I will present Urgap, an open-source foundation for file-based data engineering that facilitates standardized data provenance, aligns with FAIR principles, and addresses the increasingly distributed nature of data generation and consumption in rapidly developing environments. It reduces operational costs and environmental impact while enhancing adaptability to emerging technologies and ensuring compatibility across cloud providers and orchestration platforms.