GHGA Lecture Series: Christian Fufezan (virtual)
- 12 Nov 2025
Christian Fufezan from the University Heidelberg and GlaxoSmithKline will talk at the GHGA lecture series "Advances in Data-Driven Biomedicine" about “urgap - unified resource governance and data provenance” on November 12, 2025.
Register here.
Abstract:
Data intensive research now depends on repeated processing of large file collections, yet current practice often duplicates effort and obscures lineage. We describe urgap, a cloud-native framework for file-based data engineering that makes identity and provenance intrinsic to every artifact. To achieve this, urgap relies on location-agnostic data identity captured by the urgap canonical file signature (ucfs) and "Provenance as Code" (PaC) architecture. Outputs carry their own history, enabling safe reuse and automatic skipping of redundant steps across projects and platforms. Furthermore, urgap enables standardized microservices and exposure of all encapsulated processes as RESTful endpoints and Model Context Protocol (MCP) servers, the executors of modern agentic AI approaches. I will present Urgap, an open-source foundation for file-based data engineering that facilitates standardized data provenance, aligns with FAIR principles, and addresses the increasingly distributed nature of data generation and consumption in rapidly developing environments. It reduces operational costs and environmental impact while enhancing adaptability to emerging technologies and ensuring compatibility across cloud providers and orchestration platforms.