Researchers can now analyze data from Statistics Denmark on national HPC facilities
From December 4, 2025, Danish researchers will be able to analyze sensitive data from Statistics Denmark (DST) at a number of national HPC facilities while maintaining DST's security rules and data confidentiality.
This is done through a new so-called API (application programming interface), which connects Statistics Denmark's Data Window with the country's HPC (High Performance Computing) facilities. The solution has been developed and launched by Statistics Denmark in collaboration with DeiC and the universities' HPC environments to ensure Danish researchers flexible and secure access to advanced data processing.
New technical bridge between Statistics Denmark and the universities
For 175 years, Statistics Denmark has collected data about Denmark and Danes, and since 1988, researchers at Danish research institutions have been able to work with this data in closed computer environments called research machines via Statistics Denmark's microdata schemes.
The new API solution makes it possible to move the analysis itself to university supercomputers - while fully maintaining data security. This gives researchers access to far greater computational capacity and modern analysis tools without compromising security.
GenomeDK and DTU Computerome are the first players to sign an agreement with DST on this solution, and from December 4 it will be possible to create projects via these HPC facilities.
Secure pseudonymization and controlled data flow
As an authorized institution under DST's microdata schemes, data is today made available to certified users who order via Denmark's Data Window. All data is pseudonymized before researchers have access to it and can use it in their research.
With the new API, pseudonymized data can then be processed at an approved HPC facility where the researcher has been granted or purchased computing time. The transfer is done through the new API, which is based on a so-called pull architecture, where the HPC centers themselves retrieve the necessary data and instructions when they are ready to receive them. This means that Statistics Denmark does not need to establish technical connections to each individual facility, which both increases security and makes the solution easier to maintain and expand.
Data processing then takes place at the HPC facility, where the researcher has access to highly specialized hardware, complex software solutions and technical support. Once the analysis is complete, the results must be returned to DST for approval before the researcher can retrieve them for further use.
Michael Specht, Project Manager at DST, emphasizes:
"It's about moving the calculations to where the skills and resources are - without compromising on security and control. Our most important principle is that data should never get out of our control. Therefore, the entire solution has been built around the fact that all data transfers take place via the Danish Data Window, and that we keep track of every single movement."
One solution for broad use
The solution is the result of a close collaboration between Statistics Denmark, Danish e-infrastructure Consortium (DeiC) and the universities' HPC centers and was developed at the request of the Coordinating Body for Register Research (KOR). The goal from the start has been to develop one standardized solution that all Danish HPC centers and other research organizations can join - and thus avoid having to develop separate integrations for each center.
Project manager Rune Gamborg Ørum from DeiC sees several other user perspectives in the solution;
"We look forward to the solution going live, and see opportunities for other types of organizations than HPC facilities, such as sector research institutions, to potentially benefit from the solution in the long term."
The chairperson of the Coordinating Body for Register Research is also pleased that the solution is now a reality:
"I really appreciate the fact that DEIC and Statistics Denmark have succeeded in implementing this project. It opens up more opportunities for researchers to analyze large and complex data sets."
About the project
The development of the new API started in 2023. Statistics Denmark has been responsible for project management and development of the secure access to data, while DeiC has played a key role in the technical collaboration with the HPC centers and in developing the connection between the API and the HPC facilities. DeiC has also developed and tested a proof-of-concept code that the HPC centers have been able to adapt to their local systems.
The HPC centers have contributed extensively and continuously with technical feedback and testing, and their involvement has been crucial to ensure that the solution works in practice across different platforms. The result is a scalable and robust architecture.
The technical implementation of the API and integration with Denmark's Data Window is provided by Copenhagen data.
If you want to know more about the solution, you can contact DST's research service. If you want to know more about DeiC's role in the project, please contact Sensitive Data Coordinator Jakob Bech Petersen