Gå til hovedindhold

Supercomputing was a game changer for Danish Cancer Society Research Center

A group of researchers started from scratch 10 years ago using computational biology. Now, both local high performance computing, national HPC and international HPC have been used in order to explore new areas of cancer research at Danish Cancer Society Research Center.
Birgitte Vedel Thage
fre, 12/17/2021 - 14:40
Extract from a simulation of the tumour suppressor gene p53 in complex with DNA using DeiC Throughput HPC
Extract from a simulation of the tumour suppressor gene p53 in complex with DNA using DeiC Throughput HPC. Photo: Matteo Lambrughi.

High Performance Computing (HPC) can elevate research and uncover new ground. Researcher Elena Papaleo has shown this to the Danish Cancer Society Research Center through the last 10 years, where she has done research using several supercomputers both in Denmark and abroad.

Started from scratch and followed the dreams

The journey started more than 10 years ago. Before Elena Papaleo and her group joined Danish Cancer Society Research Center (DCRC) there was no research in computational biology.

“We started from scratch. Danish Cancer Society Research Center did not even know that they needed HPC facilities. It was an interesting learning process as it became clear that local facilities were too small, and therefore after some years a collaboration was initiated with the national HPC facility called Computerome. We could not model what we dreamed to do with only a few clusters,” says Elena Papaleo, leader of Cancer Structural Biology (former the Computational Biology Laboratory) at the Danish Cancer Society Research Center and Associate Professor, Department of Health Technology Bioinformatics at the Technical University of Denmark.

HPC competences built on pilot grants and learning by doing

Small steps were taken in order to be able to translate HPC code to larger scale in the process of moving from local HPC (Tier-2) to larger HPC facilities such as national HPC (Tier-1). Tier-1 facilities were explored both in Denmark (Computerome and ABACUS2.0) and abroad (E.g., Archer in UK). Furthermore, there was good experience with using DeiC pilot grants to national HPC, and Elena Papaleo has tried facilities many places in the world – also some of the largest machines in France and Spain (Tier-0).

“There is a learning curve when you need to interact with a new infrastructure. All the team members involved were encouraged to document learnings and challenges during the work with supercomputers specific for our work. The experiences we made by interacting with HPC facilities all over the world were anchored on an internal website at DCRC, where tips and tricks were kept for group members. This was important for evolving our field-specific HPC competences on top of general Wiki pages with HPC information,” says Elena Papaleo.

Define HPC needs and avoid wrong HPC architecture

The team behind Cancer Structural Biology at DCRC
Photo: Lissa Churchward

The team behind Cancer Structural Biology at DCRC.

Elena Papaleo and her group have experienced the importance of accessing the right HPC architecture for solving their research questions. One of their first applications to PRACE was granted allocation to a machine that was not designed for their work. They therefore had to revise the original research purpose to a smaller scope as the HPC architecture was not optimal.

“It was a challenging learning to figure out what to do with the machine. The match between the HPC architecture and research question is very important. As a beginner many years ago there was no previous experiences to rely on. A good advise to new institutions onboarding supercomputing would be: Define the HPC needs for the field-specific research in a broader picture. Try to establish a workshop where people that have used supercomputing are invited, and build on their experience,” says Elena Papaleo.

Demonstrators were key for building new infrastructure

Different obstacles were encountered using HPC depending on the aim of the research project. Some projects conducted in collaboration with clinicians at hospitals required that HPC was compliant with GDPR. In other projects, it was important to ensure enough capacity for storage and time needed for the computation as large amounts of data were generated e.g., when modelling protein structures and their dynamics. Analysis of images can create data in Terabyte-scale that require expansion of local storage capacity but also call for the need of using external storage facilities.

“We needed to show what resources were needed through demonstrators. For example, if we had one year access to Tier-1 or Tier-0 via PRACE, we could demonstrate what we could do and provide using large-scale HPC. Then it could be decided by top-management how to invest in order to make the needed infrastructure sustainable. The demonstrator part was a 2-3 year long process. Eventually, HPC support was implemented on the core budget for the entire institute. Today each project has allocated a part of the budget to HPC and storage,” says Elena Papaleo.

Bioinformatics taskforce: Cross-fertilizing the use of HPC

Group logo from Cancer Structural Biology at DCRC
Photo: DCRC

Two years ago, a new initiative was made called “the bioinformatics taskforce at DCRC”. This taskforce brings together people doing computational work through the entire institute, they plan training sessions within supercomputing, and discuss needs in terms of infrastructure such as data storage and continuous access to data as well as access to parallelized computing in order to reduce cost and time for computation.

“The taskforce is a result of the development over the past 10 years. Today around 2-3 workshops are organized every year, and around 20 people are involved across the 22 groups at DCRC. The taskforce has cross fertilized HPC competences across internal research groups as-a-service, and DCRC has moved to a multidisciplinary approach for looking at core questions within cancer research,” says Elena Papaleo.

What has DCRC gained or learned as an institution?

Not only did the group of Cancer Structural Biology at DCRC get access to needed compute power, they also got storage and backup to their increasing volume of data (e.g. images), and they strengthened their position as an interesting research partner.

“It requires collaborative efforts, tackling the same scientific problem from different points of view, and it would not have been possible without making many calculations in parallel at the same time. DCRC have been able to explore cancer science on a new level using HPC,” says Elena Papaleo.

The Danish Cancer Society Research Center also appreciates her work.

“Elena Papaleo and her group are trailblazers in a very important field where science as well as requirements develop fast. Their work in this area is key for our competitiveness in science. The Bioinformatics Taskforce was formed to meet variable needs in analyzing biological data, to share experiences and developing together. As such it provides an important addition to our research center’s learning environment,” says Mef Christina Nilbert, Research Director at DCRC.

A HPC sandbox is available through DeiC

If you have ideas for a pilot project, where HPC could be used, then remember the HPC sandbox possibility via DeiC. An application can be sent to: hpc-sandbox@listserv.deic.dk.

Scientific output behind this story

DeiC continuously monitors who uses the national and international HPC resources, and this article is based on the online publication list that hosts more than 1300 publications. A total of 255 publications included the use of international HPC, and DCRC were co-author on 17 of these publications (see Table 1).

Publications Tier-1 Tier-0 DeiC Pilot Grant
Table 1. A total of 17 publications included the use of international HPC from 2016 to 2021 from Danish Cancer Society Research Center (DCRC) based on information from HPC-Publications. All publications below also included the use of local HPC at DCRC (3 GPU based clusters).
Florentsen, C. D. et al. (2021) Annexin A4 trimers are recruited by high membrane curvatures in giant plasma membrane vesicles. Soft Matter. DOI: 10.1039/d0sm00241k Yes No No
Papaleo, E. (2021) Investigating Conformational Dynamics and Allostery in the p53 DNA-Binding Domain Using Molecular Simulations. Methods in Molecular Biology. DOI: 10.1007/978-1-0716-1154-8_13 Yes No No
Sora, V. et al. (2021) Bcl-xL Dynamics under the Lens of Protein Structure Networks. Journal of Physical Chemistry B. DOI: 10.1021/acs.jpcb.0c11562 Yes No No
Faienza, F. et al. (2020) S-nitrosylation affects TRAP1 structure and ATPase activity and modulates cell response to apoptotic stimuli. Biochemical Pharmacology. DOI: 10.1016/j.bcp.2020.113869 Yes No No
Fas, B. A. et al. (2020) The conformational and mutational landscape of the ubiquitin-like marker for autophagosome formation in cancer. Autophagy. DOI: 10.1080/15548627.2020.1847443 Yes No Yes
Kumar, M. et al. (2020) A pan-cancer assessment of alterations of the kinase domain of ULK1, an upstream regulator of autophagy. Scientific Reports. DOI: 10.1038/s41598-020-71527-4 Yes No No
Brix, D. M. et al. (2019) Release of transcriptional repression via ErbB2-induced, SUMO-directed phosphorylation of myeloid zinc finger-1 serine 27 activates lysosome redistribution and invasion. Oncogene. DOI: 10.1038/s41388-018-0653-x Yes No No
Holdgaard, S. G. et al. (2019) Selective autophagy maintains centrosome integrity and accurate mitosis by turnover of centriolar satellites. Nature Communications. DOI: 10.1038/s41467-019-12094-9 Yes Yes No
Konig, S. M. et al. (2019) Alterations of the interactome of Bcl-2 proteins in breast cancer at the transcriptional, mutational and structural level. PLOS Computational Biology. DOI: 10.1371/journal.pcbi.1007485 Yes No Yes
Lambrughi, M. et al. (2019) Analyzing Biomolecular Ensembles. Biomolecular Simulations: Methods and Protocols. DOI: 10.1007/978-1-4939-9608-7_18 Yes No Yes
Bignon, E. et al. (2018) Computational Structural Biology of S-nitrosylation of Cancer Targets. Frontiers in Oncology. DOI: 10.3389/fonc.2018.00272 Yes No No
Di Rita, A. et al. (2018) HUWE1 E3 ligase promotes PINK1/PARKIN-independent mitophagy by regulating AMBRA1 activation via IKK alpha. Nature Communications. DOI: 10.1038/s41467-018-05722-3 Yes No Yes
Marinelli, P. et al. (2018) A single cysteine post-translational oxidation suffices to compromise globular proteins kinetic stability and promote amyloid formation. Redox Biology. DOI: 10.1016/j.redox.2017.10.022 No Yes No
Papaleo, E. et al. (2018) Molecular dynamics ensemble refinement of the hetero-geneous native state of NCBD using chemical shifts and NOEs. PeerJ. DOI: 10.7717/peerj.5125 Yes No Yes
Lambrughi, M. et al. (2016) DNA-binding protects p53 from interactions with cofactors involved in transcription-independent functions. Nucleic Acids Research. DOI: 10.1093/nar/gkw770 Yes No No
Oskarsson, K. R. et al. (2016) A single mutation Gln142Lys doubles the catalytic activity of VPR, a cold adapted subtilisin-like serine proteinase. Biochimica et Biophysica Acta-Proteins and Proteomics. DOI: 10.1016/j.bbapap.2016.07.003 Yes No No
Papaleo, E. et al. (2016) The Role of Protein Loops and Linkers in Conformational Dynamics and Allostery. Chemical Reviews. DOI: 10.1021/acs.chemrev.5b00623 Yes No No

Read more