Lessons learned from urgent computing in Europe: Tackling the COVID-19 pandemic
Abstract
PRACE (Partnership for Advanced Computing in Europe), an international not-for-profit association that brings together the five largest European supercomputing centers and involves 26 European countries, has allocated more than half a billion core hours to computer simulations to fight the COVID-19 pandemic. Alongside experiments, these simulations are a pillar of research to assess the risks of different scenarios and investigate mitigation strategies. While the world deals with the subsequent waves of the pandemic, we present a reflection on the use of urgent supercomputing for global societal challenges and crisis management.
In this perspective we present an analysis of various scientific efforts undertaken in Europe to combat the COVID-19 pandemic based on advanced and computationally intensive numerical simulations that require the use of large-scale computers. The five largest European supercomputing centers supported some of these efforts with free-of-charge computational resources, through the Partnership for Advanced Computing in Europe (PRACE).*
This paper presents a number of computational models and methods relevant to the fight against the COVID-19 pandemic, including biostructural studies, docking and screening for pharmaceutical applications, computational fluid dynamics to study droplet spread and transmission of the virus, organ modeling to support diagnosis and treatment of the disease, and epidemiology. We discuss the use of artificial intelligence (AI) in these areas and the readiness of the codes to run on high-performance computers.
The COVID-19 pandemic began between late 2019 and early 2020 in the city of Wuhan, Hubei, China. Identification of the virus occurred relatively quickly. The viruss genetic profile and the crystallographic blueprint of its spike proteins were published as open science to the community within weeks, providing the crucial structural information needed to develop biomodels that ultimately led to messenger RNA vaccine formulation (1). At the same time, scientific analysis of the disease began rapidly at different levels:
At the molecular level, researchers started running molecular simulations of these structures immediately after the basic structural data became available.
At the global level, epidemiological models were tested and refined, the spread of the virus was modeled, and research groups widely shared these data.
Recognizing the fundamental importance of computer-based approaches to these urgent problems, the European supercomputing centers have established urgent computer access through PRACE, putting into practice for the first time a long-planned idea of a fast-track program for crisis situations. A small scientific committee, which leveraged expertise from the existing PRACE Scientific Steering Committee and PRACE Access Committee, began work on 21 March 2020. The first requests for computing time arrived only a couple of days later. The application system was simple and swift to attract the largest possible number of projects. Fig. 1 shows a schematic depiction of the process. The main requirements for applicants to be awarded were a convincing argument that the proposed research would help with the COVID-19 crisis and a commitment to making the resulting scientific knowledge available to the wider scientific community (thanks to the resources of Fenix ICEI, BioExcel, and EMBL-EBI). On the side of the scientific committee, PRACEs goal was to implement a rigorous and rapid peer-review process.
» data-icon-position= » » data-hide-link-title= »0″>
The scientific committee originally highlighted five research areas as targets for the urgent computing Fast Track Call:
Biomolecular research to understand the mechanisms of viral infection
Bioinformatics to understand mutations and evolution
Biosimulations to develop therapeutics and/or vaccines
Epidemiological analyses to understand and predict the spread of disease
A final broad category to understand and mitigate the impact of the pandemic
The scientific committee welcomed any academic or industrial project that included a commitment to open science and open data.
To accelerate collaborative progress based on the outcome of the Fast Track projects, PRACE and its American counterpart, XSEDE (The Extreme Science and Engineering Discovery Environment), jointly launched an international COVID-19 High Performance Computing (HPC) Knowledge Exchange, a fortnightly meeting to foster communication, collaboration, and sharing of code and data. The webinars proved very successful with an average of 50 participants and covered topics such as virtual screening, drug design, biomolecular simulations, bioinformatics, epidemiological studies, and the spread of virus via droplets.
Fast Track Call Project Assessment
A project in a biannual PRACE Project Access Call needs the positive scientific evaluation of the PRACE Access Committee, plus the complementary technical analysis of the computing centers. Minimum requirements for PRACE Project Access are at least 35 million core hours running on about 1,024 cores concurrently, with good scalability of code, typically 2 Gb per core memory usage, and from 9 to 100 TB of disk space (small variations occur for the different centers).
On the other hand, in the Fast Track Call, science was the main driver and so the scientific committee adapted the requirements regarding the scalability of the code on a case-by-case basis. They customized the requirements regarding the size of the allocation not only by balancing between the large centers but also by bringing in several national institutes.
For PRACE Project Access Calls, the standard response time from the PRACE Access Committee is 5 mo. In stark contrast to this, a much smaller scientific committee drove the COVID-19 Fast Track Call, which mobilized human resources in a variety of settings: Reviewers completed their tasks in a few days, PRACE staff managed the administrative process with high priority, and the supercomputer center staff that provided computing time and storage dealt with allocations in about a week on average.
That the Fast Track Call received more than 50 proposals in just 2 mo documents the determination of the scientific community to help. Over a 4-mo period, PRACE received 80 applications from multiple fields, ranging from the most predictive docking and screening runs for pharmaceutical applications to fluid dynamics studies of droplet dispersion at the physical level to epidemiology on a global scale (the societal level). For comparison, the regular 19th and 20th PRACE calls for proposals for Project Access received 57 and 59 proposals, respectively, from a wide range of scientific disciplines. Fig. 2 provides an overview of the number of proposals (accepted fraction in darker color, rejected one in lighter color, and the total number of proposals received is inscribed in the circle), the domains, and readiness for high-performance computing are shown in Fig. 2. A statistical breakdown of the proposals (country of the principal investigator [PI], sex ratio of the PI, and the number of reviewers per project, including the number of projects in each category and career stage of the PI) is shown in Fig. 3.
» data-icon-position= » » data-hide-link-title= »0″>
» data-icon-position= » » data-hide-link-title= »0″>
The Fast Track Call awarded half a billion computer hours. Successful projects used those resources over a 6-mo period. A typical biannual PRACE Project Access Call awards 2.5 billion computer hours to projects that run for 12 mo. Thus, the Fast Track Call, over a 6-mo period, is equivalent to about 40% of a single PRACE Project Access Call.