Lessons learned from urgent computing in Europe: Tackling the COVID-19 pandemic
Abstract
PRACE (Partnership for Advanced Computing in Europe), an international not-for-profit association that brings together the five largest European supercomputing centers and involves 26 European countries, has allocated more than half a billion core hours to computer simulations to fight the COVID-19 pandemic. Alongside experiments, these simulations are a pillar of research to assess the risks of different scenarios and investigate mitigation strategies. While the world deals with the subsequent waves of the pandemic, we present a reflection on the use of urgent supercomputing for global societal challenges and crisis management.
In this perspective we present an analysis of various scientific efforts undertaken in Europe to combat the COVID-19 pandemic based on advanced and computationally intensive numerical simulations that require the use of large-scale computers. The five largest European supercomputing centers supported some of these efforts with free-of-charge computational resources, through the Partnership for Advanced Computing in Europe (PRACE).*
This paper presents a number of computational models and methods relevant to the fight against the COVID-19 pandemic, including biostructural studies, docking and screening for pharmaceutical applications, computational fluid dynamics to study droplet spread and transmission of the virus, organ modeling to support diagnosis and treatment of the disease, and epidemiology. We discuss the use of artificial intelligence (AI) in these areas and the readiness of the codes to run on high-performance computers.
The COVID-19 pandemic began between late 2019 and early 2020 in the city of Wuhan, Hubei, China. Identification of the virus occurred relatively quickly. The viruss genetic profile and the crystallographic blueprint of its spike proteins were published as open science to the community within weeks, providing the crucial structural information needed to develop biomodels that ultimately led to messenger RNA vaccine formulation (1). At the same time, scientific analysis of the disease began rapidly at different levels:
-
At the molecular level, researchers started running molecular simulations of these structures immediately after the basic structural data became available.
-
At the global level, epidemiological models were tested and refined, the spread of the virus was modeled, and research groups widely shared these data.
Recognizing the fundamental importance of computer-based approaches to these urgent problems, the European supercomputing centers have established urgent computer access through PRACE, putting into practice for the first time a long-planned idea of a fast-track program for crisis situations. A small scientific committee, which leveraged expertise from the existing PRACE Scientific Steering Committee and PRACE Access Committee, began work on 21 March 2020. The first requests for computing time arrived only a couple of days later. The application system was simple and swift to attract the largest possible number of projects. Fig. 1 shows a schematic depiction of the process. The main requirements for applicants to be awarded were a convincing argument that the proposed research would help with the COVID-19 crisis and a commitment to making the resulting scientific knowledge available to the wider scientific community (thanks to the resources of Fenix ICEI, BioExcel, and EMBL-EBI). On the side of the scientific committee, PRACEs goal was to implement a rigorous and rapid peer-review process.
» data-icon-position= » » data-hide-link-title= »0″>
Scheme of the workflow for applications to the PRACE Fast Track Call for Proposals to Mitigate the Impact of the COVID-19 Pandemic.
The scientific committee originally highlighted five research areas as targets for the urgent computing Fast Track Call:
-
Biomolecular research to understand the mechanisms of viral infection
-
Bioinformatics to understand mutations and evolution
-
Biosimulations to develop therapeutics and/or vaccines
-
Epidemiological analyses to understand and predict the spread of disease
-
A final broad category to understand and mitigate the impact of the pandemic
The scientific committee welcomed any academic or industrial project that included a commitment to open science and open data.
To accelerate collaborative progress based on the outcome of the Fast Track projects, PRACE and its American counterpart, XSEDE (The Extreme Science and Engineering Discovery Environment), jointly launched an international COVID-19 High Performance Computing (HPC) Knowledge Exchange, a fortnightly meeting to foster communication, collaboration, and sharing of code and data. The webinars proved very successful with an average of 50 participants and covered topics such as virtual screening, drug design, biomolecular simulations, bioinformatics, epidemiological studies, and the spread of virus via droplets.
Fast Track Call Project Assessment
A project in a biannual PRACE Project Access Call needs the positive scientific evaluation of the PRACE Access Committee, plus the complementary technical analysis of the computing centers. Minimum requirements for PRACE Project Access are at least 35 million core hours running on about 1,024 cores concurrently, with good scalability of code, typically 2 Gb per core memory usage, and from 9 to 100 TB of disk space (small variations occur for the different centers).
On the other hand, in the Fast Track Call, science was the main driver and so the scientific committee adapted the requirements regarding the scalability of the code on a case-by-case basis. They customized the requirements regarding the size of the allocation not only by balancing between the large centers but also by bringing in several national institutes.
For PRACE Project Access Calls, the standard response time from the PRACE Access Committee is 5 mo. In stark contrast to this, a much smaller scientific committee drove the COVID-19 Fast Track Call, which mobilized human resources in a variety of settings: Reviewers completed their tasks in a few days, PRACE staff managed the administrative process with high priority, and the supercomputer center staff that provided computing time and storage dealt with allocations in about a week on average.
That the Fast Track Call received more than 50 proposals in just 2 mo documents the determination of the scientific community to help. Over a 4-mo period, PRACE received 80 applications from multiple fields, ranging from the most predictive docking and screening runs for pharmaceutical applications to fluid dynamics studies of droplet dispersion at the physical level to epidemiology on a global scale (the societal level). For comparison, the regular 19th and 20th PRACE calls for proposals for Project Access received 57 and 59 proposals, respectively, from a wide range of scientific disciplines. Fig. 2 provides an overview of the number of proposals (accepted fraction in darker color, rejected one in lighter color, and the total number of proposals received is inscribed in the circle), the domains, and readiness for high-performance computing are shown in Fig. 2. A statistical breakdown of the proposals (country of the principal investigator [PI], sex ratio of the PI, and the number of reviewers per project, including the number of projects in each category and career stage of the PI) is shown in Fig. 3.
» data-icon-position= » » data-hide-link-title= »0″>
Overview of 78 of the 80 projects submitted to the PRACE COVID-19 Fast Track Call. Two projects have been omitted as they are beyond the scope. The area of each circle is proportional to the total number of applications received in a given area, as inscribed in the center of the circle. The proportion of projects accepted is highlighted in darker color. The x axis represents the readiness for high-performance computing, while the y axis represents the scale of the objects targeted by each model. The scientific approach is indicated next to each circle: biostructural studies, docking and screening, fluid dynamics at the level of an organ or individual (airborne) transmission, and epidemiology at the global level.
» data-icon-position= » » data-hide-link-title= »0″>
A statistical overview of the proposals submitted to the Fast Track Call. The map shows the distribution of proposals per country, with an inset of the number of submissions per week since the call was opened. A large number of European countries have participated in the Fast Track Call. The right part of the figure shows the distribution of proposals by sex, the distribution of proposals by career stage, and the number of reviewers per proposal. The number of reviewers subfigure both shows the number of reviewers (outside the circle), from zero to three, and the number of projects for each case (within the circle). In classifying the career stages we have used the categories normally used for the European Research Council (junior: 7 y after PhD; mid-career: 12 y after PhD; senior: more than 12 y after PhD).
The Fast Track Call awarded half a billion computer hours. Successful projects used those resources over a 6-mo period. A typical biannual PRACE Project Access Call awards 2.5 billion computer hours to projects that run for 12 mo. Thus, the Fast Track Call, over a 6-mo period, is equivalent to about 40% of a single PRACE Project Access Call.