van der Schaar Lab
Photo by Zhang Kenny on Unsplash

Policy Impact Predictor for COVID-19

Answering the “What if?” questions of COVID-19

Policy Impact Predictor (PIP) is a machine learning tool developed to guide government decision-making around measures to prevent the spread of COVID-19.

In addition to accurately modeling COVID-19 mortality trends for 170 countries under their current policy sets, PIP can adaptively tailor forecasts to show the potential impact of specific policy changes, such as reopening schools or workplaces, allowing international travel, or relaxing stay-at-home requirements.

Unlike other spread models, PIP is able to tackle “What if?” policy questions looking into the future and the past. For example, PIP can estimate what would have happened if Italy’s government had waited a week before imposing lockdown measures, or predict what would happen if India were to loosen all existing spread prevention policies.

PIP was introduced in a paper authored June 4, 2020. We plan to publish PIP as an interactive projection tool over the coming days, and will continue to provide updates and add new functions on an ongoing basis.

Why the world needs PIP

The COVID-19 pandemic has led many countries to impose emergency lockdown measures that are unheard of in their scale and severity. These policies have been implemented to save lives, but cannot be maintained permanently and are not without economic and social costs.

As world leaders consider when and how to lift these measures in their respective countries, they have been presented with a number of policy levers. Going forward, they will need to decide how many of these levers to pull, when to pull them, and how far. These choices are exceptionally challenging in the absence of informed guidance regarding likely outcomes of any specific policy combination.

PIP offers a unique solution to this problem: it allows users to generate custom projections based on policy combinations of their choosing. These projections are based on learnings from policy decisions made around the world, which are then applied on a local basis while factoring in the characteristics of each individual country.

We believe that PIP could be used by governments to make the best possible decisions in a difficult situation. We hope that our model exemplifies the importance of machine learning-based decision-making for public health in the post-coronavirus world.

Not just another spread model

The academic community has produced a multitude of models for forecasting COVID-19 fatalities. These have been used by public health organizations, such as the WHO and the CDC, to advise local authorities on what policy directions to follow.

One key shortcoming of all these models is that they cannot answer “What if?” questions by predicting the impact of lockdown policies that have not yet been implemented. Such counterfactual analysis is crucial for conducting scenario analyses and policy planning.

This is, in part, because existing models have been based primarily on epidemiological and statistical approaches, with machine learning techniques being used merely for parameter optimization—despite the potential of machine learning as an analytical and predictive tool.

Unlike other models, PIP (1) forecasts mortality across all countries affected by the pandemic, (2) incorporates individual country features to learn how disease dynamics and policy effects vary within and between countries, and (3) enables evaluation of future projections under alternative policy scenarios.

PIP in action

PIP will soon be made available as an interactive projection tool on this page. Visitors will be able to select countries, input policy combinations of their choice, and view projections. We plan to add features and maintain the tool as needed over the coming months.

In the meantime, an example of a prediction made by PIP for the United Kingdom is given below for reference.

Photo by Janelle Hiroshige on Unsplash

On the left-hand side, the model predicts what would have happened between late March and May 8, had the UK’s lockdown been implemented 1 week earlier (blue curve) or 1 week later (red curve). Our model predicts that 13,827 lives would have been saved with an earlier lockdown, and 22,405 more deaths would have occured under a later one.

On the right-hand side, the model forecasts that under the UK government’s current re-opening plan (black curve), daily deaths would stabilize at around 200. Maintaining the current lockdown 283 (blue curve) would lead daily deaths to fall under 100 in August, which would save 6,215 more lives between now and the end of August, compared to the current re-opening plan.

On the other hand, a complete relaxation of measures in June (red curve) would result in a second peak in August-September, although there is substantial uncertainty about the volume of the second peak.

Current predictions for the United Kingdom

The charts below use current data to model three scenarios for COVID-19 deaths per day in the United Kingdom. The three scenarios are:

Maintain current policies: freeze and maintain all mobility and spread prevention policies unchanged from June 1, 2020 onward.
Government plan: gradual relaxation of certain policies as outlined under the UK government’s COVID-19 recovery strategy.
End lockdown in August: follow the government plan until August, and then remove all lockdown measures entirely.

Maintain current policies
Government plan
End lockdown in August

PIP predicts that, if existing social distancing and other policies were to remain in place unchanged from June 1, 2020, the number of deaths per day would decrease gradually to approximately 200 by the end of August.

Cumulative COVID-19 deaths from the start of the pandemic through August 31 are estimated at 59,693 in this scenario (high estimate: 71,894, low estimate: 47,549).

View/download data

Generated by wpDataTables

PIP predicts that the lockdown policy relaxation plan proposed by the government would cause the daily number of deaths to stabilize briefly, before rising to roughly 400 by the end of August.

Cumulative COVID-19 deaths from the start of the pandemic through August 31 are estimated at 62,848 in this scenario (high estimate: 74,736, low estimate: 50,732).

View/download data

Generated by wpDataTables

PIP predicts that the complete removal of all existing measures in August would result in daily deaths increasing dramatically to roughly 750 by the end of August.

Cumulative COVID-19 deaths from the start of the pandemic through August 31 are estimated at 65,657 in this scenario (high estimate: 77,375, low estimate: 53,924).

View/download data

Generated by wpDataTables

PIP specifications

Advanced machine learning model

Compartmental Gaussian Process
Accurate forecasting
Counterfactual analysis

Our model utilizes a two-layer Gaussian process (GP) prior.

The lower layer uses a compartmental SEIR (Susceptible, Exposed, Infected, Recovered) model as a prior mean function with “country-and-policy-specific” parameters that capture fatality curves under “counterfactual” policies within each country, whereas the upper layer is shared across all countries, and learns lower-layer SEIR parameters as a function of a country’s features and its policy indicators.

Our model combines the solid mechanistic foundations of SEIR models (Bayesian priors) with the flexible data-driven modeling and gradient-based 18 optimization routines of machine learning (Bayesian posteriors)—i.e., the entire model is trained end-to-end via stochastic variational inference.

Illustration of compartmental Gaussian processes.
(a) The upper-layer process maps country features and lockdown policies to a predicted R0. Here we depict a simplified binary policy indicator (lockdown or no lockdown).
(b) The lower-layer process maps time to number of COVID-19 fatalities. The mean function is an SEIR model modulated by the upper-layer GP. Projections are obtained using the process posteriors.

We have compared the forecasting accuracy of our model with the CDC-listed baselines that offer projections for countries other than the US (IHME, Imperial and YYG). We evaluated the performance of all baselines in 10 countries that were significantly affected by the pandemic in Europe, Asia, Africa, and the Americas (note that ours is the only model that covers all countries and spans all continents).

Since these countries were at different stages of the pandemic at any given time, we evaluated the 7-day and 14-day forecasts on April 25, 2020, when all countries had a significant number of infections. As we can see in the table below, our model outperforms the baselines in almost all countries on both forecasting horizons.

Table showing accuracy of cumulative mortality predictions by different baselines in various countries around the world; most accurate model in bold and underlined.

Central to the accuracy of our model is its hierarchical GP structure which shares data across countries based on their “ feature similarity”, enabling accurate predictions even when little data is available for individual countries.

Our model does not only learn the fatality curves within each country, but also the country-specific effects of lockdown. Since the same policy would naturally yield different effects in different countries, this is a major source of performance gain for our model compared to others, which assume fixed policy effects.

Because different countries with comparable features apply different policies, the upper-layer gaussian process function will learn counterfactual mortality curves for each country under alternative policies that have been tried in other countries. For instance, we can learn the fatality curves for Scandinavian countries (such as Norway and Denmark) under a hypothetically less restrictive lockdown policy using data from Sweden, which adopted less stringent policy measures.

Similarly, the graph in the precending section displays predictions and counterfactual inferences for a variety of policy scenarios in the UK.

High-quality datasets

DELVE Global COVID-19 dataset
Demographic and socio-economic data
COVID-19 policy data
COVID-19 mortality data

This openly-licensed dataset from the DELVE Initiative consolidates country-level data on non-pharmaceutical interventions, cases, deaths, tests, excess mortality, mobility statistics, weather patterns and other metadata from multiple sources into a single analysis-friendly format.

The DELVE dataset tracks:
– Non-Pharmaceutical Interventions (NPIs)
– COVID-19 cases, deaths and tests
– Excess mortality
– Mobility data
– Weather patterns
– Population-related metadata per country (based on World Bank data)

Published World Bank reports were used to extract a set of 35 economic, social, demographic, environmental and public health indicators for each country.

Economic Indicators

GDP per capita, GNI per capita, Income share held by lowest 20%

Social and Demographic Indicators

Population, Life expectancy, Birth rate, Death rate, Infant mortality rate, Land Area, % People with basic hand-washing facilities including soap and water, Smoking prevalence, Prevalence of undernourishment, Prevalence of overweight, Urban population, Population density, Population ages 65 and above, Access to electricity (% of population), UHC service coverage index, Total alcohol consumption per capita, Air transport (passengers carried)

Environmental Indicators

Forest Area, PM2.5 air pollution (mean annual exposure in micrograms per cubic meter)

Public Health Indicators

Immunization for measles, % deaths by communicable diseases, Current health expenditure, Current health expenditure per capita, Diabetes prevalence, Immunization for DPT, Immunization for HepB3, Incidence of HIV, Incidence of malaria, Incidence of tuberculosis, % deaths by CVD/cancer/diabetes/CRD , % deaths due to household and ambient air pollution, % deaths due to unsafe water/unsafe sanitation/lack of hygiene, Physicians (per 1,000 people)

Data obtained from

The Oxford COVID-19 Government Response Tracker (OxCGRT)—curated by the Blavatnik School of Government at Oxford University—was used to extract the following 9 policy indicators for each country over time.

– School closure
– Stay-at-home requirements
– Restrictions on gathering size
– Workplace closure
– Restrictions on domestic or internal movement
– Public transport closures
– Cancellation of public events
– Restrictions on international travel
– Public information campaign

A single Stringency Index is constructed via a nine-point aggregation of the 9 containment and closure indicators listed. The index reports a number between 0 to 100 that reflects the overall stringency of the governments response over time. This is a measure of how many of the these nine indicators (mostly around social isolation) a government has acted upon, and to what degree.

Data obtained from

Data on daily reported COVID-19 deaths was collected from the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University, through which information from local government, national government.

Data obtained from

View the June 4, 2020 paper introducing PIP

When and How to Lift the Lockdown? Global COVID-19 Scenario Analysis and Policy Assessment using Compartmental Gaussian Processes

Zhaozhi Qian, Ahmed M. Alaa, Mihaela van der Schaar

To find out more about the van der Schaar Lab’s work related to the COVID-19 pandemic, visit our dedicated page here.

Assumptions and limitations

Our model makes predictions about the future by learning patterns in data collected from the past. This model can only provide a “best guess” of likely COVID-19 outcomes based on current knowledge and cumulative trends observed across many countries, but it can never provide a 100% accurate prediction for any individual country. The accuracy of our model is bound by the quality of the data and the validity of our modeling assumptions. Future changes in social behavior and policies or the emergence of viable pharmaceutical interventions may significantly influence the actual projections of COVID-19 fatalities. Details of the model used to issue our projections can be found in this research paper.

Limitations of our model

  • Data quality : Our model is trained on data pertaining to confirmed COVID-19 deaths across many countries. The data reporting quality may widely vary across countries, and actual numbers of deaths in some countries may be much higher than reported ones. Data on excess mortality was not available for all countries. In many countries, data reporting is significantly incomplete in the weekend, hence our projections may appear to be overestimated on Saturdays and Sundays. 
  • Seasonality: The current version of the model does not incorporate seasonal changes in the factors affecting disease spread. We will include country-level weather patterns in the next update to account for potential seasonality.
  • Future policies and social behavior: Our model does not predict future policies or the extent to which populations will continue to comply with lockdown measures. Unexpected sociopolitical events that may ignite large-scale social gatherings, such as the current ongoing protests and demonstrations in the United States, are not accounted for.