van der Schaar Lab
Photo by Zhang Kenny on Unsplash

Policy Impact Predictor for COVID-19

Answering the “What if?” questions of COVID-19

Policy Impact Predictor (PIP) is a machine learning tool developed to guide government decision-making around measures to prevent the spread of COVID-19.

In addition to accurately modeling COVID-19 mortality trends under current policy sets, PIP can adaptively tailor forecasts to show the potential impact of specific policy changes, such as reopening schools or workplaces, implementing mask mandates, or relaxing shelter-in-place requirements.

A partial implementation of the PIP model is now available as an interactive web app.

Why the world needs PIP

The COVID-19 pandemic has led many countries to impose emergency lockdown measures that are unheard of in their scale and impact. These policies have been implemented to save lives, but cannot be maintained permanently and are not without economic and social costs.

As world leaders consider when and how to lift these measures in their respective countries, they have been presented with a number of policy levers. Going forward, they will need to decide how many of these levers to pull, when to pull them, and how far. These choices are exceptionally challenging in the absence of informed guidance regarding likely outcomes of any specific policy combination.

PIP offers a unique solution to this problem: it allows users to generate custom projections based on policy combinations of their choosing. These projections are based on learnings from policy decisions made around the world, which are then applied on a local basis while factoring in the characteristics of each individual country.

We believe that PIP could be used by governments to make the best possible decisions in a difficult situation. We hope that our model demonstrates the potential of machine learning-based decision-making for public health in the post-coronavirus world.

Not just another spread model

The academic community has produced a multitude of models for forecasting COVID-19 fatalities. These have been used by public health organizations, such as the WHO and the CDC, to advise local authorities on what policy directions to follow.

One key shortcoming of many of these models is that they cannot answer “What if?” questions by predicting the impact of lockdown policies that have not yet been implemented. Such analysis is crucial for conducting scenario analyses and policy planning.

This is, in part, because existing models have been based primarily on epidemiological and statistical approaches, with machine learning techniques being used merely for parameter optimization—despite the potential of machine learning as an analytical and predictive tool.

Unlike other models, PIP (1) incorporates individual country features to learn how disease dynamics and policy effects vary within and between countries, and (2) enables evaluation of future projections under alternative policy scenarios.

PIP is also able to tackle “What if?” policy questions looking into the past. For example, PIP can estimate what would have happened if Italy’s government had waited a week before imposing lockdown measures.

Note: the “counterfactual” projections on the left-hand side of this figure (i.e. the “What could have happened?” projections) can be generated using the underlying PIP model, but aren’t available in the interactive web app.

PIP specifications

Advanced machine learning model

PIP is a two-layer model. The lower layer uses a variant of compartmental SEIR (Susceptible, Exposed, Infected, Recovered) models to capture fatalities within a geographical area over time. Our compartmental model comprises six compartments, modeling susceptible, exposed, infection, critically-ill, recovered, and dead patients over time. The lower layer has parameters that are specific to each country.

The upper layer shares parameters across countries. It uses “country-and-policy-specific” variables to model the effect of different non-pharmaceutical interventions on fatalities over time. This is achieved using a Recurrent Neural Network (RNN) model that maps the observed number of cases, deaths, previous interventions and other variables pertaining to weather conditions to a time-varying reproduction number R0. Our model combines both the solid mechanistic foundations of compartmental models with the flexible data-driven modeling and gradient-based optimization routines of machine learning.

In validating the performance of PIP during the initial months of the COVID-19 pandemic, we compared the forecasting accuracy of our model with the CDC-listed baselines that offer projections for countries other than the US (IHME, Imperial and YYG). We evaluated the performance of all baselines in 10 countries that were significantly affected by the pandemic in Europe, Asia, Africa, and the Americas (note that ours is the only model that covers all countries and spans all continents, even if these are not currently available in the web app).

Since these countries were at different stages of the pandemic at any given time, we evaluated the 7-day and 14-day forecasts on April 25, 2020, when all countries had a significant number of infections. As we can see in the table below, our model outperforms the baselines in almost all countries on both forecasting horizons.

Table showing accuracy of cumulative mortality predictions by different baselines in various countries around the world; most accurate model in bold and underlined.

Central to the accuracy of our model is its hierarchical structure which shares data across countries based on their “ feature similarity”, enabling accurate predictions even when little data is available for individual countries.

Our model does not only learn the fatality curves within each country, but also the country-specific effects of lockdown. Since the same policy would naturally yield different effects in different countries, this is a major source of performance gain for our model compared to others, which assume fixed policy effects.

Note: this validation was conducted on a slightly earlier version of PIP in Spring 2020; the underlying model has since been improved even further.

Because different countries with comparable features apply different policies, the upper-layer function will learn mortality curves for each country under alternative policies that have been tried in other countries. For instance, we can learn the fatality curves for Scandinavian countries (such as Norway and Denmark) under a hypothetically less restrictive lockdown policy using data from Sweden, which adopted less stringent policy measures.

Similarly, the graph in the precending section displays predictions and counterfactual inferences for a variety of policy scenarios in the UK.

High-quality datasets

This openly-licensed dataset from the DELVE Initiative consolidates country-level data on non-pharmaceutical interventions, cases, deaths, tests, excess mortality, mobility statistics, weather patterns and other metadata from multiple sources into a single analysis-friendly format.

The DELVE dataset tracks:
– Non-Pharmaceutical Interventions (NPIs)
– COVID-19 cases, deaths and tests
– Excess mortality
– Mobility data
– Weather patterns
– Population-related metadata per country (based on World Bank data)

Published World Bank reports were used to extract a set of 35 economic, social, demographic, environmental and public health indicators for each country.

Economic Indicators

Social and Demographic Indicators

Environmental Indicators

Public Health Indicators

Data obtained from

The Oxford COVID-19 Government Response Tracker (OxCGRT)—curated by the Blavatnik School of Government at Oxford University—was used to extract the following policy indicators for each country over time:

– School closure
– Shelter-in-place requirements
– Restrictions on gathering size
– Workplace closure
– Restrictions on domestic or internal movement
– Public transport closures
– Cancellation of public events
– Restrictions on international travel

Data obtained from

Note: mask mandate policy settings and data are derived from datasets included in DELVE, not OxCGRT.

Data on daily reported COVID-19 deaths was collected from the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University, through which information from local government, national government.

Data obtained from

About the PIP web app

As of October 2020, PIP is available as a web app that allows users to model the impact of custom policy sets on COVID-19 mortality trends. The web app should primarily be viewed as a proof of concept showing the capabilities of PIP, but it lacks the full functionality of the underlying PIP model.

What the web app can do

What the web app can’t do (but the PIP model can)

How to read PIP’s projections

Source code for the PIP model can be found in our Lab’s BitBucket repo.

You can use this to conduct further analysis and research, such as counterfactual analysis with previous data or prediction of new cases. You can also freely build on and improve PIP.

View our NeurIPS 2020 paper introducing PIP

When and How to Lift the Lockdown? Global COVID-19 Scenario Analysis and Policy Assessment using Compartmental Gaussian Processes

Zhaozhi Qian, Ahmed M. Alaa, Mihaela van der Schaar

Note: PIP builds on our two-layer machine learning-based compartmental model accepted for an oral presentation in NeurIPS 2020 and introduced in the paper below.

Since the publication of the initial version of the model in June 2020, we have upgraded PIP’s functionalities by using an RNN to better capture temporal dependencies on exogenous factors such as whether conditions and mobility patterns.

To find out more about the van der Schaar Lab’s work related to the COVID-19 pandemic, visit our dedicated page here.

Assumptions and limitations

Our model makes predictions about the future by learning patterns in data collected from the past. This model can only provide a “best guess” of likely COVID-19 outcomes based on current knowledge and cumulative trends observed across many countries, but it can never provide a 100% accurate prediction for any individual country. The accuracy of our model is bound by the quality of the data and the validity of our modeling assumptions. Future changes in social behavior and policies or the emergence of viable pharmaceutical interventions may significantly influence the actual projections of COVID-19 fatalities. Details of the model used to issue our projections can be found in this research paper.

Limitations of our model

  • Data quality: Our model is trained on data pertaining to confirmed COVID-19 deaths across many countries. The data reporting quality may widely vary across countries, and actual numbers of deaths in some countries may be much higher than reported ones. Data on excess mortality was not available for all countries. In many countries, data reporting is significantly incomplete in the weekend, hence our projections may appear to be overestimated on Saturdays and Sundays. 
  • Future policies and social behavior: Our model does not predict future policies or the extent to which populations will continue to comply with lockdown measures. Unexpected sociopolitical events that may ignite large-scale social gatherings, such as the current ongoing protests and demonstrations in the United States, are not accounted for.
  • Patientlevel demographic information: PIP does not incorporate patient level demographic information (age, etc.) for new COVID-19 cases, since such data is not made available in a usable format. As a result, there may be complexities beyond simple case count figures—such as disparities in COVID-19 infection rates between age groups—that may materially impact death rates in a manner that cannot be reflected by PIP in its projections. This could potentially cause overestimation of death rates.


Is the PIP model based entirely on data-driven machine learning?

No, PIP uses a combination of mechanistic epidemiological models and machine learning models to issue its projections. A rigorous compartmental model with parameters obtained from published literature on COVID-19 is used to simulate trajectories of COVID-19 fatalities for a given R0. The effect of different non-pharmaceutical interventions on the R0 parameter over time are estimated using a machine learning model.

How does the PIP training algorithm work?

There are 3 main steps involved in the PIP training algorithm:
1. Detect change points in R0
2. Optimize the SEIR model of each country based on 7-day average deaths
3. Train the Quantile LSTM to predict the optimized contact rate of each country

These steps are outlined in more detail below.

Does PIP account for the varying age demographic of new COVID-19 cases over time?

No, PIP assumes that the age demographic of COVID-19 cases is fixed over time since data on the breakdown of the age groups diagnosed in each day is unavailable. With younger individuals being diagnosed more often in recent days, it is more likely that PIP would overestimate future fatalities rather than underestimate it.

Does PIP account for seasonality?

PIP implicitly accounts for seasonality by assuming that the R0 in each geographical area is a function of whether conditions.

How does PIP model the adherence of populations to lockdown policies?

PIP uses a three-layer model that estimates the effect of lockdown policies on mobility patterns and links population mobility to the realized R0. The machine learning component of PIP learns to predict how social distancing measures would change mobility in each country, which reflects the extent of a population’s adherence to these measures.

Do your projections take into account the numerous policy differences between different regions within the same country?

Not at this point, but future versions of PIP could include a geographical breakdown for some countries were such data is available.