van der Schaar Lab
Photo by Zhang Kenny on Unsplash

Policy Impact Predictor for COVID-19

Answering the “What if?” questions of COVID-19

Policy Impact Predictor (PIP) is a machine learning tool developed to guide government decision-making around measures to prevent the spread of COVID-19.

In addition to accurately modeling COVID-19 mortality trends under current policy sets, PIP can adaptively tailor forecasts to show the potential impact of specific policy changes, such as reopening schools or workplaces, implementing mask mandates, or relaxing shelter-in-place requirements.

A partial implementation of the PIP model is now available as an interactive web app.

Why the world needs PIP

The COVID-19 pandemic has led many countries to impose emergency lockdown measures that are unheard of in their scale and impact. These policies have been implemented to save lives, but cannot be maintained permanently and are not without economic and social costs.

As world leaders consider when and how to lift these measures in their respective countries, they have been presented with a number of policy levers. Going forward, they will need to decide how many of these levers to pull, when to pull them, and how far. These choices are exceptionally challenging in the absence of informed guidance regarding likely outcomes of any specific policy combination.

PIP offers a unique solution to this problem: it allows users to generate custom projections based on policy combinations of their choosing. These projections are based on learnings from policy decisions made around the world, which are then applied on a local basis while factoring in the characteristics of each individual country.

We believe that PIP could be used by governments to make the best possible decisions in a difficult situation. We hope that our model demonstrates the potential of machine learning-based decision-making for public health in the post-coronavirus world.

Not just another spread model

The academic community has produced a multitude of models for forecasting COVID-19 fatalities. These have been used by public health organizations, such as the WHO and the CDC, to advise local authorities on what policy directions to follow.

One key shortcoming of many of these models is that they cannot answer “What if?” questions by predicting the impact of lockdown policies that have not yet been implemented. Such analysis is crucial for conducting scenario analyses and policy planning.

This is, in part, because existing models have been based primarily on epidemiological and statistical approaches, with machine learning techniques being used merely for parameter optimization—despite the potential of machine learning as an analytical and predictive tool.

Unlike other models, PIP (1) incorporates individual country features to learn how disease dynamics and policy effects vary within and between countries, and (2) enables evaluation of future projections under alternative policy scenarios.

PIP is also able to tackle “What if?” policy questions looking into the past. For example, PIP can estimate what would have happened if Italy’s government had waited a week before imposing lockdown measures.

Note: the “counterfactual” projections on the left-hand side of this figure (i.e. the “What could have happened?” projections) can be generated using the underlying PIP model, but aren’t available in the interactive web app.

PIP specifications

Advanced machine learning model

PIP is a two-layer model. The lower layer uses a variant of compartmental SEIR (Susceptible, Exposed, Infected, Recovered) models to capture fatalities within a geographical area over time. Our compartmental model comprises six compartments, modeling susceptible, exposed, infection, critically-ill, recovered, and dead patients over time. The lower layer has parameters that are specific to each country.

The upper layer shares parameters across countries. It uses “country-and-policy-specific” variables to model the effect of different non-pharmaceutical interventions on fatalities over time. This is achieved using a Recurrent Neural Network (RNN) model that maps the observed number of cases, deaths, previous interventions and other variables pertaining to weather conditions to a time-varying reproduction number R0. Our model combines both the solid mechanistic foundations of compartmental models with the flexible data-driven modeling and gradient-based optimization routines of machine learning.

In validating the performance of PIP during the initial months of the COVID-19 pandemic, we compared the forecasting accuracy of our model with the CDC-listed baselines that offer projections for countries other than the US (IHME, Imperial and YYG). We evaluated the performance of all baselines in 10 countries that were significantly affected by the pandemic in Europe, Asia, Africa, and the Americas (note that ours is the only model that covers all countries and spans all continents, even if these are not currently available in the web app).

Since these countries were at different stages of the pandemic at any given time, we evaluated the 7-day and 14-day forecasts on April 25, 2020, when all countries had a significant number of infections. As we can see in the table below, our model outperforms the baselines in almost all countries on both forecasting horizons.

Table showing accuracy of cumulative mortality predictions by different baselines in various countries around the world; most accurate model in bold and underlined.

Central to the accuracy of our model is its hierarchical structure which shares data across countries based on their “ feature similarity”, enabling accurate predictions even when little data is available for individual countries.

Our model does not only learn the fatality curves within each country, but also the country-specific effects of lockdown. Since the same policy would naturally yield different effects in different countries, this is a major source of performance gain for our model compared to others, which assume fixed policy effects.

Note: this validation was conducted on a slightly earlier version of PIP in Spring 2020; the underlying model has since been improved even further.

Because different countries with comparable features apply different policies, the upper-layer function will learn mortality curves for each country under alternative policies that have been tried in other countries. For instance, we can learn the fatality curves for Scandinavian countries (such as Norway and Denmark) under a hypothetically less restrictive lockdown policy using data from Sweden, which adopted less stringent policy measures.

Similarly, the graph in the precending section displays predictions and counterfactual inferences for a variety of policy scenarios in the UK.

High-quality datasets

This openly-licensed dataset from the DELVE Initiative consolidates country-level data on non-pharmaceutical interventions, cases, deaths, tests, excess mortality, mobility statistics, weather patterns and other metadata from multiple sources into a single analysis-friendly format.

The DELVE dataset tracks:
– Non-Pharmaceutical Interventions (NPIs)
– COVID-19 cases, deaths and tests
– Excess mortality
– Mobility data
– Weather patterns
– Population-related metadata per country (based on World Bank data)

Published World Bank reports were used to extract a set of 35 economic, social, demographic, environmental and public health indicators for each country.

GDP per capita, GNI per capita, Income share held by lowest 20%

Population, Life expectancy, Birth rate, Death rate, Infant mortality rate, Land Area, % People with basic hand-washing facilities including soap and water, Smoking prevalence, Prevalence of undernourishment, Prevalence of overweight, Urban population, Population density, Population ages 65 and above, Access to electricity (% of population), UHC service coverage index, Total alcohol consumption per capita, Air transport (passengers carried)

Forest Area, PM2.5 air pollution (mean annual exposure in micrograms per cubic meter)

Immunization for measles, % deaths by communicable diseases, Current health expenditure, Current health expenditure per capita, Diabetes prevalence, Immunization for DPT, Immunization for HepB3, Incidence of HIV, Incidence of malaria, Incidence of tuberculosis, % deaths by CVD/cancer/diabetes/CRD , % deaths due to household and ambient air pollution, % deaths due to unsafe water/unsafe sanitation/lack of hygiene, Physicians (per 1,000 people)

Data obtained from

The Oxford COVID-19 Government Response Tracker (OxCGRT)—curated by the Blavatnik School of Government at Oxford University—was used to extract the following policy indicators for each country over time:

– School closure
– Shelter-in-place requirements
– Restrictions on gathering size
– Workplace closure
– Restrictions on domestic or internal movement
– Public transport closures
– Cancellation of public events
– Restrictions on international travel

Data obtained from

Note: mask mandate policy settings and data are derived from datasets included in DELVE, not OxCGRT.

Data on daily reported COVID-19 deaths was collected from the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University, through which information from local government, national government.

Data obtained from

About the PIP web app

As of October 2020, PIP is available as a web app that allows users to model the impact of custom policy sets on COVID-19 mortality trends. The web app should primarily be viewed as a proof of concept showing the capabilities of PIP, but it lacks the full functionality of the underlying PIP model.

The PIP web app can currently generate forward-looking projections (from the current date) for a user-specified period for a single country (the U.K. at present), based on policy scenarios defined by users.

Policy settings:
– School closures (on/off)
– Mask policies (4 settings)
– Workplace closure (on/off)
– Shelter-in-place orders (on/off)
– Public events cancellation (on/off)
– Gatherings restrictions (on/off) – Travel restrictions (on/off)

Projection period settings:
Users can create projections starting from the current date and extending up to 90 days into the future.

Projection types:
Users can select to forecast either daily deaths or reproduction number (r0).

Available countries and regions:
The PIP web app currently offers projections for the U.K., though other countries may be added in the near future. Subnational regions are not currently included, due to difficulties acquiring high-quality data.

As mentioned above, the PIP app is a partial implementation of the van der Schaar Lab’s PIP model. The model offers the following features that aren’t available in the web app.

“Counterfactual” policy analysis
Whereas the web app is limited to projections starting from the current date, the PIP model enables users to make custom projections that start from any given point since the start of spread of the novel coronavirus. This can be used to answer “What if?” questions in which the policy scenario projections are different from the policies that were actually implemented. Naturally, the model can also be used to determine the impact of implementing policy scenarios earlier or later than they were, in fact, implemented.

Projections incorporating multiple policy scenarios over time
The web app offers projections that start from the current date, and doesn’t allow users to model the impact of subsequent policy changes, but the PIP model itself can project the likely impact of the implementation of a number of different policy sets over time.

Setting up a projection for a custom policy scenario
The default view displayed by the PIP app is a one-month projection (from the current date) for the U.K. under our custom policy scenario, in which schools and workplaces are closed and masks are recommended.

To create a custom project, you’ll need to:

1. Choose your forecast settings
– Select a country/region (U.K. currently available)
– Decide whether to show daily deaths or reproduction number (r0) – Select a projection period (max. 90 days)

2. Create your custom policy scenario
– Select from a range of policy options related to social distancing, event restrictions, travel restriction, mask mandates, etc.
– Click “apply policy” to update the displayed projection
– Click “reset to current trend” to return to our lab’s default policy scenario

3. Tweak display settings
– Choose whether to display confidence intervals and model fit
– Switch between logarithmic and linear y-axis display
– Alter the graph start date

Interpreting a policy scenario
The projection starts on the current date, with the dotted red line representing PIP’s predicted daily death/reproduction number trend for the specified period. The predicted value is surrounded by an upper and lower confidence interval.

An example of one interpretation of a policy scenario is provided below.

As of late October 2020, PIP’s projection for the U.K. suggests that, in a scenario in which the only preventative measures in place are a universal mask mandate and restrictions on events and gatherings, the daily death rate would increase fairly rapidly over a month or two, peaking at over 1,300 by the end of the year. It would then fall somewhat over the subsequent month or so. To create your own custom policy scenario, follow the link below.

Source code for the PIP model can be found in our Lab’s BitBucket repo.

You can use this to conduct further analysis and research, such as counterfactual analysis with previous data or prediction of new cases. You can also freely build on and improve PIP.

View our NeurIPS 2020 paper introducing PIP

When and How to Lift the Lockdown? Global COVID-19 Scenario Analysis and Policy Assessment using Compartmental Gaussian Processes

Zhaozhi Qian, Ahmed M. Alaa, Mihaela van der Schaar

Note: PIP builds on our two-layer machine learning-based compartmental model accepted for an oral presentation in NeurIPS 2020 and introduced in the paper below.

Since the publication of the initial version of the model in June 2020, we have upgraded PIP’s functionalities by using an RNN to better capture temporal dependencies on exogenous factors such as whether conditions and mobility patterns.

To find out more about the van der Schaar Lab’s work related to the COVID-19 pandemic, visit our dedicated page here.

Assumptions and limitations

Our model makes predictions about the future by learning patterns in data collected from the past. This model can only provide a “best guess” of likely COVID-19 outcomes based on current knowledge and cumulative trends observed across many countries, but it can never provide a 100% accurate prediction for any individual country. The accuracy of our model is bound by the quality of the data and the validity of our modeling assumptions. Future changes in social behavior and policies or the emergence of viable pharmaceutical interventions may significantly influence the actual projections of COVID-19 fatalities. Details of the model used to issue our projections can be found in this research paper.

Limitations of our model

  • Data quality: Our model is trained on data pertaining to confirmed COVID-19 deaths across many countries. The data reporting quality may widely vary across countries, and actual numbers of deaths in some countries may be much higher than reported ones. Data on excess mortality was not available for all countries. In many countries, data reporting is significantly incomplete in the weekend, hence our projections may appear to be overestimated on Saturdays and Sundays. 
  • Future policies and social behavior: Our model does not predict future policies or the extent to which populations will continue to comply with lockdown measures. Unexpected sociopolitical events that may ignite large-scale social gatherings, such as the current ongoing protests and demonstrations in the United States, are not accounted for.
  • Patientlevel demographic information: PIP does not incorporate patient level demographic information (age, etc.) for new COVID-19 cases, since such data is not made available in a usable format. As a result, there may be complexities beyond simple case count figures—such as disparities in COVID-19 infection rates between age groups—that may materially impact death rates in a manner that cannot be reflected by PIP in its projections. This could potentially cause overestimation of death rates.


Is the PIP model based entirely on data-driven machine learning?

No, PIP uses a combination of mechanistic epidemiological models and machine learning models to issue its projections. A rigorous compartmental model with parameters obtained from published literature on COVID-19 is used to simulate trajectories of COVID-19 fatalities for a given R0. The effect of different non-pharmaceutical interventions on the R0 parameter over time are estimated using a machine learning model.

How does the PIP training algorithm work?

There are 3 main steps involved in the PIP training algorithm:
1. Detect change points in R0
2. Optimize the SEIR model of each country based on 7-day average deaths
3. Train the Quantile LSTM to predict the optimized contact rate of each country

These steps are outlined in more detail below.

Does PIP account for the varying age demographic of new COVID-19 cases over time?

No, PIP assumes that the age demographic of COVID-19 cases is fixed over time since data on the breakdown of the age groups diagnosed in each day is unavailable. With younger individuals being diagnosed more often in recent days, it is more likely that PIP would overestimate future fatalities rather than underestimate it.

Does PIP account for seasonality?

PIP implicitly accounts for seasonality by assuming that the R0 in each geographical area is a function of whether conditions.

How does PIP model the adherence of populations to lockdown policies?

PIP uses a three-layer model that estimates the effect of lockdown policies on mobility patterns and links population mobility to the realized R0. The machine learning component of PIP learns to predict how social distancing measures would change mobility in each country, which reflects the extent of a population’s adherence to these measures.

Do your projections take into account the numerous policy differences between different regions within the same country?

Not at this point, but future versions of PIP could include a geographical breakdown for some countries were such data is available.