van der Schaar Lab
Causal inference treatment effects

Individualized treatment effect inference

Causal inference treatment effects


Machine learning is capable of enabling truly personalized healthcare; this is what our lab calls “bespoke medicine.”

More info on bespoke medicine can be found here.

Bespoke medicine entails far more than providing predictions for individual patients: we also need to understand the effect of specific treatments on specific patients at specific times. This is what we call individualized treatment effect inference. It is a substantially more complex undertaking than prediction, and every bit as important.

Our lab has built a position of leadership in this area. We have defined the research agenda by outlining and addressing key complexities and challenges, and by laying the theoretical groundwork for model development. In our development of algorithms, we have identified and targeted an extensive range of potential clinical applications using both clinical trials and observational data as inputs.

The page below provides an introduction to individualized treatment effect inference, as well as an overview of some key projects that have driven the entire research area forward.

On this page:
  1. Individualized treatment effect inference: a brief introduction
  2. Treatment effects: from the average to the individual
  3. Why is individualized treatment effect inference so complicated?
  4. Estimating response surfaces
  5. Including treatment effects in outcome models and handling bias
  6. Selecting optimal models for individualized treatment effect inference
  7. Individualized treatment effect estimation using time-series data
  8. ML-assisted clinical trials
  9. Pharmacology
  10. Learn more and get involved
  11. Our work so far

This page is one of several introductions to areas that we see as “research pillars” for our lab. It is a living document, and the content here will evolve as we continue to reach out to the machine learning and healthcare communities, building a shared vision for the future of healthcare.

Our primary means of building this shared vision is through two groups of online engagement sessions: Inspiration Exchange (for machine learning students) and Revolutionizing Healthcare (for the healthcare community). If you would like to get involved, please visit the page below.

This page is authored and maintained by Mihaela van der Schaar and Nick Maxfield.


Individualized treatment effect inference: a brief introduction

This page introduces individualized treatment effect inference — which we could also refer to as causal inference of individualized treatment effects — as one of our lab’s key research areas, and offers an overview of a range of relevant projects we have undertaken.

 The broader area of “causal inference” in machine learning can be broken down into two sub-fields: (i) causal discovery and (ii) individualized treatment effect inference. While (i) is concerned with discovering which variables affect another in what direction, (ii) is concerned with quantifying the association between variables that by (i) are related by estimating the effect of one (or more) variables on another.  Here we focus exclusively on (ii).

In creating this page, we aim to raise and discuss issues related to both the static (cross‐sectional) setting and the longitudinal setting (where patient history and treatment timing are taken into account). We describe the challenges associated with learning from observational data, such as confounding bias, as well as the modeling choices used by machine learning methods to handle them in both settings.

Our lab is also deeply interested in what we call “causal machine learning,” a related but distinct area where the focus is on using causal graphs to improve the robustness of machine learning for prediction, domain adaptation, transfer learning, and more. To learn more about our work in this area, please take a look at our dedicated Research Pillar on Causal Deep Learning.

Treatment effects: from the average to the individual

A major challenge in the domain of healthcare is ascertaining whether a given treatment influences or determines an outcome—for instance, whether there is a survival benefit to prescribing a certain medication, such as the ability of a statin to lower the risk of cardiovascular disease.

Current treatment guidelines have been developed with the “average” patient in mind (on the basis of randomized control trials), but there is ample evidence that different treatments result in different effects and outcomes from one individual to another: for any given treatment, it is quite likely that only a small proportion of people will actually respond in a manner that resembles the “average” patient.

Since the advent of precision medicine and the availability of large amounts of observational data from electronic health records, the research community has started to explore more quantitative individual-level problems, such as the magnitude of the effect of a treatment on a condition for an individual (one example might be the survival benefit of weight loss for a 60-year-old cardiovascular patient with diabetes). Rather than making treatment decisions based on blanket assumptions about “average” patients, the goal of clinical decision-makers is now to determine the optimal treatment course for any given patient at any given time. Methods for doing so in a quantitative fashion based on insights from machine learning are in the formative stages of development (our lab’s work in this area will be covered below).

There are two ways to determine whether a treatment works: observational datasets, and post-hoc analysis of clinical trials. Each method has its own strengths and weaknesses.

Observational datasets
At the moment, doctors can learn from experience and time which treatments work for each individual, but there is no mechanism for sharing this knowledge on a population level in a way that allows the extraction of valuable insights on treatment effects. The increasing availability of observational data has, however, encouraged the development of various machine learning algorithms tailored for inferring treatment effects. It is worth noting, however, that observation datasets are prone to treatment assignment bias, as explained in more detail later on.

Clinical trials
Randomized Controlled Trials (RCTs) are the gold standard for comparing the effectiveness of a new treatment to the current one. Clinical trials may, however, not always be the most practical option for evaluating certain treatments, since they are costly and time-consuming to implement, and they do not always recruit representative patients.

This makes external validity an issue for RCTs, as findings sometimes fail to generalize beyond the study population. This may be due to the narrow inclusion criteria in RCTs compared with the real world, where historically, population restrictions with respect to disease severity, comorbidities, elderly patients, and ethnic minorities can be under‐represented. By contrast, when drugs are US Food and Drug Administration (FDA)‐approved after the clinical trials stage, they start being administered to a much larger and varied population of patients.

Although there is increasing awareness of this issue and global regulatory authorities are encouraging wider inclusion criteria in clinical trials, it remains an issue that is unlikely to be solved by RCTs and associated integrated and model‐based analyses alone. There is scope to add an adaptive element to clinical trials through the use of machine learning.

To summarize the above: our goal is to support a shift from a focus on average treatment effects to individualized treatment effects by optimizing the use of observational datasets and clinical trial design. Estimating individualized treatment effects from EHR data represents a thriving area of research, in which machine learning methods are primed to take center stage.

Clinical example:
Breast cancer treatment outcomes

Clinical example:
LVAD implantation

Why is individualized treatment effect inference so complicated?

Our goal is to use machine learning to estimate the effect of a treatment on an individual using static or time-series observational data.

The problem of estimating individual-level causal effects is usually formulated within the classical potential outcomes framework, first introduced by Neyman in 1923 and subsequently expanded by Rubin into a broader causal model. The framework is based on observational data consisting of patient features, treatment assignment and outcome.

Note:
Binary treatment options vs. multiple treatment options

The apparent simplicity of this framework belies the true complexity of the problem of individualized treatment effect inference; we believe there are three key reasons for this:
– we must work in the absence of counterfactual outcomes;
– bias in observational datasets must be addressed; and
– there is no single preferred way to include treatment indicators in outcome models.

Furthermore, little work has been done to develop a comprehensive theory for individualized treatment effect inference, including principles for optimal model selection.

Overcoming these challenges will require not just methodological advances but also new ways of thinking. In the sections below, we will provide an explanation of each of these issues, while highlighting some of the ways in which our lab’s projects have made progress towards their resolution.

Note:
Assumptions for individualized treatment effect inference

Estimating response surfaces

In the potential outcomes framework outlined above, every subject (individual) in the observational dataset possesses a number of potential outcomes: the subject’s outcome under the application of various treatments, and the subject’s outcome when no treatment is applied. The treatment effect is the difference between the two potential outcomes, but since we only observe the “factual” outcome for a specific treatment assignment, and never observe the corresponding “counterfactual” outcome, we never observe any examples of the true treatment effect in an observational dataset. This is what makes the problem of individualized treatment effect inference fundamentally different from standard supervised learning (regression).

It is important, therefore, to understand from the outset that any method to estimate individualized treatment effects is limited to using the data available at hand, which is entirely composed of factuals, and not counterfactuals. For instance, note that the figure above shows us the factual outcome, but not the counterfactual.

The majority of existing methods for estimating individualized treatment effects from observational data focus on the binary or categorical treatment settings and very few methods consider more complex treatment scenarios. However, it is often the case that treatments have an associated dosage which requires us to estimate the causal effects of continuous-valued interventions.

Additionally, for organ transplantation, it is necessary to estimate the effect of high dimensional, and potentially unique, organs on the patient’s survival such that we assign the organ to the patient that would have the highest survival benefit. Our lab has done work to handle these more complex treatment scenarios in two recent papers published at NeurIPS 2020 on individualized dose-response estimation (SCIGAN) and estimating the individualized effect of transplant-organs (high-dimensional treatments) on patients’ survival (OrganITE).

Research focus:
Using GANs to compensate for the absence of counterfactual outcomes

Including treatment effects in outcome models and handling bias

When modeling individualized treatment effects, we face further issues related to handling treatment bias in observational datasets, and a multitude of choices regarding approaches to handling treatment indicators when estimating patient outcomes.

The former challenge results from the fact that, when estimating individualized treatment effects, assignment bias creates a discrepancy in the feature distributions for treated and control patient groups. Simply put: decision-making by doctors introduces bias into the data.

Modeling the treatment assignment, and its impact on the outcome, is a similarly complex proposition: several approaches exist, with the simplest being to split data into separate models (treated and untreated), or to use the assignment variable as a feature to augment the feature dimension.

A third solution, which has been adopted in a number of papers by our own lab’s researchers, is to learn shared representations, where the treatment assignment indexes these shared representations. This enables us to learn jointly across the treated and untreated populations.

Research focus:
Building shared representations with non-stationary Gaussian processes

Many methods have been developed that take one of the three approaches above to modeling treatments, and they also handle bias differently. For all of this research, however, there still remains a great deal of work to be done in developing a comprehensive theory (i.e. a principled guideline for building estimators of treatment effects using machine learning algorithms). This makes it very difficult to verify the types of algorithms we should be developing, or how to deal with the two problems of modeling treatments and handling bias.

This is why, a few years ago, our lab developed the first theory for individualized treatment effect inference. To do this, we first tried to develop a theoretical understanding of the limits of this problem. Then, guided by this, we sought to identify unique principles that can guide the development of algorithms.

Research focus:
Under the hood of our comprehensive theory for individualized treatment effect inference

The theory we developed guides our model design in two ways: in the small sample regime, we need to have methods that are effectively handling assignment bias, and are hence able to share the training data effectively between the response surfaces. In the large sample regime, we need models that are able to flexibly learn from the available data and do hyperparameter tuning effectively.

We continue to push the boundaries of our understanding of different strategies for treatment effect estimation. More recently, we investigated the strengths and weaknesses of a number of so-called meta-learners (model-agnostic learning strategies) both theoretically and empirically, providing further guidance towards principled algorithm design. Our recent paper on this topic was accepted for publication at AISTATS 2021, and can be found here.

A firm theoretical foundation for individualized treatment effect inference will make it possible to carry out reliable estimation of individualized treatment effects. Such reliable estimation will have obvious implications for the treatment of patients, but it will also have less-obvious implications for clinical trials. The first is that it will enable more reliable post-hoc analysis (such as understanding which groups of patients benefit least or most from the trial treatment). The second is that it may better inform the process of sequentially recruiting patients into clinical trials, thereby enabling better design, both in terms of maximizing overall statistical power and in terms of maximizing the information learned for patients with specific covariates.

Research focus:
Estimating the effects of continuous interventions from observational data

Selecting optimal models for individualized treatment effect inference

In the sections above, we have introduced the challenges inherent in developing approaches to individualized treatment effect inference; we have explained how these challenges can be compensated for or handled, and have outlined a theory for building effective models.

This still leaves us, however, with a further challenge in implementation: a wide variety of models to choose from, and a potentially limitless array of application types and datasets. Choosing “one best model” is impossible, since no single method will ever outperform all others across all datasets, so the challenge becomes selecting the best-performing model for each particular task and dataset. This is further complicated by the fact that we lack access to the counterfactuals and we cannot compute ground truth individualized treatment effects estimates to evaluate the model’s predictions against. This is in contrast to predictive models, where one can use the mean squared error between the model’s predictions and the ground truth label.

The answer to this problem is to use automated machine learning (AutoML) to compare models and select the best model for the task at hand. In experiments applying our own AutoML framework for individualized treatment effect inference (details of which are provided in the box below), we found that the best model selected by the framework tended to significantly outperform other commonly-used methods. This is shown below in a comparison of the performance of methods published at ICML, NeurIPS and ICLR conferences from 2016 to 2018 on 77 datasets.

Research focus:
Automated causal inference using influence functions

Using AutoML enables practitioners such as epidemiologists and applied statisticians to use our validation procedure to select the best model for the observational study at hand.

Moreover, it is often the case that the observational data used to train a treatments effect model may come from a setting where the distribution of patient features is different from the one in the deployment (target) environment, for example, when transferring models across hospitals or countries. Because of this, it is important to be able to also select models that are robust to these covariate shifts across disparate patient populations.

In a recent paper from our lab, we propose leveraging the invariance of causal structures across domains to introduce a novel model selection metric specifically designed for treatment effects models under the unsupervised domain adaptation setting. Experimentally, our method selects treatment effects models that are more robust to covariate shifts on several synthetic and real healthcare datasets, including on estimating the effect of ventilation in COVID-19 patients from different geographic locations.

Individualized treatment effect estimation using time-series data

While the majority of previous work focuses on the effects of interventions at a single point in time, observational data also capture information on complex time-dependent treatment scenarios, such as where the efficacy of treatments changes over time (for example, drug resistance in cancer patients), or where patients receive multiple interventions administered at different points in time (such as joint prescriptions of chemotherapy and radiotherapy).

Estimating the effects of treatments over time therefore presents unique opportunities, such as understanding how diseases evolve under different treatment plans, how individual patients respond to medication over time, and which timings may be optimal for assigning treatments, thus providing new tools to improve clinical decision support systems.

Electronic health records provide a rich source of data for machine learning methods to learn dynamic treatment responses over time. These records, collected over time as part of regular follow-ups, provide a more cost-effective method to gather insights on the effectiveness of past treatment regimens.

Estimating counterfactual patient outcomes over time is challenging due to the presence of time-dependent confounders in observational datasets. Time-dependent confounders are patient covariates that affect the treatment assignments and are themselves affected by past treatments.

For instance, imagine a patient is given treatment A when a certain covariate (let’s say, white blood cell count) has been outside of normal range values for a while. Now, also imagine that the white blood cell count was itself affected by the past administration of a different treatment, treatment B. If this patient is more likely to die, without adjusting for the time-dependent confounding (e.g. the changes in the white blood cell count over time), we could incorrectly conclude that treatment A is harmful to patients.

To make this even more challenging, estimating the effect of a different sequence of treatments on the patient would require not only adjusting for the bias at the current step (in treatment A), but also for the bias introduced by the previous application of treatment B.

Using standard supervised learning methods to estimate these treatment effects will be biased by the treatment assignment policy present in the observational dataset and will not be able to generalize well to changes in the treatment policy in order to generate counterfactuals.

Research focus:
Two approaches to handling time-depending confounders

The ability to accurately estimate treatment effects over time using machine learning allows clinicians to determine, in a manner tailored to each individual patient, both the treatments to prescribe and the optimal time at which to administer them, given their observational history.

Both new methods and theory are necessary to be able to harness the full potential of observational data for learning individualized effects of complex treatment scenarios. Further work in this direction is needed for proposing alternative methods for handling time-dependent confounders, for modelling combinations of treatments assigned over time or for estimating the individualized effects of time-dependent treatments with associated dosage.

ML-assisted clinical trials

Understanding treatment effects can play an important role in the post-hoc analysis of clinical trials into interventions and treatments, as well as influencing the design of more effective clinical trials.

The implementation of clinical trials is a setting in which the relevant population is diverse, and different parts of the population display different reactions to treatment. In such settings, heterogeneous treatment effect (HTE) analysis, also called subgroup analysis, is used to find subgroups consisting of subjects who have similar covariates and display similar treatment responses. The identification of subgroups improves the interpretation of treatment effects across the entire population, and makes it possible to develop more effective interventions and treatments and to improve the design of further experiments. In clinical trials, HTE analysis can identify subgroups of the population for which the studied treatment is effective, even when it is found to be ineffective for the population in general.

Clinical example:
Machine learning for clinical trials in the era of COVID-19

Research focus:
Robust Recursive Partitioning (R2P): a method to support adaptive clinical trial design

We have created an additional page for next-generation clinical trials as a key research pillar for our lab, have a look here if you’d like to learn more.

Pharmacology

There are close links between our work on treatment effect estimation and pharmacology. In addition, our belief at the van der Schaar lab is that the integration of machine learning with pharmacology will unlock new frontiers in personalised medicine and clinical trials.

Click here to learn more.

Learn more and get involved

This page has served as an introduction to individualized treatment effect inference—from the perspective of both healthcare and machine learning.

We have demonstrated the importance of estimating individualized treatment effects in enabling “bespoke medicine” and truly moving beyond one-size-fits-all approaches. In particular, there is great potential to influence and improve the design of clinical trials, and to make effective use of observational data even in the absence of clinical trials. There are further applications to explore, such as modeling individualized treatment effects for organ transplants (as most recently highlighted in a paper accepted for presentation at NeurIPS 2020).

We have also outlined the numerous intricacies and challenges that have complicated the development of machine learning methods and techniques for individualized treatment effect inference, not only due to the lack of counterfactuals, but also due to the lack of a governing theory, the ubiquity of bias in observational data, the choice between several options for modeling treatments, and the difficulty of adapting from static to dynamic datasets. We have also summarized our own lab’s projects seeking to address these challenges.

If you would like to learn more about this topic, we would recommend reading a somewhat more detailed (but still accessible) overview of our work on individualized treatment effect inference, entitled “From Real‐World Patient Data to Individualized Treatment Effects Using Machine Learning: Current and Future Methods to Address Underlying Challenges” (published in Clinical Pharmacology & Therapeutics in 2020).

We have also created a video tutorial series on individualized treatment effect inference, which we will continue to update over time.

We would also encourage you to stay abreast of ongoing developments in this and other areas of machine learning for healthcare by signing up to take part in one of our two streams of online engagement sessions.

If you are a practicing clinician, please sign up for Revolutionizing Healthcare, which is a forum for members of the clinical community to share ideas and discuss topics that will define the future of machine learning in healthcare (no machine learning experience required).

If you are a machine learning student, you can join our Inspiration Exchange engagement sessions, in which we introduce and discuss new ideas and development of new methods, approaches, and techniques in machine learning for healthcare.

Our work so far

Individualized treatment effects have been an area of significant focus for our lab’s researchers since 2016. While the above represents a comprehensive introduction and overview of individualized treatment effect inference, below we share our most recent papers representing the unique breadth that distinguish the van der Schaar lab. We are continuously breaking new ground by pushing the boundaries of ITE through machine learning in both theory and practical application.

We have organised our publications around the following topics: (1) estimating heterogeneous treatment effects, (2) ITE estimation using time-series data, (3) extending the applicability of treatment effect estimation, (4) real-world impact.

Estimating heterogeneous treatment effects

Nonparametric Estimation of Heterogeneous Treatment Effects: From Theory to Learning Algorithms

We continue to push the boundaries of our understanding of different strategies for treatment effect estimation. More recently, we therefore investigated the strengths and weaknesses of a number of so-called meta-learners (model-agnostic strategies) that have been proposed in recent years. Such learners decompose the treatment effect estimation problem into separate sub-problems, each solvable using standard supervised learning methods. Choosing between different meta-learners in a data-driven manner is difficult, as it requires access to counterfactual information. Therefore, with the ultimate goal of building better understanding of the conditions under which some learners can be expected to perform better than others a priori, we theoretically analyze four broad meta-learning strategies which rely on plug-in estimation and pseudo-outcome regression. We highlight how this theoretical reasoning can be used to guide principled algorithm design and translate our analyses into practice by considering a variety of neural network architectures as base-learners for the discussed meta-learner strategies. In a simulation study, we showcase the relative strengths of the learners under different data-generating processes.

Nonparametric Estimation of Heterogeneous Treatment Effects: From Theory to Learning Algorithms

Alicia Curth, Mihaela van der Schaar

AISTATS 2021

Abstract

On Inductive Biases for Heterogeneous Treatment Effect Estimation

As an alternative to meta-learner strategies, which separate the treatment effect estimation task into separate estimation stages, we also considered end-to-end learning solutions.  We investigate how to exploit structural similarities of an individual’s potential outcomes (POs) under different treatments to obtain better estimates of conditional average treatment effects in finite samples. Especially when it is unknown whether a treatment has an effect at all, it is natural to hypothesize that the POs are similar – yet, some existing strategies for treatment effect estimation employ regularization schemes that implicitly encourage heterogeneity even when it does not exist and fail to fully make use of shared structure. In this paper, we investigate and compare three end-to-end learning strategies to overcome this problem – based on regularization, reparametrization and a flexible multi-task architecture – each encoding inductive bias favoring shared behavior across POs. To build understanding of their relative strengths, we implement all strategies using neural networks and conduct a wide range of semi-synthetic experiments. We observe that all three approaches can lead to substantial improvements upon numerous baselines — including meta-learner strategies — and gain insight into performance differences across various experimental settings.

On Inductive Biases for Heterogeneous Treatment Effect Estimation

Alicia Curth, Mihaela van der Schaar

NeurIPS 2021

Abstract

Really Doing Great at Estimating CATE? A Critical Look at ML Benchmarking Practices in Treatment Effect Estimation

As we have highlighted throughout this pillar,  the machine learning (ML) toolbox for estimation of heterogeneous treatment effects from observational data is expanding rapidly — yet many of its algorithms have been evaluated only on a very limited set of semi-synthetic benchmark datasets. In this paper, we investigate current benchmarking practices for ML-based conditional average treatment effect (CATE) estimators, with special focus on empirical evaluation based on the popular semi-synthetic IHDP benchmark. We identify problems with current practice and highlight that semi-synthetic benchmark datasets, which (unlike real-world benchmarks used elsewhere in ML) do not necessarily reflect properties of real data, can systematically favor some algorithms over others – a fact that is rarely acknowledged but of immense relevance for interpretation of empirical results. Further, we argue that current evaluation metrics evaluate performance only for a small subset of possible use cases of CATE estimators, and discuss alternative metrics relevant for applications in personalized medicine. Additionally, we discuss alternatives for current benchmark datasets, and implications of our findings for benchmarking in CATE estimation.

Really Doing Great at Estimating CATE? A Critical Look at ML Benchmarking Practices in Treatment Effect Estimation

Alicia CurthDavid SvenssonJim WeatherallMihaela van der Schaar

NeurIPS 2021

Abstract

Benchmarking Heterogeneous Treatment Effect Models through the Lens of Interpretability

Recent developments in the machine learning (ML) literature on heterogeneous treatment effect estimation — many of which we have discussed throughout this pillar — have given rise to many sophisticated, but opaque, tools: due to their flexibility, modularity and ability to learn constrained representations, neural networks in particular have become central to this literature. Unfortunately, the assets of such black boxes come at a cost: models typically involve countless nontrivial operations, making it difficult to understand what they have learned. Yet, understanding these models can be crucial – in a medical context, for example, discovered knowledge on treatment effect heterogeneity could inform treatment prescription in clinical practice. In this work, we therefore use post-hoc feature importance methods to identify features that influence the model’s predictions. This allows us to evaluate treatment effect estimators along a new and important dimension that has been overlooked in previous work: We construct a benchmarking environment to empirically investigate the ability of personalized treatment effect models to identify predictive covariates – covariates that determine differential responses to treatment. Our benchmarking environment then enables us to provide new insight into the strengths and weaknesses of different types of treatment effects models as we modulate different challenges specific to treatment effect estimation – e.g. the ratio of prognostic to predictive information, the possible nonlinearity of potential outcomes and the presence and type of confounding.   

Benchmarking Heterogeneous Treatment Effect Models through the Lens of Interpretability

Jonathan Crabbé, Alicia Curth, Ioana Bica, Mihaela van der Schaar

NeurIPS 2022

Abstract

To Impute or not to Impute? Missing Data in Treatment Effect Estimation

In real-world scenarios, missing data poses a significant challenge – injecting noise and bias into estimates of treatment effects. This complexity makes accurately estimating treatment effects from incomplete data a daunting task. The challenge stems from the fact that traditional assumptions about missing data don’t fully account for the presence of treatment variables alongside inputs (e.g., individuals) and labels (e.g., outcomes).

In our research, we identify a novel missingness mechanism called mixed confounded missingness (MCM), where some missing data is linked to treatment selection while other missing data is influenced by treatment selection. We find that imputing missing data leads to flawed treatment effect models because it erases crucial information needed for unbiased estimates. Conversely, ignoring imputation altogether introduces bias because missingness linked to treatment affects covariates.

Our proposed solution is selective imputation, where we leverage insights from MCM to determine which variables should be imputed and which should not. Through empirical analysis, we demonstrate the superiority of selective imputation over other methods for handling missing data across various learning algorithms in both average and conditional average treatment effect scenarios.

To Impute or not to Impute? Missing Data in Treatment Effect Estimation

Jeroen Berrevoets, Fergus Imrie, Trent Kyono, James Jordon, Mihaela van der Schaar

ICLR 2023

Abstract

Transfer Learning on Heterogeneous Feature Spaces for Treatment Effects Estimation

Improving the estimation of conditional average treatment effects (CATE) for a specific group by using data from a different group with distinct features is a fascinating machine learning challenge. This transfer learning problem arises when we want to assess a treatment’s effectiveness for a new patient population with different clinical characteristics and limited data.

Our approach involves several key components:

  1. Representation learning: This helps handle the differences in feature spaces between the source and target domains.
  2. Multi-task architecture: We employ a flexible structure with shared and private layers to transfer information between potential outcome functions across domains.

By combining these building blocks, we develop transfer learning versions of standard CATE learners.

Our results demonstrate not only performance improvements of our causal effect learners across datasets but also offer insights into their differences from a transfer perspective. This work contributes to advancing methods for leveraging related information from different domains to enhance treatment effect estimation in real-world applications.

Transfer Learning on Heterogeneous Feature Spaces for Treatment Effects Estimation

Ioana Bica, Mihaela van der Schaar

NeurIPS 2022

Abstract

Identifiable Energy-based Representations: An Application to Estimating Heterogeneous Causal Effects

Identifiable Energy-based Representations: An Application to Estimating Heterogeneous Causal Effects

Yao Zhang, Jeroen Berrevoets, Mihaela van der Schaar

AISTATS 2022

Abstract

ITE estimation using time-series data

Continuous-Time Modeling of Counterfactual Outcomes Using Neural Controlled Differential Equations

Estimating counterfactual outcomes over time has the potential to unlock personalized healthcare by assisting decision-makers to answer ”what-if” questions. Existing causal inference approaches typically consider regular, discrete-time intervals between observations and treatment decisions and hence are unable to naturally model irregularly sampled data, which is common in practice. To handle arbitrary observation patterns, we interpret the data as samples from an underlying continuous-time process and model its latent trajectory explicitly using neural controlled differential equations. This leads to a new approach, the Treatment Effect Neural Controlled Differential Equation (TE-CDE), that allows the potential outcomes to be evaluated at any time point.

Continuous-Time Modeling of Counterfactual Outcomes Using Neural Controlled Differential Equations

Nabeel Seedat, Fergus Imrie, Alexis Bellot, Zhaozhi Qian, Mihaela van der Schaar

ICML 2022

Abstract

Accounting For Informative Sampling When Learning to Forecast Treatment Outcomes Over Time

While TE-CDE extended the capabilities of ITE estimation over time to irregularly sampled data, an important hurdle often overlooked in ML research is the impact of informative sampling in observational data.

Informative sampling means that when instances are observed irregularly over time, the timing of observations isn’t random. Instead, it depends on factors such as the patient’s characteristics, past outcomes, and treatments received. This introduces a problem known as covariate shift, which can hinder accurate estimation of treatment outcomes if not properly addressed.

To tackle this, we introduce a general framework that treats informative sampling as a covariate shift problem. We propose a new method called TESAR-CDE, which implements this framework using Neural CDEs. Through simulations based on a clinical scenario, we demonstrate the effectiveness of our approach in learning treatment outcomes under informative sampling conditions.

Accounting For Informative Sampling When Learning to Forecast Treatment Outcomes Over Time

Toon Vanderschueren, Alicia Curth, Wouter Verbeke, Mihaela van der Schaar

ICML 2023

Abstract

ODE Discovery for Longitudinal Heterogeneous Treatment Effects Inference

The machine learning community has focused on finding unbiased ways to infer treatment effects. Most solutions rely on neural networks to correct assignment bias, even in complex scenarios like high-dimensional or longitudinal settings.

In our study, we introduce a different approach specifically for longitudinal settings: closed-form ordinary differential equations (ODEs). Unlike neural networks, ODEs offer advantages like interpretability, handling irregular sampling, and different identification assumptions. Our main contribution is offering a new type of solution, potentially inspiring fresh innovations in treatment effects research.

We present our approach as a framework that can turn any ODE discovery method into a treatment effects method, opening doors for diverse approaches in this field.

ODE Discovery for Longitudinal Heterogeneous Treatment Effects Inference

Krzysztof Kacprzyk, Samuel Holt, Jeroen Berrevoets, Zhaozhi Qian, Mihaela van der Schaar

ICLR 2024

Abstract

SyncTwin: Treatment Effect Estimation with Longitudinal Outcomes

SyncTwin: Treatment Effect Estimation with Longitudinal Outcomes

Zhaozhi Qian, Yao Zhang, Ioana Bica, Angela Wood, Mihaela van der Schaar

NeurIPS 2021

Abstract

Extending the applicability of treatment effect estimation

SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data

In this work, we study the problem of inferring heterogeneous treatment effects from time-to-event outcome data. While both the related problems of (i) estimating treatment effects for binary or continuous outcomes and (ii) predicting survival outcomes have been well studied in the recent machine learning literature, their combination – albeit of high practical relevance – has received considerably less attention. With the ultimate goal of reliably estimating the effects of treatments on instantaneous risk and survival probabilities, we focus on the problem of learning (discrete-time) treatment-specific conditional hazard functions. We find that unique challenges arise in this context due to a variety of covariate shift issues that go beyond a mere combination of well-studied confounding and censoring biases. We theoretically analyse their effects by adapting recent generalisation bounds from domain adaptation and treatment effect estimation to our setting and discuss implications for model design. We use the resulting insights to propose a novel deep learning method for treatment-specific hazard estimation based on balancing representations. We investigate performance across a range of experimental settings and empirically confirm that our method outperforms baselines by addressing covariate shifts from various sources

SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data

Alicia CurthChanghee LeeMihaela van der Schaar

NeurIPS 2021

Abstract

Understanding the Impact of Competing Events on Heterogeneous Treatment Effect Estimation from Time-to-Event Data

The challenge of uncovering diverse treatment effects from time-to-event data amidst competing events is an often-overlooked topic in research despite its significant practical implications.

Our methodology revolves around outcome modelling to estimate HTEs, exploring the feasibility of utilising existing prediction models for time-to-event data as plug-in estimators for potential outcomes. We delve into the impact of competing events on HTE estimation, recognising them as a potential additional challenge alongside the standard confounding issue.

We identify multiple definitions of causal effects in this context—total, direct, and separable effects—each carrying implications for covariate shift depending on the interpretation of treatment effects and associated estimands. Through theoretical analysis and empirical illustration, we dissect when and how these challenges manifest, especially when employing generic machine learning prediction models for HTE estimation.

Understanding the Impact of Competing Events on Heterogeneous Treatment Effect Estimation from Time-to-Event Data

Alicia Curth, Mihaela van der Schaar

AISTATS 2023

Abstract

Estimating Multi-cause Treatment Effects via Single-cause Perturbation

Estimating Multi-cause Treatment Effects via Single-cause Perturbation

Zhaozhi Qian, Alicia Curth, Mihaela van der Schaar

NeurIPS 2021

Abstract

Policy Analysis using Synthetic Controls in Continuous-Time

Policy Analysis using Synthetic Controls in Continuous-Time

Alexis Bellot, Mihaela van der Schaar

PMLR 2021

Abstract

Adaptive Identification of Populations with Treatment Benefit in Clinical Trials: Machine Learning Challenges and Solutions

This work changes the outlook on adaptive clinical trials, focusing on the dynamic identification of patient subpopulations that respond positively to a specific treatment. While this problem has been extensively explored in biostatistics, existing methodologies have been somewhat constrained in their adaptivity. Our objective is to broaden the scope of these trial designs by incorporating insights from recent advancements in machine learning for adaptive and online experimentation.

One key distinction of the subpopulation selection problem is that we seek to identify groups that benefit from treatment, rather than just pinpointing the single subgroup with the largest effect, all within a limited budget. These nuances present unique challenges and criteria when crafting algorithmic solutions.

Drawing from these observations, we introduce AdaGGI and AdaGCPI, two meta-algorithms tailored for subpopulation construction. Through empirical analysis across various simulation scenarios, we evaluate their performance and gain valuable insights into their strengths and limitations across different settings. This exploration paves the way for more flexible and efficient adaptive clinical trials, ultimately enhancing patient care and treatment efficacy.

Adaptive Identification of Populations with Treatment Benefit in Clinical Trials: Machine Learning Challenges and Solutions

Alicia Curth, Alihan Hüyük, Mihaela van der Schaar

ICML 2023

Abstract

A Neural Framework for Generalized Causal Sensitivity Analysis

A core assumption in treatment effect estimation is that there are no unobserved confounders. However, often this assumption doesn’t hold in many real-world situations. To address this, we propose an innovative approach for estimating causal effects in the presence of unobserved confounding. Our neural framework, NeuralCSA, addresses previous limitations in causal sensitivity analysis by offering compatibility with various sensitivity models, treatment types, and causal queries. This generality is achieved by learning a latent distribution shift corresponding to a treatment intervention using two conditional normalising flows, and the approach is supported by theoretical guarantees. Advances such as NeuralCSA help extend the applicability of treatment effect estimation approaches to real-world scenarios.

A Neural Framework for Generalized Causal Sensitivity Analysis

Alicia Curth, Alihan Hüyük, Mihaela van der Schaar

ICLR 2024

Abstract

Defining Expertise: Applications to Treatment Effect Estimation

Decision-makers, like doctors, use their expertise to guide actions, such as prescribing treatments based on predictions of outcomes. This expertise is valuable because treatments frequently prescribed by experts are likely more effective. However, in machine learning, we often overlook this expertise and don’t use it as a guiding principle, especially in treatment effect estimation research, where assumptions are typically limited to overlap.

In our study, we argue that decision-makers’ expertise can inform the design and selection of treatment effect estimation methods. We identify two types of expertise – predictive and prognostic – and show empirically that:

  1. The dominant type of expertise in a domain significantly impacts the performance of different estimation methods.
  2. We can predict the type of expertise in a dataset, providing a quantitative basis for selecting the most suitable models.

By recognising and leveraging decision-makers’ expertise, we can enhance the effectiveness of treatment effect estimation methods.

Defining Expertise: Applications to Treatment Effect Estimation

Alihan Hüyük, Qiyao Wei, Alicia Curth, Mihaela van der Schaar

ICLR 2024

Abstract

Real-world impact

Causal machine learning for predicting treatment outcomes

Causal machine learning (ML) provides adaptable, data-centric techniques for forecasting treatment outcomes like effectiveness and side effects, which aids in evaluating drug safety. A significant advantage of causal ML is its capability to estimate personalised treatment effects, enabling tailored clinical decision-making based on individual patient characteristics. This method can leverage both clinical trial data and real-world data from sources like clinical registries and electronic health records. However, it’s crucial to proceed with caution to avoid biased or inaccurate predictions.

In this Perspective, we highlight the advantages of causal ML over traditional statistical or ML methods and delineate its essential components and processes. Finally, we offer recommendations for ensuring the dependable application of causal ML and its successful integration into clinical practice.

Causal machine learning for predicting treatment outcomes

Stefan Feuerriegel, Dennis Frauen, Valentyn Melnychuk, Jonas Schweisthal, Konstantin Hess, Alicia Curth, Stefan Bauer, Niki Kilbertus, Isaac S. Kohane, Mihaela van der Schaar

Nature Medicine 2024

Abstract

Machine learning for clinical trials in the era of COVID-19

The COVID-19 pandemic has presented enormous challenge to clinical trials in particular, given the need for expedited development, approval, and distribution.

In a 2020 paper co-authored with some of our collaborators, published in Statistics in Biopharmaceutical Research, we identified ways in which machine learning can respond to the challenges inherent in clinical trials of COVID-19 treatments and vaccines.

We identified three key areas for support: ongoing clinical trials for non-COVID-19 drugs; clinical trials for repurposing drugs to treat COVID-19, and clinical trials for new drugs to treat COVID-19. Many of the research projects outlined above feature in the paper.

Machine learning for clinical trials in the era of COVID-19

William R. Zame, Ioana Bica, Cong Shen, Alicia Curth, Hyun-Suk Lee, Stuart Bailey, James Weatherall, David Wright, Frank Bretz, Mihaela van der Schaar

Statistics in Biopharmaceutical Research 2020

Abstract

Further resources and papers cited on this page, in order of first appearance

A full list of our papers on causal inference, individualized treatment effect inference, and related topics, can be found here.