The van der Schaar Lab’s seventh Inspiration Exchange engagement session took place virtually on March 30, 2021.
The session featured 4 presentations (given by current and former lab members) on a range of application-oriented projects in machine learning for healthcare. Topics ranged from treatment effect estimation to multi-omics data integration, organ transplantation, and clinical trials.
A Q&A/open discussion took place in the latter half of the session, with participants asking researchers about their projects, and sharing their thoughts.
Introduction – 0:00
Welcome from Mihaela – 2:17
Presentation 1 [Nonparametric estimation of heterogenous treatment effects // Alicia Curth] – 4:48
Presentation 2 [A variational information bottleneck approach to multi-omics data integration // Changhee Lee] – 13:26
Presentation 3 [Learning matching representations for individualized organ transplantation allocation // Ahmed Alaa] – 20:40
Presentation 4 [SDF-Bayes: cautious optimism in safe dose-finding clinical trials with drug combinations and heterogenous patient groups // Cong Shen] – 24:32
Q&A session – 36:50
Closing words from Mihaela – 53:27
Intro to next sessions – 54:23
Sign up for our upcoming sessions here.
Titles, authors and abstracts for all projects featured in this session are given below.
Nonparametric Estimation of Heterogeneous Treatment Effects:
From Theory to Learning Algorithms
The need to evaluate treatment effectiveness is ubiquitous in most of empirical science, and interest in flexibly investigating effect heterogeneity is growing rapidly. To do so, a multitude of model-agnostic, nonparametric meta-learners have been proposed in recent years. Such learners decompose the treatment effect estimation problem into separate sub-problems, each solvable using standard supervised learning methods. Choosing between different meta-learners in a data-driven manner is difficult, as it requires access to counterfactual information.
Therefore, with the ultimate goal of building better understanding of the conditions under which some learners can be expected to perform better than others a priori, we theoretically analyze four broad meta-learning strategies which rely on plug-in estimation and pseudo-outcome regression. We highlight how this theoretical reasoning can be used to guide principled algorithm design and translate our analyses into practice by considering a variety of neural network architectures as base-learners for the discussed meta-learning strategies.
In a simulation study, we showcase the relative strengths of the learners under different data-generating processes.
A Variational Information Bottleneck Approach to Multi-Omics Data Integration
Integration of data from multiple omics techniques is becoming increasingly important in biomedical research. Due to non-uniformity and technical limitations in omics platforms, such integrative analyses on multiple omics, which we refer to as views, involve learning from incomplete observations with various view-missing patterns. This is challenging because i) complex interactions within and across observed views need to be properly addressed for optimal predictive power and ii) observations with various view-missing patterns need to be flexibly integrated.
To address such challenges, we propose a deep variational information bottleneck (IB) approach for incomplete multi-view observations. Our method applies the IB framework on marginal and joint representations of the observed views to focus on intra-view and inter-view interactions that are relevant for the target. Most importantly, by modeling the joint representations as a product of marginal representations, we can efficiently learn from observed views with various view-missing patterns.
Experiments on real-world datasets show that our method consistently achieves gain from data integration and outperforms state-of-the-art benchmarks.
Learning Matching Representations for Individualized Organ Transplantation Allocation
Organ transplantation is often the last resort for treating end-stage illness, but the probability of a successful transplantation depends greatly on compatibility between donors and recipients. Current medical practice relies on coarse rules for donor-recipient matching, but is short of domain knowledge regarding the complex factors underlying organ compatibility.
In this paper, we formulate the problem of learning data-driven rules for organ matching using observational data for organ allocations and transplant outcomes. This problem departs from the standard supervised learning setup in that it involves matching the two feature spaces (i.e., donors and recipients), and requires estimating transplant outcomes under counterfactual matches not observed in the data. To address these problems, we propose a model based on representation learning to predict donor-recipient compatibility; our model learns representations that cluster donor features, and applies donor-invariant transformations to recipient features to predict outcomes for a given donor-recipient feature instance.
Experiments on semi-synthetic and real-world datasets show that our model outperforms state-of-art allocation methods and policies executed by human experts.
SDF-Bayes: Cautious Optimism in Safe Dose-Finding Clinical Trials with Drug Combinations and Heterogeneous Patient Groups
Hyun-Suk Lee, Cong Shen, William Zame, Jang-Won Lee, Mihaela van der Schaar
Phase I clinical trials are designed to test the safety (non-toxicity) of drugs and find the maximum tolerated dose (MTD). This task becomes significantly more challenging when multiple-drug dose-combinations (DC) are involved, due to the inherent conflict between the exponentially increasing DC candidates and the limited patient budget.
This paper proposes a novel Bayesian design, SDF-Bayes, for finding the MTD for drug combinations in the presence of safety constraints. Rather than the conventional principle of escalating or de-escalating the current dose of one drug (perhaps alternating between drugs), SDF-Bayes proceeds by cautious optimism: it chooses the next DC that, on the basis of current information, is most likely to be the MTD (optimism), subject to the constraint that it only chooses DCs that have a high probability of being safe (caution). We also propose an extension, SDF-Bayes-AR, that accounts for patient heterogeneity and enables heterogeneous patient recruitment.
Extensive experiments based on both synthetic and real-world datasets demonstrate the advantages of SDF-Bayes over state of the art DC trial designs in terms of accuracy and safety.