The van der Schaar Lab’s 2021 open house took place on October 15 as a by-invitation event for collaborators and sponsors. Mihaela van der Schaar and members of the lab presented a cohesive vision for machine learning for healthcare, while demonstrating how this vision is already becoming reality through a range of world-leading research projects.
The full three-hour event consisted of 10 mini-seminars (under 10 minutes) given by lab members, focusing on projects or areas grouped around three broad themes: challenge, growth, and impact.
All 10 mini-seminars (as well as Mihaela’s introductory presentation) are provided in full below, along with links to any reference material. The Q&A/discussion breaks between sessions are not included.
The mini-seminars are provided in chronological order on this page, or you can jump to a specific video using the menu below. You can also view all videos as a single YouTube playlist here.
Introduction (Mihaela van der Schaar)
In this short introductory presentation Mihaela van der Schaar outlines the lab’s ongoing vision and research focus, shares some key resources and reference material, and explains our ongoing focus on using engagement to help build the future of ML for healthcare.
– Publications page
– Engagement sessions – Inspiration Exchange and Revolutionizing Healthcare
– Video tutorial series on ITE inference
– ICML 2020 tutorial, and ICML 2021 tutorial
– Impact page
– Software page, GitHub repo, and PyPi project page
Session 1 – challenge
This section focuses specifically on projects we feel are turning existing thinking on its head, challenging established wisdom, and leading the field.
ML interpretability (Jonathan Crabbé)
Ph.D. student Jonathan Crabbé explains the importance of machine learning interpretability and explainability, and outlines some key challenges in and beyond healthcare. Jonathan then focuses on the problem of feature importance, and introduces Dynamask, a parsimonious approach for feature importance in the time series setting.
– Explaining Time Series Predictions with Dynamic Masks (ICML 2021)
– Research pillar on interpretable machine learning
Personalized therapeutics (Alicia Curth)
Ph.D. student Alicia Curth introduces the problem of personalized therapeutics and heterogenous treatment effects—a core area of study at the intersection of machine learning and healthcare. After recapping progress to date, Alicia focuses on the problem of modeling treatment effects using approaches such as meta-learners and altering inductive biases. She then moves on to the problem of estimating heterogenous treatment effects from time-to-event data.
– Nonparametric Estimation of Heterogeneous Treatment Effects: From Theory to Learning Algorithms (AISTATS 2021)
– On Inductive Biases for Heterogeneous Treatment Effect Estimation (NeurIPS 2021)
– Really Doing Great at Estimating CATE? A Critical Look at ML Benchmarking Practices in Treatment Effect Estimation (NeurIPS 2021)
– SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data (NeurIPS 2021)
– Research pillar on individualized treatment effect inference
– Video tutorial series on individualized treatment effect inference
Self-supervised learning for genomics (Fergus Imrie)
Postdoc Fergus Imrie introduces key problems surrounding the use of genomics data, starting with the lack of labeled data and the unique structure of tabular data. After introducing novel solutions to these problems, Fergus then demonstrates how ML for genomics can be used not only for prediction but also for discovery, with potential application in areas such as transcriptomics and proteomics.
– VIME: Extending the Success of Self- and Semi-supervised Learning to Tabular Domain (NeurIPS 2020)
– Self-Supervision Enhanced Feature Selection with Correlated Gates (ICLR 2022)
Session 2 – growth
The focus of this session is on projects that show how our lab is creating entirely new research areas, cultivating underdeveloped areas, and fresh investment of resources.
Discovery using ML (Zhaozhi Qian)
Ph.D. student Zhaozhi Qian begins this mini-seminar by framing the broader problem of taking pharmacokinetic/pharmacodynamic models from lab research to implementation in practice. Zhaozhi introduces existing approaches such as latent variable models, and explains the shortcomings of such approaches. He then proposes a latent hybridization approach, demonstrating its effectiveness using a case study from the ICU setting.
– Integrating Expert ODEs into Neural ODEs: Pharmacology and Disease Progression (NeurIPS 2021)
Synthetic data (Boris van Breugel)
Ph.D. student Boris van Breugel explains how the inaccessibility of healthcare data continues to exert a dampening effect on potentially groundbreaking research in areas like machine learning. After introducing synthetic data generation as a powerful solution to this problem, he explains how synthetic approaches can also lead to datasets that are fairer and more useful than their real-world counterparts. Boris presents DECAF, an approach to generating fair synthetic data using causal knowledge.
– DECAF: Generating Fair Synthetic Data Using Causally-Aware Generative Networks (NeurIPS 2021)
– Research pillar on synthetic data
Quantitative epistemology (Alihan Hüyük)
Ph.D. student Alihan Hüyük introduces quantitative epistemology, a new and transformationally significant research pillar pioneered by the van der Schaar Lab with the purpose of understanding, supporting, and improving human decision-making. As Alihan presents a broad framework for quantitative epistemology, as well as some of the lab’s initial forays into this entirely new area of research.
– Inverse Active Sensing: Modeling and Understanding Timely Decision-Making (ICML 2020)
– Scalable Bayesian Inverse Reinforcement Learning (ICLR 2021)
– Learning “What-if” Explanations for Sequential Decision-Making (ICLR 2021)
– Explaining by Imitating: Understanding Decisions by Interpretable Policy Learning (ICLR 2021)
– Inverse Decision Modeling: Learning Interpretable Representations of Behavior (ICML 2021)
– Inverse Contextual Bandits: Learning How Behavior Evolves over Time (submitted, 2021)
– Research pillar on quantitative epistemology
Session 3 – impact
For this section, we selected a handful of projects and research areas with a particularly strong focus on utility and outreach.
Temporal phenotyping (Changhee Lee)
Recent lab graduate Changhee Lee introduces the broad problem of phenotyping, and shows how the lab’s existing work has already made a real-world impact. Changhee then explores the complexity of shifting from the static to the time-series setting, and the challenge of transforming complex data into actionable information. He presents a powerful approach to clustering patients based on their outcomes over time, and finishes with a demonstration of a brand-new interactive toolkit for temporal phenotyping.
– Application of a novel machine learning framework for predicting non-metastatic prostate cancer-specific mortality in men using the Surveillance, Epidemiology, and End Results (SEER) database (The Lancet Digital Health, 2021)
– Temporal Phenotyping using Deep Predictive Clustering of Disease Progression (ICML 2020)
Organ transplantation (Jeroen Berrevoets)
Ph.D. student Jeroen Berrevoets highlights a range of major problems (including scarcity and inefficient allocation) currently facing decision-makers in the incredibly important domain of organ transplantation. Jeroen explains the potential of machine learning in areas such as prediction, allocation, and matchmaking policy. He then introduces and explains some of the lab’s latest research conducted in collaboration with transplantation experts from the clinical world.
– OrganITE: Optimal transplant donor organ offering using an individual treatment effect (NeurIPS 2020)
– Learning Queueing Policies for Organ Transplantation Allocation using Interpretable Counterfactual Survival Analysis (ICML 2021)
– Spotlight on the lab’s organ transplantation research
Transfer learning (Trent Kyono)
Ph.D. student Trent Kyono begins his mini-seminar with a high-level introduction to transfer learning, and explains the value of being able to translate knowledge learned when solving one problem to other problems. Trent then focuses on the problem of domain adaptation, explaining the shortcomings of currently prevalent approaches and demonstrating the effectiveness of a newly-developed unsupervised domain adaptation method.
– Exploiting Causal Structure for Robust Model Selection in Unsupervised Domain Adaptation (IEEE Transactions on Artificial Intelligence, 2021)
– Selecting Treatment Effects Models for Domain Adaptation Using Causal Knowledge (submitted, 2021)
Adjutorium (Bogdan Cebere)
Research engineer Bogdan Cebere presents Adjutorium, the van der Schaar Lab’s cutting-edge automated machine learning (AutoML) toolkit for static datasets. Bogdan demonstrates how the toolkit automatically generates unique and optimized models to solve custom user-defined problems based on user-defined datasets and parameters. He also shows how Adjutorium provides interpretable outputs, and how the toolkit itself can be easily extended with custom plugins.
– Note: the framework presented above is closed-source for now. We have so far provided access on a very limited basis to some of our lab’s partners and collaborators.
– More generally, you can find software and code via our software page, GitHub repo, and PyPi project page.
To inquire about PhD studentships, visit this page.