van der Schaar Lab

Machine Learning meets Pharmacology

ML for pharmacology has the potential to revolutionise the way we do clinical trials and treat patients, and the van der Schaar lab is on the forefront of this research. Below are a few of our works in this area.

From Real-World Patient Data to Individualized Treatment Effects Using Machine Learning: Current and Future Methods to Address Underlying Challenges

Ioana Bica, Ahmed M Alaa, Craig Lambert, Mihaela van der Schaar

Clinical decision making needs to be supported by evidence that treatments are beneficial to individual patients. Although randomized control trials (RCTs) are the gold standard for testing and introducing new drugs, due to the focus on specific questions with respect to establishing efficacy and safety vs. standard treatment, they do not provide a full characterization of the heterogeneity in the final intended treatment population.

Conversely, real-world observational data, such as electronic health records (EHRs), contain large amounts of clinical information about heterogeneous patients and their response to treatments. In this paper, we introduce the main opportunities and challenges in using observational data for training machine learning methods to estimate individualized treatment effects and make treatment recommendations. We describe the modeling choices of the state-of-the-art machine learning methods for causal inference, developed for estimating treatment effects both in the cross-section and longitudinal settings. Additionally, we highlight future research directions that could lead to achieving the full potential of leveraging EHRs and machine learning for making individualized treatment recommendations. We also discuss how experimental data from RCTs and Pharmacometric and Quantitative Systems Pharmacology approaches can be used to not only improve machine learning methods, but also provide ways for validating them.

These future research directions will require us to collaborate across the scientific disciplines to incorporate models based on RCTs and known disease processes, physiology, and pharmacology into these machine learning models based on EHRs to fully optimize the opportunity these data present.

Integrating Expert ODEs into Neural ODEs: Pharmacology and Disease Progression

Zhaozhi Qian, William R Zame, Lucas M Fleuren, Paul Elbers, Mihaela van der Schaar

Modeling a system’s temporal behaviour in reaction to external stimuli is a fundamental problem in many areas. Pure Machine Learning (ML) approaches often fail in the small sample regime and cannot provide actionable insights beyond predictions. A promising modification has been to incorporate expert domain knowledge into ML models.

The application we consider is predicting the progression of disease under medications, where a plethora of domain knowledge is available from pharmacology. Pharmacological models describe the dynamics of carefully-chosen medically meaningful variables in terms of systems of Ordinary Differential Equations (ODEs). However, these models only describe a limited collection of variables, and these variables are often not observable in clinical environments. To close this gap, we propose the latent hybridisation model (LHM) that integrates a system of expert-designed ODEs with machine-learned Neural ODEs to fully describe the dynamics of the system and to link the expert and latent variables to observable quantities.

We evaluated LHM on synthetic data as well as real-world intensive care data of COVID-19 patients. LHM consistently outperforms previous works, especially when few training samples are available such as at the beginning of the pandemic.

Neural Laplace: Learning diverse classes of differential equations in the Laplace domain

Samuel HoltZhaozhi QianMihaela van der Schaar

Neural Ordinary Differential Equations model dynamical systems with ODEs learned by neural networks. However, ODEs are fundamentally inadequate to model systems with long-range dependencies or discontinuities, which are common in engineering and biological systems.

Broader classes of differential equations (DE) have been proposed as remedies, including delay differential equations and integro-differential equations. Furthermore, Neural ODE suffers from numerical instability when modelling stiff ODEs and ODEs with piecewise forcing functions. In this work, we propose Neural Laplace, a unified framework for learning diverse classes of DEs including all the aforementioned ones. Instead of modelling the dynamics in the time domain, we model it in the Laplace domain, where the history-dependencies and discontinuities in time can be represented as summations of complex exponentials. To make learning more efficient, we use the geometrical stereographic map of a Riemann sphere to induce more smoothness in the Laplace domain.

In the experiments, Neural Laplace shows superior performance in modelling and extrapolating the trajectories of diverse classes of DEs, including the ones with complex history dependency and abrupt changes.

D-CODE: Discovering Closed-form ODEs from Observed Trajectories

Zhaozhi QianKrzysztof KacprzykMihaela van der Schaar

For centuries, scientists have manually designed closed-form ordinary differential equations (ODEs) to model dynamical systems. An automated tool to distill closed-form ODEs from observed trajectories would accelerate the modeling process. Traditionally, symbolic regression is used to uncover a closed-form prediction function a = f(b) with label-features (ai, bi) as training examples,

However, an ODE models the time derivative x(t) of a dynamical system, e.g. x(t) = f(x(t),t), and the “label” x(t) is usually *not* observed. The existing ways to bridge this gap only perform well for a narrow range of settings with low measurement noise, frequent sampling, and non-chaotic dynamics. In this work, we propose the Discovery of Closed-form ODE framework (D-CODE), which advances symbolic regression beyond the paradigm of supervised learning.

D-CODE leverages a novel objective function based on the variational formulation of ODEs to bypass the unobserved time derivative. For formal justification, we prove that this objective is a valid proxy for the estimation error of the true (but unknown) ODE. In the experiments, D-CODE successfully discovered the governing equations of a diverse range of dynamical systems under challenging measurement settings with high noise and infrequent sampling.

D-CIPHER: Discovery of Closed-form Partial Differential Equations

Krzysztof KacprzykZhaozhi Qian, Mihaela van der Schaar

Closed-form differential equations, including partial differential equations and higher-order ordinary differential equations, are one of the most important tools used by scientists to model and better understand natural phenomena.

Discovering these equations directly from data is challenging because it requires modeling relationships between various derivatives that are not observed in the data (equation-data mismatch) and it involves searching across a huge space of possible equations. Current approaches make strong assumptions about the form of the equation and thus fail to discover many well-known systems. Moreover, many of them resolve the equation-data mismatch by estimating the derivatives, which makes them inadequate for noisy and infrequently sampled systems. To this end, we propose D-CIPHER, which is robust to measurement artifacts and can uncover a new and very general class of differential equations.

We further design a novel optimization procedure, CoLLie, to help D-CIPHER search through this class efficiently. Finally, we demonstrate empirically that it can discover many well-known equations that are beyond the capabilities of current methods.

Continuous-Time Modeling of Counterfactual Outcomes Using Neural Controlled Differential Equations

Nabeel Seedat
Fergus ImrieAlexis BellotZhaozhi Qian,
Mihaela van der Schaar

Systematic quantification of data quality is critical for consistent model performance. Prior works have focused on out-of-distribution data. Instead, we tackle an understudied yet eEstimating counterfactual outcomes over time has the potential to unlock personalized healthcare by assisting decision-makers to answer “what-if” questions. Existing causal inference approaches typically consider regular, discrete-time intervals between observations and treatment decisions and hence are unable to naturally model irregularly sampled data, which is the common setting in practice.

To handle arbitrary observation patterns, we interpret the data as samples from an underlying continuous-time process and propose to model its latent trajectory explicitly using the mathematics of controlled differential equations. This leads to a new approach, the Treatment Effect Neural Controlled Differential Equation (TE-CDE), that allows the potential outcomes to be evaluated at any time point. In addition, adversarial training is used to adjust for time-dependent confounding which is critical in longitudinal settings and is an added challenge not encountered in conventional time-series.

To assess solutions to this problem, we propose a controllable simulation environment based on a model of tumor growth for a range of scenarios with irregular sampling reflective of a variety of clinical scenarios. TE-CDE consistently outperforms existing approaches in all simulated scenarios with irregular sampling.

Synthetic Model Combination: An Instance-wise Approach to Unsupervised Ensemble Learning

Alex Chan,
Mihaela van der Schaar

Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data – instead given access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.

In scenarios from finance to the medical sciences, and even consumer practice, stakeholders have developed models on private data they either cannot, or do not want to, share. Given the value and legislation surrounding personal information, it is not surprising that only the models, and not the data, will be released – the pertinent question becoming: how best to use these models? Previous work has focused on global model selection or ensembling, with the result of a single final model across the feature space. Machine learning models perform notoriously poorly on data outside their training domain however, and so we argue that when ensembling models the weightings for individual instances must reflect their respective domains – in other words models that are more likely to have seen information on that instance should have more attention paid to them.

We introduce a method for such an instance-wise ensembling of models, including a novel representation learning step for handling sparse high-dimensional domains.

Finally, we demonstrate the need and generalisability of our method on classical machine learning tasks as well as highlighting a real world use case in the pharmacological setting of vancomycin precision dosing.

Deep Generative Symbolic Regression

Sam Holt,
Zhaozhi Qian, Mihaela van der Schaar

Symbolic regression (SR) aims to discover concise closed-form mathematical equations from data, a task fundamental to scientific discovery. However, the problem is highly challenging because closed-form equations lie in a complex combinatorial search space.

Existing methods, ranging from heuristic search to reinforcement learning, fail to scale with the number of input variables. We make the observation that closed-form equations often have structural characteristics and invariances (e.g., the commutative law) that could be further exploited to build more effective symbolic regression solutions. Motivated by this observation, our key contribution is to leverage pre-trained deep generative models to capture the intrinsic regularities of equations, thereby providing a solid foundation for subsequent optimization steps.

We show that our novel formalism unifies several prominent approaches of symbolic regression and offers a new perspective to justify and improve on the previous ad hoc designs, such as the usage of cross-entropy loss during pre-training. Specifically, we propose an instantiation of our framework, Deep Generative Symbolic Regression (DGSR). In our experiments, we show that DGSR achieves a higher recovery rate of true equations in the setting of a larger number of input variables, and it is more computationally efficient at inference time than state-of-the-art RL symbolic regression solutions.

Synthetic Model Combination: A new machine-learning method for pharmacometric model ensembling

Alex Chan, Richard Peck, Megan Gibbs, Mihaela van der Schaar

When aiming to make predictions over targets in the pharmacological setting, a data-focused approach aims to learn models based on a collection of labeled examples. Unfortunately, data sharing is not always possible, and this can result in many different models trained on disparate populations, leading to the natural question of how best to use and combine them when making a new prediction.

Previous work has focused on global model selection or ensembling, with the result of a single final model across the feature space. Machine-learning models perform notoriously poorly on data outside their training domain, however, due to a problem known as covariate shift, and so we argue that when ensembling models the weightings for individual instances must reflect their respective domains—in other words, models that are more likely to have seen information on that instance should have more attention paid to them. We introduce a method for such an instance-wise ensembling of models called Synthetic Model Combination (SMC), including a novel representation learning step for handling sparse high-dimensional domains.

We demonstrate the use of SMC on an example with dosing predictions for vancomycin, although emphasize the applicability of the method to any scenario involving the use of multiple models.

Andreas Bedorf