van der Schaar Lab


Please note: this page is a work in progress. Please treat it as a “stub” containing only basic information, rather than a full-fledged summary of our lab’s vision for clustering and our research to date.

Clustering patients (also referred to on this page as phenotyping and subgroup identification) is an important challenge that becomes particularly complicated in a dynamic setting where longitudinal datasets are in use. This page provides an overview of our lab’s work to date on clustering, with a special focus on our research on outcome-oriented clustering.

From unsupervised clustering to outcome-oriented clustering

The conventional notion of clustering seeks to group patients together in an unsupervised manner, based on their static or longitudinal features (covariates). However, unsupervised clustering does not account for patients’ observed outcomes (such as adverse events or the onset of comorbidities), and thus often leads to heterogeneous outcomes within a given cluster. Therefore, this type of clustering yields information that is of relatively limited use to clinicians and patients—after all, chronic diseases such as cancer, cystic fibrosis and dementia are heterogeneous in nature, with widely differing outcomes, even when the patients’ features seem relatively similar.

What clinicians and patients actually need to know is what types of events (including events related to competing risks) will likely occur in the future, given the observations (features) they have observed so far. We are, therefore, interested in a type of clustering or phenotyping in which patients are grouped based on similarity of future outcomes, rather than solely on similarity of observations.

One of our lab’s first projects to address this shortcoming was the “tree of predictors” (ToPs), an ensemble method first published in 2018.

Working in the supervised setting, ToPs captures the heterogeneity of the populations by learning automatically on the basis of the data which features have the most predictive power and which features have the most discriminative power for each time horizon. ToPs uses this knowledge to create clusters of patients and specific predictive models for each cluster. The clusters that are identified and the predictive models that are applied to each cluster are readily interpretable.

ToPs differs from existing methods in that it discovers clusters in a data-driven manner, and then constructs and applies different predictive models to the discovered clusters. While tree-based approaches create successive clusters (splits) of the feature space in order to maximize homogeneity of each cluster (split) with respect to labels, ToPs creates successive clusters (splits) of the feature space in order to create clusters that maximize the predictive accuracy of each cluster (split) with respect to a constructed predictive model. To this end, ToPs creates a tree of clusters (subsets of the feature space) and associates a predictive model with each such cluster (subset).

The figure below shows the tree of predictors for 3-year survival following cardiac transplantation. End nodes are shaded gray, and the 3 most relevant features are listed for each node as feature indices.

Abstracts and papers related to the lab’s work on ToPs (including an application of ToPs to the problem of cardiac transplantation) can be found below.

ToPs: Ensemble Learning with Trees of Predictors

Jinsung Yoon, William R. Zame, Mihaela van der Schaar

IEEE Transactions on Signal Processing, 2018

We present a new approach to ensemble learning. Our approach differs from previous approaches in that it constructs and applies different predictive models to different subsets of the feature space. It does this by constructing a tree of subsets of the feature space and associating a predictor (predictive model) to each node of the tree; we call the resulting object a tree of predictors.

The (locally) optimal tree of predictors is derived recursively; each step involves jointly optimizing the split of the terminal nodes of the previous tree and the choice of learner (from among a given set of base learners) and training set-hence predictor-for each set in the split. The features of a new instance determine a unique path through the optimal tree of predictors; the final prediction aggregates the predictions of the predictors along this path. Thus, our approach uses base learners to create complex learners that are matched to the characteristics of the data set while avoiding overfitting. We establish loss bounds for the final predictor in terms of the Rademacher complexity of the base learners.

We report the results of a number of experiments on a variety of datasets, showing that our approach provides statistically significant improvements over a wide variety of state-of-the-art machine learning algorithms, including various ensemble learning methods.

Risk prediction is crucial in many areas of medical practice, such as cardiac transplantation, but existing clinical risk-scoring methods have suboptimal performance.

We develop a novel risk prediction algorithm and test its performance on the database of all patients who were registered for cardiac transplantation in the United States during 1985-2015. We develop a new, interpretable, methodology (ToPs: Trees of Predictors) built on the principle that specific predictive (survival) models should be used for specific clusters within the patient population. ToPs discovers these specific clusters and the specific predictive model that performs best for each cluster.

In comparison with existing clinical risk scoring methods and state-of-the-art machine learning methods, our method provides significant improvements in survival predictions, both post- and pre-cardiac transplantation. For instance: in terms of 3-month survival post-transplantation, our method achieves AUC of 0.660; the best clinical risk scoring method (RSS) achieves 0.587. In terms of 3-year survival/mortality predictions post-transplantation (in comparison to RSS), holding specificity at 80.0%, our algorithm correctly predicts survival for 2,442 (14.0%) more patients (of 17,441 who actually survived); holding sensitivity at 80.0%, our algorithm correctly predicts mortality for 694 (13.0%) more patients (of 5,339 who did not survive). ToPs achieves similar improvements for other time horizons and for predictions pre-transplantation.

ToPs discovers the most relevant features (covariates), uses available features to best advantage, and can adapt to changes in clinical practice. We show that, in comparison with existing clinical risk-scoring methods and other machine learning methods, ToPs significantly improves survival predictions both post- and pre-cardiac transplantation. ToPs provides a more accurate, personalized approach to survival prediction that can benefit patients, clinicians, and policymakers in making clinical decisions and setting clinical policy.

Because survival prediction is widely used in clinical decision-making across diseases and clinical specialties, the implications of our methods are far-reaching.

Outcome-oriented clustering in the time series setting

Temporal clustering has been recently used as a data-driven framework to partition patients with time-series observations into subgroups of patients. Recent research has typically focused on either finding fixed-length and low-dimensional representations, or on modifying the similarity measure, both in an attempt to apply the existing clustering algorithms to time-series observations.

Identifying patient subgroups with similar progression patterns can be advantageous for understanding such heterogeneous diseases. This allows clinicians to anticipate patients’ prognoses by comparing them to “similar” patients, and to design treatment guidelines tailored to homogeneous subgroups.

Our lab has developed a method for temporal phenotyping in this manner using deep predictive clustering of disease progression, as presented at ICML 2020. This provides a notion of temporal phenotyping that is predictive of similar future outcomes, on the basis of which doctors and patients can actively plan. The focus here is on learning discrete representations of past observations that best describe and predict future events and outcomes of interest.

Temporal phenotyping using deep predictive clustering of disease progression

Changhee Lee, Mihaela van der Schaar

ICML 2020

Due to the wider availability of modern electronic health records, patient care data is often being stored in the form of time-series. Clustering such time-series data is crucial for patient phenotyping, anticipating patients’ prognoses by identifying “similar” patients, and designing treatment guidelines that are tailored to homogeneous patient subgroups.

In this paper, we develop a deep learning approach for clustering time-series data, where each cluster comprises patients who share similar future outcomes of interest (e.g., adverse events, the onset of comorbidities). To encourage each cluster to have homogeneous future outcomes, the clustering is carried out by learning discrete representations that best describe the future outcome distribution based on novel loss functions.

Experiments on two real-world datasets show that our model achieves superior clustering performance over state-of-the-art benchmarks and identifies meaningful clusters that can be translated into actionable information for clinical decision-making.

Developing machine learning algorithms for dynamic estimation of progression during active surveillance for prostate cancer

Changhee Lee, Alexander Light, Evgeny S. Saveliev, Mihaela van der Schaar, Vincent J. Gnanapragasam

npj Digital Medicine, 2022

Active Surveillance (AS) for prostate cancer is a management option that continually monitors early disease and considers intervention if progression occurs. A robust method to incorporate “live” updates of progression risk during follow-up has hitherto been lacking. To address this, we developed a deep learning-based individualized longitudinal survival model using Dynamic-DeepHit-Lite (DDHL) that learns data-driven distribution of time-to-event outcomes.

Further refining outputs, we used a reinforcement learning approach (Actor-Critic) for temporal predictive clustering (AC-TPC) to discover groups with similar time-to-event outcomes to support clinical utility. We applied these methods to data from 585 men on AS with longitudinal and comprehensive follow-up (median 4.4 years). Time-dependent C-indices and Brier scores were calculated and compared to Cox regression and landmarking methods. Both Cox and DDHL models including only baseline variables showed comparable C-indices but the DDHL model performance improved with additional follow-up data. With 3 years of data collection and 3 years follow-up the DDHL model had a C-index of 0.79 (± 0.11) compared to 0.70 (± 0.15) for landmarking Cox and 0.67 (± 0.09) for baseline Cox only. Model calibration was good across all models tested.

The AC-TPC method further discovered 4 distinct outcome-related temporal clusters with distinct progression trajectories. Those in the lowest risk cluster had negligible progression risk while those in the highest cluster had a 50% risk of progression by 5 years. In summary we report a novel machine learning approach to inform personalised follow-up during active surveillance which improves predictive power with increasing data input over time.

Outcome-oriented deep temporal phenotyping of disease progression

Changhee Lee, Jem Rashbass, Mihaela van der Schaar

IEEE transactions on biomedical engineering, 2020

Chronic diseases evolve slowly throughout a patient’s lifetime creating heterogeneous progression patterns that make clinical outcomes remarkably varied across individual patients. A tool capable of identifying temporal phenotypes based on the patients’ different progression patterns and clinical outcomes would allow clinicians to better forecast disease progression by recognizing a group of similar past patients, and to better design treatment guidelines that are tailored to specific phenotypes.

To build such a tool, we propose a deep learning approach, which we refer to as outcome-oriented deep temporal phenotyping (ODTP), to identify temporal phenotypes of disease progression considering what type of clinical outcomes will occur and when based on the longitudinal observations. More specifically, we model clinical outcomes throughout a patient’s longitudinal observations via time-to-event (TTE) processes whose conditional intensity functions are estimated as non-linear functions using a recurrent neural network. Temporal phenotyping of disease progression is carried out by our novel loss function that is specifically designed to learn discrete latent representations that best characterize the underlying TTE processes.

The key insight here is that learning such discrete representations groups progression patterns considering the similarity in expected clinical outcomes, and thus naturally provides outcome-oriented temporal phenotypes.

We demonstrate the power of ODTP by applying it to a real-world heterogeneous cohort of 11,779 stage III breast cancer patients from the UK National Cancer Registration and Analysis Service. The experiments show that ODTP identifies temporal phenotypes that are strongly associated with the future clinical outcomes and achieves significant gain on the homogeneity and heterogeneity measures over existing methods.

Furthermore, we are able to identify the key driving factors that lead to transitions between phenotypes which can be translated into actionable information to support better clinical decision-making.

Associating outcome-oriented subgroups with longitudinal patterns

Outcome-oriented clusters can capture the transition of disease progression and allow clinicians to investigate the associated longitudinal patterns in patient trajectories. Existing temporal clustering approaches generally focus on discovering patient subgroups solely based on their clinical status or outcome, which restrains the prognostic value of discovered clusters due to the negligence of the heterogeneity of longitudinal patterns in each subgroup.

To understand the full picture of disease progression that manifests through heterogeneous temporal characteristics in disease trajectory, identification of unique associations between longitudinal patterns and clinical outcomes is desirable. This provides greater diagnostic value to clinicians and enables tailored treatments with references to “similar” patients of both close outcomes and homogeneous disease progression patterns over time.

In complement to the outcome-oriented clusters, our lab has developed a novel temporal clustering method to correctly uncover predictive temporal patterns that are descriptive of the underlying disease progression from labeled time-series data, as published in AISTATS 2023.  This new temporal clustering approach not only can identify clusters that have a prognostic value but also can offer interpretable information about the disease progression patterns. This is achieved through constrained outcome-oriented clustering on a similarity graph which captures heterogeneities in disease trajectories of individual patients.

T-Phenotype: Discovering Phenotypes of Predictive Temporal Patterns in Disease Progression

Yuchao Qin, Mihaela van der Schaar, Changhee Lee


Clustering time-series data in healthcare is crucial for clinical phenotyping to understand patients’ disease progression patterns and to design treatment guidelines tailored to homogeneous patient subgroups. While rich temporal dynamics enable the discovery of potential clusters beyond static correlations, two major challenges remain outstanding: i) discovery of predictive patterns from many potential temporal correlations in the multi-variate time-series data and ii) association of individual temporal patterns to the target label distribution that best characterizes the underlying clinical progression.

To address such challenges, we develop a novel temporal clustering method, T-Phenotype, to discover phenotypes of predictive temporal patterns from labeled time-series data. We introduce an efficient representation learning approach in frequency domain that can encode variable-length, irregularly-sampled time-series into a unified representation space, which is then applied to identify various temporal patterns that potentially contribute to the target label using a new notion of path-based similarity.

Throughout the experiments on synthetic and real-world datasets, we show that T-Phenotype achieves the best phenotype discovery performance over all the evaluated baselines. We further demonstrate the utility of T-Phenotype by uncovering clinically meaningful patient subgroups characterized by unique temporal patterns.

Other work on clustering and subtyping

New approaches to clustering and subtyping also feature in some of our lab’s earlier research. For example, the paper below introduces a personalized risk scoring method that learns a set of latent patient subtypes from offline electronic health record data, and trains a mixture of Gaussian process experts. Each expert models the physiological data streams associated with a specific patient subtype. Transfer learning techniques are used to learn the relationship between a patient’s latent subtype and static admission information (e.g., age, gender, transfer status, ICD-9 codes, etc).

Personalized Risk Scoring for Critical Care Prognosis using Mixtures of Gaussian Processes

Ahmed M Alaa, Jinsung Yoon, Scott Hu, Mihaela van der Schaar

IEEE transactions on biomedical engineering, 2017

In this paper, we develop a personalized real-time risk scoring algorithm that provides timely and granular assessments for the clinical acuity of ward patients based on their (temporal) lab tests and vital signs; the proposed risk scoring system ensures timely intensive care unit admissions for clinically deteriorating patients.

The risk scoring system is based on the idea of sequential hypothesis testing under an uncertain time horizon. The system learns a set of latent patient subtypes from the offline electronic health record data, and trains a mixture of Gaussian Process experts, where each expert models the physiological data streams associated with a specific patient subtype. Transfer learning techniques are used to learn the relationship between a patient’s latent subtype and her static admission information (e.g., age, gender, transfer status, ICD-9 codes, etc).

Experiments conducted on data from a heterogeneous cohort of 6321 patients admitted to Ronald Reagan UCLA medical center show that our score significantly outperforms the currently deployed risk scores, such as the Rothman index, MEWS, APACHE, and SOFA scores, in terms of timeliness, true positive rate, and positive predictive value. Our results reflect the importance of adopting the concepts of personalized medicine in critical care settings; significant accuracy and timeliness gains can be achieved by accounting for the patients’ heterogeneity.

The proposed risk scoring methodology can confer huge clinical and social benefits on a massive number of critically ill inpatients who exhibit adverse outcomes including, but not limited to, cardiac arrests, respiratory arrests, and septic shocks.

Learn more and get involved

Our research related to clustering is closely linked to a number of our lab’s other core areas of focus. If you’re interested in branching out from clustering, we’d recommend reviewing our summaries on time series in healthcare and survival analysis, competing risks, and comorbidities.

We would also encourage you to stay up-to-date with ongoing developments in this and other areas of machine learning for healthcare by signing up to take part in one of our two streams of online engagement sessions.

If you are a practicing clinician, please sign up for Revolutionizing Healthcare, which is a forum for members of the clinical community to share ideas and discuss topics that will define the future of machine learning in healthcare (no machine learning experience required).

If you are a machine learning student, you can join our Inspiration Exchange engagement sessions, in which we introduce and discuss new ideas and development of new methods, approaches, and techniques in machine learning for healthcare.

A full list of our papers on this and related topics can be found here.