van der Schaar Lab

Revolutionizing healthcare: an invitation to clinical professionals everywhere

What’s on this page?

  • Intro to Revolutionizing Healthcare (engagement session series for clinicians)
  • Vision for next-generation healthcare ecosystem, and role of machine learning
  • Machine learning tools for clinicians, and real-world applications/contexts:
    + Breast cancer
    + Cystic fibrosis
    + In-hospital critical care
    + Multimorbidity
    + Transplant survival
  • – Plans for future Revolutionizing Healthcare sessions
  • – Closing message

Dear clinical colleagues,

I would like to ask you to partner with me and the members of my lab as we embark on a new and exciting journey.

In one week, we will hold the first in a series of online discussions on how machine learning can change healthcare for the better. If you are a clinician—even if you don’t know anything about machine learning, or are perhaps skeptical—please consider joining us, and please reach out to your colleagues to get them involved as well.

The name of our new discussion series is Revolutionizing Healthcare. This might seem overambitious, but I truly believe it’s the right name for the challenging but transformative nature of the journey that lies ahead of us in the coming years. We want you to lead us on that journey and become agents of change; we hope to enable you to tackle the many institutional and conceptual barriers that can only be taken down from the inside.

Through these sessions, we hope to create an interdisciplinary community based on curiosity, shared goals, and mutual understanding. On one hand, we want to convey directly to you (since you’re our intended end users) how you could harness our lab’s potent and adaptable machine learning models to make the best possible decisions for your patients—whether you’re a GP aiming to diagnose patients early and screen them effectively, a specialist aiming to identify the best treatments for a particular patient, or an ICU clinician trying to manage scarce resources. On the other hand, we need your guidance to help us explore, frame and solve important real-world problems in the domain of healthcare, and to work our solutions into a cohesive ecosystem for healthcare delivery. As our Revolutionizing Healthcare series progresses and this community grows over time, we will discuss ways to design this ecosystem together piece by piece, actor by actor, and method by method.

In anticipation of our first session on September 29, I would like to use this post to share my vision of a more personalized, quantitative, and data-driven healthcare system, and to show the substance behind my excitement. This is still really just a sliver of what is possible; I hope that together we can paint a much more thorough picture over the coming months.

Envisaging a healthcare ecosystem powered by machine learning

Our ultimate goal is a full-fledged transformation that will create an entire ecosystem encompassing everything from (inter)national healthcare networks all the way down to individual practitioners and patients. I have presented an early version of this in my chapter for the Chief Medical Officer’s 2018 annual report.

At the center will be machine learning, driven by healthcare data (including electronic health records, prescription data, referral data, appointment data, and so forth), which will provide the actionable intelligence that is relevant to each specific user in each specific scenario. In this way, machine learning can inform and empower any and all actors within healthcare, while strengthening the quality of their mutual interactions.

While the diagram above could never possibly capture the actual complexity of a healthcare ecosystem, my intention in creating it was to highlight the sheer scale on which we need to consider the implementation of machine learning in healthcare. This is a system that requires all tools, for all users, for all purposes, to act together in concert. Such a system will need to be extraordinarily well-planned: it must be built on top of shared standards and protocols, it must offer a high degree of interoperability between users in different “layers,” and the tools provided must be created in modular fashion with well-defined interfaces, enabling users to switch between them seamlessly to create custom combinations of analytics that reflect their individual and context-specific needs. We will dedicate a future Revolutionizing Healthcare session to exploring and discussing several possibilities for such an architecture.

Making machine learning work for clinicians

In our first few Revolutionizing Healthcare sessions, we will be focusing specifically on clinicians like you, and demonstrating how machine learning tools can positively transform practically every area of your work. This is what I will focus on below, and I would like to start by addressing three questions that the clinicians I’ve worked with have often asked me—especially in the earlier stages of a collaboration.

The first such question is whether machine learning is really able to take advantage of existing healthcare data (as exemplified by electronic health records), and how data should be acquired and curated in order to get the most use out of it. This is a topic I could debate at great length, but for now I can summarize by saying that, while I firmly believe that healthcare data are the key that will unlock the transformation in healthcare, I also feel that electronic health records will need to be redesigned. As I’m sure many of you will agree, electronic health records currently tend to present hours of clerical busywork every day (actively competing with patient consultation time), with surprisingly little utility in return. These grievances are pithily rendered in a single sentence by Atul Gawande: “I’ve come to feel that a system that promised to increase my mastery over my work has, instead, increased my work’s mastery over me.” We need to rebuild electronic health records to take full advantage of the capabilities of machine learning, and, by doing so, strike a better balance between time invested by clinicians and actionable knowledge gained. We must turn them into true engines for healthcare. I hope to use our Revolutionizing Healthcare sessions to discuss the kind of data that will be most important in healthcare, as well as the best means of acquiring such data and putting them to work. On a side note, we will also discuss, in future Revolutionizing Healthcare sessions, how to approach the implementation of machine learning in countries that do not have a well-developed infrastructure for data.

Another question I’ve been asked is whether AI and machine learning can ever substitute for humans. This question is based on a fundamental misconception that I hope to thoroughly and repeatedly squash. There is no viable vision for healthcare that does not have humans i) at its heart and ii) in full control. It is not a case of choosing between one or the other. Humans and AI/machine learning each have unique strengths that must be combined in order to become more than the sum of their respective parts, rather than applied separately (for more on this, here’s a highly recommended article by Siddartha Mukherjee).

Lastly, from time to time I am asked why only a few examples of techniques or approaches reach the public conversation about machine learning in healthcare; specifically, I’m talking about producing risk scores and static predictions for single diseases and conditions, and image recognition and its ability to spot tumors. My answer is that such examples are really only scratching the surface of what is possible. Through our Revolutionizing Healthcare series, I hope to demonstrate that the potential of machine learning goes far, far beyond these oft-cited examples. We can actually provide a simultaneous array of risk scores for a host of diseases or conditions (and these can be tuned, customized, and focused at ease). Furthermore, risk scores and predictions themselves represent just fragments of the extensive capabilities of machine learning. This brings me on to the main topic of my post.

Introducing the “toolbox of toolboxes”

For years, our lab has been steadily developing a comprehensive and completely customizable range of tools for analytics, intelligence, and recommendations that can inform and support clinicians, empowering them with the best possible information to make treatment decisions, while also helping them to make the most efficient use of limited time and resources. I’d call the result a “toolbox,” but it’s really a “toolbox of toolboxes.” We want clinicians alone to have complete control over this toolbox of toolboxes: they should be able to harness its contents to fit their individual preferences, adapting them for specific situations, contexts, or patients.

We have developed and validated numerous formalisms, models, and algorithms for every single task referred to in the table below.

Screening
Accurate and personalized identification of at-risk patients

Risk score array
Instant provision of a range of deep and comprehensive set of risk scores

Monitoring
Determining what to monitor for each specific patient, and when

Treatment effect estimation
Designing optimal treatment courses with foresight into effects over time

Early warning systems
Accurate and timely prediction of clinical deterioration

Early diagnosis
Catching diseases and conditions earlier

Recommender systems
Suggesting information clinicians may find useful when treating specific patients

Competing risk analysis
Factoring multimorbidity considerations into treatment decisions

Dynamic survival analysis
Survival analysis weighted to prefer better-performing models

Resource allocation
Predictions and recommendations based on accurate forecasting of resource usage

I don’t want to rattle off a list of promising but vague names while skimping on details, so let me share some specific examples of applications that show precisely how many of machine learning tools and techniques that we have already developed can be effectively and efficiently applied in practice. Many of these were actually developed in close collaboration with, and with input from, clinical colleagues.

A few points I’d like to make before I start:
While I’ve attempted to illustrate the capabilities of machine learning by using a few specific diseases (e.g. cancer and cystic fibrosis) and settings as examples, most of the tools introduced here can be applied to an exceptionally wide range of diseases, conditions, or scenarios.
I am trying to give an introductory peek inside our “toolbox of toolboxes,” so please consider this a “representative sample” of our tools. We have a lot more, but the ones below have been selected to highlight the diversity of situations in which machine learning can aid decision-making, while also showing how these tools can work together synergistically. In fact, what we want most out of our Revolutionizing Healthcare sessions is inspiration for new approaches and methods based on real-world clinical needs!
I may describe some of these use cases in a manner unlike that of a clinician. This is because, despite many fruitful interdisciplinary collaborations, I remain a machine learning engineer, not a clinician—and it’s another reason my lab and I need your help!

Breast cancer

Breast cancer is one of many common diseases that can be fought more successfully by clinicians using machine learning at all stages of patient-clinician interaction, from screening through to treatment courses.

Screening: from “one-size-fits-all” to personalized

For diseases like breast cancer, screening often represents the start of the patient journey. Time is of the essence when diagnosing and treating breast cancer, since earlier diagnoses yield higher likelihood of survival. Yet adolescent and young adult female breast cancer patients are significantly less likely than older adults to be diagnosed with early-stage disease—a fact that decreases the likelihood of survival of thousands of women each year. Younger women are unlikely to be screened because current screening policies are guided by clinical practice guidelines that take a “one size-fits-all” approach designed to work well on average for an entire population.

Since the risks and benefits of screening tests are functions of each patient’s features and characteristics, personalized screening policies tailored to the features of individuals would yield a significantly more accurate and efficient result. Several machine learning approaches to this problem have been developed by our own lab. One example of such a tool is ConfidentCare, a computer-aided clinical decision support system that provides personalized screening policies based on healthcare data. This is done by identifying clusters of patients with similar features, then learning the “best” screening procedure for each cluster, while ensuring that the learned screening policy satisfies a predefined accuracy requirement with a high level of confidence for every cluster. We have already demonstrated that, when applied to real-world data, ConfidentCare outperforms current clinical practice guidelines in terms of cost-efficiency and false positive rates.

Personalized screening policies will sometimes lead to recommendations to screen more often, or to use more accurate but perhaps more invasive or more expensive screening procedures. They will also sometimes be exactly the opposite, because many patients who are low risk do not need to be screened as often and do not need costly or invasive screening procedures. Less frequent screening of patients at low risk frees up scarce resources to provide more frequent screening for patients at high risk. In addition, machine learning can learn the value of screening and of screening over time for the particular patient at hand. And it’s not only the timing of screening that should be individualized: machine learning can learn the relative value of various types of information (and the value of such information over time) and hence learn that some patients require a particular method of screening or follow-up while others do not.

More about the underlying research

ConfidentCare: A Clinical Decision Support System for Personalized Breast Cancer Screening
Ahmed Alaa, Kyeong H. Moon, William Hsu, Mihaela van der Schaar (IEEE Transactions on Multimedia, 2016)

DPSCREEN: Dynamic Personalized Screening
Kartik Ahuja, William Zame, Mihaela van der Schaar (NeurIPS 2017)

Chapter 10, Chief Medical Officer annual report 2018: better health within reach
William Zame, Mihaela van der Schaar (published 2018)

Deep Sensing: Active Sensing using Multi-directional Recurrent Neural Networks
Jinsung Yoon, William R. Zame, Mihaela van der Schaar (ICLR 2018)

Ensuring timely diagnosis by understanding decision-making

Providing personalized screening policies, as described above, is an important first step to identifying patients who may already have undiagnosed diseases or conditions, such as breast cancer. Clinicians still, however, need to make a sequence of diagnostic decisions: if a patient has been flagged for screening or presents with a set of symptoms, the clinician must decide which diagnostic tests should be prescribed, in which order they should be taken, and when an official diagnosis should be declared.

No two patients or clinicians are alike, which may explain why diagnosis is an area in which regional and institutional variation in clinical practice is commonplace. This variation has been shown to contribute substantially to healthcare costs.

This is an issue that can be addressed and mitigated by quantifying the different decision-making processes of clinicians and observing their demonstrated behavior. By observing the sequences of diagnostic tests prescribed, we can learn about the tradeoffs implicitly encoded into the patterns of diagnostic decision making. This applies, in particular, to the timing and motivators of decisions made regarding the cost-benefit balance of acquiring information—for example, when running diagnostic tests that may potentially be costly.

Our own lab is beginning to conduct research in this area. We have developed and validated a number of tools that could be used to audit and improve decision-making behavior. In the case of a breast cancer patient, for example, such tools could enable us to identify a patient whose diagnosis might have been delayed as a result of a belated decision to run a test. More generally, we could quantify and compare the priorities of decision-makers such as clinicians, or detect hidden biases. We could learn, for example, which diseases clinicians consider more important to diagnose correctly, or which tests tend to be over-prescribed. Such learnings can be used to improve consistency and quality of practice.

More about the underlying research

Balancing Suspense and Surprise: Timely Decision Making with Endogenous Information Acquisition
Ahmed M. Alaa, Mihaela van der Schaar (NeurIPS 2016)

ASAC: Active Sensing using Actor-Critic models
Jinsung Yoon, James Jordon, Mihaela van der Schaar (MLHC 2019)

Inverse Active Sensing: Modeling and Understanding Timely Decision-Making
Daniel Jarrett, Mihaela van der Schaar (ICML 2020)

Explaining by Imitating: Understanding Decisions by Interpretable Policy Learning
Alihan Hüyük, Daniel Jarrett, Mihaela van der Schaar (submitted for publication, 2020)

Diagnosis: saving lives by catching cancer earlier

The sections above demonstrate how machine learning can help identify and flag at-risk patients, while also understanding and improving decision-making regarding the appropriate course of action for diagnosis. The diagnosis process itself can also benefit from the application of machine learning: while this capability is extremely well-documented (often to the exclusion of other notable achievements), it is still worth mentioning as part of a larger whole.

Machine learning algorithms have demonstrated a clear ability to aid radiologists in accurately identifying images that are problematic (i.e. those featuring areas that appear suspicious), and numerous studies have shown that the performance of radiologists is significantly improved when using a deep learning computer system as decision support in breast cancer diagnosis. This directly supports early diagnosis of breast cancer, thereby increasing the variety of available treatments and likelihood of survival.

We have explored the potential of such tools in a variety of studies, perhaps the most notable of which led to the development of a triaging system called MVMT, which has the potential to save countless hours for overworked radiologists and provide them more time to focus on the difficult and complex cases that warrant additional scrutiny.

Results obtained on a private dataset show that MVMT reduced the number of radiologist readings required by 43% while improving the overall diagnostic accuracy in comparison to readings done by radiologists alone.

More about the underlying research

Multi-view Multi-task Learning for Improving Autonomous Mammogram Diagnosis
Trent Kyono, Fiona J. Gilbert, Mihaela van der Schaar (MLHC 2019)

Getting the best results from the referral process

Once a patient has been screened and diagnosed, a number of key decisions need to be made regarding their options for care. In the case of a disease such as breast cancer, it is often the case that numerous specialists need to be appointed. The referral process, however, is far from simple, and variation is commonplace. In current clinical practice, patients are referred to experts in an ad-hoc manner based on signs and symptoms of the patient, patient’s or primary care physician’s preference, insurance plan, and availability of the physician. As a result, these decisions can be made in a somewhat arbitrary and potentially inefficient manner, with the result being that patients may not end up seeing the right person.

A few years ago, we introduced a machine learning technique to optimize clinical workflows by personalizing the process of matching new patient cases with appropriate diagnostic expertise from a range of domain experts who specialize in similar types of cases. While our algorithm is general and can be applied in numerous medical scenarios, we illustrated its functionality and performance by applying it to a real-world breast cancer diagnosis dataset.

Such an approach also lends itself quite naturally to the task of composing a team of diverse experts (for example, an oncologist, a radiologist, and general practitioner) to treat patients suffering from complex diseases such as breast cancer.

Operation of the proposed system for clinic i. In this example, the diagnostic decision for the patient is made by a human expert from clinic j. Then, after some time the true health state of the patient is revealed. Based on this, clinic i updates the diagnostic accuracy of clinic j for that context, while clinic j updates the diagnostic accuracy of its expert f.

When implemented in a clinical context, this kind of system could feature safeguards: for example, referral recommendations made by the system will be examined by a clinician before the final prediction is made. For any patient, it would therefore remain the clinician’s discretion whether to rely on or to disregard the recommendation of the algorithm. Similarly, recommendations could take into account the patient load of individual clinicians, ensuring that workloads remain as manageable as possible.

More about the underlying research

Treatment effect estimation: developing effective and personalized treatment plans

Once a patient has been diagnosed with a disease such as breast cancer and referred to a specialist (or a group of specialists), a treatment plan for that patient needs to be developed. This involves identifying when to give treatments, and how to select among multiple treatments over time. These are important and complex problems with few existing solutions. Choosing the best treatment for a particular patient requires estimating the effects of a variety of treatments for that particular patient. To accomplish this, we need to understand personalized treatment effects and not just average treatment effects. Unfortunately, this is essentially still beyond the capabilities of state-of-the-art medical knowledge.

Recent advances in machine learning, however, have led to tools that are capable of learning personalized treatment effects by beginning where clinical trials leave off: learning from observational data captured in clinical practice.

The impact of accurate prediction of personalized treatment effects is hard to overestimate, because it provides a basis for the choice of treatment according to the features of the particular patient at hand. This leads to improved outcomes for patients, who receive the best treatment, and to improved use of resources (i.e. not wasting scarce resources on patients who will receive little or no additional benefit from them). Such methods can produce predictions that come with statistically valid confidence intervals, which are invaluable in a clinical context.

When applied to breast cancer, we can use machine learning to draw upon observational data to devise and predict the likely outcome of multiple treatment courses. This includes determination of which specific treatments will likely need to be given at various points in time. This allows clinicians to “mix and match” combinations of treatments, such as chemotherapy and radiotherapy, to maximize benefit and minimize risk.

More about the underlying research

Bayesian Inference of Individualized Treatment Effects using Multi-task Gaussian Processes
Ahmed M. Alaa, Mihaela van der Schaar (NeurIPS 2017)

GANITE: Estimation of Individualized Treatment Effects using Generative Adversarial Nets
Jinsung Yoon, James Jordon, Mihaela van der Schaar (ICLR 2018)

Forecasting Treatment Responses Over Time Using Recurrent Marginal Structural Networks
Bryan Lim, Ahmed M. Alaa, Mihaela van der Schaar (NeurIPS 2018)

Validating Causal Inference Models via Influence Functions
Ahmed M. Alaa, Mihaela van der Schaar (ICML 2019)

Estimating counterfactual treatment outcomes over time through adversarially balanced representations
Ioana Bica, Ahmed M. Alaa, James Jordon, Mihaela van der Schaar (ICLR 2020)

A full list of papers on the topic of causal inference and treatment effect estimation can be found here.

Survival analysis with competing risks

Treatment decisions of the nature outlined immediately above, as well as ongoing patient management, need to factor in the evolving likelihood of survival over time, and must also include the possibility of complications arising from competing risks. The problem of survival analysis with competing risks has recently gained significant attention in the medical community due to the realization that many chronic diseases possess a shared biology. For example, cancer and cardiovascular disease (CVD) possess various similarities and possible interactions, including a number of similar risk factors. In addition, many existing cancer therapies increase a patient’s risk for CVD. At the same time, conventional methods for survival analysis, such as the Kaplan-Meier method and standard Cox proportional hazards regression, are not equipped to handle competing risks.

It is obviously important that patients who are at risk of both cancer and CVD be provided with a joint prognosis of mortality due to the two competing diseases in order to properly manage therapeutic interventions. This is a challenging problem, since CVD patient cohorts are very heterogeneous; CVD exhibits complex phenotypes for which mortality rates can vary as much as 10-fold among patients in the same phenotype.

Machine learning, however, can accurately model survival of patients in such a highly heterogeneous cohort, while treating CVD and cancer as competing risks. We have demonstrated this our lab’s own survival models for competing risks, DeepHit and Dynamic-DeepHit, which offer personalized actionable prognoses that clinicians can use to design personalized treatment plans. Experiments on real-world data have demonstrated that our models outperform state-of-the-art survival models.

We have also designed survival analysis tools that can provide predictions are the cumulative product of many different models, but which are “weighted” to prioritize the best-performing models, thereby yielding the best possible result. This kind of versatility (being able to compare and choose between many tools at once to get the best prediction) adaptability is a particular strength of machine learning.

More about the underlying research

Deep Multi-task Gaussian Processes for Survival Analysis with Competing Risks
Ahmed M. Alaa, Mihaela van der Schaar (NeurIPS 2017)

DeepHit: A Deep Learning Approach to Survival Analysis with Competing Risks
Changhee Lee, William Zame, Jinsung Yoon, Mihaela van der Schaar (AAAI 2018)

Multitask Boosting for Survival Analysis with Competing Risks
Alexis Bellot, Mihaela van der Schaar (NeurIPS 2018)

Temporal Quilting for Survival Analysis
Changhee Lee, William Zame, Ahmed Alaa, Mihaela Schaar (AISTATS 2019)

Dynamic-DeepHit: A Deep Learning Approach for Dynamic Survival Analysis With Competing Risks Based on Longitudinal Data
Changhee Lee, Jinsung Yoon, Mihaela van der Schaar (IEEE Transactions on Biomedical Engineering, 2020)

Cystic fibrosis

As a congenital chronic disease, cystic fibrosis presents a range of challenges that are distinct to those posed by a disease such as cancer. Patients require long-term monitoring and complex treatment regimens, especially given the increasing likelihood of multimorbidity with age. Again, however, these complexities can be approached more effectively by clinicians if they are armed with machine learning tools that can model the entire health trajectory of each individual patient.

Prognosis: getting the timing right for lung transplant referrals

Despite recent therapeutic advances that have significantly improved cystic fibrosis prognosis, only half of the current cystic fibrosis population are expected to live to over 40 years old.

When the respiratory failure of individuals with cystic fibrosis becomes severe, a decision must be taken on whether to refer them for a lung transplant—a major operation that may increase their life expectancy, but which comes with significant risks of its own. The timing of this referral is crucial.

The decision to refer is typically based on forced expiratory volume (FEV): when this falls below 30% of healthy levels, a patient is generally deemed in need of a transplant. Yet research has revealed that just under half of the people referred in this way actually have a sufficiently urgent need for a transplant.

Our lab, in collaboration with the Cystic Fibrosis Trust, has been applying machine learning methods to improve this referral process. Specifically, we developed an algorithmic framework and provided it with data for the 115 variables associated with patients over three years. We then allowed the framework to discover for itself which variables were most important and, crucially, how they interacted with each other to produce a variety of patient outcomes.

The end result is a tool that suggests correct referrals two-thirds of the time, which represents a remarkable 35% improvement in accuracy over traditional methods. In other words, in a lung transplant waiting list of 100 patients, our model would replace 17 patients who were unnecessarily referred for a transplant with 17 other patients who truly needed one. Once again, this demonstrates the potential for machine learning techniques—in the hands of clinicians—to learn more deeply and from a broader range of information than is currently possible.

Side note: we discuss organ transplants in more depth later on.

More about the underlying research

Prognostication and Risk Factors for Cystic Fibrosis via Automated Machine Learning
Ahmed M. Alaa, Mihaela van der Schaar (Nature Scientific Reports, 2018)

Longitudinal disease trajectories

In addition to providing transplant decision support through prognosis as outlined above, machine learning tools can efficiently and effectively predict disease trajectories for cystic fibrosis patients. This can help inform treatment decisions and generally make cystic fibrosis a less disruptive force in the lives of its patients.

Currently available risk prediction methods are limited in their ability to deal with complex, heterogeneous, and longitudinal data such as those available in primary care records, or in their ability to deal with multiple competing risks. They utilize only a small fraction of the available longitudinal (repeated) measurements of biomarkers and other risk factors. In particular, even though biomarkers and other risk factors are measured repeatedly over time, survival analysis is typically based on the last available measurement—despite the informative nature of the evolution of biomarkers and risk factors.

In the case of cystic fibrosis, the development of variables including FEV is a crucial biomarker in assessing the severity of the disease: it allows clinicians to describe the progression of the disease and to anticipate the occurrence of respiratory failures. Therefore, to provide a better understanding of disease progression, it is essential to incorporate longitudinal measurements of key biomarkers and risk factors into a model. Rather than discarding valuable information recorded over time, this allows us to make better risk assessments on the clinical events. To this end, the advent of modern electronic health records provides an opportunity for building models of disease progression that can predict individual-level disease trajectories, and distill intelligible and actionable representations of disease dynamics.

Our own lab’s work has resulted in the creation of several models capable of predicting disease trajectories, including competing risk. Real-world experiments have shown that these models demonstrate superior predictive accuracy compared to existing approaches, in addition to providing insights into disease progression dynamic.

More about the underlying research

Attentive State-Space Modeling of Disease Progression
Ahmed M. Alaa, Mihaela van der Schaar (NeurIPS 2019)

Dynamic-DeepHit: A Deep Learning Approach for Dynamic Survival Analysis With Competing Risks Based on Longitudinal Data
Changhee Lee, Jinsung Yoon, Mihaela van der Schaar (IEEE Trans Biomed Eng. 2020)

Clairvoyance: a Unified, End-to-End AutoML Pipeline for Medical Time Series
Daniel Jarrett, Jinsung Yoon, Ioana Bica, Zhaozhi Qian, Ari Ercole, Mihaela van der Schaar (submitted for publication, 2020)

Flexible Modelling of Longitudinal Medical Data: A Bayesian Nonparametric Approach
Alexis Bellot, Mihaela van der Schaar (ACM Transactions on Computing for Healthcare, 2020)

A full list of papers on the topic of time-series analysis can be found here.

Screening for comorbidities

Even if we are able to optimize organ transplant matching and timing while successfully predicting the evolution of disease trajectories over time, the development of new comorbidities remains a particularly worrisome long-term risk for cystic fibrosis patients.

An additional potential application of machine learning for cystic fibrosis patients is the development of personalized screening for comorbidities. Techniques our lab has developed allow us, for example, to assess and predict a patient’s likelihood over time of developing comorbidities such as diabetes. This can be used to inform the frequency and nature of screening regimes adapted to reflect the features and characteristics of individuals. This can be done in such a way as to balance the cost of screening against the risks of delayed screening.

Side note: we discuss multimorbidity in more depth later on.

More about the underlying research

Disease-Atlas: Navigating Disease Trajectories using Deep Learning
Bryan Lim, Mihaela van der Schaar (MLHC 2018)

In-hospital critical care

While the examples above show the benefits of machine learning in the context of specific diseases, I would also like to show the same benefits from the perspective of specific clinical settings. One such setting is in-hospital critical care: every day, a multitude of difficult decisions regarding patient health and resource allocation need be taken by both clinicians and administrators. These affect the whole hospital’s ability to operate.

Early warning systems: inferring the unobservable

Each year in the U.S., roughly 150,000 people die from sudden cardiac arrest in hospitals. At least 50,000 of these deaths are considered preventable, and the American Heart Association has identified in-hospital cardiac arrest as a public health crisis. Timely prognosis and intervention could have helped ensure earlier ICU admission, dramatically increasing the odds of survival.

Standard medical techniques are generally unable to detect a patient’s true state: in the case of a patient whose ICU admission is too late, their true state may have been hidden by their seemingly stable observable data. What machine learning can offer is a framework for taking observable physiological data (from vital signs, lab tests and admission information) and integrating it to make real-time inferences about diagnosis and prognosis for individual patients—including predicting whether or not they will shift from one state to another.

Our lab’s work so far has yielded a new dynamic survival prediction technique that can estimate a patient’s likelihood of survival as data are gathered over time. We first validated this model, which we named ForecastICU, at UCLA’s Ronald Reagan Medical Center, where it significantly outperformed the 4 risk scoring techniques commonly used in hospitals, consistently providing more accurate results when predicting which patients would be admitted to the ICU and which would be discharged.

While there is still scope for improvement, it is clear that such approaches are already capable of predicting whether or not a patient will require ICU admission more accurately than any existing technique. The real-world implications are enormous: in scenarios where minutes count, we can save hours, and lives.

More about the underlying research

ForecastICU: A Prognostic Decision Support System for Timely Prediction of Intensive Care Unit Admission
Jinsung Yoon, Ahmed M. Alaa, Scott Hu, Mihaela van der Schaar (ICML 2016)

Learning from Clinical Judgments: Semi-Markov-Modulated Marked Hawkes Processes for Risk Prognosis
Ahmed M. Alaa, Scott Hu, Mihaela van der Schaar (ICML 2017)

A Hidden Absorbing Semi-Markov Model for Informatively Censored Temporal Data: Learning and Inference
Ahmed M. Alaa, Mihaela van der Schaar (JMLR 2017)

Personalized Risk Scoring for Critical Care Prognosis using Mixtures of Gaussian Processes
Ahmed M. Alaa, Jinsung Yoon, Scott Hu, Mihaela van der Schaar (IEEE Transactions on Biomedical Engineering, 2018)

Resource allocation: making efficient, timely, and effective use of scarce equipment

The ongoing COVID-19 pandemic has highlighted the need for systems that can predict and plan the efficient usage of scarce resources such as ventilators and ICU beds. Since early this year, we have seen that, even if social policies help take the strain off healthcare systems at a national level, there’s no guarantee that individual hospitals (or entire regions) won’t still be stretched well beyond capacity. It’s important to ensure that hospitals remain armed with information that will help them manage peaks in demand for resources.

If we have access to high-quality datasets containing such information, we can use machine learning to answer questions such as:
– Which patients are most likely to need ventilators within a week?
– How many free ICU beds is this hospital likely to have in a week?
– Which of these two patients will get the most benefit from going on a ventilator today?

Our lab provided just such a system to NHS Digital and Public Health England in a partnership we entered back in April (we named our underlying model Cambridge Adjutorium; it was implemented within the NHS under the name CPAS). The automated machine learning pipeline in question assigns risk scores to COVID-19 patients based on their likelihood of ICU admission or ventilator usage. These are then aggregated across the hospital to give a picture of future demand on resources.

This forewarns clinicians and hospital administrators alike regarding upcoming resource shortages, giving them valuable time to adjust their operations or make arrangements to share resources with neighboring healthcare facilities.

Systems such as these can provide both administrators and clinicians with actionable information to support their decision-making. Patients will also benefit, since efficient allocation of critical care resources will save lives.

Navigating the complexities of multimorbidity

In the examples above, I have shown how machine learning can provide end-to-end support in treating single diseases, or even factoring comorbidities into treatment of a single disease. There is also, of course, a pressing need for tools that address multimorbidity. Multimorbidity affects a sizeable proportion of the global population, and is one of the most substantial burdens on medical resources and public health—especially since these tend to be the sickest patients who require extensive long-term care. It is a particularly complex challenge that may appear almost insurmountable using conventional medical and statistical approaches alone. We can, however, apply machine learning to tackle multimorbidity in all its complexity.

Personalized, dynamic morbidity networks

Current models of comorbidities in the medical literature vary from somewhat simplistic statistical approaches to slightly more advanced methods that involve mapping interactions between diseases, but in a static, non-personalized manner. The reality is that comorbid diseases co-occur and progress via complex temporal patterns that vary among individuals. The causal structure of relationships between diseases can be represented by networks that are dynamic in nature. For instance, long-term diabetes increases the risk of cardiovascular and renal disease making high blood pressure and its complications, such as heart attacks, more likely.

Our own lab has been developing personalized and dynamic methods for mapping morbidity networks over time with the aim of understanding how—for different patients with different characteristics—the progression of various diseases may change, and how specific morbidities (or groups of morbidities) may or may not effect subsequent occurrence of other morbidities, or the strength of such occurrences. A problem like this is computationally extremely complex, meaning machine learnings can succeed where humans cannot. In addition to mapping the causal and temporal relationships between comorbidities, we can then identify from data which patients are similar, and which are not.

Our experiments using this model have shown that such approaches can offer more nuanced understanding of disease progression mechanisms, more accurate prediction of patient pathways, and better generalizability across different diseases. By taking into account the full disease history, comorbidity networks of this nature are well equipped to deal with individual-level disease trajectories in a data-driven fashion, improving over existing one-size-fits-all clinical guidelines.

Additionally, using “recommender systems” described above, we can also use tools such as these to provide consistent but individually tailored reports that only display information that is relevant to the user at hand. For example, an oncologist and a cardiologist could receive reports on a given patient, and—despite both reports being based on the same underlying information—would each see information most relevant to their own specialization. In this way, machine learning can not only advance our ability to treat multimorbid patients, but also improve collaboration between the clinicians treating those patients—all in a systematic and consistent manner.

More about the underlying research

Learning Dynamic and Personalized Comorbidity Networks from Event Data using Deep Diffusion Processes
Zhaozhi Qian, Ahmed Alaa, Alexis Bellot, Mihaela Schaar, Jem Rashbass (AISTATS 2020)

Distributed Online Learning in Social Recommender Systems
Cem Tekin, Simpson Zhang, Mihaela van der Schaar (IEEE Journal of Selected Topics in Signal Processing, 2014)

Online Learning in Large-Scale Contextual Recommender Systems
Linqi Song, Cem Tekin, Mihaela van der Schaar (IEEE Transactions on Services Computing, 2016)

Optimizing transplant survival

Transplants, like critical care, are an area of medicine in which life-and-death decisions are frequently made regarding the use of scarce and valuable resources. In the U.S., an average of 17 patients died every day in 2018 while waiting for an organ transplant. As with the ICU equipment as described above, donor lists force medical professionals to make tough decisions regarding the best use of limited resources (in this case, the organs themselves).

Making the right decisions with limited organs

In broad terms, the question of “Who gets the transplant?” should be answered on the basis of urgency and benefit. In other words, the decision relies on an estimate of how long each patient will survive without a transplant, and how much a transplant would improve each patient’s chance of survival. Just as accurate risk scores can save lives by ensuring that patients are admitted to the ICU at the right time, organs can save the most lives if transplantation decisions are based on accurate information.

Unfortunately, existing prediction methods yield recommendations that are statistically not much more effective than simply guessing which patient’s urgency and benefit are greater. One major obstacle is the fact that clinical scores tend to be calculated on a one-size-fits-all basis, while each patient is obviously different.

Again, though, machine learning can be used to sift through vast quantities of existing data and offer actionable recommendations tailored to individual patients.

To address challenges like these, our lab developed a methodology called ToPs (short for “trees of predictors”). ToPs can offer personalized predictions that address the interaction of multiple features and, importantly, takes into account the difference between long-term survival and short-term survival.

The accuracy of ToPs was tested against heart transplant data for the U.S. between 1985 and 2015. Our model made recommendations based on predicted survival and mortality for roughly 35,000 patients who didn’t receive heart transplants, and roughly 60,000 patients who did receive heart transplants, and was able to outperform standard risk scoring techniques in predicting survival and mortality rates across a range of timeframes. Prediction of survival and mortality at 3 months after transplant was particularly impressive, with our model outpredicting MAGGIC, a leading heart failure risk calculation technique, by roughly a third. Naturally, this could be applied to organ transplants of any nature.

Once again, machine learning can make a meaningful difference in the real world by enabling clinical professionals to make better-informed decisions about a waitlisted patient’s likely survival time without transplant, and their likely benefit from receiving a transplant. This, in turn, can save lives by ensuring that organs are used in cases where they are likely to be most effective.

More about the underlying research

Personalized survival predictions via Trees of Predictors: An application to cardiac transplantation
Jinsung Yoon, William Zame, Amitava Banerjee, Martin Cadeiras, Ahmed M. Alaa, Mihaela van der Schaar (PLOS ONE, 2018)

Personalized Donor-Recipient Matching

The probability of a successful transplant depends in a very subtle fashion on compatibility between the donor and the recipient, but current medical practice is short of domain knowledge regarding the complex nature of recipient-donor compatibility. Meanwhile, the advent of electronic health records (EHR) has inspired a data-driven approach for constructing such predictive models in which the complex recipient-donor compatibility patterns are discovered from observational data.

This led to the development of an automated system that learns the recipient-donor compatibility patterns from the EHR data in terms of the probability of transplant success for given recipient-donor pairs. Clinicians can utilize this as a prognostic tool for managing organ transplant selection decisions in which information about the donor and recipient are fed to the system, and the output comes as the probability of the transplant’s success. Experiments conducted on a public heart transplant dataset have demonstrated the superiority of our model to other competing benchmark algorithms, as well as clinical risk scores and statistical methods.

More about the underlying research

Personalized Donor-Recipient Matching for Organ Transplantation
Jinsung Yoon, Ahmed M. Alaa, Martin Cadeiras, Mihaela van der Schaar (AAAI, 2016)

Personalized survival predictions via Trees of Predictors: An application to cardiac transplantation
Jinsung Yoon, William Zame, Amitava Banerjee, Martin Cadeiras, Ahmed M. Alaa, Mihaela van der Schaar (PLOS ONE, 2018)

As I noted earlier, the tools introduced above are just a few examples that demonstrate how machine learning can revolutionize medicine by empowering decision-makers (clinicians, in this case) with actionable information that’s tailored to individual patients and scenarios. These are the kinds of topics we will be discussing in our initial Revolutionizing Healthcare sessions.

Again, however, there are many other fascinating and important areas we plan to save for specific discussion in later sessions. These include (but are not limited to):
interpretability (ensuring our models are transparent and understandable)
trustworthiness (providing confidence estimates)
clinical guidelines and best practices (helping formulate and validate objective and consistent guidelines)
data (determining the types of data we need for machine learning, and how we can compile datasets to extract the intelligence we need)
healthcare economics (maximizing care quality and reducing cost of delivery)
public health and policymaking
clinical research (fueling new discoveries and improving efficiency of clinical trials)
patient engagement with machine learning
adapting developments, tools, and discoveries from one setting to another (enabling collaboration and sharing of knowledge among institutions, specializations, and countries)
– and many, many more topics that our lab has been working on (some of which I also touched on in 2019, in my chapter for the NHS Topol Review).

All of these deserve their own dedicated Revolutionizing Healthcare sessions because they, too, are essential parts of a much larger system that needs to be designed holistically in order to be effective.

Join us, share your thoughts, and help change healthcare

As I mentioned above, the purpose of our Revolutionizing Healthcare sessions is not only to discuss the kinds of tools that we can currently offer, but also to receive guidance about the tools we still need to build, and how this can be done. Put simply, we need the real-world medical problems to be framed for us. We also need guidance that will enable us to assemble the “toolbox of toolboxes” into a cohesive whole that’s optimally structured for individual users at every point in the healthcare ecosystem.

This is why our hope is that, from our second session onwards, you, the clinicians, will introduce specific problems or challenges to us. We can then discuss and define the topics you raise, examine potential solutions, and figure out how these could be applied in practice. I will then set our lab to the challenge of crafting these solutions, and we can share them with you in subsequent sessions. This would be an invaluable way to ensure that absolutely everything in our “toolbox of toolboxes” has been designed with clinicians and other healthcare users in mind, from start to finish.

I would like to conclude by asking any clinicians reading this, once again, to join us—and to invite any clinician friends or colleagues who may be interested. This is a very hard problem with no quick fixes, and we will need your guidance and your leadership. Let’s design the next generation of healthcare—together.

Sign up here:

Mihaela van der Schaar

Mihaela van der Schaar

Mihaela van der Schaar is the John Humphrey Plummer Professor of Machine Learning, Artificial Intelligence and Medicine at the University of Cambridge, a Fellow at The Alan Turing Institute in London, and a Chancellor’s Professor at UCLA.

Mihaela has received numerous awards, including the Oon Prize on Preventative Medicine from the University of Cambridge (2018), a National Science Foundation CAREER Award (2004), 3 IBM Faculty Awards, the IBM Exploratory Stream Analytics Innovation Award, the Philips Make a Difference Award and several best paper awards, including the IEEE Darlington Award.

In 2019, she was identified by National Endowment for Science, Technology and the Arts as the most-cited female AI researcher in the UK. She was also elected as a 2019 “Star in Computer Networking and Communications” by N²Women. Her research expertise span signal and image processing, communication networks, network science, multimedia, game theory, distributed systems, machine learning and AI.

Mihaela’s research focus is on machine learning, AI and operations research for healthcare and medicine.

Nick Maxfield

Nick Maxfield

Nick oversees the van der Schaar Lab’s communications, including media relations, content creation, and maintenance of the lab’s online presence.

Nick studied Japanese (BA Hons.) at the University of Oxford, graduating in 2012. Nick previously worked in HQ communications roles at Toyota (2013-2016) and Nissan (2016-2020).

Given his humanities/languages background and experience in communications, Nick is well-positioned to highlight and explain the real-world impact of research that can often be quite esoteric. Thankfully, he is comfortable asking almost endless questions in order to understand a topic.