van der Schaar Lab

Automated machine learning will empower some and replace others

Images by @ASCHINCHON on

Many fear that machine learning and artificial intelligence will make professionals such as doctors obsolete. That such Frankensteinian fears follow new technologies is nothing new; as J. B. S. Haldane noted in an address to Cambridge students almost a century ago, “There is no great invention, from fire to flying, which has not been hailed as an insult to some god.”

New technologies inevitably change the way we work, upset balances, make some jobs obsolete and allow us to think in new ways—but this is to be celebrated, not feared. I believe the same will happen in medicine and healthcare.

Doctors will stay at the heart of clinical medicine for a long time, because medicine and the underlying pathologies of disease are very complex. There are many subtle interdependencies we do not recognise and cannot capture, and while some will become tractable to machine learning, designing and building the right machine learning models requires the insight of gifted and visionary clinicians.

The groups where I do see an impending obsolescence are those that apply standard off-the-shelf data science (statistics, machine learning) methods to essentially similar datasets. Much of this low-level customization can be automated so that it can be performed by everyone in the clinical and healthcare domain easily, robustly and using state-of-the-art data science technology. I believe it is the current armies of data scientists that will no longer be needed to push data through a standard analysis sausage machine.

Machine learning – barriers to entry

Machine learning is only just starting to make in-roads into clinical applications. There are a number of reasons that this has taken time, but the most important ones (in my view) arise from mismatches between the machine learning developer and the user of the models developed such as clinicians, medical researchers, healthcare managers. The machine learning community is excellent at building new, innovative models, but we generally don’t do this while considering the perspective of users. 

Many machine learning models are complex to train and optimize and thus cannot be easily used off-the-shelf, so clinicians and medical researchers need either to be experts themselves in training and using machine learning models or to rely on experts. In addition, many models are not interpretable (see my previous post on black boxes!), meaning that their predictions or recommendations are hard to understand and trust, and are therefore not actionable.

Faced with such hurdles, of course it’s tempting for users in the clinical community to fall back on tried and familiar statistical models, which (while often less accurate and prone to assumptions) are easier to use and don’t require as high a degree of expertise on the part of the user. 

The potential of automated machine learning

Machine learning has so many (potentially lifesaving!) advantages over existing statistical techniques, but the challenge is finding a way to make the most of them in each usage scenario. This is a topic I touched on a couple weeks ago during my ICLR 2020 keynote (video below) in a section on automating the design of clinical predictive analytics, which I believe can empower clinicians, medical researchers and entire healthcare systems.

Many existing applications of machine learning in medicine are simply not as effective as they could be. There are many machine learning algorithms to choose from, and selecting which one is best in a particular setting is non-trivial – the results depend on the characteristics of the data, including number of samples, interactions among features and among features and outcomes, as well as performance metrics used. In addition, in medicine, we need entire processing pipelines which involve imputation, feature selection, prediction, calibration, and which issue interpretations associated with the predictions made.

This differs from the current compartmentalized way in which machine learning solutions are often hand-built and focus on one specific stage, without taking preceding or following stages into account. But even if an effort is made to connect these individual processing stages carefully, with consideration given to the overall pipeline effectiveness, there’s a further challenge in terms of complexity: since there are many different algorithms that could be used at any given stage in the pipeline, it’s almost impossible to manually work out which combination of algorithms and their associated hyper-parameters (out of thousands of possibilities) would yield the best results.

So we need to use machine learning itself in order to craft the machine learning pipeline and select which algorithms and associated hyper-parameters of these algorithms should be employed to issue the best prediction. We call the process of selecting algorithms and building pipelines using machine learning “automated machine learning,” or for short, “autoML.”

AutoML is essential in order to enable machine learning to be applied effectively and at scale: given the huge number of different diseases, different variables, and different needs, it’s not possible to hand-craft a model for each disease. 

Naturally, automated machine learning has the potential to solve these problems by joining up the stages of the pipeline and generating models that can be scaled and applied to a variety of diseases. But past automated machine learning methods (e.g. Auto-WEKA, Auto-Sklearn) have achieved only limited performance gains in medicine—because medicine isn’t the purpose they were built for. Their optimization is ad-hoc, and they learn simplistically from other datasets; they don’t handle missing data well; they don’t capture uncertainty; and they’re only used for classification problems (whereas clinical applications entail multiple additional problems such as survival analysis, competing risks, time-series data, and many others). 


Given the issues outlined above, our lab set about developing a comprehensive autoML toolset for clinical use a few years ago. The result was AutoPrognosis, our first tool for crafting clinical scores at scale. 

AutoPrognosis incorporates all relevant algorithms that can be used at each stage in the pipeline: numerous imputation algorithms, feature preprocessing and selection algorithms, classification algorithms, and calibration methods, with the ability to easily incorporate new algorithms into the mix as needed. AutoPrognosis takes an innovative approach to evaluating all viable combinations of algorithms to adopt at each stage of the pipeline to determine the most accurate possible approach. This is a challenging task which was not possible previously due to the sheer complexity of the problem. We have conquered this by learning effectively (using cutting-edge in-house methods) the performance of the various algorithms and associated hyper-parameters from data. Another important merit of AutoPrognosis is that the predictions are done with an ensemble methodology that aggregates the output of multiple pipelines (rather than simply picking the single most successful combination), thereby allowing us to provide uncertainty estimates while counteracting information loss.

AutoPrognosis does not only issue predictions. It also issues interpretations in order to ensure that the predictions are not black boxes, and can be understood by clinicians in a way that’s helpful and meaningful. The result is an end-to-end automated pipeline that takes clinical data as an input and provides predictions and explanations as outputs. Since 2018, my team has applied AutoPrognosis in many clinical applications showing how autoML can outperform other methods – both statistical and machine learning approaches. We’ve made new and meaningful discoveries about lung transplant risk factors for cystic fibrosis patients (see our article in Nature here), individualized risk predictions for breast cancer (see The Alan Turing Institute story here) and cardiovascular disease (using UK Biobank), among other applications. Additionally, AutoPrognosis is at the core of our Adjutorium system, which is currently being deployed by the NHS to help hospitals with capacity planning in the face of the COVID-19 pandemic (more here). 

What’s next? 

It is my strong belief that automated machine learning can empower medical researchers to make new discoveries, clinicians to better treat patients, and healthcare systems to operate more effectively and robustly.

Since the introduction of AutoPrognosis, a few years ago, we have extended our suite of autoML algorithms to a wide variety of purposes other than static predictions, including survival analysis, competing risk analysis, time series predictions, and causal inference. 

We have also recently released a brand-new autoML system called Clairvoyance, an end-to-end automated machine learning pipeline that works specifically with time-series data. Please consider this a teaser for now, since I hope to do Clairvoyance justice with a full introduction and demonstration in the near future.  In the meantime, if you’d like to learn more about AutoPrognosis, I’d recommend watching the relevant section of my ICLR keynote below. Those who want to dig even deeper can visit the “Automated ML” section on our publications page

Images by @ASCHINCHON on

Mihaela van der Schaar

Mihaela van der Schaar is the John Humphrey Plummer Professor of Machine Learning, Artificial Intelligence and Medicine at the University of Cambridge and a Fellow at The Alan Turing Institute in London.

Mihaela has received numerous awards, including the Oon Prize on Preventative Medicine from the University of Cambridge (2018), a National Science Foundation CAREER Award (2004), 3 IBM Faculty Awards, the IBM Exploratory Stream Analytics Innovation Award, the Philips Make a Difference Award and several best paper awards, including the IEEE Darlington Award.

In 2019, she was identified by National Endowment for Science, Technology and the Arts as the most-cited female AI researcher in the UK. She was also elected as a 2019 “Star in Computer Networking and Communications” by N²Women. Her research expertise span signal and image processing, communication networks, network science, multimedia, game theory, distributed systems, machine learning and AI.

Mihaela’s research focus is on machine learning, AI and operations research for healthcare and medicine.

1 comment