van der Schaar Lab

Million-patient study shows strength of machine learning in recommending breast cancer therapies

An extensive new study published on June 24 in Nature Machine Intelligence shows that a prognostic tool developed by the van der Schaar Lab can recommend therapies for breast cancer patients more reliably than methods that are currently considered international clinical best practice. The study makes unprecedented use of complex, high-quality cancer datasets from the U.K. and U.S. to demonstrate the accuracy of Adjutorium, a machine learning system for prognostication and treatment benefit prediction.

Individualized treatment benefit prediction: the key to making the right decisions for patients

Developed by the van der Schaar Lab’s researchers, Adjutorium is a cutting-edge machine learning tool that can be trained to inform treatment decisions for a wide range of different diseases. It predicts an individual’s likely outcomes in the event of treatment or no treatment—with the difference between the two being their individualized “survival benefit” from treatment.

In the case of breast cancer, Adjutorium was tasked with determining whether or not patients would benefit from adjuvant therapies prescribed in addition to surgery, such as chemotherapy and hormone therapy. While such therapies have improved outcomes for early-stage breast cancer patients since their introduction, they bear their own risks, which must be weighed carefully against the expected benefits. Accurately predicting survival benefit is of critical importance in order to prevent a patient from being undertreated or overtreated.

Real-world clinical problems such as these lie at the heart of the van der Schaar Lab’s work. Describing the importance of the Adjutorium project, Prof. Mihaela van der Schaar explained: “Adjutorium exemplifies my lab’s focus on the development of new machine learning tools that can support clinicians by allowing them to make better, more accurate decisions for each patient they treat.”

Training and validating Adjutorium across multiple world-leading datasets

In developing and validating Adjutorium, the van der Schaar Lab’s researchers, with guidance from a diverse team of academic and clinical collaborators, conducted one of the largest studies of an AI or machine learning tool for cancer to date. The process involved the use of data from nationally representative large-scale cohorts of nearly 1 million women within the cancer registries of the U.K. and the U.S.

First, the model was trained and internally validated on roughly 396,000 patients from the U.K. National Cancer Registration and Analysis Service (NCRAS) dataset administered by Public Health England—and, in doing so, became the first AI model to make use of this real-world complex and high-quality dataset.

Adjutorium was then externally validated on 572,000 patients from the U.S. Surveillance, Epidemiology, and End Results (SEER) program. This makes Adjutorium one of only a handful of AI models to have been validated using datasets from 2 countries—an especially challenging undertaking given the variations in the datasets between the countries, in addition to their markedly different healthcare systems and populations.

Commenting on this achievement, Dr. Jem Rashbass, Executive Director for Data and Analytical Services at NHS Digital and a co-author of the article published in Nature Machine Intelligence, said: “Not only is this work based on a real-world population dataset from the National Cancer Registration and Analysis Service in England, but we have been able to show that the methodologies and learning based on this national data collection in the UK can be directly transferred to the SEER dataset in the U.S.”

Pitting Adjutorium against prevalent standards for clinical decision-making

At present, there are several methods for predicting the survival profiles of individual patients on the basis of their clinicopathological features. Of these, PREDICT is the most commonly used worldwide, and is the recommended tool for adjuvant therapy planning under the current NICE guidelines.

To assess the clinical benefit of using Adjutorium for supporting decisions regarding adjuvant therapies, Adjutorium’s predictions of treatment benefit were compared to PREDICT and the actual decisions of multidisciplinary teams obtained from the NCRAS database.

Discriminative accuracy of Adjutorium vs. PREDICT v2.1, evaluated in sub-cohorts of patients stratified by diagnosis date. Measured using Uno’s concordance index; higher is better.

Row “a” shows discriminative accuracy with respect to all-cause mortality, whereas row “b” shows discriminative accuracy with respect to breast cancer-specific mortality.

Adjutorium uniformly outperformed PREDICT in predicting all-cause and breast cancer-specific mortality, both when validated internally within NCRAS and externally within the SEER cohort. Importantly, Adjutorium demonstrated superior accuracy in specific subgroups known to be under-served by PREDICT. This is likely due to the fact Adjutorium’s machine learning-based risk equation captured nuanced interactions and non-linear patterns that were not incorporated in PREDICT or similar prognostic tools.

Similarly, compared to observed decisions made by multidisciplinary teams, Adjutorium could have provided improved treatment decisions for 25% of the patient population (13% who were likely under-treated, and 12% who were likely over-treated).

Commenting on the results achieved by Adjutorium, Dr. Rashbass said: “This really is a game-changer and shows the power of machine learning approaches to cancer risk models. The approach provides more robust estimates than traditional statistical, epidemiological tools but also delivers better results on small subsets of patients who have largely been ignored in the past.”

Machine learning’s impact beyond breast cancer

While the recent Nature Machine Intelligence article demonstrates the value of Adjutorium when applied specifically to survival benefit prediction for breast cancer patients, AutoPrognosis, the machine learning framework at the heart of the model, is inherently adaptable. So far, it has been used in a number of different applications including cardiovascular diseasecystic fibrosis, and ICU admission prediction (most recently in partnership with the U.K. NHS during the COVID-19 pandemic). Given high-quality datasets and guidance from clinicians, AutoPrognosis could provide treatment guidance for a comprehensive array of diseases and conditions.

To find out more about Adjutorium—and try out an interactive web app demonstrator—read the paper below or visit our dedicated Adjutorium webpage.

Machine learning to guide the use of adjuvant therapies for breast cancer

Ahmed M. Alaa, Deepti Gurdasani, Adrian L. Harris, Jem Rashbass, Mihaela van der Schaar

Nature Machine Intelligence, 2021

Accurate prediction of the individualized survival benefit of adjuvant therapy is key to making informed therapeutic decisions for patients with early invasive breast cancer. Machine learning technologies can enable accurate prognostication of patient outcomes under different treatment options by modelling complex interactions between risk factors in a data-driven fashion.

Here, we use an automated and interpretable machine learning algorithm to develop a breast cancer prognostication and treatment benefit prediction model—Adjutorium—using data from large-scale cohorts of nearly one million women captured in the national cancer registries of the United Kingdom and the United States.

We trained and internally validated the Adjutorium model on 395,862 patients from the UK National Cancer Registration and Analysis Service (NCRAS), and then externally validated the model among 571,635 patients from the US Surveillance, Epidemiology, and End Results (SEER) programme.

Adjutorium exhibited significantly improved accuracy compared to the major prognostic tool in current clinical use (PREDICT v2.1) in both internal and external validation. Importantly, our model substantially improved accuracy in specific subgroups known to be under-served by existing models.

Reference materials and further reading

Click here to find out more about the van der Schaar Lab’s many research projects related to cancer.

A full list of the van der Schaar Lab’s papers can be found here.

We encourage you to stay up-to-date with ongoing developments in this and other areas of machine learning for healthcare by signing up to take part in one of our two streams of online engagement sessions.

If you are a practicing clinician, please sign up for Revolutionizing Healthcare, which is a forum for members of the clinical community to share ideas and discuss topics that will define the future of machine learning in healthcare (no machine learning experience required).

If you are a machine learning student, you can join our Inspiration Exchange engagement sessions, in which we introduce and discuss new ideas and development of new methods, approaches, and techniques in machine learning for healthcare.

Nick Maxfield

Nick oversees the van der Schaar Lab’s communications, including media relations, content creation, and maintenance of the lab’s online presence.

Nick studied Japanese (BA Hons.) at the University of Oxford, graduating in 2012. Nick previously worked in HQ communications roles at Toyota (2013-2016) and Nissan (2016-2020).

Given his humanities/languages background and experience in communications, Nick is well-positioned to highlight and explain the real-world impact of research that can often be quite esoteric. Thankfully, he is comfortable asking almost endless questions in order to understand a topic.