On February 12, 2018, Mihaela van der Schaar delivered a presentation entitled “Machine learning and data science for medicine: a vision, some progress and opportunities” at the Turing Institute’s AI for Social Good talk series.
Event synopsis: Mihaela’s work uses data science and machine learning to create models that assist diagnosis and prognosis. Existing models suffer from two kinds of problems. Statistical models that are driven by theory/hypotheses are easy to apply and interpret but they make many assumptions and often have inferior predictive accuracy. Machine learning models can be crafted to the data and often have superior predictive accuracy but they are often hard to interpret and must be crafted for each disease … and there are a lot of diseases. In this talk I present a method (AutoPrognosis) that makes machine learning itself do both the crafting and interpreting. For medicine, this is a complicated problem because missing data must be imputed, relevant features/covariates must be selected, and the most appropriate classifier(s) must be chosen. Moreover, there is no one “best” imputation algorithm or feature processing algorithm or classification algorithm; some imputation algorithms will work better with a particular feature processing algorithm and a particular classifier in a particular setting. To deal with these complications, we need an entire pipeline. Because there are many pipelines we need a machine learning method for this purpose, and this is exactly what AutoPrognosis is: an automated process for creating a particular pipeline for each particular setting. Using a variety of medical datasets, we show that AutoPrognosis achieves performance that is significantly superior to existing clinical approaches and statistical and machine learning methods.