van der Schaar Lab

NeurIPS 2023: Data-Centric AI for reliable and responsible AI

This NeurIPS Tutorial was presented by Mihaela van der Schaar, Isabelle Guyon and Nabeel Seedat (see presenter bios below) during the 37th Conference on Neural Information Processing Systems, which ran from 10 – 16 December, 2023.

Logistics

The live tutorial took place on 11 December 2023 11:45 PST — 14:15 PST, in Ballroom A-C.

About

Data-Centric AI has recently been raised as an important paradigm shift in machine learning and AI — placing the previously undervalued “data work’ at the center of AI development. This tutorial aims to illuminate the fundamentals of Data-Centric AI and articulate its transformative potential.

Goals of the NeurIPS tutorial

We explore the motivation behind the data-centric approach, highlighting the power to improve model performance, engender more trustworthy, fair, and unbiased AI systems, and also discuss benchmarking from a data-centric perspective.

Our examination extended to standardised documentation frameworks, exposing how they form the backbone of this new paradigm. The tutorial covers state-of-the-art methodologies that underscore these areas, which we contextualise around the high-stakes setting of healthcare. A focus of this tutorial is providing participants with an interactive and hands-on experience.

To this end, we provide coding/software tools and resources, thereby enabling practical engagement. The panel discussion, with experts spanning diverse industries, provides a dynamic platform for discourse, enabling a nuanced understanding of the implications and limitations of Data-Centric AI across different contexts.

Ultimately, our goal is that participants gain a practical foundation in data-centric AI, such that they can use or contribute to Data-Centric AI research.

The Recording

The Tutorial Slides

Relevant Software

Relevant Resources

Presenter bios

The tutorial will be presented by Mihaela van der Schaar, Isabelle Guyon, and Nabeel Seedat

Mihaela van der Schaar


Mihaela van der Schaar is the John Humphrey Plummer Professor of Machine Learning, Artificial Intelligence and Medicine at the University of Cambridge, a Fellow at The Alan Turing Institute in London, and a Chancellor’s Professor at UCLA. Mihaela has received numerous awards, including the Oon Prize on Preventative Medicine from the University of Cambridge (2018), a National Science Foundation CAREER Award (2004), 3 IBM Faculty Awards, the IBM Exploratory Stream Analytics Innovation Award, the Philips Make a Difference Award and several best paper awards, including the IEEE Darlington Award. In 2019, she was identified by National Endowment for Science, Technology and the Arts as the most-cited female AI researcher in the UK. She was also elected as a 2019 “Star in Computer Networking and Communications” by N²Women. Her research expertise span signal and image processing, communication networks, network science, multimedia, game theory, distributed systems, machine learning and AI.
Mihaela’s research focus is on machine learning, AI and operations research for healthcare and medicine. In addition to leading the van der Schaar Lab, Mihaela is founder and director of the Cambridge Centre for AI in Medicine (CCAIM).

Isabelle Guyon


Isabelle Guyon is the Chaired Professor of Artificial Intelligence (PR EX1) and INRIA researcher at the Machine Learning and Optimization (TAU team) at Laboratoire Interdisciplinaire des Sciences du Numérique (LISN) University Paris-Saclay, France. Graduate of ESPCI. She did her PhD in Paris in Gerard Dreyfus’ Lab. After working at Bell Labs for 7 years, Isabelle then moved to Berkeley, California.
She is an independent consultant with ClopiNet. Since 2015, Isabelle is holding a full Professorship at Université Paris-Saclay where she is teaching Machine Learning and advising graduate students. From 2019-2021, she was the coordinator of the CS Artificial Intelligence master program at UPSaclay.
Isabelle was Co-program chair of NeurIPS 2016 and co-general chair of NeurIPS 2017, and then NeurIPS board member. She is AMIA and an ELLIS fellow, and an action editor at JMLR, as well as a CiML springer series editor. She won the BBVA award in 2020.

Nabeel Seedat


Nabeel Seedat is a PhD candidate at the University of Cambridge. Nabeel’s research is focused on Data-Centric AI, uncertainty quantification and synthetic data. He has published papers on Data-Centric AI in leading ML conferences including, NeurIPS, ICML and AISTATS. Nabeel has recently given talks and presentations on Data-Centric AI to both industry: AstraZeneca, Discovery Limited and academic research groups: Queen Mary University of London, University of Cape Town). He also has experience giving talks to diverse audiences at conferences including IEEE conferences, KDD and PyData.
Beyond Nabeel’s academic background in data-centric AI, he also has extensive industry experience working on data-centric problems. He has worked as a Machine Learning engineer across two multinational corporations (in the USA and South Africa), building real-world computer vision and NLP systems that currently serve millions of customers daily.

Nabeel Seedat

Before joining the van der Schaar Lab, Nabeel received a merit scholarship for a master’s degree at Cornell University, researching Bayesian deep learning and uncertainty estimation for high stakes applications. In addition, he holds a master’s degree from the University of the Witwatersrand (South Africa), where he was awarded a National Research Foundation grant for his work applying signal processing and machine learning to Parkinson’s disease diagnostics in low-resource settings.

Professionally, Nabeel has worked as a machine learning engineer in the United States and South Africa. The computer vision and natural language processing models he worked on are currently deployed and serving millions of customers on a daily basis.

Nabeel is keenly aware that taking methods from the lab to the bedside “requires a unique focus beyond just high-performance predictive models; it requires the development of a toolkit of methods for transfer learning across domains and locations, learning on smaller datasets, understanding model biases and quantifying model reliability and uncertainty are fundamentally needed to bridge this divide.”

Nabeel’s research is supported by funding from the Cystic Fibrosis Trust.

Mihaela van der Schaar

Mihaela van der Schaar is the John Humphrey Plummer Professor of Machine Learning, Artificial Intelligence and Medicine at the University of Cambridge and a Fellow at The Alan Turing Institute in London.

Mihaela has received numerous awards, including the Oon Prize on Preventative Medicine from the University of Cambridge (2018), a National Science Foundation CAREER Award (2004), 3 IBM Faculty Awards, the IBM Exploratory Stream Analytics Innovation Award, the Philips Make a Difference Award and several best paper awards, including the IEEE Darlington Award.

In 2019, she was identified by National Endowment for Science, Technology and the Arts as the most-cited female AI researcher in the UK. She was also elected as a 2019 “Star in Computer Networking and Communications” by N²Women. Her research expertise span signal and image processing, communication networks, network science, multimedia, game theory, distributed systems, machine learning and AI.

Mihaela’s research focus is on machine learning, AI and operations research for healthcare and medicine.