van der Schaar Lab

NeurIPS 2023: Data-Centric AI for reliable and responsible AI

This NeurIPS Tutorial will be presented by Mihaela van der Schaar, Isabelle Guyon and Nabeel Seedat (see presenter bios below) during the 37th Conference on Neural Information Processing Systems, running from 10 – 16 December, 2023.


The live tutorial will take place on 11 December 2023 11:45 PST — 14:15 PST, in Ballroom A-C.


Data-Centric AI has recently been raised as an important paradigm shift in machine learning and AI — placing the previously undervalued “data work’ at the center of AI development. This tutorial aims to illuminate the fundamentals of Data-Centric AI and articulate its transformative potential.

Goals of the NeurIPS tutorial

We will explore the motivation behind the data-centric approach, highlighting the power to improve model performance, engender more trustworthy, fair, and unbiased AI systems, as well as discuss benchmarking from a data-centric perspective.

Our examination extends to standardised documentation frameworks, exposing how they form the backbone of this new paradigm. The tutorial will cover state-of-the-art methodologies that underscore these areas, which we will contextualise around the high-stakes setting of healthcare. A focus of this tutorial is providing participants with an interactive and hands-on experience.

To this end, we provide coding/software tools and resources, thereby enabling practical engagement. The panel discussion, with experts spanning diverse industries, will provide a dynamic platform for discourse, enabling a nuanced understanding of the implications and limitations of Data-Centric AI across different contexts.

Ultimately, our goal is that participants gain a practical foundation in data-centric AI, such that they can use or contribute to Data-Centric AI research.

Relevant Software

Relevant Resources

Presenter bios

The tutorial will be presented by Mihaela van der Schaar, Isabelle Guyon, and Nabeel Seedat

Mihaela van der Schaar

Mihaela van der Schaar is the John Humphrey Plummer Professor of Machine Learning, Artificial Intelligence and Medicine at the University of Cambridge, a Fellow at The Alan Turing Institute in London, and a Chancellor’s Professor at UCLA. Mihaela has received numerous awards, including the Oon Prize on Preventative Medicine from the University of Cambridge (2018), a National Science Foundation CAREER Award (2004), 3 IBM Faculty Awards, the IBM Exploratory Stream Analytics Innovation Award, the Philips Make a Difference Award and several best paper awards, including the IEEE Darlington Award. In 2019, she was identified by National Endowment for Science, Technology and the Arts as the most-cited female AI researcher in the UK. She was also elected as a 2019 “Star in Computer Networking and Communications” by N²Women. Her research expertise span signal and image processing, communication networks, network science, multimedia, game theory, distributed systems, machine learning and AI.
Mihaela’s research focus is on machine learning, AI and operations research for healthcare and medicine. In addition to leading the van der Schaar Lab, Mihaela is founder and director of the Cambridge Centre for AI in Medicine (CCAIM).

Isabelle Guyon

Isabelle Guyon is the Chaired Professor of Artificial Intelligence (PR EX1) and INRIA researcher at the Machine Learning and Optimization (TAU team) at Laboratoire Interdisciplinaire des Sciences du Numérique (LISN) University Paris-Saclay, France. Graduate of ESPCI. She did her PhD in Paris in Gerard Dreyfus’ Lab. After working at Bell Labs for 7 years, Isabelle then moved to Berkeley, California.
She is an independent consultant with ClopiNet. Since 2015, Isabelle is holding a full Professorship at Université Paris-Saclay where she is teaching Machine Learning and advising graduate students. From 2019-2021, she was the coordinator of the CS Artificial Intelligence master program at UPSaclay.
Isabelle was Co-program chair of NeurIPS 2016 and co-general chair of NeurIPS 2017, and then NeurIPS board member. She is AMIA and an ELLIS fellow, and an action editor at JMLR, as well as a CiML springer series editor. She won the BBVA award in 2020.

Nabeel Seedat

Nabeel Seedat is a PhD candidate at the University of Cambridge. Nabeel’s research is focused on Data-Centric AI, uncertainty quantification and synthetic data. He has published papers on Data-Centric AI in leading ML conferences including, NeurIPS, ICML and AISTATS. Nabeel has recently given talks and presentations on Data-Centric AI to both industry: AstraZeneca, Discovery Limited and academic research groups: Queen Mary University of London, University of Cape Town). He also has experience giving talks to diverse audiences at conferences including IEEE conferences, KDD and PyData.
Beyond Nabeel’s academic background in data-centric AI, he also has extensive industry experience working on data-centric problems. He has worked as a Machine Learning engineer across two multinational corporations (in the USA and South Africa), building real-world computer vision and NLP systems that currently serve millions of customers daily.

Andreas Bedorf