This IJCAI Tutorial will be presented by Mihaela van der Schaar and Nabeel Seedat during the 32nd International Joint Conference On Artificial Intelligence, running from 19 – 25 August, 2023. This is a hybrid event (in person/online) you can register for here.
Title
Data-Centric AI: Foundation, Frontiers and Applications in the quest for robust and reliable AI systems
About
Data-Centric AI has recently been raised as an important paradigm shift to change how AI is built — placing the previously undervalued “data work’ at the centre of AI development. This tutorial introduces participants to the foundations of Data-Centric AI by exploring recent state-of-the-art methods and use-cases around characterising, auditing, and improving the data used in machine learning.
Goals of the lab
The quality of data used to train Machine Learning (ML) models is crucial to the success or failure of AI. This is increasingly critical with data-hungry algorithms deployed in high-stakes healthcare or finance settings. Despite its importance, the “data” work has been undervalued as merely operational [9].
Hence, along with algorithmic improvement, there is an urgent need to shift the focus to the data used in AI/ML and its quality. The emergence of Data-Centric AI addresses this issue by developing tools for systematic characterisation, evaluation, and monitoring of the data used to train and evaluate ML models.
This tutorial introduces participants to the foundations of Data-Centric AI. We will provide a comprehensive introduction to recent state-of-the-art Data-Centric AI methods to (1) characterize, (2) generate and (3) evaluate the underlying ML data. A unique focus of the tutorial is showing how these Data-Centric components apply to different stages of the ML pipeline with practical use-cases on tabular, image and text data. This end-to-end approach will enable participants to practically engage with Data-Centric AI for their own problems — from a researcher or practitioner perspective. Additionally, we will explore the future of Data-Centric AI, discussing challenges and opportunities.
After the tutorial, participants will understand the need for Data-Centric AI and its essential components and gain a foundation in state-of-the-art tools and methods such that they can either use or contribute to Data-Centric AI.
You can have a look at our Inspiration Exchange session on the topic of Data-Centric AI in healthcare here.
You can find our most recent Revolutionizing Healthcare sessions on data here and here.
Other useful links:
– Our lab’s publications
– Mihaela van der Schaar on Twitter and LinkedIn