van der Schaar Lab
2020 NeurIPS privacy challenge

Hide-and-seek privacy challenge

Introducing the NeurIPS 2020 hide-and-seek privacy challenge

Pitting “hiders” and “seekers” against one another, this privacy challenge is a novel two-tracked competition to explore the meaning and limitations of data privacy.

Importantly, rather than falling back on fixed theoretical notions of anonymity, we allow participants on both sides to uncover the best approaches in practice for launching or defending against privacy attacks.

The hide-and-seek challenge has been accepted as part of the competition track for NeurIPS 2020, and is being administered by the van der Schaar Lab with support from the University of Cambridge, Microsoft Research, and Amsterdam UMC. The challenge will run from July through mid-November 2020.

Challenge overview

Coupled with advances in machine learning, the vast quantities of clinical data now stored in machine-readable form have the potential to revolutionize healthcare. At the same time, this enterprise is threatened by the fact that patient data are inherently highly sensitive, and privacy concerns have recently been thrown into sharp relief by several high-profile data breaches that have greatly undermined public confidence.

In particular, the recent COVID-19 pandemic has shed light on the critical need for high-quality datasets to be made readily available to the research community without requiring individual agreements between each research group and each entity holding clinical data. At the same time, the pandemic has also highlighted the sheer scale of the organizational and interdisciplinary barriers that continue to prevent this from happening; simply put, existing data anonymization methods are clearly not sufficient to reassure the clinical community at large that machine learning researchers cannot misuse or abuse patient datasets.

We seek novel methods capable of bridging the gap between data-hungry techniques in machine learning and privacy-conscious applications in healthcare settings.

The clinical time-series setting poses a unique combination of challenges to data modeling and sharing. Due to the high dimensionality of clinical time series, adequate de-identification to preserve privacy while retaining data utility is difficult to achieve using common de-identification techniques.

An innovative approach to this problem is synthetic data generation. From a technical perspective, a good generative model for time-series data should preserve temporal dynamics, in the sense that new sequences respect the original relationships between high dimensional variables across time. From the privacy perspective, the model should prevent patient re-identification by limiting vulnerability to membership inference attacks.

The NeurIPS 2020 Hide-and-Seek Privacy Challenge is a novel two-tracked competition to simultaneously accelerate progress in tackling both problems. In our head-to-head format, participants in the synthetic data generation track (i.e. “hiders”) and the patient re-identification track (i.e. “seekers”) are directly pitted against each other by way of a new, high-quality intensive care time-series dataset: the AmsterdamUMCdb dataset. Ultimately, we seek to advance generative techniques for dense and high-dimensional temporal data streams that are (1) clinically meaningful in terms of fidelity and predictivity, as well as (2) capable of minimizing membership privacy risks in terms of the concrete notion of patient re-identification.

This competition provides a two-sided platform for synthetic data generation and patient re-identification methods to compete among and against each other. Our aim is to understand—through the practical task of membership inference attacks—the strengths and weaknesses of machine learning techniques on both sides of the privacy battle, in particular to organically uncover what existing (and potentially novel) notions of privacy and anonymity end up being the most meaningful in practice. We therefore invite participants to compete in either or both of two submission tracks of the interactive challenge: (1) the hider (i.e. synthetic data generation) track, and (2) the seeker (i.e. patient re-identification) track.

Original NeurIPS 2020 challenge proposal

This is our original proposal for consideration in the NeurIPS 2020 competition track. It serves to explain the broad strokes of the challenge, but certain details have changed since it was accepted (for example, classification is no longer a task).

Please only use this document to get a sense of why we’re holding this competition, and how it’s structured. For competition specifics, please see the documentation on our BitBucket here.

Competitor tasks

The competition will be a continuous process taking place over 3 months, during which participants are free to submit entries (i.e. algorithms) to either side of the competition. A complete evaluation procedure of all submitted entries will be conducted on a monthly basis, from which a leaderboard will be constructed for each track—respectively ranking submissions in order of performance.

Tasks for hiders

In the synthetic data generation track, participants are tasked with developing an algorithm that generates synthetic data on the basis of real data.

Their submission must be an algorithm (i.e. not just a trained model), whose input will be random subsets of an unseen subset of the dataset, and whose output is a synthetic dataset that contains entries from the same space as entries in the original dataset. At competition launch, participants will be given a subset of the dataset. They will be free to use this data to develop their algorithm and perform preliminary hyper-parameter selection, but may not use it to pre-train/initialise a model’s weights.

The synthetic data generated by each model will be evaluated in two ways: (1) similarity to the real data; and (2) resistance to re-identification. For each model this will be done on 10 random subsets of the non-public data.

Tasks for seekers

In the patient re-identification track, participants are tasked with developing an algorithm that performs membership inference (aka patient re-identification) on synthetic data generation algorithms. Their submission must be an algorithm, which may contain trained models from the public data.

At competition launch, participants will be given the same public dataset as the generation track. In addition, as each generation algorithm is submitted it will be made publicly available alongside 10 synthetic datasets generated using 10 random (known) subsets of the public data (so that re-identification track participants do not need to – but are still welcome to – run the generation models themselves).

Re-identification algorithms will be evaluated on 10 synthetic datasets generated by each generation algorithm according to their classification accuracy. The synthetic datasets used for evaluation will be generated on the basis of random subsets of the unseen data.

Schematics and descriptions for the mechanics of submissions and evaluations

Evaluation and scoring

Briefly, each head-to-head matchup is a zero-sum game.


Hiders will be scored according to how well their generation algorithms hold up to membership inference attacks.

In addition, hider submissions are required to adequately capture the feature and temporal correlations in the original data; accordingly, they must also first pass a minimum quality bar (in terms of fidelity and predictivity) in order to qualify for competition. (Although the trade-off between quality and privacy is very interesting in its own right, for purposes of fair comparison we fix the former to allow ranking in terms of the latter).


Seekers will be scored according to their accuracy at the membership inference task over each hider submission that is, in correctly identifying whether a given instance was employed in the process of generating of a given synthetic dataset.


This challenge will use a new dataset, AmsterdamUMCdb. The dataset was developed and released by Amsterdam UMC in the Netherlands and the European Society of Intensive Care Medicine (ESICM). It is the first freely accessible comprehensive and high resolution European intensive care database. It is also first to have addressed compliance with General Data Protection Regulation (GDPR, EU 2016/679) using an extensive risk-based de-identification approach.

ICU admissions represent some of the most data-dense patient episodes in healthcare, and these patients represent some of the sickest. Unlike other healthcare domains, ICU data is characterized by its granular, sequential nature, its high-dimensionality and variety of data types, as well as heterogeneous sampling patterns and frequencies. This combination of challenges poses distinctive complexities for modeling; at the same time, it offers huge potential for improving patient care in real-time settings of life-and-death decision-making—where patients are often at risk of deterioration over the span of hours or minutes. Crucially, while a range of diverse models have been investigated in medical literature, they are largely based on a small number of publicly available datasets. To date, the availability of alternative high-quality, dense, and high-dimensional datasets for verifying model generalizability has been limited—precisely due to concerns of privacy.

AmsterdamUMCdb contains approximately 1 billion clinical data points related to 23,106 admissions of 20,109 unique patients between 2003 and 2016. The released data points include patient monitor and life support device data, laboratory measurements, clinical observations and scores, medical procedures and tasks, medication, fluid balance, diagnosis groups and clinical patient outcomes. Data granularity depends on the type of data and admission year, but is up to 1 value every minute for data from patient monitor and life support devices. The data is much richer and granular than those in other well known freely available intensive care databases, such as MIMIC and is comprised of patients with higher illness acuity than is found in US datasets.


Microsoft will provide 2 $5, 000 cash prizes for the winning team from each track.

In addition to the cash prize, Microsoft have made available 50 grants of $250 Azure cloud computing credits to students participating in the contest. Students are not required to use Azure to compete.


May 11, 2020
Publication of competition overview and rules

July 1 – November 15, 2020
Competition runs; generation submissions will be publicly available once submitted

November 15, 2020 (note: previously October 15, 2020)
Deadline for final submission

November 16, 2020
Organizers begin evaluation of submissions

December 1, 2020
Announce results and release evaluation data; release code of all re-identification algorithms

Eligibility and general restrictions

  • The van der Schaar Lab, Microsoft Research, Cambridge and Amsterdam UMC employees can participate but are ineligible for prizes.
  • Participants that have access to the AmsterdamUMCdb dataset will be required to declare this and will be ineligible for prizes.
  • Participants that are ineligible for prizes will not contribute to the scores of other teams.
  • To be eligible for the final scoring, participants are required to release the code of their submissions as open source.
  • If a submission does not run successfully it is the participants’ responsibility to debug it. Participants will be allowed to attempt to submit at most once per day.
  • Generation algorithms may only use the public data to define and tune hyperparameters of their algorithm but may not use the public data to initialise/pre-train a model.
  • Each generative and re-idendification algorithm will be required to run within a specific time on a given GPU.

How to enter

Due to the sensitive nature of the dataset used for the hide-and-seek privacy challenge, we need to take extra measures to vet teams and ensure that all participants are aware of their responsibilities and liabilities. This means a few extra steps, but we’ll try to make the process as painless as possible.


Form a team, choose a team name.

Nominate a team liaison (someone who will share information with us throughout the contest) and a team supervisor.

Get each team member to create a CodaLab account and join the challenge.

Your supervisor should either be a practicing intensivist or a senior professional in machine learning from a recognized higher education institution or a registered company. Team supervisors can participate in the contest as team members, provided that they are not the only member of their team.


Have each team member download and sign a data access request form; send each form to your team supervisor for their signature.

It’s probably best if the team liaison handles this step.


Team liaisons submit their team’s info and all signed forms via our sign-up page (you’ll need a Google account for this).

Once we receive your application, we’ll contact your team supervisor by email within a few days. We hope to get back to you with a final answer within 2 weeks.


Organizing team

James Jordon

Lead coordinator // competition design // evaluation design // evaluation // baseline method provision

James Jordon is an Engineering Science PhD student at the University of Oxford. His primary research focus has been on generative models and their use for various tasks such as synthetic data generation, treatment-effect estimation and feature selection. He has published papers in several leading machine learning conferences including NeurIPS, ICML and ICLR.

Daniel Jarrett

Lead coordinator // competition design // evaluation design // evaluation // platform design and engineering

Daniel Jarrett is a Mathematics PhD student at the University of Cambridge. His primary research focus has been on representation learning for predictive, generative, and decision-making problems over time with a focus on healthcare. He has published in various journals and conferences including ICLR, NeurIPS, AISTATS, and The British Journal of Radiology.

Jinsung Yoon

Baseline method provision // data analysis // competition design advice // evaluation

Jinsung Yoon is a research scientist at Google Cloud AI. His main research interest has been on data imputation, model interpretation, transfer learning, and synthetic data generation using adversarial learning and reinforcement learning frameworks. He has published various papers and served as a reviewer in top-tier machine learning conferences (NeurIPS, ICML, ICLR, AAAI).

Paul Elbers

Domain expertise // data provision // competition design advice // evaluation design advice

Paul Elbers, MD, PhD, EDIC is a medical specialist in intensive care medicine at Amsterdam UMC, Amsterdam, The Netherlands. He also leads the Right Data Right Now research group at Amsterdam UMC that specifically aims to bring machine learning to the bedside of critically ill patients to improve their outcome. He is the deputy chair of the Data Science Section of the European Society of Intensive Care Medicine and co-chair of Amsterdam Medical Data Science, home of AmsterdamUMCdb, the first freely accessible European Intensive Care database.

Patrick Thoral

Domain expertise // data provision // competition design advice // evaluation design advice

Patrick Thoral, MD, EDIC works as an intensivist, medical specialist for intensive care, at Amsterdam UMC, Amsterdam, The Netherlands. With a background of medicine as well as medical informatics, he’s currently responsible for implementation of the electronic health record system in the ICU. To expedite improving patient outcomes using health care data, he played a major role in releasing AmsterdamUMCdb, the first freely accessible European Intensive Care database.

Ari Ercole

Domain expertise // data provision // competition design advice // evaluation design advice

Ari Ercole MD, PhD, FICM, FRCA, FCI is a research active intensive care attending physician at Cambridge University Hospitals NHS Foundation Trust with a PhD in physics and extensive experience in computing and ICU data modelling. He is chair of the European Society of Intensive Care Medicine Data Science Section and is a founding Fellow of the Faculty of Clinical Informatics. He has authored numerous peer-reviewed publications on the re-use of routinely ICU time-series data to improve predictions and care of intensive care patients and has been involved in a number of big-data projects such as the development of the Critical Care Health Informatics Collaborative database and the recent DAQCORD data curation guidelines.

Cheng Zhang

ML expertise // competition design advice // evaluation design advice

Cheng Zhang, PhD is a senior researcher at Microsoft Research Cambridge, UK. She leads the Data Efficient Decision Making (Project Azua) team in Microsoft. Before joining Microsoft, she was with the statistical machine learning group of Disney Research Pittsburgh, located at Carnegie Mellon University. She is interested in both machine learning theory, including variational inference, deep generative models and sequential decision making under uncertainty, as well as various machine learning applications with social impact such as education and healthcare. She has published many papers in top machine learning venues including NeurIPS, ICML, ICLR, ICLR, UAI etc. She co-organized the Symposium on Advances in Approximate Bayesian Inference from 2017 to 2019.

Danielle Belgrave

ML expertise // competition design advice // evaluation design advice

Danielle Belgrave, PhD is a principal researcher at Microsoft Research Cambridge, working on the intersection of machine learning and healthcare. The primary focus of her work is on developing probabilistic models to understand personalised healthcare strategies. She has published extensively on this intersection in high impact medical journals. She is the tutorial chair of NeurIPS 2019, 2020, diversity and inclusion chair of AISTATS 2020, board member of the Deep Learning Indaba, coorganiser of the first Khipu 2019, is a board member of Women in Machine Learning, program chair of WiML 2017, and has organised several other conferences and workshops.

Mihaela van der Schaar

General coordination and management // ML expertise // competition design advice // evaluation design advice

Mihaela van der Schaar is the John Humphrey Plummer Professor of Machine Learning, Artificial Intelligence and Medicine at the University of Cambridge, a Fellow at The Alan Turing Institute in London, and a Chancellor’s Professor at UCLA. Mihaela was elected IEEE Fellow in 2009. She has received numerous awards, including the Oon Prize on Preventative Medicine from the University of Cambridge (2018), a National Science Foundation CAREER Award (2004), 3 IBM Faculty Awards, the IBM Exploratory Stream Analytics Innovation Award, the Philips Make a Difference Award and several best paper awards, including the IEEE Darlington Award. Mihaela’s work has also led to 35 USA patents (many widely cited and adopted in standards) and 45+ contributions to international standards for which she received 3 International ISO (International Organization for Standardization) Awards. In 2019, she was identified by National Endowment for Science, Technology and the Arts as the most-cited female AI researcher in the UK. She was also elected as a 2019 “Star in Computer Networking and Communications” by N²Women. Her research expertise span signal and image processing, communication networks, network science, multimedia, game theory, distributed systems, machine learning and AI.

Related reading

Our lab has published a number of papers on the topic of synthetic data and machine learning for privacy. To learn more, visit our publications page.

Contact the organizers

If you’d like to contact the organizers, please complete the form below. Please note that this form is for general inquiries, and cannot be used to enter the competition.