van der Schaar Lab

Quantitative epistemology: conceiving a new human-machine partnership

This page is authored and maintained by Mihaela van der Schaar and Nick Maxfield.

Pioneering a new field of research

Quantitative epistemology is a new and transformationally significant research pillar pioneered by the van der Schaar Lab. The purpose of this research is to develop a strand of machine learning aimed at understanding, supporting, and improving human decision-making. We aim to do so by building machine learning models of decision-making—including how humans acquire and learn from new information, establish and update their beliefs, and act on the basis of their cumulative knowledge. Because our approach is driven by observational data in studying knowledge as well as using machine learning methods for supporting and improving knowledge acquisition and its impact on decision-making, we call this “quantitative epistemology.”

Our methods are aimed at studying human decision-making, identifying potential suboptimalities in beliefs and decision processes (such as cognitive biases, selective attention, imperfect retention of past experience, etc.), and understanding risk attitudes and their implications for learning and decision-making. This would allow us to construct decision support systems that provide humans with information pertinent to their intended actions, their possible alternatives and counterfactual outcomes, as well as other evidence to empower better decision-making.

Revisiting the roots of human (meta-)learning

Quantitative epistemology draws inspiration from the field of meta-learning. While meta-learning is arguably best-known today as a subfield of machine learning, in this case we are referring the original meaning of the term within the domains of social psychology and education—as coined by Donald B. Maudsley in his 1979 book entitled A theory of meta-learning and principles of facilitation : an organismic perspective.

Maudsley defined meta-learning as “the process by which learners become aware of and increasingly in control of habits of perception, inquiry, learning, and growth.” He put forward five requirements learners must observe in order to practice meta-learning successfully:
– Have a theory;
– Work in a supportive environment;
– Discover their rules and assumptions;
– Reconnect with reality-information from the environment; and
– Reorganize themselves by changing their rules/assumptions.

In reality, meta-learning remains extremely difficult for humans, even when the five requirements listed above are met. Our goal for quantitative epistemology, therefore, was to develop a new machine learning field aiming to empower humans to perform meta-learning. Our vision is to use machine learning to serve they purpose defined by Maudsley by empowering humans to improve and control their own perception, inquiry, learning, and growth—as well as their decision-making.

This is in keeping with our lab’s overall vision of using machine learning to learn human intelligence with the aim of empowering humans—rather than empowering machine intelligence.

A human-machine partnership based on empowerment, not replacement

As mentioned above, it is important to distinguish quantitative epistemology from existing work in AI and machine learning, such as imitation learning (i.e. replicating expert actions) and apprenticeship learning (i.e. matching expert returns), both of which intend to construct autonomous agents that can mimic and replace human demonstrators. Instead, we are concerned with leveraging machine learning to help humans become better decision-makers.

Quantitative epistemology entails developing machine learning models that capture how humans acquire new information, how they pay attention to such information, how their beliefs may be represented, how their internal models may be structured, how these different levels of knowledge are leveraged in the form of actions, and how such knowledge is learned and updated over time.

Quantitative epistemology envisages a new human-machine partnership in which machines support and empower humans, rather than replacing them.

The figure below depicts the broad strokes of this partnership in terms of long-term cycles in which a theory of meta-learning is built and continually honed, and in which humans are constantly being empowered to control their growth, perception, inquiry, learning, and decision-making.

Starting at the bottom left of the figure and moving clockwise:
1. humans act and perform meta-learning;
2. assumptions, structures, and rules, etc., can be studied using machine learning (quantitative epistemology) and developed into meta-learning models;
3. we can use these behavior models to distil hypotheses about meta-learning;
4. through the scientific process, we can build these hypotheses into a comprehensive and quantitative theory of meta-learning;
5a. we can reconnect this theory with reality-information and improve it cyclically over time;
5b. this process can also provide new advice, empowering humans to grow and further hone their perception, inquiry, learning, and decision-making.

Note: our use of “meta-learning models” here refers to models that examine the individual-specific thought processes and tendencies or biases that influence how humans make decisions when presented with specific information. Such models can examine characteristics including (but not limited to) an individual’s capacity for flexibility or adaptivity, tolerance of risk, or degree of optimism, and can also identify context-specific factors that drive changes in these characteristics. For instance, such models may identify that certain clinicians tend to be less optimistic when diagnosing patients at risk, or they may show how optimism and confirmation bias could lead to similar but differentiable behavior.

We can also use quantitative epistemology to build the “supportive environment” Maudsley defined as a requirement for successful meta-learning.

Starting at the very bottom of the figure and moving clockwise:
1. as in the previous figure, humans act and perform meta-learning;
2. machine learning tools (quantitative epistemology) can understand these decisions by building meta-learning models, identifying potential biases, errors, and inconsistencies, and providing advice;
3. humans are provided with this information;
4. humans inform the machine learning tools whether the adjustments or corrections provided about their behavior are effective or not, and offer clarifications about their decisions as well as rating the advice provided to them;
5. this serves to improve the understanding of the quantitative epistemology machine learning tools, driving a cycle that can further empower humans.

Applications of quantitative epistemology

Broadly, we currently see four potential areas of application for quantitative epistemology, none of which are limited to healthcare:

1. Decision Support
This is arguably the most intuitive and straightforward application of understanding human decision-making. In medicine, for example, we can combine a meaningful understanding of the basis on which decisions are made with normative standards for optimal decision-making in areas such as diagnosis, treatment, and resource allocation.

Furthermore, we can apply quantitative epistemology in a single-agent or multi-agent setting, using our understanding of decision-making to optimize decision-making across multiple individuals or groups, whether in a co-operative or a competitive setting.

2. Analysis of variation
In many fields such as healthcare, there is often remarkable regional, institutional, and subgroup-level variability in practice. This variability renders detection and quantification of biases crucial.

Quantitative epistemology can yield powerful tools to audit clinical decision-making to investigate variation in practice, biases, and sub-optimal decision-making, and understand where improvements can be made.

3. (Re)-Definition of Normative Standards
There are many areas in which normative standards have not been defined, or may need to be continually redefined. Through the application of quantitative epistemology, we can determine whether normative standards are realistic and effective representations of desired outcomes, enabling policy-makers to design better policies going forward.

4. Education and training
Quantitative epistemology aims to produce a data-driven, quantitative—and most importantly interpretable—description of the process by which humans form and adapt their beliefs and understanding of the world. This could yield enormous benefit in education and training: both the content and instructional methods employed in courses could be extensively tailored to specific individuals, taking into account their learning styles, biases, and preferences.

Quantitative epistemology in action: decision trajectories for Alzheimer’s patients

Intersection with other areas of research

Quantitative epistemology will complement and build upon projects across the lab’s other key research areas, including decision support systems, predictive analytics, automated ML, individualized treatment effect inference, interpretability, synthetic data, and more.

These points of intersection (and the immense potential for additional intersection) should be clear from the following descriptions of some of our initial projects related to quantitative epistemology.

Our work so far

Quantitative epistemology has become an area of significant focus for our lab’s researchers in recent years. Some of our first papers are shared below.

Online Decision Mediation
We develop a decision support assistant that serves as an intermediary between (oracle) expert behaviour and (imperfect) human behaviour. At each time, the algorithm observes an action chosen by a fallible agent, and decides whether to accept that agent’s decision, intervene with an alternative, or request the expert’s opinion. Successful mediation requires striking a balance between when to learn from the expert and when to intervene based on what is learned.

Online Decision Mediation

Daniel Jarrett, Alihan Hüyük, Mihaela van der Schaar

NeurIPS 2022


Inverse Contextual Bandits: Learning How Behavior Evolves over Time
Conventional approaches to policy learning almost invariably assume stationarity in behaviour, this is hardly true in practice: Medical practice is constantly evolving as clinical professionals fine-tune their knowledge over time. To quantify how medical practice have been evolving, we develop a policy learning method that provides interpretable representations of decision-making, in particular capturing an agent’s non-stationary knowledge of the world, as well as operating in an offline manner.

Inverse Contextual Bandits: Learning How Behavior Evolves over Time

Alihan Hüyük, Daniel Jarrett, Mihaela van der Schaar

ICML 2022


Inverse Online Learning: Understanding Non-Stationary and Reactionary Policies
We introduce a novel approach using deep state-space models to retrospectively estimate the factors that govern decision processes and how they change over time. By applying this technique to the analysis of organ donation acceptance decisions, we demonstrate its ability to provide valuable insights into human decision making and the potential for improving decision-making ability.

Inverse Online Learning: Understanding Non-Stationary and Reactionary Policies

Alex J. Chan, Alicia Curth, Mihaela van der Schaar

ICLR 2022


POETREE: Interpretable Policy Learning with Adaptive Decision Trees
Policy Extraction through Decision Trees (POETREE) is a new framework for interpretable policy learning that is compatible with fully-offline and partially-observable clinical decision environments. It uses fully-differentiable tree architectures to learn a representation of patient history and adapt over time, resulting in decision tree policies that can outperform the state-of-the-art transparent models. This approach has the potential to improve future decision support systems and help us better understand, diagnose, and support real-world policies in healthcare.

POETREE: Interpretable Policy Learning with Adaptive Decision Trees

Alizée Pace, Alex J. Chan, Mihaela van der Schaar

ICLR 2022


Inferring Lexicographically-Ordered Rewards from Preferences
In modelling the preferences of agents over a set of alternatives, the dominant approach has been to find a single reward/utility function with the property that alternatives yielding higher rewards are preferred over alternatives yielding lower rewards. However, in many settings, preferences are based on multiple—often competing—objectives; a single reward function is not adequate to represent such preferences. In this paper, we propose a method for inferring multi-objective reward-based representations of an agent’s observed preferences.

Inferring Lexicographically-Ordered Rewards from Preferences

Alihan Hüyük, William R. Zame, Mihaela van der Schaar

AAAI 2022


The Medkit-Learn(ing) Environment: Medical Decision Modelling through Simulation
The Medkit-Learn(ing) Environment is a new tool that uses synthetic medical data to test and improve machine learning algorithms for imitation and inverse reinforcement learning in the healthcare field. With the Medkit, researchers can evaluate and compare their algorithms in a realistic medical setting, while inspecting their algorithms to validate that they learn appropriate features.

The Medkit-Learn(ing) Environment: Medical Decision Modelling through Simulation

Alex J. Chan, Ioana Bica, Alihan Hüyük, Daniel Jarrett, Mihaela van der Schaar

NeurIPS 2021


Closing the loop in medical decision support by understanding clinical decision-making: A case study on organ transplantation
Using organ transplantation as a case study, we formalize the desiderata of methods for understanding clinical decision-making. We show that most existing machine learning methods are insufficient to meet these requirements and propose iTransplant, a novel data-driven framework to learn the factors affecting decisions on organ offers in an instance-wise fashion directly from clinical data, as a possible solution.

Closing the loop in medical decision support by understanding clinical decision-making: A case study on organ transplantation

Yuchao Qin, Fergus Imrie, Alihan Hüyük, Daniel Jarrett, Alexander Gimson, Mihaela van der Schaar

NeurIPS 2021


Inverse decision modeling (IDM)
In a paper accepted for publication at ICML 2021, we developed an expressive, unifying perspective on inverse decision modeling (IDM): a framework for learning parameterized representations of sequential decision behavior.

IDM enables us to quantify intuitive notions of bounded rationality—such as the apparent flexibility of decisions, tolerance for surprise, or optimism in beliefs—while also making such representations interpretable. In presenting IDM, we highlight its potential utility in real-world settings as an investigative device for auditing and understanding human decision-making.

Inverse Decision Modeling: Learning Interpretable Representations of Behavior

Daniel Jarrett, Alihan Hüyük, Mihaela van der Schaar

ICML 2021


Approximate Variational Reward Imitation Learning (AVRIL)
AVRIL, presented in a paper published at ICLR 2021, offers yet another potential approach to addressing the problem of studying decision-making in settings in which there is no access to knowledge of the environment dynamics nor intrinsic reward, nor even the ability to interact and test policies. As explained directly below, AVRIL offers reward inference in environments beyond the scope of current methods, as well as task performance competitive with focused offline imitation learning algorithms.

Scalable Bayesian Inverse Reinforcement Learning

Alex Chan, Mihaela van der Schaar

ICLR 2021


Counterfactual inverse reinforcement learning (CIRL)
In a paper published at ICLR 2021, we proposed learning explanations of expert decisions by modeling their reward function in terms of preferences with respect to counterfactual “what if” outcomes. In healthcare, for example, treatments often affect several patient covariates, by having both benefits and side-effects; decision-makers often make choices based on their preferences over these outcomes. By presenting decision-makers with counterfactuals, we can present them with potential outcomes of a particular action and model their preferences and reward functions. In the context of healthcare, doing this could enable us to quantify and inspect policies in different institutions and uncover the trade-offs and preferences associated with expert actions, as well as revealing the tendencies of individual practitioners to treat various diseases more or less aggressively.

Learning “What-if” Explanations for Sequential Decision-Making

Ioana Bica, Daniel Jarrett, Alihan Hüyük, Mihaela van der Schaar

ICLR 2021


Interpretable policy learning (INTERPOLE)
The motivation behind INTERPOLE, introduced in a paper published at ICLR 2021, was to create a transparent description of behavior capable of locating the factors that contribute to individual decisions, in a language that can readily understood by domain experts. Classical imitation learning approaches incorporate black-box hidden states that are rarely amenable to meaningful interpretation, while apprenticeship learning algorithms only offer high-level reward mappings that are not informative as to individual actions observed in the data. Additionally, INTERPOLE aims to accommodate partial observability, and operate completely offline.

During our work on INTERPOLE, we conducted experiments on both simulated and real-world data for the problem of Alzheimer’s disease diagnosis. We then sought feedback on our approach through a survey of 9 clinicians, who expressed an overwhelming preference for INTERPOLE by comparison with other potential approaches. Further details are provided earlier on this page.

Explaining by Imitating: Understanding Decisions by Interpretable Policy Learning

Alihan Hüyük, Daniel Jarrett, Cem Tekin, Mihaela van der Schaar

ICLR 2021


Inverse active sensing
The first paper resulting from this push into new territory was titled “Inverse Active Sensing: Modeling and Understanding Timely Decision-Making,” and was published at ICML 2020. The paper takes the familiar concept of active sensing (the goal-oriented problem of efficiently selecting which information to acquire, and when and what decision to settle on) and inverts it, seeking to uncover an agent’s preferences and strategy for acquiring information given their observable decision-making behavior.

Inverse active sensing has a diverse range of potential applications both in and beyond healthcare. A particularly salient application might be understanding decision-making around diagnosis of patients. For instance, we expect doctors to care much more about correctly diagnosing a lethal disease than another condition that presents with similar symptoms, but do they actually? By how much? Inverse active sensing can help us answer questions like these by uncovering preferences that effectively underlie observed decision behavior.

Inverse Active Sensing: Modeling and Understanding Timely Decision-Making

Daniel Jarrett, Mihaela van der Schaar

ICML 2020


The path ahead

The work above showcases our first few tentative steps into quantitative epistemology. We have committed to substantial further investment of our lab’s time and resources on a long-term basis.

Quantitative epistemology can yield fascinating new insights into how humans learn and make decisions, and can bring about a new type of human-machine partnership based on empowerment, not replacement. While existing approaches (shown in blue above) can be incorporated into our research and recent work by our own lab (shown in purple) has helped us lay a partial foundation for this new area of research, we are truly entering uncharted territory. There are many complex questions (whosn in green) to explore, and practically unlimited new discoveries to make. Our sincere hope is that our readers will share our vision for quantitative epistemology, and consider developing new machine learning methods within the quantitative epistemology agenda.

Going forward, our priorities will be:
– to hone our vision for what quantitative epistemology can become, how we can create a new human-machine partnership, and how in practice this can deliver social benefit (in and beyond healthcare);
– to construct a comprehensive theoretical foundation that will serve as the basis for development of models and methods (our ICML 2021 paper on inverse decision modeling is just one initial example of this); and
– to solve specific real-world problems in partnership with our network of clinical collaborators (such as the Alzheimer’s diagnosis example earlier on this page), while also using newly developed approaches to support clinical auditing, address variation in practice, and encourage the introduction of more quantitative and principled clinical guidelines in complex areas such as cancer and transplantation.

As we continue to expand the boundaries of quantitative epistemology ever further, this page will serve as a living map documenting our latest discoveries and reflecting our evolving understanding of this brand new area of research. Please continue to check back here for the latest updates.

You can find our related publications here.

Videos: NeurIPS 2021, ICML 2021, and Inspiration Exchange engagement session

This invited talk, entitled “Quantitative epistemology – empowering human meta-learning using machine learning,” was given by Mihaela van der Schaar on December 13, 2021, as part of the Workshop on Meta-Learning (MetaLearn) running alongside NeurIPS 2021.

On July 23, 2021, Mihaela van der Schaar gave a keynote talk entitled “Quantitative epistemology – conceiving a new human-machine partnership” as part of the ICML 2021 Interpretable Machine Learning in Healthcare (IMLH) Workshop.

The full talk can be found below, and is highly recommended viewing for anyone who would like to know more or get involved in the quantitative epistemology research agenda.

Our primary means of building a shared vision for machine learning for healthcare is through two groups of online engagement sessions: Inspiration Exchange (for machine learning students) and Revolutionizing Healthcare (for the healthcare community). If you would like to get involved, please visit the page below.

On July 12, 2021, our lab held an Inspiration Exchange session dedicated to introducing quantitative epistemology—including theory, approaches, and future directions for this brand new research area. The recording is available directly below.