The van der Schaar Lab regularly publishes papers at the top conferences in AI and machine learning (AISTATS, ICML, ICLR, and NeurIPS), and in leading journals.
How to use this page
You can use the table below to filter our publications by a range of categories, including author, year, conference, and research area. You can also use the search bar for custom queries—for example, to look up papers whose titles or abstracts include specific terms like “counterfactual,” or papers published in a certain journal, or even specific DOIs.
Clicking on a paper will show you relevant information, including the paper’s abstract and a URL to the paper’s published location (note: very recent papers may have been accepted for publication but may not yet be available online).
By default, the table below displays all of our published papers. Selecting the Top conferences tab will show you all of our papers published at AISTATS, ICLR, ICML, and NeurIPS.
If you want to share a URL linking to a specific type or group of papers (for example, papers on AutoML or papers from 2018), click the Shareable URLs tab.
Timestamp | Title | Authors | Year | URL | Abstract | DOI | Publication type | Conference | Journal/book name | ML sub-field | Clinical application | Past research |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2023/04/25 00:00 | Membership Inference Attacks against Synthetic Data through Overfitting Detection | B. van Breugel, H. Sun, Z. Qian, M. van der Schaar | 2023 | https://arxiv.org/abs/2302.12580 | Data is the foundation of most science. Unfortunately, sharing data can be obstructed by the risk of violating data privacy, impeding research in fields like healthcare. Synthetic data is a potential solution. It aims to generate data that has the same distribution as the original data, but that does not disclose information about individuals. Membership Inference Attacks (MIAs) are a common privacy attack, in which the attacker attempts to determine whether a particular real sample was used for training of the model. Previous works that propose MIAs against generative models either display low performance -- giving the false impression that data is highly private -- or need to assume access to internal generative model parameters -- a relatively low-risk scenario, as the data publisher often only releases synthetic data, not the model. In this work we argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution. We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model. Experimentally we show that DOMIAS is significantly more successful at MIA than previous work, especially at attacking uncommon samples. The latter is disconcerting since these samples may correspond to underrepresented groups. We also demonstrate how DOMIAS' MIA performance score provides an interpretable metric for privacy, giving data publishers a new tool for achieving the desired privacy-utility trade-off in their synthetic data. | Conference | AISTATS | Privacy-preserving ML & synthetic data | ||||
2023/04/25 00:00 | To Impute or not to Impute? Missing Data in Treatment Effect Estimation | J. Berrevoets, F. Imrie, T. Kyono, J. Jordon, M. van der Schaar | 2023 | https://proceedings.mlr.press/v206/berrevoets23a.html | Missing data is a systemic problem in practical scenarios that causes noise and bias when estimating treatment effects. This makes treatment effect estimation from data with missingness a particularly tricky endeavour. A key reason for this is that standard assumptions on missingness are rendered insufficient due to the presence of an additional variable, treatment, besides the input (e.g. an individual) and the label (e.g. an outcome). The treatment variable introduces additional complexity with respect to why some variables are missing that is not fully explored by previous work. In our work we introduce mixed confounded missingness (MCM), a new missingness mechanism where some missingness determines treatment selection and other missingness is determined by treatment selection. Given MCM, we show that naively imputing all data leads to poor performing treatment effects models, as the act of imputation effectively removes information necessary to provide unbiased estimates. However, no imputation at all also leads to biased estimates, as missingness determined by treatment introduces bias in covariates. Our solution is selective imputation, where we use insights from MCM to inform precisely which variables should be imputed and which should not. We empirically demonstrate how various learners benefit from selective imputation compared to other solutions for missing data. We highlight that our experiments encompass both average treatment effects and conditional average treatment effects. | Conference | AISTATS | Causal inference | Treatment & trials, Missing data imputation | |||
2023/04/25 00:00 | Improving Adaptive Conformal Prediction Using Self-Supervised Learning | N. Seedat* , A. Jeffares* , F. Imrie, M. van der Schaar | 2023 | https://proceedings.mlr.press/v206/seedat23a/seedat23a.pdf | Conformal prediction is a powerful distribution-free tool for uncertainty quantification, establishing valid prediction intervals with finite-sample guarantees. To produce valid intervals which are also adaptive to the difficulty of each instance, a common approach is to compute normalized nonconformity scores on a separate calibration set. Self-supervised learning has been effectively utilized in many domains to learn general representations for downstream predictors. However, the use of self-supervision beyond model pretraining and representation learning has been largely unexplored. In this work, we investigate how self-supervised pretext tasks can improve the quality of the conformal regressors, specifically by improving the adaptability of conformal intervals. We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores. We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals. | https://doi.org/10.48550/arXiv.2302.12238 | Conference | AISTATS | Deep learning, Uncertainty estimation | |||
2023/04/25 00:00 | T-Phenotype: Discovering Phenotypes of Predictive Temporal Patterns in Disease Progression | Y. Qin, M. van der Schaar, C. Lee | 2023 | https://arxiv.org/abs/2302.12619 | Clustering time-series data in healthcare is crucial for clinical phenotyping to understand patients' disease progression patterns and to design treatment guidelines tailored to homogeneous patient subgroups. While rich temporal dynamics enable the discovery of potential clusters beyond static correlations, two major challenges remain outstanding: i) discovery of predictive patterns from many potential temporal correlations in the multi-variate time-series data and ii) association of individual temporal patterns to the target label distribution that best characterizes the underlying clinical progression. To address such challenges, we develop a novel temporal clustering method, T-Phenotype, to discover phenotypes of predictive temporal patterns from labeled time-series data. We introduce an efficient representation learning approach in frequency domain that can encode variable-length, irregularly-sampled time-series into a unified representation space, which is then applied to identify various temporal patterns that potentially contribute to the target label using a new notion of path-based similarity. Throughout the experiments on synthetic and real-world datasets, we show that T-Phenotype achieves the best phenotype discovery performance over all the evaluated baselines. We further demonstrate the utility of T-Phenotype by uncovering clinically meaningful patient subgroups characterized by unique temporal patterns. | Conference | AISTATS | Time series analysis | Phenotyping & subgroup analysis | |||
2023/04/25 00:00 | Neural Laplace Control for Continuous-time Delayed Systems | S. Holt, A. Hüyük, M. van der Schaar | 2023 | https://arxiv.org/abs/2302.12604 | Many real-world offline reinforcement learning (RL) problems involve continuous-time environments with delays. Such environments are characterised by two distinctive features: firstly, the state x(t) is observed at irregular time intervals, and secondly, the current action a(t) only affects the future state x(t + g) with an unknown delay g > 0. A prime example of such an environment is satellite control where the communication link between earth and a satellite causes irregular observations and delays. Existing offline RL algorithms have achieved success in environments with irregularly observed states in time or known delays. However, environments involving both irregular observations in time and unknown delays remains an open and challenging problem. To this end, we propose Neural Laplace Control, a continuous-time model-based offline RL method that combines a Neural Laplace dynamics model with a model predictive control (MPC) planner—and is able to learn from a offline dataset sampled with irregular time intervals from an environment that has a inherent unknown constant delay. We show experimentally on continuous-time delayed environments it is able to achieve near expert policy performance. | Conference | AISTATS | Time series analysis | ||||
2023/04/25 00:00 | Understanding the Impact of Competing Risks on Heterogeneous Treatment Effect Estimation from Time-to-Event Data | A. Curth, M. van der Schaar | 2023 | https://arxiv.org/abs/2302.12718 | We study the problem of inferring heterogeneous treatment effects (HTEs) from time-to-event data in the presence of competing risks. Albeit its great practical relevance, this problem has received little attention compared to its counterparts studying treatment effect estimation without time-to-event data or competing risks. We take an outcome modeling approach to estimating HTEs, and consider how and when existing prediction models for time-to-event data can be used as plug-in estimators for potential outcome predictions. We then investigate whether competing risks present new challenges for HTE estimation – in addition to the standard confounding problem –, and find that, as there are multiple definitions of causal effects in this setting – namely total, direct and separable effects –, competing risks can act as an additional source of covariate shift depending on the desired treatment effect interpretation and associated estimand. We theoretically analyze and empirically illustrate when and how these challenges play a role when using generic machine learning prediction models for the estimation of HTEs. | Conference | AISTATS | Treatment & trials | ||||
2023/04/25 00:00 | SurvivalGAN: Generating time-to-event Data for Survival Analysis | A. Norcliffe*, B. Cebere*, F. Imrie, P. Lio, M. van der Schaar | 2023 | https://arxiv.org/abs/2302.12749 | Synthetic data is becoming an increasingly promising technology for research, successful application can improve privacy, fairness and data democratization. While there are many methods for generating synthetic tabular data, the task remains non-trivial and unexplored for specific scenarios. One such scenario is survival data, here the key difficulty is censoring, where we don’t know the time of event or if one even occurred. Imbalance in censoring and time horizons cause generative models to experience three new failure modes specific to survival analysis: generating too few at-risk members; generating too many at-risk members and censoring too early. We formalize these failure modes and provide three new generative metrics to quantify them. Following this, we propose SurvivalGAN, a generative model that handles survival data firstly by addressing the imbalance in the censoring and time horizons, and secondly by using a dedicated mechanism for approximating time-to-event/censoring. We evaluate this method via extensive experiments on medical datasets. SurvivalGAN outperforms multiple baselines at generating survival data, and in particular addresses the failure modes as measured by the new metrics, improving downstream performance of survival models. | Conference | AISTATS | Data-centric AI & reliable ML | ||||
2023/05/26 13:25 | TANGOS: Regularizing Tabular Neural Networks through Gradient Orthogonalization and Specialization | A. Jeffares*, T. Liu*, J. Crabbé, F. Imrie, M. van der Schaar | 2023 | https://openreview.net/forum?id=n6H86gW8u0d | Despite their success with unstructured data, deep neural networks are not yet a panacea for structured tabular data. In the tabular domain, their efficiency crucially relies on various forms of regularization to prevent overfitting and provide strong generalization performance. Existing regularization techniques include broad modelling decisions such as choice of architecture, loss functions, and optimization methods. In this work, we introduce Tabular Neural Gradient Orthogonalization and Specialization (TANGOS), a novel framework for regularization in the tabular setting built on latent unit attributions. The gradient attribution of an activation with respect to a given input feature suggests how the neuron attends to that feature, and is often employed to interpret the predictions of deep networks. In TANGOS, we take a different approach and incorporate neuron attributions directly into training to encourage orthogonalization and specialization of latent attributions in a fully-connected network. Our regularizer encourages neurons to focus on sparse, non-overlapping input features and results in a set of diverse and specialized latent units. In the tabular domain, we demonstrate that our approach can lead to improved out-of-sample generalization performance, outperforming other popular regularization methods. We provide insight into why our regularizer is effective and demonstrate that TANGOS can be applied jointly with existing methods to achieve even greater generalization performance. | 10.48550/arXiv.2303.05506 | Conference | ICLR | Deep learning | |||
2023/05/26 16:49 | GOGGLE: Generative Modelling for Tabular Data by Learning Relational Structure | T. Liu, Z. Qian, J. Berrevoets, M. van der Schaar | 2023 | https://openreview.net/forum?id=fPVRcJqspu | Deep generative models learn highly complex and non-linear representations to generate realistic synthetic data. While they have achieved notable success in computer vision and natural language processing, similar advances have been less demonstrable in the tabular domain. This is partially because generative modelling of tabular data entails a particular set of challenges, including heterogeneous relationships, limited number of samples, and difficulties in incorporating prior knowledge. Additionally, unlike their counterparts in image and sequence domain, deep generative models for tabular data almost exclusively employ fully-connected layers, which encode weak inductive biases about relationships between inputs. Real-world data generating processes can often be represented using relational structures, which encode sparse, heterogeneous relationships between variables. In this work, we learn and exploit relational structure underlying tabular data to better model variable dependence, and as a natural means to introduce regularization on relationships and include prior knowledge. Specifically, we introduce GOGGLE, an end-to-end message passing scheme that jointly learns the relational structure and corresponding functional relationships as the basis of generating synthetic samples. Using real-world datasets, we provide empirical evidence that the proposed method is effective in generating realistic synthetic data and exploiting domain knowledge for downstream tasks. | Conference | ICLR | Deep learning, Privacy-preserving ML & synthetic data | ||||
2023/05/29 12:14 | When to Make and Break Commitments? | A. Hüyük, Z. Qian, M. van der Schaar | 2023 | https://openreview.net/forum?id=q8vgHfPdoQP | In many scenarios, decision-makers must commit to long-term actions until their resolution before receiving the payoff of said actions, and usually, staying committed to such actions incurs continual costs. For instance, in healthcare, a newlydiscovered treatment cannot be marketed to patients until a clinical trial is conducted, which both requires time and is also costly. Of course in such scenarios, not all commitments eventually pay off. For instance, a clinical trial might end up failing to show efficacy. Given the time pressure created by the continual cost of keeping a commitment, we aim to answer: When should a decision-maker break a commitment that is likely to fail—either to make an alternative commitment or to make no further commitments at all? First, we formulate this question as a new type of optimal stopping/switching problem called the optimal commitment problem (OCP). Then, we theoretically analyse OCP, and based on the insight we gain, propose a practical algorithm for solving it. Finally, we empirically evaluate the performance of our algorithm in running clinical trials with subpopulation selection. | Conference | ICLR | Quantitative epistemology (understanding decision-making) | Treatment & trials | |||
2023/05/29 12:15 | Deep Generative Symbolic Regression | S. Holt, Z. Qian, M. van der Schaar | 2023 | https://openreview.net/forum?id=o7koEEMA1bR | Symbolic regression (SR) aims to discover concise closed-form mathematical equations from data, a task fundamental to scientific discovery. However, the problem is highly challenging because closed-form equations lie in a complex combinatorial search space. Existing methods, ranging from heuristic search to reinforcement learning, fail to scale with the number of input variables. We make the observation that closed-form equations often have structural characteristics and invariances (e.g., the commutative law) that could be further exploited to build more effective symbolic regression solutions. Motivated by this observation, our key contribution is to leverage pre-trained deep generative models to capture the intrinsic regularities of equations, thereby providing a solid foundation for subsequent optimization steps. We show that our novel formalism unifies several prominent approaches of symbolic regression and offers a new perspective to justify and improve on the previous ad hoc designs, such as the usage of cross-entropy loss during pre-training. Specifically, we propose an instantiation of our framework, Deep Generative Symbolic Regression (DGSR). In our experiments, we show that DGSR achieves a higher recovery rate of true equations in the setting of a larger number of input variables, and it is more computationally efficient at inference time than state-of-the-art RL symbolic regression solutions. | Conference | ICLR | |||||
2023/07/23 00:00 | Differentiable and Transportable Structure Learning | J. Berrevoets, N. Seedat, F. Imrie, M. van der Schaar | 2023 | https://arxiv.org/abs/2206.06354 | Directed acyclic graphs (DAGs) encode a lot of information about a particular distribution in its structure. However, compute required to infer these structures is typically super-exponential in the number of variables, as inference requires a sweep of a combinatorially large space of potential structures. That is, until recent advances made it possible to search this space using a differentiable metric, drastically reducing search time. While this technique -- named NOTEARS -- is widely considered a seminal work in DAG-discovery, it concedes an important property in favour of differentiability: transportability. To be transportable, the structures discovered on one dataset must apply to another dataset from the same domain. In our paper, we introduce D-Struct which recovers transportability in the discovered structures through a novel architecture and loss function, while remaining completely differentiable. Because D-Struct remains differentiable, our method can be easily adopted in existing differentiable architectures, as was previously done with NOTEARS. In our experiments, we empirically validate D-Struct with respect to edge accuracy and structural Hamming distance in a variety of settings. | Conference | ICML | Causal inference | Scientific discovery | |||
2023/07/23 00:00 | Learning Representations Without Compositional Assumptions | T. Liu, J. Berrevoets, Z. Qian, M. van der Schaar | 2023 | https://arxiv.org/abs/2305.19726 | This paper addresses the issue of unsupervised representation learning on tabular datasets that contain feature sets from various sources of measurements. Traditional methods, which tackle this problem using the multi-view framework, are constrained by predefined assumptions that assume feature sets share the same information and representations should learn globally shared factors. However, this assumption is not always valid for real-world tabular datasets with complex dependencies between feature sets, resulting in localized information that is harder to learn. To overcome this limitation, we propose a data-driven approach that learns feature set dependencies by representing feature sets as graph nodes and their relationships as learnable edges. Furthermore, we introduce LEGATO, a novel hierarchical graph autoencoder that learns a smaller, latent graph to aggregate information from multiple views dynamically. This approach results in latent graph components that specialize in capturing localized information from different regions of the input, leading to superior downstream performance. | Conference | ICML | Deep learning | ||||
2023/07/23 00:00 | In Search of Insights, Not Magic Bullets: Towards Demystification of the Model Selection Dilemma in Heterogeneous Treatment Effect Estimation | A. Curth, M. van der Schaar | 2023 | https://arxiv.org/abs/2302.02923 | Personalized treatment effect estimates are often of interest in high-stakes applications -- thus, before deploying a model estimating such effects in practice, one needs to be sure that the best candidate from the ever-growing machine learning toolbox for this task was chosen. Unfortunately, due to the absence of counterfactual information in practice, it is usually not possible to rely on standard validation metrics for doing so, leading to a well-known model selection dilemma in the treatment effect estimation literature. While some solutions have recently been investigated, systematic understanding of the strengths and weaknesses of different model selection criteria is still lacking. In this paper, instead of attempting to declare a global `winner', we therefore empirically investigate success- and failure modes of different selection criteria. We highlight that there is a complex interplay between selection strategies, candidate estimators and the data used for comparing them, and provide interesting insights into the relative (dis)advantages of different criteria alongside desiderata for the design of further illuminating empirical studies in this context. | Conference | ICML | Causal inference | Treatment & trials | |||
2023/07/23 00:00 | Adaptive Identification of Populations with Treatment Benefit in Clinical Trials: Machine Learning Challenges and Solutions | A. Curth, A. Hüyük, M. van der Schaar | 2023 | https://arxiv.org/abs/2208.05844 | We study the problem of adaptively identifying patient subpopulations that benefit from a given treatment during a confirmatory clinical trial. This type of adaptive clinical trial has been thoroughly studied in biostatistics, but has been allowed only limited adaptivity so far. Here, we aim to relax classical restrictions on such designs and investigate how to incorporate ideas from the recent machine learning literature on adaptive and online experimentation to make trials more flexible and efficient. We find that the unique characteristics of the subpopulation selection problem -- most importantly that (i) one is usually interested in finding subpopulations with any treatment benefit (and not necessarily the single subgroup with largest effect) given a limited budget and that (ii) effectiveness only has to be demonstrated across the subpopulation on average -- give rise to interesting challenges and new desiderata when designing algorithmic solutions. Building on these findings, we propose AdaGGI and AdaGCPI, two meta-algorithms for subpopulation construction. We empirically investigate their performance across a range of simulation scenarios and derive insights into their (dis)advantages across different settings. | Conference | ICML | Causal inference, Next-generation clinical trials | Treatment & trials | |||
2023/07/23 00:00 | Accounting For Informative Sampling When Learning to Forecast Treatment Outcomes Over Time | T. Vanderschueren*, A. Curth*, W. Verbeke, M. van der Schaar | 2023 | https://arxiv.org/abs/2306.04255 | Machine learning (ML) holds great potential for accurately forecasting treatment outcomes over time, which could ultimately enable the adoption of more individualized treatment strategies in many practical applications. However, a significant challenge that has been largely overlooked by the ML literature on this topic is the presence of informative sampling in observational data. When instances are observed irregularly over time, sampling times are typically not random, but rather informative -- depending on the instance's characteristics, past outcomes, and administered treatments. In this work, we formalize informative sampling as a covariate shift problem and show that it can prohibit accurate estimation of treatment outcomes if not properly accounted for. To overcome this challenge, we present a general framework for learning treatment outcomes in the presence of informative sampling using inverse intensity-weighting, and propose a novel method, TESAR-CDE, that instantiates this framework using Neural CDEs. Using a simulation environment based on a clinical use case, we demonstrate the effectiveness of our approach in learning under informative sampling. | Conference | ICML | Causal inference, Time series analysis | Treatment & trials | |||
2023/07/23 00:00 | Synthetic data, real errors: how (not) to publish and use synthetic data | B. van Breugel, Z. Qian, M. van der Schaar | 2023 | https://arxiv.org/abs/2305.09235 | Generating synthetic data through generative models is gaining interest in the ML community and beyond, promising a future where datasets can be tailored to individual needs. Unfortunately, synthetic data is usually not perfect, resulting in potential errors in downstream tasks. In this work we explore how the generative process affects the downstream ML task. We show that the naive synthetic data approach -- using synthetic data as if it is real -- leads to downstream models and analyses that do not generalize well to real data. As a first step towards better ML in the synthetic data regime, we introduce Deep Generative Ensemble (DGE) -- a framework inspired by Deep Ensembles that aims to implicitly approximate the posterior distribution over the generative process model parameters. DGE improves downstream model training, evaluation, and uncertainty quantification, vastly outperforming the naive approach on average. The largest improvements are achieved for minority classes and low-density regions of the original data, for which the generative uncertainty is largest. | Conference | ICML | |||||
2023/01/13 13:08 | External validity of machine learning-based prognostic scores for cystic fibrosis: A retrospective study using the UK and Canadian registries | Y. Qin, A. Alaa, A. Floto, M. van der Schaar | 2023 | https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000179 | Precise and timely referral for lung transplantation is critical for the survival of cystic fibrosis patients with terminal illness. While machine learning (ML) models have been shown to achieve significant improvement in prognostic accuracy over current referral guidelines, the external validity of these models and their resulting referral policies has not been fully investigated. Here, we studied the external validity of machine learning-based prognostic models using annual follow-up data from the UK and Canadian Cystic Fibrosis Registries. Using a state-of-the-art automated ML framework, we derived a model for predicting poor clinical outcomes in patients enrolled in the UK registry, and conducted external validation of the derived model using the Canadian Cystic Fibrosis Registry. In particular, we studied the effect of (1) natural variations in patient characteristics across populations and (2) differences in clinical practice on the external validity of ML-based prognostic scores. Overall, decrease in prognostic accuracy on the external validation set (AUCROC: 0.88, 95% CI 0.88-0.88) was observed compared to the internal validation accuracy (AUCROC: 0.91, 95% CI 0.90-0.92). Based on our ML model, analysis on feature contributions and risk strata revealed that, while external validation of ML models exhibited high precision on average, both factors (1) and (2) can undermine the external validity of ML models in patient subgroups with moderate risk for poor outcomes. A significant boost in prognostic power (F1 score) from 0.33 (95% CI 0.31-0.35) to 0.45 (95% CI 0.45-0.45) was observed in external validation when variations in these subgroups were accounted in our model. Our study highlighted the significance of external validation of ML models for cystic fibrosis prognostication. The uncovered insights on key risk factors and patient subgroups can be used to guide the cross-population adaptation of ML-based models and inspire new research on applying transfer learning methods for fine-tuning ML models to cope with regional variations in clinical care. | 10.1371/journal.pdig.0000179 | Journal | PLOS Digital Health | Risk & prognosis | |||
2022/12/05 21:44 | The Potential and Pitfalls of Artificial Intelligence in Clinical Pharmacology | M. Johnson, M. Patel, A. Phipps, M. van der Schaar, D. Boulton, M. Gibbs | 2023 | https://ascpt.onlinelibrary.wiley.com/doi/10.1002/psp4.12902 | Artificial intelligence (AI) involves using data and algorithms to perform activities normally achieved through human intelligence. AI and its key component machine learning contextualize data and enhance decision making to transform how we operate, discover, and develop drugs. Transforming clinical pharmacology (CP) as AI-augmented CP (AI/CP) requires an ecosystem including digitized data collection, standardized processes, complementary technologies, and an ethical framework. This commentary aims to highlight the future perspectives of AI/CP in drug development. | 10.1002/psp4.12902 | Journal | CPT: Pharmacometrics & Systems Pharmacology | Clinical practice | |||
2023/03/06 10:43 | Synthcity: facilitating innovative use cases of synthetic data in different data modalities | Z. Qian, B. Cebere, M. van der Schaar | 2023 | https://arxiv.org/pdf/2301.07573.pdf | Synthcity is an open-source software package for innovative use cases of synthetic data in ML fairness, privacy and augmentation across diverse tabular data modal- ities, including static data, regular and irregular time series, data with censoring, multi-source data, composite data, and more. Synthcity provides the practitioners with a single access point to cutting edge research and tools in synthetic data. It also offers the community a playground for rapid experimentation and prototyping, a one-stop-shop for SOTA benchmarks, and an opportunity for extending research impact. The library can be accessed on GitHub and pip. We warmly invite the community to join the development effort by providing feedback, reporting bugs, and contributing code. | Journal | arxiv | Privacy-preserving ML & synthetic data | ||||
2023/04/24 12:53 | Synthetic Model Combination: A new machine-learning method for pharmacometric model ensembling | A. Chan, R. Peck, M. Gibbs, M. van der Schaar | 2023 | https://ascpt.onlinelibrary.wiley.com/doi/10.1002/psp4.12965 | When aiming to make predictions over targets in the pharmacological setting, a data-focused approach aims to learn models based on a collection of labeled examples. Unfortunately, data sharing is not always possible, and this can result in many different models trained on disparate populations, leading to the natural question of how best to use and combine them when making a new prediction. Previous work has focused on global model selection or ensembling, with the result of a single final model across the feature space. Machine-learning models perform notoriously poorly on data outside their training domain, however, due to a problem known as covariate shift, and so we argue that when ensembling models the weightings for individual instances must reflect their respective domains—in other words, models that are more likely to have seen information on that instance should have more attention paid to them. We introduce a method for such an instance-wise ensembling of models called Synthetic Model Combination (SMC), including a novel representation learning step for handling sparse high-dimensional domains. We demonstrate the use of SMC on an example with dosing predictions for vancomycin, although emphasize the applicability of the method to any scenario involving the use of multiple models. | 10.1002/psp4.12965 | Journal | CPT: Pharmacometrics & Systems Pharmacology | Transfer learning, Ensemble learning | Clinical practice | ||
2023/04/27 14:43 | Sex Differences in Heart Failure Following Acute Coronary Syndromes | E. Cenko*, O. Manfrini*, J. Yoon, M. van der Schaar, M. Bergami, Z. Vasiljevic, G. Mendieta, G. Stankovic, M. Vavlukis, S. Kedev, D. Miličić, L. Badimon, R. Bugiardini | 2023 | https://www.sciencedirect.com/science/article/pii/S2772963X23000492 | Background There have been conflicting reports regarding outcomes in women presenting with an acute coronary syndrome (ACS). Objectives The objective of the study was to examine sex-specific differences in 30-day mortality in patients with ACS and acute heart failure (HF) at the time of presentation. Methods This was a retrospective study of patients included in the International Survey of Acute Coronary Syndromes (ISACS Archives-NCT04008173). Acute HF was defined as Killip classes ≥2. Participants were stratified according to ACS presentation: ST-segment elevation myocardial infarction (STEMI) and non-ST-segment elevation ACS (NSTE-ACS). Differences in 30-day mortality and acute HF presentation at admission between sexes were examined using inverse propensity weighting based on the propensity score. Estimates were compared by test of interaction on the log scale. Results A total of 87,812 patients were included, of whom 30,922 (35.2%) were women. Mortality was higher in women compared with men in those presenting with STEMI (risk ratio (RR): 1.65; 95% CI: 1.56-1.73) and NSTE-ACS (RR: 1.18; 95% CI: 1.09-1.28; Pinteraction < 0.001). Acute HF was more common in women when compared to men with STEMI (RR: 1.24; 95% CI: 1.20-1.29) but not in those with NSTE-ACS (RR: 1.02; 95% CI: 0.97-1.08) (Pinteraction < 0.001). The presence of acute HF increased the risk of mortality for both sexes (odds ratio: 6.60; 95% CI: 6.25-6.98). Conclusions In patients presenting with ACS, mortality is higher in women. The presence of acute HF at hospital presentation increases the risk of mortality in both sexes. Women with STEMI are more likely to present with acute HF and this may, in part, explain sex differences in mortality. These findings may be helpful to improve sex-specific personalized risk stratification. | 10.1016/j.jacadv.2023.100294 | Journal | JACC: Advances | Risk & prognosis, Phenotyping & subgroup analysis | |||
2023/06/22 19:29 | AutoPrognosis 2.0: Democratizing Diagnostic and Prognostic Modeling in Healthcare with Automated Machine Learning | F. Imrie, B. Cebere, E. F. McKinney, M. van der Schaar | 2023 | https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000276 | Diagnostic and prognostic models are increasingly important in medicine and inform many clinical decisions. Recently, machine learning approaches have shown improvement over conventional modeling techniques by better capturing complex interactions between patient covariates in a data-driven manner. However, the use of machine learning introduces a number of technical and practical challenges that have thus far restricted widespread adoption of such techniques in clinical settings. To address these challenges and empower healthcare professionals, we present a machine learning framework, AutoPrognosis 2.0, to develop diagnostic and prognostic models. AutoPrognosis leverages state-of-the-art advances in automated machine learning to develop optimized machine learning pipelines, incorporates model explainability tools, and enables deployment of clinical demonstrators, without requiring significant technical expertise. Our framework eliminates the major technical obstacles to predictive modeling with machine learning that currently impede clinical adoption. To demonstrate AutoPrognosis 2.0, we provide an illustrative application where we construct a prognostic risk score for diabetes using the UK Biobank, a prospective study of 502,467 individuals. The models produced by our automated framework achieve greater discrimination for diabetes than expert clinical risk scores. Our risk score has been implemented as a web-based decision support tool and can be publicly accessed by patients and clinicians worldwide. In addition, AutoPrognosis 2.0 is provided as an open-source python package. By open-sourcing our framework as a tool for the community, clinicians and other medical practitioners will be able to readily develop new risk scores, personalized diagnostics, and prognostics using modern machine learning techniques. | 10.1371/journal.pdig.0000276 | Journal | PLOS Digital Health | Automated ML | |||
2023/06/29 12:32 | Interpretable Medical Diagnostics with Structured Data Extraction by Large Language Models | A. Bisercic, M. Nikolic, M. van der Schaar, B. Delibasic, P. Lio, A. Petrovic | 2023 | https://arxiv.org/pdf/2306.05052 | Tabular data is often hidden in text, particularly in medical diagnostic reports. Traditional machine learning (ML) models designed to work with tabular data, cannot effectively process information in such form. On the other hand, large language models (LLMs) which excel at textual tasks, are probably not the best tool for modeling tabular data. Therefore, we propose a novel, simple, and effective methodology for extracting structured tabular data from textual medical reports, called TEMED-LLM. Drawing upon the reasoning capabilities of LLMs, TEMED-LLM goes beyond traditional extraction techniques, accurately inferring tabular features, even when their names are not explicitly mentioned in the text. This is achieved by combining domain-specific reasoning guidelines with a proposed data validation and reasoning correction feedback loop. By applying interpretable ML models such as decision trees and logistic regression over the extracted and validated data, we obtain end-to-end interpretable predictions. We demonstrate that our approach significantly outperforms state-of-the-art text classification models in medical diagnostics. Given its predictive performance, simplicity, and interpretability, TEMED-LLM underscores the potential of leveraging LLMs to improve the performance and trustworthiness of ML models in medical applications. | Other | arXiv | |||||
2023/06/29 12:36 | U-PASS: an Uncertainty-guided deep learning Pipeline for Automated Sleep Staging | E. Heremans, N. Seedat, B. Buyse, D. Testelmans, M. van der Shaar | 2023 | https://arxiv.org/pdf/2306.04663 | As machine learning becomes increasingly prevalent in critical fields such as healthcare, ensuring the safety and reliability of machine learning systems becomes paramount. A key component of reliability is the ability to estimate uncertainty, which enables the identification of areas of high and low confidence and helps to minimize the risk of error. In this study, we propose a machine learning pipeline called U-PASS tailored for clinical applications that incorporates uncertainty estimation at every stage of the process, including data acquisition, training, and model deployment. The training process is divided into a supervised pre-training step and a semi-supervised finetuning step. We apply our uncertainty-guided deep learning pipeline to the challenging problem of sleep staging and demonstrate that it systematically improves performance at every stage. By optimizing the training dataset, actively seeking informative samples, and deferring the most uncertain samples to an expert, we achieve an expert-level accuracy of 85% on a challenging clinical dataset of elderly sleep apnea patients, representing a significant improvement over the baseline accuracy of 75%. U-PASS represents a promising approach to incorporating uncertainty estimation into machine learning pipelines, thereby improving their reliability and unlocking their potential in clinical settings. | Other | arXiv | |||||
2023/06/29 12:38 | Automated machine learning as a partner in predictive modelling | T. Callender, M. van der Schaar | 2023 | https://www.thelancet.com/journals/landig/article/PIIS2589-7500(23)00054-7/fulltext | Machine learning promises to underpin personalised medicine. However, the expertise required to develop and deploy state-of-the-art machine learning algo rithms has contributed to the inconsistent quality of model development, the shallow range of methods considered, and the relatively poor penetrance of machine learning models in clinical use. In this Comment, we discuss the emerging field of automated machine learning and propose that it could have a central role in the future of clinical risk prediction. We argue that automated machine learning can empower both modelling experts and nonexperts, democratise access to machine learning methods, and encode better standards in model development. Finally, we advocate that such frameworks be an initial step in model development to support practitioners to find the most suitable modelling approach for their question and to understand if machine learning … | Journal | The Lancet Digital Health | Risk & prognosis, Risk & disease trajectories | ||||
2023/06/29 12:39 | Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance | J. Crabbé, M. van der Schaar | 2023 | https://arxiv.org/pdf/2304.06715 | Interpretability methods are valuable only if their explanations faithfully describe the explained model. In this work, we consider neural networks whose predictions are invariant under a specific symmetry group. This includes popular architectures, ranging from convolutional to graph neural networks. Any explanation that faithfully explains this type of model needs to be in agreement with this invariance property. We formalize this intuition through the notion of explanation invariance and equivariance by leveraging the formalism from geometric deep learning. Through this rigorous formalism, we derive (1) two metrics to measure the robustness of any interpretability method with respect to the model symmetry group; (2) theoretical robustness guarantees for some popular interpretability methods and (3) a systematic approach to increase the invariance of any interpretability method with respect to a symmetry group. By empirically measuring our metrics for explanations of models associated with various modalities and symmetry groups, we derive a set of 5 guidelines to allow users and developers of interpretability methods to produce robust explanations. | Other | arXiv | Interpretability & explainability | ||||
2023/06/29 12:41 | Synthetic Model Combination: A new machine learning method for pharmacometric model ensembling | A. Chan, R. Peck, M. Gibbs, M. van der Schaar | 2023 | https://ascpt.onlinelibrary.wiley.com/doi/pdf/10.1002/psp4.12965 | When aiming to make predictions over targets in the pharmacological setting, a data-focused approach aims to learn models based on a collection of labeled examples. Unfortunately, data sharing is not always possible, and this can result in many different models trained on disparate populations, leading to the natural question of how best to use and combine them when making a new prediction. Previous work has focused on global model selection or ensembling, with the result of a single final model across the feature space. Machine-learning models perform notoriously poorly on data outside their training domain, however, due to a problem known as covariate shift, and so we argue that when ensembling models the weightings for individual instances must reflect their respective domains—in other words, models that are more likely to have seen information on that instance should have more attention paid to them. We introduce a method for such an instance-wise ensembling of models called Synthetic Model Combination (SMC), including a novel representation learning step for handling sparse high-dimensional domains. We demonstrate the use of SMC on an example with dosing predictions for vancomycin, although emphasize the applicability of the method to any scenario involving the use of multiple models. | Journal | CPT: pharmacometrics & systems pharmacology | Treatment & trials | ||||
2023/06/29 12:42 | Machine Learning with Requirements: a Manifesto | E. Giunchiglia, F. Imrie, M. van der Schaar, T. Lukasiewicz | 2023 | https://arxiv.org/pdf/2304.03674 | In the recent years, machine learning has made great advancements that have been at the root of many breakthroughs in different application domains. However, it is still an open issue how make them applicable to high-stakes or safety-critical application domains, as they can often be brittle and unreliable. In this paper, we argue that requirements definition and satisfaction can go a long way to make machine learning models even more fitting to the real world, especially in critical domains. To this end, we present two problems in which (i) requirements arise naturally, (ii) machine learning models are or can be fruitfully deployed, and (iii) neglecting the requirements can have dramatic consequences. We show how the requirements specification can be fruitfully integrated into the standard machine learning development pipeline, proposing a novel pyramid development process in which requirements definition may impact all the subsequent phases in the pipeline, and viceversa. | Other | arXiv | |||||
2023/06/29 12:44 | Beyond Privacy: Navigating the Opportunities and Challenges of Synthetic Data | B. van Breugel, M. van der Schaar | 2023 | https://arxiv.org/pdf/2304.03722 | Generating synthetic data through generative models is gaining interest in the ML community and beyond. In the past, synthetic data was often regarded as a means to private data release, but a surge of recent papers explore how its potential reaches much further than this -- from creating more fair data to data augmentation, and from simulation to text generated by ChatGPT. In this perspective we explore whether, and how, synthetic data may become a dominant force in the machine learning world, promising a future where datasets can be tailored to individual needs. Just as importantly, we discuss which fundamental challenges the community needs to overcome for wider relevance and application of synthetic data -- the most important of which is quantifying how much we can trust any finding or prediction drawn from synthetic data. | Other | arXiv | |||||
2023/06/29 12:46 | Matters Arising: PREDICT underestimates survival of patients with HER2-positive early-stage breast cancer | A. M. Alaa, A. L. Harris, M. van der Schaar | 2023 | https://www.nature.com/articles/s41523-023-00514-5 | In a recent paper in NPJ Breast Cancer, Agostinetto et al. 1 demonstrated the poor concordance between recently improved survival data for HER2-positive early-stage breast cancer with outcomes predicted by PREDICT 2.1. We replicated these findings in large-scale cohorts extracted from the UK and US patient registries and demonstrated that a publicly available machine learning-based prognostic model provides improved predictive accuracy. | Journal | npj Breast Cancer | |||||
2023/06/29 12:48 | Learning machines for health and beyond | M. Abroshan, O. Giles, S. Greenbury, J. Roberts, M. van der Schaar, J. S. Steyn, A. Wilson, M. Yong | 2023 | https://arxiv.org/pdf/2303.01513 | Machine learning techniques are effective for building predictive models because they are good at identifying patterns in large datasets. Development of a model for complex real life problems often stops at the point of publication, proof of concept or when made accessible through some mode of deployment. However, a model in the medical domain risks becoming obsolete as soon as patient demographic changes. The maintenance and monitoring of predictive models post-publication is crucial to guarantee their safe and effective long term use. As machine learning techniques are effectively trained to look for patterns in available datasets, the performance of a model for complex real life problems will not peak and remain fixed at the point of publication or even point of deployment. Rather, data changes over time, and they also changed when models are transported to new places to be used by new demography. | Other | arXiv | |||||
2023/06/29 12:52 | Assessing eligibility for lung cancer screening: Parsimonious multi-country ensemble machine learning models for lung cancer prediction | T. Callender, F. Imrie, B. Cebere, N. Pashayan, N. Navani, M. van der Schaar, S. M. Janes | 2023 | https://www.medrxiv.org/content/medrxiv/early/2023/01/29/2023.01.27.23284974.full.pdf | Background Ensemble machine learning could support the development of highly parsimonious prediction models that maintain the performance of more complex models whilst maximising simplicity and generalisability, supporting the widespread adoption of personalised screening. In this work, we aimed to develop and validate ensemble machine learning models to determine eligibility for risk-based lung cancer screening. Methods For model development, we used data from 216,714 ever-smokers in the UK Biobank prospective cohort and 26,616 high-risk ever-smokers in the control arm of the US National Lung Screening randomised controlled trial. We externally validated our models amongst the 49,593 participants in the chest radiography arm and amongst all 80,659 ever-smoking participants in the US Prostate, Lung, Colorectal and Ovarian Screening Trial (PLCO). Models were developed to predict the risk of two outcomes within five years from baseline: diagnosis of lung cancer, and death from lung cancer. We assessed model discrimination (area under the receiver operating curve, AUC), calibration (calibration curves and expected/observed ratio), overall performance (Brier scores), and net benefit with decision curve analysis. Results Models predicting lung cancer death (UCL-D) and incidence (UCL-I) using three variables – age, smoking duration, and pack-years – achieved or exceeded parity in discrimination, overall performance, and net benefit with comparators currently in use, despite requiring only one-quarter of the predictors. In external validation in the PLCO trial, UCL-D had an AUC of 0.803 (95% CI: 0.783-0.824) and was well calibrated with an expected/observed (E/O) ratio of 1.05 (95% CI: 0.95-1.19). UCL-I had an AUC of 0.787 (95% CI: 0.771-0.802), an E/O ratio of 1.0 (0.92-1.07). The sensitivity of UCL-D was 85.5% and UCL-I was 83.9%, at 5-year risk thresholds of 0.68% and 1.17%, respectively 7.9% and 6.2% higher than the USPSTF-2021 criteria at the same specificity. Conclusions We present parsimonious ensemble machine learning models to predict the risk of lung cancer in ever-smokers, demonstrating a novel approach that could simplify the implementation of risk-based lung cancer screening in multiple settings. | Journal | medRxiv | |||||
2021/10/11 13:39 | The Medkit-Learn(ing) Environment: Medical Decision Modelling through Simulation | A. J. Chan, I. Bica, A. Huyuk, D. Jarrett, M. van der Schaar | 2021 | https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/26e359e83860db1d11b6acca57d8ea88-Abstract-round2.html | The goal of understanding decision-making behaviours in clinical environments is of paramount importance if we are to bring the strengths of machine learning to ultimately improve patient outcomes. Mainstream development of algorithms is often geared towards optimal performance in tasks that do not necessarily translate well into the medical regime---due to several factors including the lack of public availability of realistic data, the intrinsically offline nature of the problem, as well as the complexity and variety of human behaviours. We therefore present a new benchmarking suite designed specifically for medical sequential decision modelling: the Medkit-Learn(ing) Environment, a publicly available Python package providing simple and easy access to high-fidelity synthetic medical data. While providing a standardised way to compare algorithms in a realistic medical setting, we employ a generating process that disentangles the policy and environment dynamics to allow for a range of customisations, thus enabling systematic evaluation of algorithms’ robustness against specific challenges prevalent in healthcare. | Conference | NeurIPS | Quantitative epistemology (understanding decision-making), Privacy-preserving ML & synthetic data | Clinical practice | |||
2021/10/11 13:33 | Really Doing Great at Estimating CATE? A Critical Look at ML Benchmarking Practices in Treatment Effect Estimation | A. Curth, D. Svensson, J. Weatherall, M. van der Schaar | 2021 | https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/2a79ea27c279e471f4d180b08d62b00a-Abstract-round2.html | The machine learning (ML) toolbox for estimation of heterogeneous treatment effects from observational data is expanding rapidly, yet many of its algorithms have been evaluated only on a very limited set of semi-synthetic benchmark datasets. In this paper, we investigate current benchmarking practices for ML-based conditional average treatment effect (CATE) estimators, with special focus on empirical evaluation based on the popular semi-synthetic IHDP benchmark. We identify problems with current practice and highlight that semi-synthetic benchmark datasets, which (unlike real-world benchmarks used elsewhere in ML) do not necessarily reflect properties of real data, can systematically favor some algorithms over others -- a fact that is rarely acknowledged but of immense relevance for interpretation of empirical results. Further, we argue that current evaluation metrics evaluate performance only for a small subset of possible use cases of CATE estimators, and discuss alternative metrics relevant for applications in personalized medicine. Additionally, we discuss alternatives for current benchmark datasets, and implications of our findings for benchmarking in CATE estimation. | Conference | NeurIPS | Causal inference | Treatment & trials | |||
2021/09/30 23:55 | MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms | T. Kyono*, Y. Zhang*, A. Bellot, M. van der Schaar | 2021 | https://papers.nips.cc/paper/2021/hash/c80bcf42c220b8f5c41f85344242f1b0-Abstract.html | Missing data is an important problem in machine learning practice. Starting from the premise that imputation methods should preserve the causal structure of the data, we develop a regularization scheme that encourages any baseline imputation method to be causally consistent with the underlying data generating mechanism. Our proposal is a causally-aware imputation algorithm (MIRACLE). MIRACLE iteratively refines the imputation of a baseline by simultaneously modeling the missingness generating mechanism, encouraging imputation to be consistent with the causal structure of the data. We conduct extensive experiments on synthetic and a variety of publicly available datasets to show that MIRACLE is able to consistently improve imputation over a variety of benchmark methods across all three missingness scenarios: at random, completely at random, and not at random. | Conference | NeurIPS | Deep learning, Causal inference | Missing data imputation | |||
2021/09/30 23:53 | DECAF: Generating Fair Synthetic Data Using Causally-Aware Generative Networks | T. Kyono*, B. van Breugel*, J. Berrevoets, M. van der Schaar | 2021 | https://papers.nips.cc/paper/2021/hash/ba9fab001f67381e56e410575874d967-Abstract.html | Machine learning models have been criticized for reflecting unfair biases in the training data. Instead of solving for this by introducing fair learning algorithms directly, we focus on generating fair synthetic data, such that any downstream learner is fair. Generating fair synthetic data from unfair data - while remaining truthful to the underlying data-generating process (DGP) - is non-trivial. In this paper, we introduce DECAF: a GAN-based fair synthetic data generator for tabular data. With DECAF we embed the DGP explicitly as a structural causal model in the input layers of the generator, allowing each variable to be reconstructed conditioned on its causal parents. This procedure enables inference time debiasing, where biased edges can be strategically removed for satisfying user-defined fairness requirements. The DECAF framework is versatile and compatible with several popular definitions of fairness. In our experiments, we show that DECAF successfully removes undesired bias and - in contrast to existing methods - is capable of generating high-quality synthetic data. Furthermore, we provide theoretical guarantees on the generator's convergence and the fairness of downstream models. | Conference | NeurIPS | Deep learning, Causal inference, Privacy-preserving ML & synthetic data | Clinical practice | |||
2021/09/30 20:32 | Closing the loop in medical decision support by understanding clinical decision-making: A case study on organ transplantation | Y. Qin*, F. Imrie*, A. Hüyük, D. Jarrett, A. E. Gimson, M. van der Schaar | 2021 | https://papers.nips.cc/paper/2021/hash/c344336196d5ec19bd54fd14befdde87-Abstract.html | Significant effort has been placed on developing decision support tools to improve patient care. However, drivers of real-world clinical decisions in complex medical scenarios are not yet well-understood, resulting in substantial gaps between these tools and practical applications. In light of this, we highlight that more attention on understanding clinical decision-making is required both to elucidate current clinical practices and to enable effective human-machine interactions. This is imperative in high-stakes scenarios with scarce available resources. Using organ transplantation as a case study, we formalize the desiderata of methods for understanding clinical decision-making. We show that most existing machine learning methods are insufficient to meet these requirements and propose iTransplant, a novel data-driven framework to learn the factors affecting decisions on organ offers in an instance-wise fashion directly from clinical data, as a possible solution. Through experiments on real-world liver transplantation data from OPTN, we demonstrate the use of iTransplant to: (1) discover which criteria are most important to clinicians for organ offer acceptance; (2) identify patient-specific organ preferences of clinicians allowing automatic patient stratification; and (3) explore variations in transplantation practices between different transplant centers. Finally, we emphasize that the insights gained by iTransplant can be used to inform the development of future decision support tools. | Conference | NeurIPS | Quantitative epistemology (understanding decision-making) | Clinical practice | |||
2021/09/30 14:34 | On Inductive Biases for Heterogeneous Treatment Effect Estimation | A. Curth, M. van der Schaar | 2021 | https://papers.nips.cc/paper/2021/hash/8526e0962a844e4a2f158d831d5fddf7-Abstract.html | We investigate how to exploit structural similarities of an individual's potential outcomes (POs) under different treatments to obtain better estimates of conditional average treatment effects in finite samples. Especially when it is unknown whether a treatment has an effect at all, it is natural to hypothesize that the POs are similar -- yet, some existing strategies for treatment effect estimation employ regularization schemes that implicitly encourage heterogeneity even when it does not exist and fail to fully make use of shared structure. In this paper, we investigate and compare three end-to-end learning strategies to overcome this problem -- based on regularization, reparametrization and a flexible multi-task architecture -- each encoding inductive bias favoring shared behavior across POs. To build understanding of their relative strengths, we implement all strategies using neural networks and conduct a wide range of semi-synthetic experiments. We observe that all three approaches can lead to substantial improvements upon numerous baselines and gain insight into performance differences across various experimental settings. | Conference | NeurIPS | Deep learning, Causal inference | Treatment & trials | |||
2021/09/30 14:32 | SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data | A. Curth*, C. Lee*, M. van der Schaar | 2021 | https://papers.nips.cc/paper/2021/hash/e0eacd983971634327ae1819ea8b6214-Abstract.html | We study the problem of inferring heterogeneous treatment effects from time-to-event data. While both the related problems of (i) estimating treatment effects for binary or continuous outcomes and (ii) predicting survival outcomes have been well studied in the recent machine learning literature, their combination -- albeit of high practical relevance -- has received considerably less attention. With the ultimate goal of reliably estimating the effects of treatments on instantaneous risk and survival probabilities, we focus on the problem of learning (discrete-time) treatment-specific conditional hazard functions. We find that unique challenges arise in this context due to a variety of covariate shift issues that go beyond a mere combination of well-studied confounding and censoring biases. We theoretically analyse their effects by adapting recent generalization bounds from domain adaptation and treatment effect estimation to our setting and discuss implications for model design. We use the resulting insights to propose a novel deep learning method for treatment-specific hazard estimation based on balancing representations. We investigate performance across a range of experimental settings and empirically confirm that our method outperforms baselines by addressing covariate shifts from various sources. | Conference | NeurIPS | Deep learning, Causal inference, Survival analysis competing risks & comorbidities | Treatment & trials | |||
2021/09/30 13:53 | Explaining Latent Representations with a Corpus of Examples | J. Crabbé, Z. Qian, F. Imrie, M. van der Schaar | 2021 | https://papers.nips.cc/paper/2021/hash/65658fde58ab3c2b6e5132a39fae7cb9-Abstract.html | Modern machine learning models are complicated. Most of them rely on convoluted latent representations of their input to issue a prediction. To achieve greater transparency than a black-box that connects inputs to predictions, it is necessary to gain a deeper understanding of these latent representations. To that aim, we propose SimplEx: a user-centred method that provides example-based explanations with reference to a freely selected set of examples, called the corpus. SimplEx uses the corpus to improve the user’s understanding of the latent space with post-hoc explanations answering two questions: (1) Which corpus examples explain the prediction issued for a given test example? (2) What features of these corpus examples are relevant for the model to relate them to the test example? SimplEx provides an answer by reconstructing the test latent representation as a mixture of corpus latent representations. Further, we propose a novel approach, the integrated Jacobian, that allows SimplEx to make explicit the contribution of each corpus feature in the mixture. Through experiments on tasks ranging from mortality prediction to image classification, we demonstrate that these decompositions are robust and accurate. With illustrative use cases in medicine, we show that SimplEx empowers the user by highlighting relevant patterns in the corpus that explain model representations. Moreover, we demonstrate how the freedom in choosing the corpus allows the user to have personalized explanations in terms of examples that are meaningful for them. | Conference | NeurIPS | Deep learning, Interpretability & explainability | ||||
2021/09/30 13:35 | Estimating Multi-cause Treatment Effects via Single-cause Perturbation | Z. Qian, A. Curth, M. van der Schaar | 2021 | https://papers.nips.cc/paper/2021/hash/c793b3be8f18731f2a4c627fb3c6c63d-Abstract.html | Most existing methods for conditional average treatment effect estimation are designed to estimate the effect of a single cause -- only one variable can be intervened on at one time. However, many applications involve simultaneous intervention on multiple variables, which leads to multi-cause treatment effect problems. The multi-cause problem is challenging due to severe data scarcity -- we only observe the outcome corresponding to the treatment that was actually given but need to infer a large number of potential outcomes under different combinations of the causes. In this work, we propose Single-cause Perturbation (SCP), a novel two-step procedure to estimate the multi-cause treatment effect. SCP starts by augmenting the observational dataset with the estimated potential outcomes under single-cause interventions. It then performs covariate adjustment on the augmented dataset to obtain the estimator. SCP is agnostic to the exact choice of algorithm in either step. We show formally that the procedure is valid under standard assumptions in causal inference. We demonstrate the performance gain of SCP on extensive simulation and real data experiments. | Conference | NeurIPS | Deep learning, Causal inference | Risk & prognosis, Treatment & trials | |||
2021/09/30 13:32 | SyncTwin: Treatment Effect Estimation with Longitudinal Outcomes | Z. Qian, Y. Zhang, I. Bica, A. M. Wood, M. van der Schaar | 2021 | https://papers.nips.cc/paper/2021/hash/19485224d128528da1602ca47383f078-Abstract.html | Most of the medical observational studies estimate the causal treatment effects using electronic health records (EHR), where a patient's covariates and outcomes are both observed longitudinally. However, previous methods focus only on adjusting for the covariates while neglecting the temporal structure in the outcomes. To bridge the gap, this paper develops a new method, SyncTwin, that learns a patient-specific time-constant representation from the pre-treatment observations. SyncTwin issues counterfactual prediction of a target patient by constructing a synthetic twin that closely matches the target in representation. The reliability of the estimated treatment effect can be assessed by comparing the observed and synthetic pre-treatment outcomes. The medical experts can interpret the estimate by examining the most important contributing individuals to the synthetic twin. In the real-data experiment, SyncTwin successfully reproduced the findings of a randomized controlled clinical trial using observational data, which demonstrates its usability in the complex real-world EHR. | Conference | NeurIPS | Deep learning, Causal inference, Time series analysis, Interpretability & explainability | Risk & prognosis, Treatment & trials | |||
2021/09/30 13:30 | Integrating Expert ODEs into Neural ODEs: Pharmacology and Disease Progression | Z. Qian, W. R. Zame, L. M. Fleuren, P. Elbers, M. van der Schaar | 2021 | https://papers.nips.cc/paper/2021/hash/5ea1649a31336092c05438df996a3e59-Abstract.html | Modeling a system's temporal behaviour in reaction to external stimuli is a fundamental problem in many areas. Pure Machine Learning (ML) approaches often fail in the small sample regime and cannot provide actionable insights beyond predictions. A promising modification has been to incorporate expert domain knowledge into ML models. The application we consider is predicting the progression of disease under medications, where a plethora of domain knowledge is available from pharmacology. Pharmacological models describe the dynamics of carefully-chosen medically meaningful variables in terms of systems of Ordinary Differential Equations (ODEs). However, these models only describe a limited collection of variables, and these variables are often not observable in clinical environments. To close this gap, we propose the latent hybridisation model (LHM) that integrates a system of expert-designed ODEs with machine-learned Neural ODEs to fully describe the dynamics of the system and to link the expert and latent variables to observable quantities. We evaluated LHM on synthetic data as well as real-world intensive care data of COVID-19 patients. LHM consistently outperforms previous works, especially when few training samples are available such as at the beginning of the pandemic. | Conference | NeurIPS | Time series analysis, Interpretability & explainability | Risk & prognosis, Risk & disease trajectories, Treatment & trials, Scientific discovery | |||
2021/09/30 12:52 | Time-series Generation by Contrastive Imitation | D. Jarrett, I. Bica, M. van der Schaar | 2021 | https://papers.nips.cc/paper/2021/hash/f2b4053221961416d47d497814a8064f-Abstract.html | Consider learning a generative model for time-series data. The sequential setting poses a unique challenge: Not only should the generator capture the *conditional* dynamics of (stepwise) transitions, but its open-loop rollouts should also preserve the *joint* distribution of (multi-step) trajectories. On one hand, autoregressive models trained by MLE allow learning and computing explicit transition distributions, but suffer from compounding error during rollouts. On the other hand, adversarial models based on GAN training alleviate such exposure bias, but transitions are implicit and hard to assess. In this work, we propose a novel framework marrying the best of both worlds: Motivated by a precise moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) *transition policy*, where the reinforcement signal is provided by a global (but stepwise-decomposable) *energy model* trained by contrastive estimation. In learning, the two components are trained cooperatively, avoiding the instabilities typical of adversarial objectives. Moreover, while the learned policy serves as the generator for sampling, the learned energy naturally serves as a trajectory-level measure for evaluating sample quality. By expressly training a policy to imitate sequential behavior of time-series features in a dataset, our approach embodies "generation by imitation". Theoretically, we demonstrate the correctness of our formulation and consistency of our algorithm. Empirically, we evaluate its ability to generate realistic samples using real-world datasets, and verify that it performs at or above the standard of existing benchmarks. | Conference | NeurIPS | Deep learning, Time series analysis, Privacy-preserving ML & synthetic data | ||||
2021/09/30 12:51 | Invariant Causal Imitation Learning for Generalizable Policies | I. Bica, D. Jarrett, M. van der Schaar | 2021 | https://papers.nips.cc/paper/2021/hash/204904e461002b28511d5880e1c36a0f-Abstract.html | Consider learning an imitation policy on the basis of demonstrated behavior from multiple environments, with an eye towards deployment in an unseen environment. Since the observable features from each setting may be different, directly learning individual policies as mappings from features to actions is prone to spurious correlations---and may not generalize well. However, the expert’s policy is often a function of a shared latent structure underlying those observable features that is invariant across settings. By leveraging data from multiple environments, we propose Invariant Causal Imitation Learning (ICIL), a novel technique in which we learn a feature representation that is invariant across domains, on the basis of which we learn an imitation policy that matches expert behavior. To cope with transition dynamics mismatch, ICIL learns a shared representation of causal features (for all training environments), that is disentangled from the specific representations of noise variables (for each of those environments). Moreover, to ensure that the learned policy matches the observation distribution of the expert's policy, ICIL estimates the energy of the expert's observations and uses a regularization term that minimizes the imitator policy's next state energy. Experimentally, we compare our methods against several benchmarks in control and healthcare tasks and show its effectiveness in learning imitation policies capable of generalizing to unseen environments. | Conference | NeurIPS | Causal inference, Transfer learning | Clinical practice | |||
2021/09/30 12:41 | Conformal Time-Series Forecasting | K. Stankevičiūtė, A. M. Alaa, M. van der Schaar | 2021 | https://papers.nips.cc/paper/2021/hash/312f1ba2a72318edaaa995a67835fad5-Abstract.html | Current approaches for (multi-horizon) time-series forecasting using recurrent neural networks (RNNs) focus on issuing point estimates, which are insufficient for informing decision-making in critical application domains wherein uncertainty estimates are also required. Existing methods for uncertainty quantification in RNN-based time-series forecasts are limited as they may require significant alterations to the underlying architecture, may be computationally complex, may be difficult to calibrate, may incur high sample complexity, and may not provide theoretical validity guarantees for the issued uncertainty intervals. In this work, we extend the inductive conformal prediction framework to the time-series forecasting setup, and propose a lightweight uncertainty estimation procedure to address the above limitations. With minimal exchangeability assumptions, our approach provides uncertainty intervals with theoretical guarantees on frequentist coverage for multi-horizon forecast predictor and dataset. We demonstrate the effectiveness of the conformal forecasting framework by comparing it with existing baselines on a variety of synthetic and real-world datasets. | Conference | NeurIPS | Deep learning, Time series analysis, Uncertainty estimation | Risk & disease trajectories | |||
2021/09/03 14:20 | Data-Driven Online Recommender Systems with Costly Information Acquisition | O. Atan, S. Ghoorchian, S. Maghsudi, M. van der Schaar | 2021 | https://ieeexplore.ieee.org/document/9540300 | In numerous recommender systems, collecting useful information from users is costly, implying that the recommender system has to make active choices by simultaneously learning the observations of the features’ states to make useful recommendations to users among available products and services. This paper integrates information acquisition decisions into recommender system. To solve the aforementioned dual learning problem, we propose two different algorithms, namely Sim-OOS and Seq-OOS, where observations are made simultaneously and sequentially, respectively. We prove that both algorithms guarantee a sublinear regret. The developed recommender system can be applied to a variety of real-world applications, including medical informatics, smart transportation, finance, and cyber-security where collecting information before making decisions results in an excessive cost. We validate and evaluate our proposed policies in a medical decision-support system that recommends tests and treatments for breast cancer patients. | Journal | IEEE Transactions on Services Computing | Reinforcement learning | ||||
2021/08/30 08:34 | Dynamic Stochastic Demand Response With Energy Storage | Y. Xiao, M. van der Schaar | 2021 | https://ieeexplore.ieee.org/document/9523526 | We consider a power system with an independent system operator (ISO), and distributed aggregators who have energy storage and purchase energy from the ISO to serve their customers. All the entities in the system are foresighted: each aggregator seeks to minimize its own long-term payments for energy purchase and operational costs of energy storage by deciding how much energy to buy from the ISO, and the ISO seeks to minimize the long-term total cost of the system (e.g. energy generation costs and the aggregators’ costs) by dispatching the energy production among the generators. The decision making of the foresighted entities is complicated for two reasons. First, the information is decentralized among the entities, namely each entity does not know the others’ states. Second, an aggregator’s current decision affects its future costs due to the coupling introduced by the energy storage. We propose a design framework in which the ISO provides each aggregator with a conjectured future price, and each aggregator distributively minimizes its own long-term cost based on its conjectured price as well as its locally-available information. We prove that the proposed framework can achieve the social optimum despite being decentralized and involving complex coupling. Simulation results show that the proposed foresighted demand side management achieves significant reduction in the total cost, compared to the optimal myopic demand side management (up to 60% reduction), and the foresighted demand side management based on the Lyapunov optimization framework (up to 30% reduction). | 10.1109/TSG.2021.3108096 | Journal | IEEE Transactions on Smart Grid | Reinforcement learning | |||
2021/08/27 14:29 | Prediction of Major Complications and Readmission After Lumbar Spinal Fusion: A Machine Learning–Driven Approach | A. A. Shah, S. K. Devana, C. Lee, A. Bugarin, E. L. Lord, A. N. Shamie, D. Y. Park, M. van der Schaar, N. F. SooHoo | 2021 | https://www.sciencedirect.com/science/article/abs/pii/S1878875021007750 | Given the significant cost and morbidity of patients undergoing lumbar fusion, accurate preoperative risk-stratification would be of great utility. We aim to develop a machine learning model for prediction of major complications and readmission after lumbar fusion. We also aim to identify the factors most important to performance of each tested model. We identified 38,788 adult patients who underwent lumbar fusion at any California hospital between 2015 and 2017. The primary outcome was major perioperative complication or readmission within 30 days. We build logistic regression and advanced machine learning models: XGBoost, AdaBoost, Gradient Boosting, and Random Forest. Discrimination and calibration were assessed using area under the receiver operating characteristic curve and Brier score, respectively. There were 4470 major complications (11.5%). The XGBoost algorithm demonstrates the highest discrimination of the machine learning models, outperforming regression. The variables most important to XGBoost performance include angina pectoris, metastatic cancer, teaching hospital status, history of concussion, comorbidity burden, and workers’ compensation insurance. Teaching hospital status and concussion history were not found to be important for regression. We report a machine learning algorithm for prediction of major complications and readmission after lumbar fusion that outperforms logistic regression. Notably, the predictors most important for XGBoost differed from those for regression. The superior performance of XGBoost may be due to the ability of advanced machine learning methods to capture relationships between variables that regression is unable to detect. This tool may identify and address potentially modifiable risk factors, helping risk-stratify patients and decrease complication rates. | 10.1016/j.wneu.2021.05.080 | Journal | World Neurosurgery | ||||
2021/08/22 23:41 | Machine learning-driven identification of novel patient factors for prediction of major complications after posterior cervical spinal fusion | A. Shah, S. Devana, C. Lee, A. Bugarin, E. Lord, A. Shamie, D. Park, M. van der Schaar, N. SooHoo | 2021 | https://link.springer.com/article/10.1007/s00586-021-06961-7 | Purpose: Posterior cervical fusion is associated with increased rates of complications and readmission when compared to anterior fusion. Machine learning (ML) models for risk stratification of patients undergoing posterior cervical fusion remain limited. We aim to develop a novel ensemble ML algorithm for prediction of major perioperative complications and readmission after posterior cervical fusion and identify factors important to model performance. Methods: This is a retrospective cohort study of adults who underwent posterior cervical fusion at non-federal California hospitals between 2015 and 2017. The primary outcome was readmission or major complication. We developed an ensemble model predicting complication risk using an automated ML framework. We compared performance with standard ML models and logistic regression (LR), ranking contribution of included variables to model performance. Results: Of the included 6822 patients, 18.8% suffered a major complication or readmission. The ensemble model demonstrated slightly superior predictive performance compared to LR and standard ML models. The most important features to performance include sex, malignancy, pneumonia, stroke, and teaching hospital status. Seven of the ten most important features for the ensemble model were markedly less important for LR. Conclusion: We report an ensemble ML model for prediction of major complications and readmission after posterior cervical fusion with a modest risk prediction advantage compared to LR and benchmark ML models. Notably, the features most important to the ensemble are markedly different from those for LR, suggesting that advanced ML methods may identify novel prognostic factors for adverse outcomes after posterior cervical fusion. | 10.1007/s00586-021-06961-7 | Journal | European Spine Journal | Risk & prognosis | |||
2021/08/04 09:31 | Exploiting Causal Structure for Robust Model Selection in Unsupervised Domain Adaptation | T. Kyono, M. van der Schaar | 2021 | https://ieeexplore.ieee.org/document/9503312/ | In many real-world settings, such as healthcare, machine learning models are trained and validated on one labeled domain and tested or deployed on another where feature distributions differ, i.e., there is covariate shift. When annotations are costly or prohibitive, an unsupervised domain adaptation (UDA) regime can be leveraged requiring only unlabeled samples in the target domain. Existing UDA methods are unable to factor in a model's predictive loss based on predictions in the target domain and therefore suboptimally leverage density ratios of only the input covariates in each domain. In this work we propose a model selection method for leveraging model predictions on a target domain without labels by exploiting the domain invariance of causal structure. We assume or learn a causal graph from the source domain, and select models that produce predicted distributions in the target domain that have the highest likelihood of fitting our causal graph. We thoroughly analyze our method under oracle knowledge using synthetic data. We then show on several real-world datasets, including several COVID-19 examples, that our method is able to improve on the state-of-the-art UDA algorithms for model selection. | 10.1109/TAI.2021.3101185 | Journal | IEEE Transactions on Artificial Intelligence | Causal inference, Transfer learning | |||
2021/07/20 23:04 | Concerns about the use of digoxin in acute coronary syndromes | R. Bugiardini, E. Cenko, J. Yoon, S. Kedev, C. Gale, Z. Vasiljevic, M. Bergami, D. Miličić, M. Zdravkovic, G. Krljanac, L. Badimon, O. Manfrini, M. van der Schaar | 2021 | https://academic.oup.com/ehjcvp/advance-article-abstract/doi/10.1093/ehjcvp/pvab055/6319496 | The use of digitalis has been plagued by controversy since its initial use. We aimed to determine the relationship between digoxin use and outcomes in hospitalized patients with acute coronary syndromes (ACSs) complicated by heart failure (HF) accounting for sex difference and prior heart diseases. Of the 25 187 patients presenting with acute HF (Killip class ≥2) in the International Survey of Acute Coronary Syndromes Archives (NCT04008173) registry, 4722 (18.7%) received digoxin on hospital admission. The main outcome measure was all-cause 30-day mortality. Estimates were evaluated by inverse probability of treatment weighting models. Women who received digoxin had a higher rate of death than women who did not receive it [33.8% vs. 29.2%; relative risk (RR) ratio: 1.24; 95% confidence interval (CI): 1.12–1.37]. Similar odds for mortality with digoxin were observed in men (28.5% vs. 24.9%; RR ratio: 1.20; 95% CI: 1.10–1.32). Comparable results were obtained in patients with no prior coronary heart disease (RR ratio: 1.26; 95% CI: 1.10–1.45 in women and RR ratio: 1.21; 95% CI: 1.06–1.39 in men) and those in sinus rhythm at admission (RR ratio: 1.34; 95% CI: 1.15–1.54 in women and RR ratio: 1.26; 95% CI: 1.10–1.45 in men). Digoxin therapy is associated with an increased risk of early death among women and men with ACS complicated by HF. This finding highlights the need for re-examination of digoxin use in the clinical setting of ACS. | 10.1093/ehjcvp/pvab055 | Journal | European Heart Journal - Cardiovascular Pharmacotherapy | ||||
2021/07/20 18:06 | Development of a Machine Learning Algorithm for Prediction of Complications and Unplanned Readmission following Reverse Total Shoulder Arthroplasty | S. Devana, A. Shah, C. Lee, V. Gudapati, A. Jensen, E. Cheung, C. Solorzano, M. van der Schaar, N. SooHoo | 2021 | https://journals.sagepub.com/doi/full/10.1177/24715492211038172 | The demand and incidence of anatomic total shoulder arthroplasty (aTSA) procedures is projected to increase substantially over the next decade. There is a paucity of accurate risk prediction models which would be of great utility in minimizing morbidity and cost associated with major post operative complications. Machine learning is a powerful predictive modeling tool and has become increasingly popular, especially in orthopedics. We aim to to build a ML model for prediction of major complications and readmission following primary aTSA. | Journal | Journal of Shoulder and Elbow Arthroplasty | Risk & prognosis | ||||
2021/07/18 00:00 | Explaining Time Series Predictions With Dynamic Masks | J. Crabbé, M. van der Schaar | 2021 | http://proceedings.mlr.press/v139/crabbe21a.html | How can we explain the predictions of a machine learning model? When the data is structured as a multivariate time series, this question induces additional difficulties such as the necessity for the explanation to embody the time dependency and the large number of inputs. To address these challenges, we propose dynamic masks (Dynamask). This method produces instance-wise importance scores for each feature at each time step by fitting a perturbation mask to the input sequence. In order to incorporate the time dependency of the data, Dynamask studies the effects of dynamic perturbation operators. In order to tackle the large number of inputs, we propose a scheme to make the feature selection parsimonious (to select no more feature than necessary) and legible (a notion that we detail by making a parallel with information theory). With synthetic and real-world data, we demonstrate that the dynamic underpinning of Dynamask, together with its parsimony, offer a neat improvement in the identification of feature importance over time. The modularity of Dynamask makes it ideal as a plug-in to increase the transparency of a wide range of machine learning models in areas such as medicine and finance, where time series are abundant. | Conference | ICML | Feature selection, Interpretability & explainability, Time series analysis | ||||
2021/07/18 00:00 | Inverse Decision Modeling: Learning Interpretable Representations of Behavior | D. Jarrett, A. Hüyük, M. van der Schaar | 2021 | http://proceedings.mlr.press/v139/jarrett21a.html | Decision analysis deals with modeling and enhancing decision processes. A principal challenge in improving behavior is in obtaining a transparent *description* of existing behavior in the first place. In this paper, we develop an expressive, unifying perspective on *inverse decision modeling*: a framework for learning parameterized representations of sequential decision behavior. First, we formalize the *forward* problem (as a normative standard), subsuming common classes of control behavior. Second, we use this to formalize the *inverse* problem (as a descriptive model), generalizing existing work on imitation/reward learning—while opening up a much broader class of research problems in behavior representation. Finally, we instantiate this approach with an example (*inverse bounded rational control*), illustrating how this structure enables learning (interpretable) representations of (bounded) rationality—while naturally capturing intuitive notions of suboptimal actions, biased beliefs, and imperfect knowledge of environments. | Conference | ICML | Quantitative epistemology (understanding decision-making) | ||||
2021/07/18 00:00 | Learning Queueing Policies for Organ Transplantation Allocation using Interpretable Counterfactual Survival Analysis | J. Berrevoets, A. M. Alaa, Z. Qian, J. Jordon, A. Gimson, M. van der Schaar | 2021 | http://proceedings.mlr.press/v139/berrevoets21a.html | Organ transplantation is often the last resort for treating end-stage illnesses, but managing transplant wait-lists is challenging because of organ scarcity and the complexity of assessing donor-recipient compatibility. In this paper, we develop a data-driven model for (real-time) organ allocation using observational data for transplant outcomes. Our model integrates a queuing-theoretic framework with unsupervised learning to cluster the organs into “organ types”, and then construct priority queues (associated with each organ type) wherein incoming patients are assigned. To reason about organ allocations, the model uses synthetic controls to infer a patient’s survival outcomes under counterfactual allocations to the different organ types{–} the model is trained end-to-end to optimise the trade-off between patient waiting time and expected survival time. The usage of synthetic controls enable patient-level interpretations of allocation decisions that can be presented and understood by clinicians. We test our model on multiple data sets, and show that it outperforms other organ-allocation policies in terms of added life-years, and death count. Furthermore, we introduce a novel organ-allocation simulator to accurately test new policies. | Conference | ICML | Causal inference, Interpretability & explainability, Survival analysis competing risks & comorbidities | Treatment & trials | |||
2021/07/18 00:00 | Policy Analysis using Synthetic Controls in Continuous-Time | A. Bellot, M. van der Schaar | 2021 | http://proceedings.mlr.press/v139/bellot21a.html | Counterfactual estimation using synthetic controls is one of the most successful recent methodological developments in causal inference. Despite its popularity, the current description only considers time series aligned across units and synthetic controls expressed as linear combinations of observed control units. We propose a continuous-time alternative that models the latent counterfactual path explicitly using the formalism of controlled differential equations. This model is directly applicable to the general setting of irregularly-aligned multivariate time series and may be optimized in rich function spaces -- thereby improving on some limitations of existing approaches. | Conference | ICML | Causal inference | ||||
2021/07/16 23:50 | Doing Great at Estimating CATE? On the Neglected Assumptions in Benchmark Comparisons of Treatment Effect Estimators | A. Curth, M. van der Schaar | 2021 | https://arxiv.org/abs/2107.13346 | The machine learning toolbox for estimation of heterogeneous treatment effects from observational data is expanding rapidly, yet many of its algorithms have been evaluated only on a very limited set of semi-synthetic benchmark datasets. In this paper, we show that even in arguably the simplest setting -- estimation under ignorability assumptions -- the results of such empirical evaluations can be misleading if (i) the assumptions underlying the data-generating mechanisms in benchmark datasets and (ii) their interplay with baseline algorithms are inadequately discussed. We consider two popular machine learning benchmark datasets for evaluation of heterogeneous treatment effect estimators -- the IHDP and ACIC2016 datasets -- in detail. We identify problems with their current use and highlight that the inherent characteristics of the benchmark datasets favor some algorithms over others -- a fact that is rarely acknowledged but of immense relevance for interpretation of empirical results. We close by discussing implications and possible next steps. | Conference | ICML Workshop on the Neglected Assumptions in Causal Inference | Causal inference | ||||
2021/07/15 09:40 | Inverse Contextual Bandits: Learning How Behavior Evolves over Time | A. Hüyük, D. Jarrett, M. van der Schaar | 2021 | https://arxiv.org/abs/2107.06317 | Understanding an agent's priorities by observing their behavior is critical for transparency and accountability in decision processes, such as in healthcare. While conventional approaches to policy learning almost invariably assume stationarity in behavior, this is hardly true in practice: Medical practice is constantly evolving, and clinical professionals are constantly fine-tuning their priorities. We desire an approach to policy learning that provides (1) interpretable representations of decision-making, accounts for (2) non-stationarity in behavior, as well as operating in an (3) offline manner. First, we model the behavior of learning agents in terms of contextual bandits, and formalize the problem of inverse contextual bandits (ICB). Second, we propose two algorithms to tackle ICB, each making varying degrees of assumptions regarding the agent's learning strategy. Finally, through both real and simulated data for liver transplantations, we illustrate the applicability and explainability of our method, as well as validating its accuracy. | Other | Quantitative epistemology (understanding decision-making) | |||||
2021/07/14 18:01 | Machine learning to guide the use of adjuvant therapies for breast cancer | A. M. Alaa, D. Gurdasani, A. L. Harris, J. Rashbass, M. van der Schaar | 2021 | https://www.nature.com/articles/s42256-021-00353-8 | Accurate prediction of the individualized survival benefit of adjuvant therapy is key to making informed therapeutic decisions for patients with early invasive breast cancer. Machine learning technologies can enable accurate prognostication of patient outcomes under different treatment options by modelling complex interactions between risk factors in a data-driven fashion. Here, we use an automated and interpretable machine learning algorithm to develop a breast cancer prognostication and treatment benefit prediction model—Adjutorium—using data from large-scale cohorts of nearly one million women captured in the national cancer registries of the United Kingdom and the United States. We trained and internally validated the Adjutorium model on 395,862 patients from the UK National Cancer Registration and Analysis Service (NCRAS), and then externally validated the model among 571,635 patients from the US Surveillance, Epidemiology, and End Results (SEER) programme. Adjutorium exhibited significantly improved accuracy compared to the major prognostic tool in current clinical use (PREDICT v2.1) in both internal and external validation. Importantly, our model substantially improved accuracy in specific subgroups known to be under-served by existing models. Adjutorium is currently implemented as a web-based decision support tool (https://vanderschaar-lab.com/adjutorium/) to aid decisions on adjuvant therapy in women with early breast cancer, and can be publicly accessed by patients and clinicians worldwide. | 10.1038/s42256-021-00353-8 | Journal | Nature Machine Intelligence | Automated ML, Ensemble learning | Risk & prognosis | ||
2021/05/03 00:00 | Clairvoyance: A Pipeline Toolkit for Medical Time Series | D. Jarrett, J. Yoon, I. Bica, Z. Qian, A. Ercole, M. van der Schaar | 2021 | https://openreview.net/pdf?id=xnC8YwKUE3k | Time-series learning is the bread and butter of data-driven *clinical decision support*, and the recent explosion in ML research has demonstrated great potential in various healthcare settings. At the same time, medical time-series problems in the wild are challenging due to their highly *composite* nature: They entail design choices and interactions among components that preprocess data, impute missing values, select features, issue predictions, estimate uncertainty, and interpret models. Despite exponential growth in electronic patient data, there is a remarkable gap between the potential and realized utilization of ML for clinical research and decision support. In particular, orchestrating a real-world project lifecycle poses challenges in engineering (i.e. hard to build), evaluation (i.e. hard to assess), and efficiency (i.e. hard to optimize). Designed to address these issues simultaneously, Clairvoyance proposes a unified, end-to-end, autoML-friendly pipeline that serves as a (i) software toolkit, (ii) empirical standard, and (iii) interface for optimization. Our ultimate goal lies in facilitating transparent and reproducible experimentation with complex inference workflows, providing integrated pathways for (1) personalized prediction, (2) treatment-effect estimation, and (3) information acquisition. Through illustrative examples on real-world data in outpatient, general wards, and intensive-care settings, we illustrate the applicability of the pipeline paradigm on core tasks in the healthcare journey. To the best of our knowledge, Clairvoyance is the first to demonstrate viability of a comprehensive and automatable pipeline for clinical time-series ML. | Conference | ICLR | Automated ML, Time series analysis | Risk & prognosis | |||
2021/05/03 00:00 | Explaining by Imitating: Understanding Decisions by Interpretable Policy Learning | A. Hüyük, D. Jarrett, C. Tekin, M. van der Schaar | 2021 | https://openreview.net/pdf?id=unI5ucw_Jk | Understanding human behavior from observed data is critical for transparency and accountability in decision-making. Consider real-world settings such as healthcare, in which modeling a decision-maker’s policy is challenging—with no access to underlying states, no knowledge of environment dynamics, and no allowance for live experimentation. We desire learning a data-driven representation of decisionmaking behavior that (1) inheres transparency by design, (2) accommodates partial observability, and (3) operates completely offline. To satisfy these key criteria, we propose a novel model-based Bayesian method for interpretable policy learning (“INTERPOLE”) that jointly estimates an agent’s (possibly biased) belief-update process together with their (possibly suboptimal) belief-action mapping. Through experiments on both simulated and real-world data for the problem of Alzheimer’s disease diagnosis, we illustrate the potential of our approach as an investigative device for auditing, quantifying, and understanding human decision-making behavior. | Conference | ICLR | Quantitative epistemology (understanding decision-making) | ||||
2021/05/03 00:00 | Generative Time-series Modeling with Fourier Flows | A. M. Alaa, A. J. Chan, M. van der Schaar | 2021 | https://openreview.net/pdf?id=PpshD0AXfA | Generating synthetic time-series data is crucial in various application domains, such as medical prognosis, wherein research is hamstrung by the lack of access to data due to concerns over privacy. Most of the recently proposed methods for generating synthetic time-series rely on implicit likelihood modeling using generative adversarial networks (GANs)—but such models can be difficult to train, and may jeopardize privacy by “memorizing” temporal patterns in training data. In this paper, we propose an explicit likelihood model based on a novel class of normalizing flows that view time-series data in the frequency-domain rather than the time-domain. The proposed flow, dubbed a Fourier flow, uses a discrete Fourier transform (DFT) to convert variable-length time-series with arbitrary sampling periods into fixedlength spectral representations, then applies a (data-dependent) spectral filter to the frequency-transformed time-series. We show that, by virtue of the DFT analytic properties, the Jacobian determinants and inverse mapping for the Fourier flow can be computed efficiently in linearithmic time, without imposing explicit structural constraints as in existing flows such as NICE (Dinh et al. (2014)), RealNVP (Dinh et al. (2016)) and GLOW (Kingma & Dhariwal (2018)). Experiments show that Fourier flows perform competitively compared to state-of-the-art baselines. | Conference | ICLR | Privacy-preserving ML & synthetic data | ||||
2021/05/03 00:00 | Learning "What-if" Explanations for Sequential Decision-Making | I. Bica, D. Jarrett, A. Hüyük, M. van der Schaar | 2021 | https://openreview.net/pdf?id=h0de3QWtGG | Building interpretable parameterizations of real-world decision-making on the basis of demonstrated behavior--i.e. trajectories of observations and actions made by an expert maximizing some unknown reward function--is essential for introspecting and auditing policies in different institutions. In this paper, we propose learning explanations of expert decisions by modeling their reward function in terms of preferences with respect to ``"what if'' outcomes: Given the current history of observations, what would happen if we took a particular action? To learn these cost-benefit tradeoffs associated with the expert's actions, we integrate counterfactual reasoning into batch inverse reinforcement learning. This offers a principled way of defining reward functions and explaining expert behavior, and also satisfies the constraints of real-world decision-making---where active experimentation is often impossible (e.g. in healthcare). Additionally, by estimating the effects of different actions, counterfactuals readily tackle the off-policy nature of policy evaluation in the batch setting, and can naturally accommodate settings where the expert policies depend on histories of observations rather than just current states. Through illustrative experiments in both real and simulated medical environments, we highlight the effectiveness of our batch, counterfactual inverse reinforcement learning approach in recovering accurate and interpretable descriptions of behavior. | Conference | ICLR | Quantitative epistemology (understanding decision-making) | ||||
2021/05/03 00:00 | Scalable Bayesian Inverse Reinforcement Learning | A. J. Chan, M. van der Schaar | 2021 | https://openreview.net/pdf?id=4qR3coiNaIv | Bayesian inference over the reward presents an ideal solution to the ill-posed nature of the inverse reinforcement learning problem. Unfortunately current methods generally do not scale well beyond the small tabular setting due to the need for an inner-loop MDP solver, and even non-Bayesian methods that do themselves scale often require extensive interaction with the environment to perform well, being inappropriate for high stakes or costly applications such as healthcare. In this paper we introduce our method, Approximate Variational Reward Imitation Learning (AVRIL), that addresses both of these issues by jointly learning an approximate posterior distribution over the reward that scales to arbitrarily complicated state spaces alongside an appropriate policy in a completely offline manner through a variational approach to said latent reward. Applying our method to real medical data alongside classic control simulations, we demonstrate Bayesian reward inference in environments beyond the scope of current methods, as well as task performance competitive with focused offline imitation learning algorithms. | Conference | ICLR | Quantitative epistemology (understanding decision-making) | ||||
2021/04/13 00:00 | A Variational Information Bottleneck Approach to Multi-Omics Data Integration | C. Lee, M. van der Schaar | 2021 | http://proceedings.mlr.press/v130/lee21a.html | Integration of data from multiple omics techniques is becoming increasingly important in biomedical research. Due to non-uniformity and technical limitations in omics platforms, such integrative analyses on multiple omics, which we refer to as views, involve learning from incomplete observations with various view-missing patterns. This is challenging because i) complex interactions within and across observed views need to be properly addressed for optimal predictive power and ii) observations with various view-missing patterns need to be flexibly integrated. To address such challenges, we propose a deep variational information bottleneck (IB) approach for incomplete multi-view observations. Our method applies the IB framework on marginal and joint representations of the observed views to focus on intra-view and inter-view interactions that are relevant for the target. Most importantly, by modeling the joint representations as a product of marginal representations, we can efficiently learn from observed views with various view-missing patterns. Experiments on real-world datasets show that our method consistently achieves gain from data integration and outperforms state-of-the-art benchmarks. | Conference | AISTATS | Genomics | ||||
2021/04/13 00:00 | Learning Matching Representations for Individualized Organ Transplantation Allocation | C. Xu, A. M. Alaa, I. Bica, B. D. Ershoff, M. Cannesson, M. van der Schaar | 2021 | http://proceedings.mlr.press/v130/xu21e.html | Organ transplantation can improve life expectancy for recipients, but the probability of a successful transplant depends on the compatibility between donor and recipient features. Current medical practice relies on coarse rules for donor-recipient matching, but is short of domain knowledge regarding the complex factors underlying organ compatibility. In this paper, we formulate the problem of learning data-driven rules for donor-recipient matching using observational data for organ allocations and transplant outcomes. This problem departs from the standard supervised learning setup in that it involves matching two feature spaces (for donors and recipients), and requires estimating transplant outcomes under counterfactual matches not observed in the data. To address this problem, we propose a model based on representation learning to predict donor-recipient compatibility—our model learns representations that cluster donor features, and applies donor-invariant transformations to recipient features to predict transplant outcomes under a given donor-recipient feature instance. Experiments on several semi-synthetic and real-world datasets show that our model outperforms state-of-art allocation models and real-world policies executed by human experts. | Conference | AISTATS | Causal inference | ||||
2021/04/13 00:00 | Nonparametric Estimation of Heterogeneous Treatment Effects: From Theory to Learning Algorithms | A. Curth, M. van der Schaar | 2021 | http://proceedings.mlr.press/v130/curth21a.html | The need to evaluate treatment effectiveness is ubiquitous in most of empirical science, and interest in flexibly investigating effect heterogeneity is growing rapidly. To do so, a multitude of model-agnostic, nonparametric meta-learners have been proposed in recent years. Such learners decompose the treatment effect estimation problem into separate sub-problems, each solvable using standard supervised learning methods. Choosing between different meta-learners in a data-driven manner is difficult, as it requires access to counterfactual information. Therefore, with the ultimate goal of building better understanding of the conditions under which some learners can be expected to perform better than others a priori, we theoretically analyze four broad meta-learning strategies which rely on plug-in estimation and pseudo-outcome regression. We highlight how this theoretical reasoning can be used to guide principled algorithm design and translate our analyses into practice by considering a variety of neural network architectures as base-learners for the discussed meta-learning strategies. In a simulation study, we showcase the relative strengths of the learners under different data-generating processes. | Conference | AISTATS | Causal inference | ||||
2021/04/13 00:00 | SDF-Bayes: Cautious Optimism in Safe Dose-Finding Clinical Trials with Drug Combinations and Heterogeneous Patient Groups | H.-S. Lee, C. Shen, W. R. Zame, J.-W. Lee, M. van der Schaar | 2021 | http://proceedings.mlr.press/v130/lee21c.html | Phase I clinical trials are designed to test the safety (non-toxicity) of drugs and find the maximum tolerated dose (MTD). This task becomes significantly more challenging when multiple-drug dose-combinations (DC) are involved, due to the inherent conflict between the exponentially increasing DC candidates and the limited patient budget. This paper proposes a novel Bayesian design, SDF-Bayes, for finding the MTD for drug combinations in the presence of safety constraints. Rather than the conventional principle of escalating or de-escalating the current dose of one drug (perhaps alternating between drugs), SDF-Bayes proceeds by cautious optimism: it chooses the next DC that, on the basis of current information, is most likely to be the MTD (optimism), subject to the constraint that it only chooses DCs that have a high probability of being safe (caution). We also propose an extension, SDF-Bayes-AR, that accounts for patient heterogeneity and enables heterogeneous patient recruitment. Extensive experiments based on both synthetic and real-world datasets demonstrate the advantages of SDF-Bayes over state of the art DC trial designs in terms of accuracy and safety. | Conference | AISTATS | Next-generation clinical trials | ||||
2021/01/01 00:00 | A Kernel Two-Sample Test with Selection Bias | A. Bellot, M. van der Schaar | 2021 | https://www.auai.org/uai2021/pdf/uai2021.96.pdf | We propose a framework for analyzing and comparing distributions, which we use to construct statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS), and is called the maximum mean discrepancy (MMD). We present two distribution-free tests based on large deviation bounds for the MMD, and a third test based on the asymptotic distribution of this statistic. The MMD can be computed in quadratic time, although efficient linear time approximations are available. Our statistic is an instance of an integral probability metric, and various classical metrics on distributions are obtained when alternative function classes are used in place of an RKHS. We apply our two-sample tests to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where they perform strongly. Excellent performance is also obtained when comparing distributions over graphs, for which these are the first such tests. | Conference | Conference on Uncertainty in Artificial Intelligence (UAI) | |||||
2021/01/01 00:00 | A Novel, Potentially Universal Machine Learning Algorithm to Predict Complications in Total Knee Arthroplasty | S. Devana, A. Shah, C. Lee, A. Roney, N. SooHoo, M. van der Schaar | 2021 | https://www.sciencedirect.com/science/article/pii/S2352344121001175 | There remains a lack of accurate and validated outcome-prediction models in total knee arthroplasty (TKA). While machine learning (ML) is a powerful predictive tool, determining the proper algorithm to apply across diverse data sets is challenging. AutoPrognosis (AP) is a novel method that uses automated ML framework to incorporate the best performing stages of prognostic modeling into a single well-calibrated algorithm. We aimed to compare various ML methods to AP in predictive performance of complications after TKA. | 10.1016/j.artd.2021.06.020 | Journal | Arthroplasty Today | ||||
2021/01/01 00:00 | Application of a novel machine learning framework for predicting non-metastatic prostate cancer-specific mortality in men using the Surveillance, Epidemiology, and End Results (SEER) database | C. Lee*, A. Light*, A. M. Alaa, D. Thurtle, M. van der Schaar, V. J. Gnanapragasam | 2021 | https://www.thelancet.com/journals/landig/article/PIIS2589-7500(20)30314-9/fulltext | Accurate prognostication is crucial in treatment decisions made for men diagnosed with non-metastatic prostate cancer. Current models rely on prespecified variables, which limits their performance. We aimed to investigate a novel machine learning approach to develop an improved prognostic model for predicting 10-year prostate cancer-specific mortality and compare its performance with existing validated models. We derived and tested a machine learning-based model using Survival Quilts, an algorithm that automatically selects and tunes ensembles of survival models using clinicopathological variables. Our study involved a US population-based cohort of 171 942 men diagnosed with non-metastatic prostate cancer between Jan 1, 2000, and Dec 31, 2016, from the prospectively maintained Surveillance, Epidemiology, and End Results (SEER) Program. The primary outcome was prediction of 10-year prostate cancer-specific mortality. Model discrimination was assessed using the concordance index (c-index), and calibration was assessed using Brier scores. The Survival Quilts model was compared with nine other prognostic models in clinical use, and decision curve analysis was done. 647 151 men with prostate cancer were enrolled into the SEER database, of whom 171 942 were included in this study. Discrimination improved with greater granularity, and multivariable models outperformed tier-based models. The Survival Quilts model showed good discrimination (c-index 0·829, 95% CI 0·820–0·838) for 10-year prostate cancer-specific mortality, which was similar to the top-ranked multivariable models: PREDICT Prostate (0·820, 0·811–0·829) and Memorial Sloan Kettering Cancer Center (MSKCC) nomogram (0·787, 0·776–0·798). All three multivariable models showed good calibration with low Brier scores (Survival Quilts 0·036, 95% CI 0·035–0·037; PREDICT Prostate 0·036, 0·035–0·037; MSKCC 0·037, 0·035–0·039). Of the tier-based systems, the Cancer of the Prostate Risk Assessment model (c-index 0·782, 95% CI 0·771–0·793) and Cambridge Prognostic Groups model (0·779, 0·767–0·791) showed higher discrimination for predicting 10-year prostate cancer-specific mortality. c-indices for models from the National Comprehensive Cancer Care Network, Genitourinary Radiation Oncologists of Canada, American Urological Association, European Association of Urology, and National Institute for Health and Care Excellence ranged from 0·711 (0·701–0·721) to 0·761 (0·750–0·772). Discrimination for the Survival Quilts model was maintained when stratified by age and ethnicity. Decision curve analysis showed an incremental net benefit from the Survival Quilts model compared with the MSKCC and PREDICT Prostate models currently used in practice. A novel machine learning-based approach produced a prognostic model, Survival Quilts, with discrimination for 10-year prostate cancer-specific mortality similar to the top-ranked prognostic models, using only standard clinicopathological variables. Future integration of additional data will likely improve model performance and accuracy for personalised prognostics. | 10.1016/S2589-7500(20)30314-9 | Journal | The Lancet Digital Health | Risk & prognosis | |||
2021/01/01 00:00 | Application of Kernel Hypothesis Testing on Set-valued Data | A. Bellot, M. van der Schaar | 2021 | https://arxiv.org/abs/1907.04081 | We present a general framework for hypothesis testing on distributions of sets of individual examples. Sets may represent many common data sources such as groups of observations in time series, collections of words in text or a batch of images of a given phenomenon. This observation pattern, however, differs from the common assumptions required for hypothesis testing: each set differs in size, may have differing levels of noise, and also may incorporate nuisance variability, irrelevant for the analysis of the phenomenon of interest; all features that bias test decisions if not accounted for. In this paper, we propose to interpret sets as independent samples from a collection of latent probability distributions, and introduce kernel two-sample and independence tests in this latent space of distributions. We prove the consistency of tests and observe them to outperform in a wide range of synthetic experiments. Finally, we showcase their use in practice with experiments of healthcare and climate data, where previously heuristics were needed for feature extraction and testing. | Conference | Conference on Uncertainty in Artificial Intelligence (UAI) | |||||
2021/01/01 00:00 | Comparing COVID-19 risk factors in Brazil using machine learning: the importance of socioeconomic, demographic and structural factors | P. Baqui, V. Marra, A. M. Alaa, I. Bica, A. Ercole, M. van der Schaar | 2021 | https://www.nature.com/articles/s41598-021-95004-8 | The COVID-19 pandemic continues to have a devastating impact on Brazil. Brazil’s social, health and economic crises are aggravated by strong societal inequities and persisting political disarray. This complex scenario motivates careful study of the clinical, socioeconomic, demographic and structural factors contributing to increased risk of mortality from SARS-CoV-2 in Brazil specifically. We consider the Brazilian SIVEP-Gripe catalog, a very rich respiratory infection dataset which allows us to estimate the importance of several non-laboratorial and socio-geographic factors on COVID-19 mortality. We analyze the catalog using machine learning algorithms to account for likely complex interdependence between metrics. The XGBoost algorithm achieved excellent performance, producing an AUC-ROC of 0.813 (95%CI 0.810–0.817), and outperforming logistic regression. Using our model we found that, in Brazil, socioeconomic, geographical and structural factors are more important than individual comorbidities. Particularly important factors were: The state of residence and its development index; the distance to the hospital (especially for rural and less developed areas); the level of education; hospital funding model and strain. Ethnicity is also confirmed to be more important than comorbidities but less than the aforementioned factors. Socioeconomic and structural factors are as important as biological factors in determining the outcome of COVID-19. This has important consequences for policy making, especially on vaccination/non-pharmacological preventative measures, hospital management and healthcare network organization. | 10.1101/2021.03.11.21253380 | Journal | Nature Scientific Reports | ||||
2021/01/01 00:00 | Personalized Education in the Artificial Intelligence Era: What to Expect Next | S. Maghsudi, A. Lan, J. Xu, M. van der Schaar | 2021 | https://ieeexplore.ieee.org/document/9418572 | The objective of personalized learning is to design an effective knowledge acquisition track that matches the learner's strengths and bypasses his/her weaknesses to ultimately meet his/her desired goal. This concept emerged several years ago and is being adopted by a rapidly growing number of educational institutions around the globe. In recent years, the rise of artificial intelligence (AI) and machine learning (ML), together with advances in big data analysis, has introduced novel perspectives that enhance personalized education in numerous ways. By taking advantage of AI/ML methods, the educational platform precisely acquires the student?s characteristics. This is done, in part, by observing past experiences as well as analyzing the available big data through exploring the learners' features and similarities. It can, for example, recommend the most appropriate content among numerous accessible ones, advise a well-designed long-term curriculum, and connect appropriate learners by suggestion, accurate performance evaluation, and so forth. Still, several aspects of AI-based personalized education remain unexplored. These include, among others, compensating for the adverse effects of the absence of peers, creating and maintaining motivations for learning, increasing the diversity, removing the biases induced by data and algorithms, and so on. In this article, while providing a brief review of state-of-the-art research, we investigate the challenges of AI/ML-based personalized education and discuss potential solutions. | 10.1109/MSP.2021.3055032 | Journal | IEEE Signal Processing Magazine | Personalized Education | |||
2021/01/01 00:00 | Selecting Treatment Effects Models for Domain Adaptation Using Causal Knowledge | T. Kyono, I. Bica, Z. Qian, M. van der Schaar | 2021 | https://arxiv.org/abs/2102.06271 | Selecting causal inference models for estimating individualized treatment effects (ITE) from observational data presents a unique challenge since the counterfactual outcomes are never observed. The problem is challenged further in the unsupervised domain adaptation (UDA) setting where we only have access to labeled samples in the source domain, but desire selecting a model that achieves good performance on a target domain for which only unlabeled samples are available. Existing techniques for UDA model selection are designed for the predictive setting. These methods examine discriminative density ratios between the input covariates in the source and target domain and do not factor in the model's predictions in the target domain. Because of this, two models with identical performance on the source domain would receive the same risk score by existing methods, but in reality, have significantly different performance in the test domain. We leverage the invariance of causal structures across domains to propose a novel model selection metric specifically designed for ITE methods under the UDA setting. In particular, we propose selecting models whose predictions of interventions' effects satisfy known causal structures in the target domain. Experimentally, our method selects ITE models that are more robust to covariate shifts on several healthcare datasets, including estimating the effect of ventilation in COVID-19 patients from different geographic locations. | Other | Causal inference | Treatment & trials | ||||
2021/01/01 00:00 | Sharing ICU Patient Data Responsibly Under the Society of Critical Care Medicine/European Society of Intensive Care Medicine Joint Data Science Collaboration: The Amsterdam University Medical Centers Database (AmsterdamUMCdb) Example | P. Thoral, ..., M. van der Schaar, ..., A. Ercole, P. Elbers | 2021 | https://journals.lww.com/ccmjournal/Fulltext/2021/06000/Sharing_ICU_Patient_Data_Responsibly_Under_the.16.aspx | OBJECTIVES: Critical care medicine is a natural environment for machine learning approaches to improve outcomes for critically ill patients as admissions to ICUs generate vast amounts of data. However, technical, legal, ethical, and privacy concerns have so far limited the critical care medicine community from making these data readily available. The Society of Critical Care Medicine and the European Society of Intensive Care Medicine have identified ICU patient data sharing as one of the priorities under their Joint Data Science Collaboration. To encourage ICUs worldwide to share their patient data responsibly, we now describe the development and release of Amsterdam University Medical Centers Database (AmsterdamUMCdb), the first freely available critical care database in full compliance with privacy laws from both the United States and Europe, as an example of the feasibility of sharing complex critical care data. SETTING: University hospital ICU. SUBJECTS: Data from ICU patients admitted between 2003 and 2016. INTERVENTIONS: We used a risk-based deidentification strategy to maintain data utility while preserving privacy. In addition, we implemented contractual and governance processes, and a communication strategy. Patient organizations, supporting hospitals, and experts on ethics and privacy audited these processes and the database. MEASUREMENTS AND MAIN RESULTS: AmsterdamUMCdb contains approximately 1 billion clinical data points from 23,106 admissions of 20,109 patients. The privacy audit concluded that reidentification is not reasonably likely, and AmsterdamUMCdb can therefore be considered as anonymous information, both in the context of the U.S. Health Insurance Portability and Accountability Act and the European General Data Protection Regulation. The ethics audit concluded that responsible data sharing imposes minimal burden, whereas the potential benefit is tremendous. CONCLUSIONS: Technical, legal, ethical, and privacy challenges related to responsible data sharing can be addressed using a multidisciplinary approach. A risk-based deidentification strategy, that complies with both U.S. and European privacy regulations, should be the preferred approach to releasing ICU patient data. This supports the shared Society of Critical Care Medicine and European Society of Intensive Care Medicine vision to improve critical care outcomes through scientific inquiry of vast and combined ICU datasets. | 10.1097/CCM.0000000000004916 | Journal | Critical Care Medicine | ||||
2021/01/01 00:00 | Smoking and sex differences in first manifestation of cardiovascular disease | Z. Vasiljevic, M. Scarpone, M. Bergami, J. Yoon, M. van der Schaar, G. Krljanac, M. Asanin, G. Davidovic, S. Simovic, O. Manfrini, N. Mickovski Katalina, L. Badimon, E. Cenko, R. Bugiardini | 2021 | https://www.atherosclerosis-journal.com/article/S0021-9150(21)01194-1/fulltext | An increasing proportion of women believe that smoking few cigarettes daily substantially reduces their risk of developing cardiovascular (CV) related disorders. The effect of low intensity smoking is still largely understudied. We investigated the relation among sex, age, cigarette smoking and ST segment elevation myocardial infarction (STEMI) as initial manifestation of CV disease. We analyzed data of 50,713 acute coronary syndrome patients with no prior manifestation of CV disease from the ISACS-Archives (NCT04008173) registry. We compared the rates of STEMI in current smokers (n = 11,530) versus nonsmokers (n = 39,183). In the young middle age group (<60 years), there was evidence of a more harmful effect in women compared with men (RR ratios: 1.90; 95% CI: 1.69–2.14 versus 1.68; 95% CI: 1.56–1.80). This association persisted even in women who smoked 1 to 10 packs per year (RR ratios: 2.02; 95% CI: 1.65 to 2.48 versus 1.38; 95% CI: 1.22 to 1.57). In the older group, rates of STEMI were similar for women and men (RR ratios: 1.36; 95% CI: 1.22–1.53 versus 1.39; 95% CI: 1.28–1.50). STEMI was associated with a twofold higher 30-day mortality rate in young middle age women compared with men of the same age (odds ratios, 5.54; 95% CI, 3.83–8.03 vs. 2.93; 95% CI, 2.33–3.69). Low intensity smoking provides inadequate protection in young - middle age women as they still have a substantially higher rate of STEMI and related mortality compared with men even smoking less than 10 packs per year. This finding is worrying as more young - middle age women are smoking, and rates of smoking among young-middle age men continue to fall. | 10.1016/j.atherosclerosis.2021.06.909 | Journal | Atherosclerosis | ||||
2021/01/01 00:00 | Triage of 2D Mammographic Images Using Multi-view Multi-task Convolutional Neural Networks | T. Kyono, F. J. Gilbert, M. van der Schaar | 2021 | https://dl.acm.org/doi/10.1145/3453166 | With an aging and growing population, the number of women receiving mammograms is increasing. However, existing techniques for autonomous diagnosis do not surpass a well-trained radiologist. Therefore, to reduce the number of mammograms that require examination by a radiologist, subject to preserving the diagnostic accuracy observed in current clinical practice, we develop Man and Machine Mammography Oracle (MAMMO)—a clinical decision support system capable of determining whether its predicted diagnoses require further radiologist examination. We first introduce a novel multi-view convolutional neural network (CNN) trained using multi-task learning (MTL) to diagnose mammograms and predict the radiological assessments known to be associated with cancer. MTL improves diagnostic performance and triage efficiency while providing an additional layer of model interpretability. Furthermore, we introduce a novel triage network that takes as input the radiological assessment and diagnostic predictions of the multi-view CNN and determines whether the radiologist or CNN will most likely provide the correct diagnosis. Results obtained on a dataset of over 7,000 patients show that MAMMO reduced the number of diagnostic mammograms requiring radiologist reading by 42.8% while improving the overall diagnostic accuracy in comparison to readings done by radiologists alone. | 10.1145/3453166 | Journal | ACM Transactions on Computing for Healthcare | Deep learning | Medical imaging, Screening | ||
2020/12/06 00:00 | Hide-and-Seek Privacy Challenge: Synthetic Data Generation vs. Patient Re-identification | J. Jordon, D. Jarrett, E. Saveliev, J. Yoon, P. Elbers, P. Thoral, A. Ercole, C. Zhang, D. Belgrave, M. van der Schaar | 2021 | http://proceedings.mlr.press/v133/jordon21a.html | The clinical time-series setting poses a unique combination of challenges to data modelling and sharing. Due to the high dimensionality of clinical time series, adequate de-identification to preserve privacy while retaining data utility is difficult to achieve using common de-identification techniques. An innovative approach to this problem is synthetic data generation. From a technical perspective, a good generative model for time-series data should preserve temporal dynamics; new sequences should respect the original relationships between high-dimensional variables across time. From the privacy perspective, the model should prevent patient re-identification. The NeurIPS 2020 Hide-and-Seek Privacy Challenge was a novel two-tracked competition to simultaneously accelerate progress in tackling both problems. In our head-to-head format, participants in the generation track (hiders) and the patient re-identification track (seekers) were directly pitted against each other by way of a new, high-quality intensive care time-series dataset: the AmsterdamUMCdb dataset. In this paper we present an overview of the competition design, as well as highlighting areas we feel should be changed for future iterations of this competition. | Conference | NeurIPS 2020 Competition and Demonstration Track | Privacy-preserving ML & synthetic data | ||||
2020/12/06 00:00 | CASTLE: Regularization via Auxiliary Causal Graph Discovery | T. Kyono, Y. Zhang, M. van der Schaar | 2020 | https://papers.nips.cc/paper/2020/hash/1068bceb19323fe72b2b344ccf85c254-Abstract.html | Regularization improves generalization of supervised models to out-of-sample data. Prior works have shown that prediction in the causal direction (effect from cause) results in lower testing error than the anti-causal direction. However, existing regularization methods are agnostic of causality. We introduce Causal Structure Learning (CASTLE) regularization and propose to regularize a neural network by jointly learning the causal relationships between variables. CASTLE learns the causal directed acyclical graph (DAG) as an adjacency matrix embedded in the neural network's input layers, thereby facilitating the discovery of optimal predictors. Furthermore, CASTLE efficiently reconstructs only the features in the causal DAG that have a causal neighbor, whereas reconstruction-based regularizers suboptimally reconstruct all input features. We provide a theoretical generalization bound for our approach and conduct experiments on a plethora of synthetic and real publicly available datasets demonstrating that CASTLE consistently leads to better out-of-sample predictions as compared to other popular benchmark regularizers. | Conference | NeurIPS | Causal inference, Deep learning | ||||
2020/12/06 00:00 | Estimating the Effects of Continuous-valued Interventions using Generative Adversarial Networks | I. Bica, J. Jordon, M. van der Schaar | 2020 | https://papers.nips.cc/paper/2020/hash/bea5955b308361a1b07bc55042e25e54-Abstract.html | While much attention has been given to the problem of estimating the effect of discrete interventions from observational data, relatively little work has been done in the setting of continuous-valued interventions, such as treatments associated with a dosage parameter. In this paper, we tackle this problem by building on a modification of the generative adversarial networks (GANs) framework. Our model, SCIGAN, is flexible and capable of simultaneously estimating counterfactual outcomes for several different continuous interventions. The key idea is to use a significantly modified GAN model to learn to generate counterfactual outcomes, which can then be used to learn an inference model, using standard supervised methods, capable of estimating these counterfactuals for a new sample. To address the challenges presented by shifting to continuous interventions, we propose a novel architecture for our discriminator - we build a hierarchical discriminator that leverages the structure of the continuous intervention setting. Moreover, we provide theoretical results to support our use of the GAN framework and of the hierarchical discriminator. In the experiments section, we introduce a new semi-synthetic data simulation for use in the continuous intervention setting and demonstrate improvements over the existing benchmark models. | Conference | NeurIPS | Causal inference, Deep learning | Treatment & trials, Treatment & trials | |||
2020/12/06 00:00 | Gradient Regularized V-Learning for Dynamic Treatment Regimes | Y. Zhang, M. van der Schaar | 2020 | https://papers.nips.cc/paper/2020/hash/17b3c7061788dbe82de5abe9f6fe22b3-Abstract.html | Deciding how to optimally treat a patient, including how to select treatments over time among the multiple available treatments, represents one of the most important issues that need to be addressed in medicine today. A dynamic treatment regime (DTR) is a sequence of treatment rules indicating how to individualize treatments for a patient based on the previously assigned treatments and the evolving covariate history. However, DTR evaluation and learning based on offline data remain challenging problems due to the bias introduced by time-varying confounders that affect treatment assignment over time; this may lead to suboptimal treatment rules being used in practice. In this paper, we introduce Gradient Regularized V-learning (GRV), a novel method for estimating the value function of a DTR. GRV regularizes the underlying outcome and propensity score models with respect to the optimality condition in semiparametric estimation theory. On the basis of this design, we construct estimators that are efficient and stable in finite samples regime. Using multiple simulation studies and one real-world medical dataset, we demonstrate that our method is superior in DTR evaluation and learning, thereby providing improved treatment options over time for patients. | Conference | NeurIPS | Causal inference | Treatment & trials | |||
2020/12/06 00:00 | Learning outside the Black-Box: The pursuit of interpretable models | J. Crabbé, Y. Zhang, W. R. Zame, M. van der Schaar | 2020 | https://proceedings.neurips.cc/paper/2020/hash/ce758408f6ef98d7c7a7b786eca7b3a8-Abstract.html | Machine learning has proved its ability to produce accurate models -- but the deployment of these models outside the machine learning community has been hindered by the difficulties of interpreting these models. This paper proposes an algorithm that produces a continuous global interpretation of any given continuous black-box function. Our algorithm employs a variation of projection pursuit in which the ridge functions are chosen to be Meijer G-functions, rather than the usual polynomial splines. Because Meijer G-functions are differentiable in their parameters, we can "tune" the parameters of the representation by gradient descent; as a consequence, our algorithm is efficient. Using five familiar data sets from the UCI repository and two familiar machine learning algorithms, we demonstrate that our algorithm produces global interpretations that are both faithful (highly accurate) and parsimonious (involve a small number of terms). Our interpretations permit easy understanding of the relative importance of features and feature interactions. Our interpretation algorithm represents a leap forward from the previous state of the art. | Conference | NeurIPS | Interpretability & explainability | ||||
2020/12/06 00:00 | OrganITE: Optimal transplant donor organ offering using an individual treatment effect | J. Berrevoets, J. Jordon, I. Bica, A. Gimson, M. van der Schaar | 2020 | https://papers.nips.cc/paper/2020/hash/e7c573c14a09b84f6b7782ce3965f335-Abstract.html | Transplant-organs are a scarce medical resource. The uniqueness of each organ and the patients' heterogeneous responses to the organs present a unique and challenging machine learning problem. In this problem there are two key challenges: (i) assigning each organ "optimally" to a patient in the queue; (ii) accurately estimating the potential outcomes associated with each patient and each possible organ. In this paper, we introduce OrganITE, an organ-to-patient assignment methodology that assigns organs based not only on its own estimates of the potential outcomes but also on organ scarcity. By modelling and accounting for organ scarcity we significantly increase total life years across the population, compared to the existing greedy approaches that simply optimise life years for the current organ available. Moreover, we propose an individualised treatment effect model capable of addressing the high dimensionality of the organ space. We test our method on real and simulated data, resulting in as much as an additional year of life expectancy as compared to existing organ-to-patient policies. | Conference | NeurIPS | Causal inference | Treatment & trials | |||
2020/12/06 00:00 | Robust Recursive Partitioning for Heterogeneous Treatment Effects with Uncertainty Quantification | H.-S. Lee, Y. Zhang, W. R. Zame, C. Shen, J.-W. Lee, M. van der Schaar | 2020 | https://proceedings.neurips.cc//paper/2020/hash/1819020b02e926785cf3be594d957696-Abstract.html | Subgroup analysis of treatment effects plays an important role in applications from medicine to public policy to recommender systems. It allows physicians (for example) to identify groups of patients for whom a given drug or treatment is likely to be effective and groups of patients for which it is not. Most of the current methods of subgroup analysis begin with a particular algorithm for estimating individualized treatment effects (ITE) and identify subgroups by maximizing the difference across subgroups of the average treatment effect in each subgroup. These approaches have several weaknesses: they rely on a particular algorithm for estimating ITE, they ignore (in)homogeneity within identified subgroups, and they do not produce good confidence estimates. This paper develops a new method for subgroup analysis, R2P, that addresses all these weaknesses. R2P uses an arbitrary, exogenously prescribed algorithm for estimating ITE and quantifies the uncertainty of the ITE estimation, using a construction that is more robust than other methods. Experiments using synthetic and semi-synthetic datasets (based on real data) demonstrate that R2P constructs partitions that are simultaneously more homogeneous within groups and more heterogeneous across groups than the partitions produced by other methods. Moreover, because R2P can employ any ITE estimator, it also produces much narrower confidence intervals with a prescribed coverage guarantee than other methods. | Conference | NeurIPS | Next-generation clinical trials, Causal inference, Uncertainty estimation | Treatment & trials | |||
2020/12/06 00:00 | Strictly Batch Imitation Learning by Energy-based Distribution Matching | D. Jarrett, I. Bica, M. van der Schaar | 2020 | https://papers.nips.cc/paper/2020/hash/524f141e189d2a00968c3d48cadd4159-Abstract.html | Consider learning a policy purely on the basis of demonstrated behavior---that is, with no access to reinforcement signals, no knowledge of transition dynamics, and no further interaction with the environment. This strictly batch imitation learning problem arises wherever live experimentation is costly, such as in healthcare. One solution is simply to retrofit existing algorithms for apprenticeship learning to work in the offline setting. But such an approach leans heavily on off-policy evaluation or offline model estimation, and can be indirect and inefficient. We argue that a good solution should be able to explicitly parameterize a policy (i.e. respecting action conditionals), implicitly learn from rollout dynamics (i.e. leveraging state marginals), and---crucially---operate in an entirely offline fashion. To address this challenge, we propose a novel technique by energy-based distribution matching (EDM): By identifying parameterizations of the (discriminative) model of a policy with the (generative) energy function for state distributions, EDM yields a simple but effective solution that equivalently minimizes a divergence between the occupancy measure for the demonstrator and a model thereof for the imitator. Through experiments with application to control and healthcare settings, we illustrate consistent performance gains over existing algorithms for strictly batch imitation learning. | Conference | NeurIPS | Reinforcement learning, Quantitative epistemology (understanding decision-making) | ||||
2020/12/06 00:00 | VIME: Extending the Success of Self- and Semi-supervised Learning to Tabular Domain | J. Yoon, Y. Zhang, J. Jordon, M. van der Schaar | 2020 | https://papers.nips.cc/paper/2020/hash/7d97667a3e056acab9aaf653807b4a03-Abstract.html | Self- and semi-supervised learning frameworks have made significant progress in training machine learning models with limited labeled data in image and language domains. These methods heavily rely on the unique structure in the domain datasets (such as spatial relationships in images or semantic relationships in language). They are not adaptable to general tabular data which does not have the same explicit structure as image and language data. In this paper, we fill this gap by proposing novel self- and semi-supervised learning frameworks for tabular data, which we refer to collectively as VIME (Value Imputation and Mask Estimation). We create a novel pretext task of estimating mask vectors from corrupted tabular data in addition to the reconstruction pretext task for self-supervised learning. We also introduce a novel tabular data augmentation method for self- and semi-supervised learning frameworks. In experiments, we evaluate the proposed framework in multiple tabular datasets from various application domains, such as genomics and clinical data. VIME exceeds state-of-the-art performance in comparison to the existing baseline methods. | Conference | NeurIPS | Deep learning | Genomics | |||
2020/12/06 00:00 | When and How to Lift the Lockdown? Global COVID-19 Scenario Analysis and Policy Assessment using Compartmental Gaussian Processes | Z. Qian, A. M. Alaa, M. van der Schaar | 2020 | https://papers.nips.cc/paper/2020/hash/79a3308b13cd31f096d8a4a34f96b66b-Abstract.html | The coronavirus disease 2019 (COVID-19) global pandemic has led many countries to impose unprecedented lockdown measures in order to slow down the outbreak. Questions on whether governments have acted promptly enough, and whether lockdown measures can be lifted soon have since been central in public discourse. Data-driven models that predict COVID-19 fatalities under different lockdown policy scenarios are essential for addressing these questions, and for informing governments on future policy directions. To this end, this paper develops a Bayesian model for predicting the effects of COVID-19 containment policies in a global context — we treat each country as a distinct data point, and exploit variations of policies across countries to learn country-specific policy effects. Our model utilizes a two-layer Gaussian process (GP) prior — the lower layer uses a compartmental SEIR (Susceptible, Exposed, Infected, Recovered) model as a prior mean function with “country-and-policy-specific” parameters that capture fatality curves under different “counterfactual” policies within each country, whereas the upper layer is shared across all countries, and learns lower-layer SEIR parameters as a function of country features and policy indicators. Our model combines the solid mechanistic foundations of SEIR models (Bayesian priors) with the flexible data-driven modeling and gradient-based optimization routines of machine learning (Bayesian posteriors) — i.e., the entire model is trained end-to-end via stochastic variational inference. We compare the projections of our model with other models listed by the Center for Disease Control (CDC), and provide scenario analyses for various lockdown and reopening strategies highlighting their impact on COVID-19 fatalities. | Conference | NeurIPS | Causal inference, Time series analysis | ||||
2020/08/26 00:00 | Contextual Constrained Learning for Dose-Finding Clinical Trials | H.-S. Lee, C. Shen, J. Jordon, M. van der Schaar | 2020 | http://proceedings.mlr.press/v108/lee20a.html | Clinical trials in the medical domain are constrained by budgets. The number of patients that can be recruited is therefore limited. When a patient population is heterogeneous, this creates difficulties in learning subgroup specific responses to a particular drug and especially for a variety of dosages. In addition, patient recruitment can be difficult by the fact that clinical trials do not aim to provide a benefit to any given patient in the trial. In this paper, we propose C3T-Budget, a contextual constrained clinical trial algorithm for dose-finding under both budget and safety constraints. The algorithm aims to maximize drug efficacy within the clinical trial while also learning about the drug being tested. C3T-Budget recruits patients with consideration of the remaining budget, the remaining time, and the characteristics of each group, such as the population distribution, estimated expected efficacy, and estimation credibility. In addition, the algorithm aims to avoid unsafe dosages. These characteristics are further illustrated in a simulated clinical trial study, which corroborates the theoretical analysis and demonstrates an efficient budget usage as well as a balanced learning-treatment trade-off. | Conference | AISTATS | Next-generation clinical trials, Reinforcement learning | Treatment & trials | |||
2020/08/26 00:00 | Learning Dynamic and Personalized Comorbidity Networks from Event Data using Deep Diffusion Processes | Z. Qian, A. M. Alaa, A. Bellot, J. Rashbass, M. van der Schaar | 2020 | http://proceedings.mlr.press/v108/qian20a.html | Comorbid diseases co-occur and progress via complex temporal patterns that vary among individuals. In electronic medical records, we only observe onsets of diseases, but not their triggering comorbidities — i.e., the mechanisms underlying temporal relations between diseases need to be inferred. Learning such temporal patterns from event data is crucial for understanding disease pathology and predicting prognoses. To this end, we develop deep diffusion processes (DDP) to model ’dynamic comorbidity networks’, i.e., the temporal relationships between comorbid disease onsets expressed through a dynamic graph. A DDP comprises events modelled as a multi-dimensional point process, with an intensity function parameterized by the edges of a dynamic weighted graph. The graph structure is modulated by a neural network that maps patient history to edge weights, enabling rich temporal representations for disease trajectories. The DDP parameters decouple into clinically meaningful components, which enables serving the dual purpose of accurate risk prediction and intelligible representation of disease pathology. We illustrate these features in experiments using cancer registry data. | Conference | AISTATS | Causal inference, Interpretability & explainability, Survival analysis competing risks & comorbidities, Time series analysis | Risk & disease trajectories | |||
2020/08/26 00:00 | Learning Overlapping Representations for the Estimation of Individualized Treatment Effects | Y. Zhang, A. Bellot, M. van der Schaar | 2020 | http://proceedings.mlr.press/v108/zhang20c.html | The choice of making an intervention depends on its potential benefit or harm in comparison to alternatives. Estimating the likely outcome of alternatives from observational data is a challenging problem as all outcomes are never observed, and selection bias precludes the direct comparison of differently intervened groups. Despite their empirical success, we show that algorithms that learn domain-invariant representations of inputs (on which to make predictions) are often inappropriate, and develop generalization bounds that demonstrate the dependence on domain overlap and highlight the need for invertible latent maps. Based on these results, we develop a deep kernel regression algorithm and posterior regularization framework that substantially outperforms the state-of-the-art on a variety of benchmarks data sets. | Conference | AISTATS | Causal inference | Treatment & trials | |||
2020/08/26 00:00 | Stepwise Model Selection for Sequence Prediction via Deep Kernel Learning | Y. Zhang, D. Jarrett, M. van der Schaar | 2020 | http://proceedings.mlr.press/v108/zhang20f.html | An essential problem in automated machine learning (AutoML) is that of model selection. A unique challenge in the sequential setting is the fact that the optimal model itself may vary over time, depending on the distribution of features and labels available up to each point in time. In this paper, we propose a novel Bayesian optimization (BO) algorithm to tackle the challenge of model selection in this setting. This is accomplished by treating the performance at each time step as its own black-box function. In order to solve the resulting multiple black-box function optimization problem jointly and efficiently, we exploit potential correlations among black-box functions using deep kernel learning (DKL). To the best of our knowledge, we are the first to formulate the problem of stepwise model selection (SMS) for sequence prediction, and to design and demonstrate an efficient joint-learning algorithm for this purpose. Using multiple real-world datasets, we verify that our proposed method outperforms both standard BO and multi-objective BO algorithms on a variety of sequence prediction tasks. | Conference | AISTATS | Automated ML, Time series analysis | ||||
2020/07/12 00:00 | Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via Higher-Order Influence Functions | A. M. Alaa, M. van der Schaar | 2020 | http://proceedings.mlr.press/v119/alaa20a.html | Deep learning models achieve high predictive accuracy across a broad spectrum of tasks, but rigorously quantifying their predictive uncertainty remains challenging. Usable estimates of predictive uncertainty should (1) cover the true prediction targets with high probability, and (2) discriminate between high- and low confidence prediction instances. Existing methods for uncertainty quantification are based predominantly on Bayesian neural networks; these may fall short of (1) and (2) {—} i.e., Bayesian credible intervals do not guarantee frequentist coverage, and approximate posterior inference undermines discriminative accuracy. In this paper, we develop the discriminative jackknife (DJ), a frequentist procedure that utilizes influence functions of a model’s loss functional to construct a jackknife (or leave one-out) estimator of predictive confidence intervals. The DJ satisfies (1) and (2), is applicable to a wide range of deep learning models, is easy to implement, and can be applied in a post-hoc fashion without interfering with model training or compromising its accuracy. Experiments demonstrate that DJ performs competitively compared to existing Bayesian and non-Bayesian regression baselines. | Conference | ICML | Uncertainty estimation | ||||
2020/07/12 00:00 | Frequentist Uncertainty in Recurrent Neural Networks via Blockwise Influence Functions | A. M. Alaa, M. van der Schaar | 2020 | http://proceedings.mlr.press/v119/alaa20b.html | Recurrent neural networks (RNNs) are instrumental in modelling sequential and time-series data. Yet, when using RNNs to inform decision-making, predictions by themselves are not sufficient {—} we also need estimates of predictive uncertainty. Existing approaches for uncertainty quantification in RNNs are based predominantly on Bayesian methods; these are computationally prohibitive, and require major alterations to the RNN architecture and training. Capitalizing on ideas from classical jackknife resampling, we develop a frequentist alternative that: (a) does not interfere with model training or compromise its accuracy, (b) applies to any RNN architecture, and (c) provides theoretical coverage guarantees on the estimated uncertainty intervals. Our method derives predictive uncertainty from the variability of the (jackknife) sampling distribution of the RNN outputs, which is estimated by repeatedly deleting “blocks” of (temporally-correlated) training data, and collecting the predictions of the RNN re-trained on the remaining data. To avoid exhaustive re-training, we utilize influence functions to estimate the effect of removing training data blocks on the learned RNN parameters. Using data from a critical care setting, we demonstrate the utility of uncertainty quantification in sequential decision-making. | Conference | ICML | Time series analysis, Uncertainty estimation | ||||
2020/07/12 00:00 | Inverse Active Sensing: Modeling and Understanding Timely Decision-Making | D. Jarrett, M. van der Schaar | 2020 | http://proceedings.mlr.press/v119/jarrett20a.html | Evidence-based decision-making entails collecting (costly) observations about an underlying phenomenon of interest, and subsequently committing to an (informed) decision on the basis of accumulated evidence. In this setting, *active sensing* is the goal-oriented problem of efficiently selecting which acquisitions to make, and when and what decision to settle on. As its complement, *inverse active sensing* seeks to uncover an agent’s preferences and strategy given their observable decision-making behavior. In this paper, we develop an expressive, unified framework for the general setting of evidence-based decision-making under endogenous, context-dependent time pressure—which requires negotiating (subjective) tradeoffs between accuracy, speediness, and cost of information. Using this language, we demonstrate how it enables *modeling* intuitive notions of surprise, suspense, and optimality in decision strategies (the forward problem). Finally, we illustrate how this formulation enables *understanding* decision-making behavior by quantifying preferences implicit in observed decision strategies (the inverse problem). | Conference | ICML | Quantitative epistemology (understanding decision-making) | ||||
2020/07/12 00:00 | Learning for Dose Allocation in Adaptive Clinical Trials with Safety Constraints | C. Shen, S. Villar, Z. Wang, M. van der Schaar | 2020 | http://proceedings.mlr.press/v119/shen20d.html | Phase I dose-finding trials are increasingly challenging as the relationship between efficacy and toxicity of new compounds (or combination of them) becomes more complex. Despite this, most commonly used methods in practice focus on identifying a Maximum Tolerated Dose (MTD) by learning only from toxicity events. We present a novel adaptive clinical trial methodology, called Safe Efficacy Exploration Dose Allocation (SEEDA), that aims at maximizing the cumulative efficacies while satisfying the toxicity safety constraint with high probability. We evaluate performance objectives that have operational meanings in practical clinical trials, including cumulative efficacy, recommendation/allocation success probabilities, toxicity violation probability, and sample efficiency. An extended SEEDA-Plateau algorithm that is tailored for the increase-then-plateau efficacy behavior of molecularly targeted agents (MTA) is also presented. Through numerical experiments using both synthetic and real-world datasets, we show that SEEDA outperforms state-of-the-art clinical trial designs by finding the optimal dose with higher success rate and fewer patients. | Conference | ICML | Next-generation clinical trials | Treatment & trials | |||
2020/07/12 00:00 | Temporal Phenotyping using Deep Predictive Clustering of Disease Progression | C. Lee, M. van der Schaar | 2020 | http://proceedings.mlr.press/v119/lee20h.html | Due to the wider availability of modern electronic health records, patient care data is often being stored in the form of time-series. Clustering such time-series data is crucial for patient phenotyping, anticipating patients’ prognoses by identifying “similar” patients, and designing treatment guidelines that are tailored to homogeneous patient subgroups. In this paper, we develop a deep learning approach for clustering time-series data, where each cluster comprises patients who share similar future outcomes of interest (e.g., adverse events, the onset of comorbidities). To encourage each cluster to have homogeneous future outcomes, the clustering is carried out by learning discrete representations that best describe the future outcome distribution based on novel loss functions. Experiments on two real-world datasets show that our model achieves superior clustering performance over state-of-the-art benchmarks and identifies meaningful clusters that can be translated into actionable information for clinical decision-making. | Conference | ICML | Deep learning, Time series analysis, Survival analysis competing risks & comorbidities | Phenotyping & subgroup analysis, Risk & disease trajectories | |||
2020/07/12 00:00 | Time Series Deconfounder: Estimating Treatment Effects over Time in the Presence of Hidden Confounders | I. Bica, A. M. Alaa, M. van der Schaar | 2020 | http://proceedings.mlr.press/v119/bica20a.html | The estimation of treatment effects is a pervasive problem in medicine. Existing methods for estimating treatment effects from longitudinal observational data assume that there are no hidden confounders, an assumption that is not testable in practice and, if it does not hold, leads to biased estimates. In this paper, we develop the Time Series Deconfounder, a method that leverages the assignment of multiple treatments over time to enable the estimation of treatment effects in the presence of multi-cause hidden confounders. The Time Series Deconfounder uses a novel recurrent neural network architecture with multitask output to build a factor model over time and infer latent variables that render the assigned treatments conditionally independent; then, it performs causal inference using these latent variables that act as substitutes for the multi-cause unobserved confounders. We provide a theoretical analysis for obtaining unbiased causal effects of time-varying exposures using the Time Series Deconfounder. Using both simulated and real data we show the effectiveness of our method in deconfounding the estimation of treatment responses over time. | Conference | ICML | Causal inference, Deep learning, Time series analysis | Treatment & trials | |||
2020/07/12 00:00 | Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift | A. J. Chan, A. M. Alaa, Z. Qian, M. van der Schaar | 2020 | http://proceedings.mlr.press/v119/chan20a.html | Modern neural networks have proven to be powerful function approximators, providing state-of-the-art performance in a multitude of applications. They however fall short in their ability to quantify confidence in their predictions — this is crucial in high-stakes applications that involve critical decision-making. Bayesian neural networks (BNNs) aim at solving this problem by placing a prior distribution over the network’s parameters, thereby inducing a posterior distribution that encapsulates predictive uncertainty. While existing variants of BNNs based on Monte Carlo dropout produce reliable (albeit approximate) uncertainty estimates over in-distribution data, they tend to exhibit over-confidence in predictions made on target data whose feature distribution differs from the training data, i.e., the covariate shift setup. In this paper, we develop an approximate Bayesian inference scheme based on posterior regularisation, wherein unlabelled target data are used as “pseudo-labels” of model confidence that are used to regularise the model’s loss on labelled source data. We show that this approach significantly improves the accuracy of uncertainty quantification on covariate-shifted data sets, with minimal modification to the underlying model architecture. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations. | Conference | ICML | Transfer learning, Uncertainty estimation | ||||
2020/07/02 14:09 | Ethnic and regional variations in hospital mortality from COVID-19 in Brazil: a cross-sectional observational study | P. Baqui, I. Bica, V. Marra, A. Ercole, M. van der Schaar | 2020 | https://www.thelancet.com/journals/langlo/article/PIIS2214-109X(20)30285-0/fulltext | Brazil ranks second worldwide in total number of COVID-19 cases and deaths. Understanding the possible socioeconomic and ethnic health inequities is particularly important given the diverse population and fragile political and economic situation. We aimed to characterise the COVID-19 pandemic in Brazil and assess variations in mortality according to region, ethnicity, comorbidities, and symptoms. | 10.1016/S2214-109X(20)30285-0 | Journal | The Lancet Global Health | ||||
2023/06/29 12:53 | TemporAI: Facilitating Machine Learning Innovation in Time Domain Tasks for Medicine | E. S. Saveliev, M. van der Schaar | 2023 | https://arxiv.org/pdf/2301.12260 | TemporAI is an open source Python software library for machine learning (ML) tasks involving data with a time component, focused on medicine and healthcare use cases. It supports data in time series, static, and eventmodalities and provides an interface for prediction, causal inference, and time-to-event analysis, as well as common preprocessing utilities and model interpretability methods. The library aims to facilitate innovation in the medical ML space by offering a standardized temporal setting toolkit for model development, prototyping and benchmarking, bridging the gaps in the ML research, healthcare professional, medical/pharmacological industry, and data science communities. TemporAI is available on GitHub (https://github.com/vanderschaarlab/temporai) and we welcome community engagement through use, feedback, and code contributions. | Other | arXiv | Time series analysis | ||||
2023/06/29 12:54 | Joint Training of Deep Ensembles Fails Due to Learner Collusion | A. Jeffares, T. Liu, J. Crabbé, M. van der Schaar | 2023 | https://arxiv.org/pdf/2301.11323 | Ensembles of machine learning models have been well established as a powerful method of improving performance over a single model. Traditionally, ensembling algorithms train their base learners independently or sequentially with the goal of optimizing their joint performance. In the case of deep ensembles of neural networks, we are provided with the opportunity to directly optimize the true objective: the joint performance of the ensemble as a whole. Surprisingly, however, directly minimizing the loss of the ensemble appears to rarely be applied in practice. Instead, most previous research trains individual models independently with ensembling performed post hoc. In this work, we show that this is for good reason - joint optimization of ensemble loss results in degenerate behavior. We approach this problem by decomposing the ensemble objective into the strength of the base learners and the diversity between them. We discover that joint optimization results in a phenomenon in which base learners collude to artificially inflate their apparent diversity. This pseudo-diversity fails to generalize beyond the training data, causing a larger generalization gap. We proceed to demonstrate the practical implications of this effect finding that, in some cases, a balance between independent training and joint optimization can improve performance over the former while avoiding the degeneracies of the latter. | Other | arXiv | |||||
2023/06/29 12:56 | Synthetic data for privacy-preserving clinical risk prediction | Z. Qian, T. Callender, B. Cebere, S. M. Janes, N. Navani, M. van der Schaar | 2023 | https://www.medrxiv.org/content/10.1101/2023.05.18.23290114.full.pdf | Synthetic data promise privacy-preserving data sharing for healthcare research and development. Compared with other privacy-enhancing approaches - such as federated learning - analyses performed on synthetic data can be applied downstream without modification, such that synthetic data can act in place of real data for a wide range of use cases. However, the role that synthetic data might play in all aspects of clinical model development remains unknown. In this work, we used state-of-the-art generators explicitly designed for privacy preservation to create a synthetic version of the UK Biobank before building prognostic models for lung cancer under several data release assumptions. We demonstrate that synthetic data can be effectively used throughout the modelling pipeline even without eventual access to the real data. Furthermore, we show the implications of different data release approaches on how synthetic data could be deployed within the healthcare system. | Journal | medRxiv | Risk & prognosis | ||||
2023/08/10 17:06 | Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare | F. Imrie, R. Davis, M. van der Schaar | 2023 | https://www.nature.com/articles/s42256-023-00698-2 | Applications of machine learning are becoming increasingly common in medicine and healthcare, enabling more accurate predictive models. However, this often comes at the cost of interpretability, limiting the clinical impact of machine learning methods. To realize the potential of machine learning in healthcare, it is critical to understand such models from the perspective of multiple stakeholders and various angles, necessitating different types of explanation. In this Perspective, we explore five fundamentally different types of post-hoc machine learning interpretability. We highlight the different types of information that they provide, and describe when each can be useful. We examine the various stakeholders in healthcare, delving into their specific objectives, requirements and goals. We discuss how current notions of interpretability can help meet these and what is required for each stakeholder to make machine learning models clinically impactful. Finally, to facilitate adoption, we release an open-source interpretability library containing implementations of the different types of interpretability, including tools for visualizing and exploring the explanations. | https://doi.org/10.1038/s42256-023-00698-2 | Journal | Nature Machine Intelligence | Interpretability & explainability | |||
2023/09/08 15:49 | Novel Preoperative Risk Stratification Using Digital Phenotyping Applying a Scalable Machine Learning Approach | P. L. Langlois, F. Imrie, M. A. Geraldo, T. Wingert, N. Lahrichi, M. van der Schaar, M. Cannesson | 2023 | Introduction Classification of perioperative risk is important for patient care, resource allocation, and guiding shared decision-making. Using discriminative features from the electronic health record (EHR), machine learning algorithms can create digital phenotypes among heterogenous populations, representing distinct patient subpopulations grouped by shared characteristics, from which we can personalize care, anticipate clinical care trajectories, and explore therapies. We hypothesized that digital phenotypes in pre-operative settings are associated with postoperative adverse events including in-hospital and 30-day mortality, 30-day surgical redo, intensive care unit (ICU) admission, and hospital length of stay (LOS). Methods We identified all laminectomies, colectomies, and thoracic surgeries performed over a 9-year period from a large hospital system. Seventy-seven readily extractable preoperative features were first selected from clinical consensus, including demographics, medical history, and lab results. Three surgery-specific datasets were built and split into derivation and validation cohorts using chronological occurrence. Consensus k-means clustering was performed independently on each derivation cohort, from which phenotypes’ characteristics were explored. Cluster assignments were used to train a random forest model to assign patient phenotypes in validation cohorts. We reconducted descriptive analyses on validation cohorts to confirm the similarity of patient characteristics with derivation cohorts, and quantified the association of each phenotype with postoperative adverse events by using the area under receiver operating characteristic curve (AUROC). We compared our approach to ASA alone and investigated a combination of our phenotypes with the ASA score. Results A total of 7,251 patients met inclusion criteria, of which 2,480 were held out in a validation dataset based on chronological occurrence. Using segmentation metrics and clinical consensus, three distinct phenotypes were created for each surgery. The main features used for segmentation included urgency of the procedure, pre-operative LOS, age, and comorbidities. The most relevant characteristics varied for each of the three surgeries. Low-risk phenotype alpha was the most common (2039/2480, 82%) while high-risk phenotype gamma was the rarest (302/2480, 12%). Adverse outcomes progressively increased from phenotypes alpha to gamma, including 30-day mortality (0.3%, 2.1% and 6.0%, respectively), in-hospital mortality (0.2%, 2.3% and 7.3%) and prolonged hospital LOS (3.4%, 22.1% and 25.8%). When combined with ASA score, digital phenotypes achieved higher AUROC than ASA score alone (hospital mortality: 0.91 vs. 0.84; prolonged hospitalization: 0.80 vs 0.71). Conclusion For three frequently performed surgeries, we identified three digital phenotypes. The typical profiles of each phenotype were described and could be used to anticipate adverse postoperative events. | Journal | Anesthesia & Analgesia | Phenotyping & subgroup analysis | |||||
2021/12/01 14:02 | Inferring Lexicographically-Ordered Rewards from Preferences | A. Hüyük, W. R. Zame, M. van der Schaar | 2022 | Modeling the preferences of agents over a set of alternatives is a principal concern in many areas. The dominant approach has been to find a single reward/utility function with the property that alternatives yielding higher rewards are preferred over alternatives yielding lower rewards. However, in many settings, preferences are based on multiple—often competing—objectives; a single reward function is not adequate to represent such preferences. This paper proposes a method for inferring multi-objective reward-based representations of an agent's observed preferences. We model the agent's priorities over different objectives as entering lexicographically, so that objectives with lower priorities matter only when the agent is indifferent with respect to objectives with higher priorities. We offer two example applications in healthcare—one inspired by cancer treatment, the other inspired by organ transplantation—to illustrate how the lexicographically-ordered rewards we learn can provide a better understanding of a decision-maker's preferences and help improve policies when used in reinforcement learning. | Conference | AAAI | Quantitative epistemology (understanding decision-making) | |||||
2022/01/19 14:46 | Identifiable Energy-based Representations: An Application to Estimating Heterogeneous Causal Effects | Y. Zhang*, J. Berrevoets*, M. van der Schaar | 2022 | https://arxiv.org/pdf/2108.03039 | Conditional average treatment effects (CATEs) allow us to understand the effect heterogeneity across a large population of individuals. However, typical CATE learners assume all confounding variables are measured in order for the CATE to be identifiable. This requirement can be satisfied by collecting many variables, at the expense of increased sample complexity for estimating CATEs. To combat this, we propose an energy-based model (EBM) that learns a low-dimensional representation of the variables by employing a noise contrastive loss function. With our EBM we introduce a preprocessing step that alleviates the dimensionality curse for any existing learner developed for estimating CATEs. We prove that our EBM keeps the representations partially identifiable up to some universal constant, as well as having universal approximation capability. These properties enable the representations to converge and keep the CATE estimates consistent. Experiments demonstrate the convergence of the representations, as well as show that estimating CATEs on our representations performs better than on the variables or the representations obtained through other dimensionality reduction methods. | Conference | AISTATS | Deep learning, Causal inference | Treatment & trials | |||
2022/01/24 01:00 | Self-Supervision Enhanced Feature Selection with Correlated Gates | C. Lee*, F. Imrie*, M. van der Schaar | 2022 | https://openreview.net/pdf?id=oDFvtxzPOx | Discovering relevant input features for predicting a target variable is a key scientific question. However, in many domains, such as medicine and biology, feature selection is confounded by a scarcity of labeled samples coupled with significant correlations among features. In this paper, we propose a novel deep learning approach to feature selection that addresses both challenges simultaneously. First, we pre-train the network using unlabeled samples within a self-supervised learning framework via solving pretext tasks that require the network to learn informative representations from partial feature sets. Then, we fine-tune the pre-trained network to discover relevant features using labeled samples. During both training phases, we explicitly account for the correlation structure of the input features by generating correlated gate vectors from a multivariate Bernoulli distribution. Experiments on multiple real-world datasets including clinical and omics demonstrate that our model discovers relevant features that provide superior prediction performance compared to the state-of-the-art benchmarks, especially in practical scenarios where there is often limited labeled data and high correlations among features. | Conference | ICLR | Deep learning, Feature selection | Genomics | |||
2022/01/21 17:26 | Neural graphical modelling in continuous-time: consistency guarantees and algorithms | A. Bellot, K. Branson, M. van der Schaar | 2022 | The discovery of structure from time series data is a key problem in fields of study working with complex systems. Most identifiability results and learning algorithms assume the underlying dynamics to be discrete in time. Comparatively few, in contrast, explicitly define dependencies in infinitesimal intervals of time, independently of the scale of observation and of the regularity of sampling. In this paper, we consider score-based structure learning for the study of dynamical systems. We prove that for vector fields parameterized in a large class of neural networks, least squares optimization with adaptive regularization schemes consistently recovers directed graphs of local independencies in systems of stochastic differential equations. Using this insight, we propose a score-based learning algorithm based on penalized Neural Ordinary Differential Equations (modelling the mean process) that we show to be applicable to the general setting of irregularly-sampled multivariate time series and to outperform the state of the art across a range of dynamical systems. | Conference | ICLR | Causal inference, Time series analysis | Risk & disease trajectories, Scientific discovery | ||||
2022/01/21 11:27 | D-CODE: Discovering Closed-form ODEs from Observed Trajectories | Z. Qian, K. Kacprzyk, M. van der Schaar | 2022 | https://openreview.net/forum?id=wENMvIsxNN | For centuries, scientists have manually designed closed-form ordinary differential equations (ODEs) to model dynamical systems. An automated tool to distill closed-form ODEs from observed trajectories would accelerate the modeling process. Traditionally, symbolic regression is used to uncover a closed-form prediction function $a=f(b)$ with label-feature pairs $(a_i, b_i)$ as training examples. However, an ODE models the time derivative $\dot{x}(t)$ of a dynamical system, e.g. $\dot{x}(t) = f(x(t),t)$, and the ``label'' $\dot{x}(t)$ is usually \textit{not} observed. The existing ways to bridge this gap only perform well for a narrow range of settings with low measurement noise and frequent sampling. In this work, we propose the Discovery of Closed-form ODE framework (D-CODE), which advances symbolic regression beyond the paradigm of supervised learning. D-CODE uses a novel objective function based on the variational formulation of ODEs to bypass the unobserved time derivative. For formal justification, we prove that this objective is a valid proxy for the estimation error of the true (but unknown) ODE. In the experiments, D-CODE successfully discovered the governing equations of a diverse range of dynamical systems under challenging measurement settings with high noise and infrequent sampling. | Conference | ICLR | Time series analysis, Interpretability & explainability | Risk & disease trajectories, Treatment & trials, Scientific discovery | |||
2022/01/21 10:17 | Inverse Online Learning: Understanding Non-Stationary and Reactionary Policies | A. J. Chan, A. Curth, M. van der Schaar | 2022 | https://openreview.net/forum?id=DYypjaRdph2 | Human decision making is well known to be imperfect and the ability to analyse such processes individually is crucial when attempting to aid or improve a decision-maker's ability to perform a task, e.g. to alert them to potential biases or oversights on their part. To do so, it is necessary to develop interpretable representations of how agents make decisions and how this process changes over time as the agent learns online in reaction to the accrued experience. To then understand the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem. By interpreting actions within a potential outcomes framework, we introduce a meaningful mapping based on agents choosing an action they believe to have the greatest treatment effect. We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them, using a novel architecture built upon an expressive family of deep state-space models. Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time. | Conference | ICLR | Interpretability & explainability, Uncertainty estimation | Clinical practice | |||
2022/01/21 10:14 | POETREE: Interpretable Policy Learning with Adaptive Decision Trees | A. Pace, A. J. Chan, M. van der Schaar | 2022 | https://openreview.net/forum?id=AJsI-ymaKn_ | Building models of human decision-making from observed behaviour is critical to better understand, diagnose and support real-world policies such as clinical care. As established policy learning approaches remain focused on imitation performance, they fall short of explaining the demonstrated decision-making process. Policy Extraction through decision Trees (POETREE) is a novel framework for interpretable policy learning, compatible with fully-offline and partially-observable clinical decision environments -- and builds probabilistic tree policies determining physician actions based on patients' observations and medical history. Fully-differentiable tree architectures are grown incrementally during optimization to adapt their complexity to the modelling task, and learn a representation of patient history through recurrence, resulting in decision tree policies that adapt over time with patient information. This policy learning method outperforms the state-of-the-art on real and synthetic medical datasets, both in terms of understanding, quantifying and evaluating observed behaviour as well as in accurately replicating it -- with potential to improve future decision support systems. | Conference | ICLR | Interpretability & explainability, Quantitative epistemology (understanding decision-making) | Clinical practice | |||
2022/05/16 16:19 | Continuous-Time Modeling of Counterfactual Outcomes Using Neural Controlled Differential Equations | N. Seedat*, F. Imrie*, A. Bellot, Z. Qian, M. van der Schaar | 2022 | https://proceedings.mlr.press/v162/seedat22b.html | Estimating counterfactual outcomes over time has the potential to unlock personalized healthcare by assisting decision-makers to answer “what-if” questions. Existing causal inference approaches typically consider regular, discrete-time intervals between observations and treatment decisions and hence are unable to naturally model irregularly sampled data, which is the common setting in practice. To handle arbitrary observation patterns, we interpret the data as samples from an underlying continuous-time process and propose to model its latent trajectory explicitly using the mathematics of controlled differential equations. This leads to a new approach, the Treatment Effect Neural Controlled Differential Equation (TE-CDE), that allows the potential outcomes to be evaluated at any time point. In addition, adversarial training is used to adjust for time-dependent confounding which is critical in longitudinal settings and is an added challenge not encountered in conventional time-series. To assess solutions to this problem, we propose a controllable simulation environment based on a model of tumor growth for a range of scenarios with irregular sampling reflective of a variety of clinical scenarios. TE-CDE consistently outperforms existing approaches in all simulated scenarios with irregular sampling. | Conference | ICML | Causal inference, Time series analysis | Treatment & trials, Risk & disease trajectories | |||
2022/05/16 16:15 | Data-SUITE: Data-centric identification of in-distribution incongruous examples | N. Seedat, J. Crabbé, M. van der Schaar | 2022 | https://proceedings.mlr.press/v162/seedat22a.html | Systematic quantification of data quality is critical for consistent model performance. Prior works have focused on out-of-distribution data. Instead, we tackle an understudied yet equally important problem of characterizing incongruous regions of in-distribution (ID) data, which may arise from feature space heterogeneity. To this end, we propose a paradigm shift with Data-SUITE: a datacentric framework to identify these regions, independent of a task-specific model. DATA-SUITE leverages copula modeling, representation learning, and conformal prediction to build featurewise confidence interval estimators based on a set of training instances. These estimators can be used to evaluate the congruence of test instances with respect to the training set, to answer two practically useful questions: (1) which test instances will be reliably predicted by a model trained with the training instances? and (2) can we identify incongruous regions of the feature space so that data owners understand the data’s limitations or guide future data collection? We empirically validate Data-SUITE’s performance and coverage guarantees and demonstrate on cross-site medical data, biased data, and data with concept drift, that Data-SUITE best identifies ID regions where a downstream model may be reliable (independent of said model). We also illustrate how these identified regions can provide insights into datasets and highlight their limitations. | Conference | ICML | Data-centric AI & reliable ML | ||||
2022/05/16 16:09 | HyperImpute: Generalized Iterative Imputation with Automatic Model Selection | D. Jarrett, B. Cebere, T. Liu, A. Curth, M. van der Schaar | 2022 | https://proceedings.mlr.press/v162/jarrett22a.html | Consider the problem of imputing missing values in a dataset. One the one hand, conventional approaches using iterative imputation benefit from the simplicity and customizability of learning conditional distributions directly, but suffer from the practical requirement for appropriate model specification of each and every variable. On the other hand, recent methods using deep generative modeling benefit from the capacity and efficiency of learning with neural network function approximators, but are often difficult to optimize and rely on stronger data assumptions. In this work, we study an approach that marries the advantages of both: We propose *HyperImpute*, a generalized iterative imputation framework for adaptively and automatically configuring column-wise models and their hyperparameters. Practically, we provide a concrete implementation with out-of-the-box learners, optimizers, simulators, and extensible interfaces. Empirically, we investigate this framework via comprehensive experiments and sensitivities on a variety of public datasets, and demonstrate its ability to generate accurate imputations relative to a strong suite of benchmarks. Contrary to recent work, we believe our findings constitute a strong defense of the iterative imputation paradigm. | Conference | ICML | Automated ML | Missing data imputation | |||
2022/05/16 16:08 | Inverse Contextual Bandits: Learning How Behavior Evolves over Time | A. Hüyük, D. Jarrett, M. van der Schaar | 2022 | https://proceedings.mlr.press/v162/huyuk22a.html | Understanding a decision-maker's priorities by observing their behavior is critical for transparency and accountability in decision processes—such as in healthcare. Though conventional approaches to policy learning almost invariably assume stationarity in behavior, this is hardly true in practice: Medical practice is constantly evolving as clinical professionals fine-tune their knowledge over time. For instance, as the medical community's understanding of organ transplantations has progressed over the years, a pertinent question is: How have actual organ allocation policies been evolving? To give an answer, we desire a policy learning method that provides interpretable representations of decision-making, in particular capturing an agent's non-stationary knowledge of the world, as well as operating in an offline manner. First, we model the evolving behavior of decision-makers in terms of contextual bandits, and formalize the problem of Inverse Contextual Bandits ("ICB"). Second, we propose two concrete algorithms as solutions, learning parametric and non-parametric representations of an agent's behavior. Finally, using both real and simulated data for liver transplantations, we illustrate the applicability and explainability of our method, as well as benchmarking and validating the accuracy of our algorithms. | Conference | ICML | Quantitative epistemology (understanding decision-making) | ||||
2022/05/16 16:07 | Label-Free Explainability for Unsupervised Models | J. Crabbé, M. van der Schaar | 2022 | https://proceedings.mlr.press/v162/crabbe22a.html | Unsupervised black-box models are challenging to interpret. Indeed, most existing explainability methods require labels to select which component(s) of the black-box's output to interpret. In the absence of labels, black-box outputs often are representation vectors whose components do not correspond to any meaningful quantity. Hence, choosing which component(s) to interpret in a label-free unsupervised/self-supervised setting is an important, yet unsolved problem. To bridge this gap in the literature, we introduce two crucial extensions of post-hoc explanation techniques: (1) label-free feature importance and (2) label-free example importance that respectively highlight influential features and training examples for a black-box to construct representations at inference time. We demonstrate that our extensions can be successfully implemented as simple wrappers around many existing feature and example importance methods. We illustrate the utility of our label-free explainability paradigm through a qualitative and quantitative comparison of representation spaces learned by various autoencoders trained on distinct unsupervised tasks. | Conference | ICML | Interpretability & explainability | ||||
2022/05/16 16:04 | Neural Laplace: Learning diverse classes of differential equations in the Laplace domain | S. Holt, Z. Qian, M. van der Schaar | 2022 | https://proceedings.mlr.press/v162/holt22a.html | Neural Ordinary Differential Equations model dynamical systems with \textit{ODE}s learned by neural networks. However, ODEs are fundamentally inadequate to model systems with long-range dependencies or discontinuities, which are common in engineering and biological systems. Broader classes of differential equations (DE) have been proposed as remedies, including delay differential equations and integro-differential equations. Furthermore, Neural ODE suffers from numerical instability when modelling stiff ODEs and ODEs with piecewise forcing functions. In this work, we propose \textit{Neural Laplace}, a unifying framework for learning diverse classes of DEs including all the aforementioned ones. Instead of modelling the dynamics in the time domain, we model it in the Laplace domain, where the history-dependencies and discontinuities in time can be represented as summations of complex exponentials. To make learning more efficient, we use the geometrical stereographic map of a Riemann sphere to induce more smoothness in the Laplace domain. In the experiments, Neural Laplace shows superior performance in modelling and extrapolating the trajectories of diverse classes of DEs, including the ones with complex history dependency and abrupt changes. | Conference | ICML | Time series analysis | Scientific discovery, Risk & disease trajectories | |||
2022/05/16 15:05 | How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating and Auditing Generative Models | A. M. Alaa, B. van Breugel, E. S. Saveliev, M. van der Schaar | 2022 | https://proceedings.mlr.press/v162/alaa22a.html | Devising domain- and model-agnostic evaluation metrics for generative models is an important and as yet unresolved problem. Most existing metrics, which were tailored solely to the image synthesis setup, exhibit a limited capacity for diagnosing the different modes of failure of generative models across broader application domains. In this paper, we introduce a 3-dimensional evaluation metric, (α-Precision, β-Recall, Authenticity), that characterizes the fidelity, diversity and generalization performance of any generative model in a domain-agnostic fashion. Our metric unifies statistical divergence measures with precision-recall analysis, enabling sample- and distribution-level diagnoses of model fidelity and diversity. We introduce generalization as an additional, independent dimension (to the fidelity-diversity trade-off) that quantifies the extent to which a model copies training data—a crucial performance indicator when modeling sensitive data with requirements on privacy. The three metric components correspond to (interpretable) probabilistic quantities, and are estimated via sample-level binary classification. The sample-level nature of our metric inspires a novel use case which we call model auditing, wherein we judge the quality of individual samples generated by a (black-box) model, discarding low-quality samples and hence improving the overall model performance in a post-hoc manner. | Conference | ICML | Privacy-preserving ML & synthetic data | ||||
2023/02/01 12:53 | Composite Feature Selection Using Deep Ensembles | F. Imrie*, A. Norcliffe*, P. Lio, M. van der Schaar | 2022 | https://arxiv.org/abs/2211.00631 | In many real world problems, features do not act alone but in combination with each other. For example, in genomics, diseases might not be caused by any single mutation but require the presence of multiple mutations. Prior work on feature selection either seeks to identify individual features or can only determine relevant groups from a predefined set. We investigate the problem of discovering groups of predictive features without predefined grouping. To do so, we define predictive groups in terms of linear and non-linear interactions between features. We introduce a novel deep learning architecture that uses an ensemble of feature selection models to find predictive groups, without requiring candidate groups to be provided. The selected groups are sparse and exhibit minimum overlap. Furthermore, we propose a new metric to measure similarity between discovered groups and the ground truth. We demonstrate the utility of our model on multiple synthetic tasks and semi-synthetic chemistry datasets, where the ground truth structure is known, as well as an image dataset and a real-world cancer dataset. | Conference | NeurIPS | Deep Learning & Feature selection | Genomics & Scientific Discovery | |||
2023/02/01 12:52 | Transfer Learning on Heterogeneous Feature Spaces for Treatment Effects Estimation | I. Bica, M. van der Schaar | 2022 | https://arxiv.org/abs/2210.06183 | Consider the problem of improving the estimation of conditional average treatment effects (CATE) for a target domain of interest by leveraging related information from a source domain with a different feature space. This heterogeneous transfer learning problem for CATE estimation is ubiquitous in areas such as healthcare where we may wish to evaluate the effectiveness of a treatment for a new patient population for which different clinical covariates and limited data are available. In this paper, we address this problem by introducing several building blocks that use representation learning to handle the heterogeneous feature spaces and a flexible multi-task architecture with shared and private layers to transfer information between potential outcome functions across domains. Then, we show how these building blocks can be used to recover transfer learning equivalents of the standard CATE learners. On a new semi-synthetic data simulation benchmark for heterogeneous transfer learning we not only demonstrate performance improvements of our heterogeneous transfer causal effect learners across datasets, but also provide insights into the differences between these learners from a transfer perspective. | Conference | NeurIPS | Causal inference & Transfer learning | Treatment & trials | |||
2023/02/01 12:51 | Synthetic Model Combination: An Instance-wise Approach to Unsupervised Ensemble Learning | A. J. Chan, M. van der Schaar | 2022 | https://arxiv.org/abs/2210.05320 | Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data – instead given access to a set of expert models and their predictions alongside some limited information about the dataset used to train them. In scenarios from finance to the medical sciences, and even consumer practice, stakeholders have developed models on private data they either cannot, or do not want to, share. Given the value and legislation surrounding personal information, it is not surprising that only the models, and not the data, will be released – the pertinent question becoming: how best to use these models? Previous work has focused on global model selection or ensembling, with the result of a single final model across the feature space. Machine learning models perform notoriously poorly on data outside their training domain however, and so we argue that when ensembling models the weightings for individual instances must reflect their respective domains – in other words models that are more likely to have seen information on that instance should have more attention paid to them. We introduce a method for such an instance-wise ensembling of models, including a novel representation learning step for handling sparse high-dimensional domains. Finally, we demonstrate the need and generalisability of our method on classical machine learning tasks as well as highlighting a real world use case in the pharmacological setting of vancomycin precision dosing. | Conference | NeurIPS | Ensemble learning | Treatment & trials | |||
2023/02/01 12:51 | Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data | N. Seedat, J. Crabbé, I. Bica, M. van der Schaar | 2022 | https://arxiv.org/abs/2210.13043 | High model performance, on average, can hide that models may systematically underperform on subgroups of the data. We consider the tabular setting, which surfaces the unique issue of outcome heterogeneity – this is prevalent in areas such as healthcare, where patients with similar features can have different outcomes, thus making reliable predictions challenging. To tackle this, we propose Data-IQ, a framework to systematically stratify examples into subgroups with respect to their outcomes. We do this by analyzing the behavior of individual examples during training, based on their predictive confidence and, importantly, the aleatoric (data) uncertainty. Capturing the aleatoric uncertainty permits a principled characterization and then subsequent stratification of data examples into three distinct subgroups (Easy, Ambiguous, Hard). We experimentally demonstrate the benefits of Data-IQ on four real-world medical datasets. We show that Data-IQ’s characterization of examples is most robust to variation across similarly performant (yet different) models, compared to baselines. Since Data-IQ can be used with any ML model (including neural networks, gradient boosting etc.), this property ensures consistency of data characterization, while allowing flexible model selection. Taking this a step further, we demonstrate that the subgroups enable us to construct new approaches to both feature acquisition and dataset selection. Furthermore, we highlight how the subgroups can inform reliable model usage, noting the significant impact of the Ambiguous subgroup on model generalization. | Conference | NeurIPS | Data-centric AI & reliable ML | Risk & prognosis | |||
2023/02/01 12:50 | Online Decision Mediation from Scratch | D. Jarrett, A. Hüyük, M. van der Schaar | 2022 | https://openreview.net/forum?id=2ZfUNW7SoaS | Consider learning a decision support assistant to serve as an intermediary between (oracle) expert behavior and (imperfect) human behavior: At each time, the algorithm observes an action chosen by a fallible agent, and decides whether to accept that agent’s decision, intervene with an alternative, or request the expert’s opinion. For instance, in clinical diagnosis, fully-autonomous machine behavior is often beyond ethical affordances, thus real-world decision support is often limited to monitoring and forecasting. Instead, such an intermediary would strike a prudent balance between the former (purely prescriptive) and latter (purely descriptive) approaches, while providing an efficient interface between human mistakes and expert feedback. In this work, we first formalize the sequential problem of online decision mediation—that is, of simultaneously learning and evaluating mediator policies from scratch with abstentive feedback: In each round, deferring to the oracle obviates the risk of error, but incurs an upfront penalty, and reveals the otherwise hidden expert action as a new training data point. Second, we motivate and propose a solution that seeks to trade off (immediate) loss terms against (future) improvements in generalization error; in doing so, we identify why conventional bandit algorithms may fail. Finally, through experiments and sensitivities on a variety of datasets, we illustrate consistent gains over applicable benchmarks on performance measures with respect to the mediator policy, the learned model, and the decision-making system as a whole. | Conference | NeurIPS | Quantitative Epistemology | Clinical Practice | |||
2023/02/01 12:49 | Benchmarking Heterogeneous Treatment Effect Models through the Lens of Interpretability | J. Crabbé, A. Curth, I. Bica, M. van der Schaar | 2022 | https://arxiv.org/abs/2206.08363 | Estimating personalized effects of treatments is a complex, yet pervasive problem. To tackle it, recent developments in the machine learning (ML) literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools: due to their flexibility, modularity and ability to learn constrained representations, neural networks in particular have become central to this literature. Unfortunately, the assets of such black boxes come at a cost: models typically involve countless nontrivial operations, making it difficult to understand what they have learned. Yet, understanding these models can be crucial — in a medical context, for example, discovered knowledge on treatment effect heterogeneity could inform treatment prescription in clinical practice. In this work, we therefore use post-hoc feature importance methods to identify features that influence the model’s predictions. This allows us to evaluate treatment effect estimators along a new and important dimension that has been overlooked in previous work: We construct a benchmarking environment to empirically investigate the ability of personalized treatment effect models to identify predictive covariates — covariates that determine differential responses to treatment. Our benchmarking environment then enables us to provide new insight into the strengths and weaknesses of different types of treatment effects models as we modulate different challenges specific to treatment effect estimation — e.g. the ratio of prognostic to predictive information, the possible nonlinearity of potential outcomes and the presence and type of confounding. | Conference | NeurIPS | Interpretability & Explainability | Treatment & Trials | |||
2023/02/01 12:48 | Concept Activation Regions: A Generalized Framework for Concept-Based Explanations | J. Crabbé, M. van der Schaar | 2022 | https://arxiv.org/abs/2209.11222 | Concept-based explanations permit to understand the predictions of a deep neural network (DNN) through the lens of concepts specified by users. Existing methods assume that the examples illustrating a concept are mapped in a fixed direction of the DNN’s latent space. When this holds true, the concept can be represented by a concept activation vector (CAV) pointing in that direction. In this work, we propose to relax this assumption by allowing concept examples to be scattered across different clusters in the DNN’s latent space. Each concept is then represented by a region of the DNN’s latent space that includes these clusters and that we call concept activation region (CAR). To formalize this idea, we introduce an extension of the CAV formalism that is based on the kernel trick and support vector classifiers. This CAR formalism yields global concept-based explanations and local concept-based feature importance. We prove that CAR explanations built with radial kernels are invariant under latent space isometries. In this way, CAR assigns the same explanations to latent spaces that have the same geometry. We further demonstrate empirically that CARs offer (1) more accurate descriptions of how concepts are scattered in the DNN’s latent space; (2) global explanations that are closer to human concept annotations and (3) concept-based feature importance that meaningfully relate concepts with each other. Finally, we use CARs to show that DNNs can autonomously rediscover known scientific concepts, such as the prostate cancer grading system. | Conference | NeurIPS | Interpretability & Explainability, Deep Learning | ||||
2022/11/22 17:30 | Development and External Validation of A Risk Calculator for Prediction of Major Complications and Readmission after Anterior Cervical Discectomy and Fusion | A. A. Shah, S. K. Devana, C. Lee, T. E. Olson, A. Upfill-Brown, W. L. Sheppard, E. L. Lord, A. N. Shamie, M. van der Schaar, N. F. SooHoo, D. Y. Park. | 2022 | https://journals.lww.com/spinejournal/Abstract/9900/Development_and_External_Validation_of_A_Risk.167.aspx | Study Design. Retrospective, case-control study Objective. We aim to build a risk calculator predicting major perioperative complications after anterior cervical fusion. Additionally, we aim to externally validate this calculator with an institutional cohort of patients who underwent anterior cervical discectomy and fusion (ACDF). Summary of Background Data. The average age and proportion of patients with at least one comorbidity undergoing ACDF have increased in recent years. Given the increased morbidity and cost associated with perioperative complications and unplanned readmission, accurate risk stratification of patients undergoing ACDF is of great clinical utility. Methods. This is a retrospective cohort study of adults who underwent anterior cervical fusion at any non-federal California hospital between 2015-2017. The primary outcome was major perioperative complication or 30-day readmission. We built standard and ensemble machine learning models for risk prediction, assessing discrimination and calibration. The best-performing model was validated on an external cohort comprised of consecutive adult patients who underwent ACDF at our institution between 2013-2020. Results. A total of 23,184 patients were included in this study; there were 1,886 cases of major complication or readmissions. The ensemble model was well-calibrated and demonstrated an area under the receiver operating characteristic curve (AUROC) of 0.728. The variables most important for the ensemble model include male sex, medical comorbidities, history of complications, and teaching hospital status. The ensemble model was evaluated on the validation cohort (n=260) with an AUROC of 0.802. The ensemble algorithm was used to build a web-based risk calculator. Conclusion. We report derivation and external validation of an ensemble algorithm for prediction of major perioperative complications and 30-day readmission after anterior cervical fusion. This model has excellent discrimination and is well-calibrated when tested on a contemporaneous external cohort of ACDF cases. | 10.1097/BRS.0000000000004531 | Journal | Spine | Risk & prognosis | |||
2022/11/21 19:29 | Sex differences and disparities in cardiovascular outcomes of COVID-19 | R. Bugiardini, S. Nava, G. Caramori, J. Yoon, L. Badimon, M. Bergami, E. Cenko, A. David, I. Demiri, M. Dorobantu, O. Fronea, R. Jankovic, S. Kedev, N. Ladjevic, R. Lasica, G. Loncar, G. Mancuso, G. Mendieta, D. Miličić, P. Mjehović, M. Pašalić, M. Petrović, L. Poposka, M. Scarpone, M. Stefanovic, M. van der Schaar, Z. Vasiljevic, M. Vavlukis, M. L. V. Pittao, V. Vukomanovic, M. Zdravkovic, O. Manfrini | 2022 | Background: Previous analyses on sex differences in case fatality rates at population-level data had limited adjustment for key patient clinical characteristics thought to be associated with COVID-19 outcomes. We aimed to estimate the risk of specific organ dysfunctions and mortality in women and men. Methods and Results: This retrospective cross-sectional study included 17 hospitals within 5 European countries participating in the International Survey of Acute Coronavirus Syndromes (ISACS) COVID-19(NCT05188612). Participants were individuals hospitalized with positive SARS-CoV-2 from March 2020 to February 2022. Risk-adjusted ratios(RR) of in-hospital mortality, acute respiratory failure(ARF), acute heart failure(AHF), and acute kidney injury(AKI) were calculated for women versus men. Estimates were evaluated by inverse probability of weighting and logistic regression models. The overall care cohort included 4,499 patients with COVID-19 associated hospitalizations. Of these, 1,524(33.9%) were admitted to ICU, and 1,117(24.8%) died during hospitalization. Compared with men, women were less likely to be admitted to ICU (RR:0.80;95%CI: 0.71 0.91). In general wards (GW) and ICU cohorts, the adjusted women-to-men RRs for in-hospital mortality were of 1.13(95%CI: 0.90 1.42) and 0.86(95%CI: 0.70 1.05; p_interaction=0.04). Development of AHF, AKI and ARF was associated with increased mortality risk (ORs: 2.27; 95%CI;1.73 2.98,3.85;95%CI:3.21 4.63 and 3.95;95%CI:3.04 5.14, respectively). The adjusted RRs for AKI and ARF were comparable among women and men regardless of intensity of care. By contrast, female sex was associated with higher odds for AHF in GW, but not in ICU (RRs:1.25;95%CI0.94 1.67 versus 0.83; 95%CI:0.59 1.16, p_interaction=0.04). Conclusions: Women in GW were at increased risk of AHF and in-hospital mortality for COVID-19 compared with men. For patients receiving ICU care, fatal complications including AHF and mortality appeared to be independent of sex. Equitable access to COVID-19 ICU care is needed to minimize the unfavourable outcome of women presenting with COVID-19 related complications. | Journal | Cardiovascular Research | Survival analysis competing risks & comorbidities | Risk & prognosis | ||||
2022/07/18 16:09 | A risk calculator for prediction of C5 nerve root palsy after instrumented cervical fusion | A. A Shah, S. Devana, C. Lee, A. Bugarin, M. Hong, A. Upfill-Brown, G. Blumstein, E. Lord, A. Shamie, M. van der Schaar, N. SooHoo, D. Park | 2022 | Background: C5 palsy is a common post-operative complication after cervical fusion and is associated with increased healthcare costs and diminished quality of life. Accurate prediction of C5 palsy would be of great clinical utility, allowing for appropriate pre-operative counseling and risk stratification. We primarily aim to develop an algorithm for prediction of C5 palsy after instrumented cervical fusion and identify novel features for risk prediction. Additionally, we aim to build a risk calculator to provide patient-specific prediction for C5 palsy. Methods: We identified adult patients who underwent instrumented cervical fusion at our tertiary care medical center between 2013-2020. The primary outcome was post-operative C5 palsy. We developed ensemble machine learning, standard machine learning, and logistic regression models predicting risk of C5 palsy – assessing discrimination and calibration. The features most important for the ensemble model were identified. Additionally, a web-based risk calculator was built for the ensemble model. Results: A total of 1,024 patients were included, with 52 cases of C5 palsy. The ensemble model was well-calibrated and demonstrated excellent discrimination with an area under the receiver-operating characteristic curve of 0.773. The following features were the most important for ensemble model performance: diabetes mellitus, bipolar disorder, C5 or C4 level, surgical approach, pre-operative non-motor neurologic symptoms, degenerative disease, number of fused levels, age. Conclusions: We report a risk calculator that provides patient-specific risk of C5 palsy after instrumented cervical fusion. Individualized risk prediction for patients may facilitate improved pre-operative patient counseling and risk stratification as well as potential intra-operative mitigating measures. This tool may also aid with addressing potentially modifiable risk factors such as diabetes and obesity. | Journal | World Neurosurgery | Ensemble learning, Survival analysis competing risks & comorbidities | Risk & prognosis | ||||
2022/07/14 17:28 | Developing machine learning algorithms for dynamic estimation of progression during active surveillance for prostate cancer | C. Lee, A. Light, E. S. Saveliev, M. van der Schaar, V. Gnanapragasam | 2022 | https://rdcu.be/cS91j | Active Surveillance (AS) for prostate cancer is a management option that continually monitors early disease and considers intervention if progression occurs. A robust method to incorporate “live” updates of progression risk during follow-up has hitherto been lacking. To address this, we developed a deep learning-based individualized longitudinal survival model using Dynamic-DeepHit-Lite (DDHL) that learns data-driven distribution of time-to-event outcomes. Further refining outputs, we used a reinforcement learning approach (Actor-Critic) for temporal predictive clustering (AC-TPC) to discover groups with similar time-to-event outcomes to support clinical utility. We applied these methods to data from 585 men on AS with longitudinal and comprehensive follow-up (median 4.4 years). Time-dependent C-indices and Brier scores were calculated and compared to Cox regression and landmarking methods. Both Cox and DDHL models including only baseline variables showed comparable C-indices but the DDHL model performance improved with additional follow-up data. With 3 years of data collection and 3 years follow-up the DDHL model had a C-index of 0.79 (± 0.11) compared to 0.70 (± 0.15) for landmarking Cox and 0.67 (± 0.09) for baseline Cox only. Model calibration was good across all models tested. The AC-TPC method further discovered 4 distinct outcome-related temporal clusters with distinct progression trajectories. Those in the lowest risk cluster had negligible progression risk while those in the highest cluster had a 50% risk of progression by 5 years. In summary we report a novel machine learning approach to inform personalised follow-up during active surveillance which improves predictive power with increasing data input over time. | 10.1038/s41746-022-00659-w | Journal | npj Digital Medicine | Deep learning, Time series analysis, Survival analysis competing risks & comorbidities | Risk & prognosis, Risk & disease trajectories, Clinical practice, Phenotyping & subgroup analysis | ||
2022/03/29 16:07 | Reduced Heart Failure and Mortality in Patients Receiving Statin Therapy before Initial Acute Coronary Syndrome | R. Bugiardini, J. Yoon, G. Mendieta, S. Kedev, M. Zdravkovic, Z. Vasiljevic, D. Miličić, O. Manfrini, M. van der Schaar, C. P. Gale, M. Bergami, L. Badimon, E. Cenko | 2022 | Despite guideline-based recommendations for atherosclerotic cardiovascular disease (ASCVD), the use of statins in primary prevention is still controversial, owing to the variable risk of events in this population. The present study from the ISACS-Archives registry suggests that in patients with ACSs as a first manifestation of ASCVD prior use of statins reduces acute heart failure (AHF) events and confers a survival benefit from AHF. Benefits are consistent regardless of age and sex. In the absence of definitive evidence from trials our data provide sufficient grounds for further recommendation of statin therapy in the primary prevention setting. | Journal | Journal of the American College of Cardiology | Survival analysis competing risks & comorbidities | Treatment & trials, Clinical practice | ||||
2022/02/01 17:16 | Artificial intelligence and machine learning algorithms for early detection of skin cancer in community and primary care settings: a systematic review | O. T. Jones, R. N. Matin, M. van der Schaar, K. P. Bhayankaram, C. K. I. Ranmuthu, M. S. Islam, D. Behiyat, R. Boscott, N. Calanzani, J. Emery, H. C. Williams, F. M. Walter | 2022 | https://www.thelancet.com/journals/landig/article/PIIS2589-7500(22)00023-1/fulltext | Skin cancers occur commonly worldwide. The prognosis and disease burden are highly dependent on the cancer type and disease stage at diagnosis. We systematically reviewed studies on artificial intelligence and machine learning (AI/ML) algorithms that aim to facilitate the early diagnosis of skin cancers, focusing on their application in primary and community care settings. We searched MEDLINE, Embase, Scopus, and Web of Science (from Jan 1, 2000, to Aug 9, 2021) for all studies providing evidence on applying AI/ML algorithms to the early diagnosis of skin cancer, including all study designs and languages. The primary outcome was diagnostic accuracy of the algorithms for skin cancers. The secondary outcomes included an overview of AI/ML methods, evaluation approaches, cost-effectiveness, and acceptability to patients and clinicians. We identified 14 224 studies. Only two studies used data from clinical settings with a low prevalence of skin cancers. We reported data from all 272 studies that could be relevant in primary care. The primary outcomes showed reasonable mean diagnostic accuracy for melanoma (89.5% [range 59.7–100%]), squamous cell carcinoma (85.3% [71 .0–97.8%]), and basal cell carcinoma (87.6% [70.0–99.7%]). The secondary outcomes showed a heterogeneity of AI/ML methods and study designs, with high amounts of incomplete reporting (eg, patient demographics and methods of data collection). Few studies used data on populations with a low prevalence of skin cancers to train and test their algorithms; therefore, the widespread adoption into community and primary care practice cannot currently be recommended until efficacy in these populations is shown. We did not identify any health economic, patient, or clinician acceptability data for any of the included studies. We propose a methodological checklist for use in the development of new AI/ML algorithms to detect skin cancer, to facilitate their design, evaluation, and implementation. | Journal | The Lancet Digital Health | |||||
2022/01/12 16:30 | Can we reliably automate clinicalCan we reliably automate clinical prognostic modelling? A retrospective cohort study for ICU triage prediction of in-hospital mortality of COVID-19 patients in the Netherlands prognostic modelling? A retrospective cohort study for ICU triage prediction of in hospital mortality of COVID 19 patients in the Netherlands | I. Vagliano, S. Brinkman, A. Abu-Hanna, M.S. Arbous, D. A. Dongelmans, P. W. G. Elbers, D. W. de Lange, N. F. de Keizer, M. C. Schut, M. van der Schaar | 2022 | Building Machine Learning (ML) models in healthcare may suffer from time-consuming and potentially biased pre-selection of predictors by hand that can result in limited or trivial selection of suitable models. We aimed to assess the predictive performance of automating the process of building ML models (AutoML) in-hospital mortality prediction modelling of triage COVID-19 patients at ICU admission versus expert-based predictor pre-selection followed by logistic regression. We conducted an observational study of all COVID-19 patients admitted to Dutch ICUs between February and July 2020. We included 2,690 COVID-19 patients from 70 ICUs participating in the Dutch National Intensive Care Evaluation registry. The main outcome measure was in-hospital mortality. We assessed model performance (at admission and after 24 hours, respectively) of AutoML compared to the more traditional approach of predictor pre-selection and logistic regression. AutoML delivers prediction models with fair discriminatory performance, and good calibration and accuracy, which is as good as regression models with expert-based predictor pre-selection. In the context of the restricted availability of data in an ICU quality registry, extending the models with variables that are available at 24 hours after admission showed small (but significantly) performance increase. | Journal | International Journal of Medical Informatics | Automated ML | |||||
2022/01/07 13:37 | Development of a Machine Learning Algorithm for Prediction of Complications and Unplanned Readmission following Primary Anatomic Total Shoulder Replacements | S. Devana, A. Shah, C. Lee, A. Jensen, E. Cheung, M. van der Schaar, N. SooHoo | 2022 | https://journals.sagepub.com/doi/pdf/10.1177/24715492221075444 | The demand and incidence of anatomic total shoulder arthroplasty (aTSA) procedures is projected to increase substantially over the next decade. There is a paucity of accurate risk prediction models which would be of great utility in minimizing morbidity and costs associated with major post-operative complications. Machine learning is a powerful predictive modeling tool and has become increasingly popular, especially in orthopedics. We aimed to build a ML model for prediction of major complications and readmission following primary aTSA. A large California administrative database was retrospectively reviewed for all adults undergoing primary aTSA between 2015-2017. The primary outcome was any major complication or readmission following aTSA. A wide scope of standard ML benchmarks, including Logistic regression (LR), XGBoost, Gradient boosting, AdaBoost and Random Forest were employed to determine their power to predict outcomes. Additionally, important patient features to the prediction models were indentified. There were a total of 10,302 aTSAs with 598 (5.8%) having at least one major post-operative complication or readmission. XGBoost had the highest discriminative power (area under receiver operating curve AUROC of 0.689) of the 5 ML benchmarks with an area under precision recall curve AURPC of 0.207. History of implant complication, severe chronic kidney disease, teaching hospital status, coronary artery disease and male sex were the most important features for the performance of XGBoost. In addition, XGBoost identified teaching hospital status and male sex as markedly more important predictors of outcomes compared to LR models. We report a well calibrated XGBoost ML algorithm for predicting major complications and 30-day readmission following aTSA. History of prior implant complication was the most important patient feature for XGBoost performance, a novel patient feature that surgeons should consider when counseling patients. | Journal | Journal of Shoulder and Elbow Arthroplasty | |||||
2021/12/15 13:54 | Conservative Policy Construction Using Variational Autoencoders for Logged Data with Missing Values | M. Abroshan, K. Yip, C. Tekin, M. van der Schaar | 2022 | https://ieeexplore.ieee.org/document/9675815 | In high-stakes applications of data-driven decision making like healthcare, it is of paramount importance to learn a policy that maximizes the reward while avoiding potentially dangerous actions when there is uncertainty. There are two main challenges usually associated with this problem. Firstly, learning through online exploration is not possible due to the critical nature of such applications. Therefore, we need to resort to observational datasets with no counterfactuals. Secondly, such datasets are usually imperfect, additionally cursed with missing values in the attributes of features. In this paper, we consider the problem of constructing personalized policies using logged data when there are missing values in the attributes of features in both training and test data. The goal is to recommend an action (treatment) when X˜ , a degraded version of X with missing values, is observed. We consider three strategies for dealing with missingness. In particular, we introduce the conservative strategy where the policy is designed to safely handle the uncertainty due to missingness. In order to implement this strategy we need to estimate posterior distribution p(X|X˜), we use variational autoencoder to achieve this. In particular, our method is based on partial variational autoencoders (PVAE) which are designed to capture the underlying structure of features with missing values. | 10.1109/TNNLS.2021.3136385 | Journal | IEEE Transactions on Neural Networks and Learning Systems | Deep learning, Causal inference | Treatment & trials | ||
2021/11/18 14:52 | MARS: Assisting Human with Information Processing Tasks Using Machine Learning | C. Shen, Z. Qian, A. Huyuk, M. van der Schaar | 2022 | https://dl.acm.org/doi/10.1145/3494582 | This article studies the problem of automated information processing from large volumes of unstructured, heterogeneous, and sometimes untrustworthy data sources. The main contribution is a novel framework called Machine Assisted Record Selection (MARS). Instead of today’s standard practice of relying on human experts to manually decide the order of records for processing, MARS learns the optimal record selection via an online learning algorithm. It further integrates algorithm-based record selection and processing with human-based error resolution to achieve a balanced task allocation between machine and human. Both fixed and adaptive MARS algorithms are proposed, leveraging different statistical knowledge about the existence, quality, and cost associated with the records. Experiments using semi-synthetic data that are generated from real-world patients record processing in the UK national cancer registry are carried out, which demonstrate significant (3 to 4 fold) performance gain over the fixed-order processing. MARS represents one of the few examples demonstrating that machine learning can assist humans with complex jobs by automating complex triaging tasks. | 10.1145/3494582 | Journal | ACM Transactions on Computing for Healthcare | Reinforcement learning | Clinical practice | ||
2023/09/20 16:06 | Bridging the worlds of pharmacometrics and machine learning | K. Stankevičiūtė, J.-B. Woillard, R. W. Peck, P. Marquet, M. van der Schaar | 2023 | https://link.springer.com/article/10.1007/s40262-023-01310-x | Precision medicine requires individualized modeling of disease and drug dynamics, with machine-learning based computational techniques gaining increasing popularity. The complexity of either field, however, makes current pharmacological problems opaque to machine learning practitioners, and state-of-the-art machine learning methods inaccessible to pharmacometricians. To help bridge the two worlds, we provide an introduction to current problems and techniques in pharmacometrics—ranging from pharmacokinetic and pharmacodynamic modeling to pharmacometric simulations, model-informed precision dosing and systems pharmacology—and review some of the machine learning approaches to address them. We hope this would facilitate collaboration between experts, with complementary strengths of principled pharmacometric modeling and flexibility of machine learning leading to synergistic effects in pharmacological applications. | Journal | Clinical Pharmacokinetics | Clinical practice | ||||
2023/12/04 12:11 | Navigating Data-Centric Artificial Intelligence with DC-Check: Advances, Challenges, and Opportunities | N. Seedat, F. Imrie, M. van der Schaar | 2024 | https://arxiv.org/abs/2211.05764 | Data-centric AI is an emerging paradigm that emphasizes the critical role of data in real-world machine learning (ML) systems --- as a complement to model development. However, data-centric AI is still in its infancy, lacking a standardized framework that outlines necessary data-centric considerations at various stages of the ML pipeline: Data, Training, Testing, and Deployment. This lack of guidance hampers effective communication and design of data-centric driven ML systems. To address this critical gap, we introduce DC-Check, an actionable checklist-style framework that encapsulates data-centric considerations for ML systems. DC-Check is aimed at both practitioners and researchers to serve as a reference guide to data-centric AI development. Around each question in DC-Check, we discuss the applicability of different approaches, survey the state of the art, and highlight specific data-centric AI challenges and research opportunities. While developing DC-Check, we also undertook an analysis of the current data-centric AI landscape. The insights obtained from this exploration support the DC-Check framework, reinforcing its utility and relevance in the rapidly evolving field. To make DC-Check and related resources easily accessible, we provide a DC-Check companion website (https://www.vanderschaar-lab.com/dc-check), which will serve as a living resource, updated as methods and tools evolve. | Journal | IEEE Transactions on Artificial Intelligence | Data-centric AI & reliable ML | ||||
2023/12/04 12:13 | TRIAGE: Characterizing and auditing training data for improved regression | N. Seedat, J. Crabbé, Z. Qian, M. van der Schaar | 2023 | https://arxiv.org/abs/2310.18970 | Data quality is crucial for robust machine learning algorithms, with the recent interest in data-centric AI emphasizing the importance of training data characterization. However, current data characterization methods are largely focused on classification settings, with regression settings largely understudied. To address this, we introduce TRIAGE, a novel data characterization framework tailored to regression tasks and compatible with a broad class of regressors. TRIAGE utilizes conformal predictive distributions to provide a model-agnostic scoring method, the TRIAGE score. We operationalize the score to analyze individual samples' training dynamics and characterize samples as under-, over-, or well-estimated by the model. We show that TRIAGE's characterization is consistent and highlight its utility to improve performance via data sculpting/filtering, in multiple regression settings. Additionally, beyond sample level, we show TRIAGE enables new approaches to dataset selection and feature acquisition. Overall, TRIAGE highlights the value unlocked by data characterization in real-world regression applications. | Conference | NeurIPS | Uncertainty estimation, Data-centric AI & reliable ML | Risk & prognosis, Clinical practice | |||
2023/12/04 12:14 | Reimagining Synthetic Tabular Data Generation through Data-Centric AI: A Comprehensive Benchmark | L. Hansen*, N. Seedat*, M. van der Schaar, A. Petrovic | 2023 | https://arxiv.org/abs/2310.16981 | Synthetic data serves as an alternative in training machine learning models, particularly when real-world data is limited or inaccessible. However, ensuring that synthetic data mirrors the complex nuances of real-world data is a challenging task. This paper addresses this issue by exploring the potential of integrating data-centric AI techniques which profile the data to guide the synthetic data generation process. Moreover, we shed light on the often ignored consequences of neglecting these data profiles during synthetic data generation -- despite seemingly high statistical fidelity. Subsequently, we propose a novel framework to evaluate the integration of data profiles to guide the creation of more representative synthetic data. In an empirical study, we evaluate the performance of five state-of-the-art models for tabular data generation on eleven distinct tabular datasets. The findings offer critical insights into the successes and limitations of current synthetic data generation techniques. Finally, we provide practical recommendations for integrating data-centric insights into the synthetic data generation process, with a specific focus on classification performance, model selection, and feature selection. This study aims to reevaluate conventional approaches to synthetic data generation and promote the application of data-centric AI techniques in improving the quality and effectiveness of synthetic data. | Conference | NeurIPS | Privacy-preserving ML & synthetic data, Data-centric AI & reliable ML | ||||
2023/12/04 12:17 | Can You Rely on Your Model Evaluation? Improving Model Evaluation with Synthetic Test Data | B. van Breugel*, N. Seedat*, F. Imrie, M. van der Schaar | 2023 | https://arxiv.org/abs/2310.16524 | Evaluating the performance of machine learning models on diverse and underrepresented subgroups is essential for ensuring fairness and reliability in real-world applications. However, accurately assessing model performance becomes challenging due to two main issues: (1) a scarcity of test data, especially for small subgroups, and (2) possible distributional shifts in the model's deployment setting, which may not align with the available test data. In this work, we introduce 3S Testing, a deep generative modeling framework to facilitate model evaluation by generating synthetic test sets for small subgroups and simulating distributional shifts. Our experiments demonstrate that 3S Testing outperforms traditional baselines -- including real test data alone -- in estimating model performance on minority subgroups and under plausible distributional shifts. In addition, 3S offers intervals around its performance estimates, exhibiting superior coverage of the ground truth compared to existing approaches. Overall, these results raise the question of whether we need a paradigm shift away from limited real test data towards synthetic test data. | Conference | NeurIPS | Deep learning, Privacy-preserving ML & synthetic data | Risk & prognosis | |||
2023/12/04 18:09 | Risk-Averse Active Sensing for Timely Outcome Prediction under Cost Pressure | Y. Qin, M. van der Schaar, C. Lee | 2023 | https://openreview.net/forum?id=aw1vLo7TE7 | Timely outcome prediction is essential in healthcare to enable early detection and intervention of adverse events. However, in longitudinal follow-ups to patients' health status, cost-efficient acquisition of patient covariates is usually necessary due to the significant expense involved in screening and lab tests. To balance the timely and accurate outcome predictions with acquisition costs, an effective active sensing strategy is crucial. In this paper, we propose a novel risk-averse active sensing approach RAS that addresses the composite decision problem of when to conduct the acquisition and which measurements to make. Our approach decomposes the policy into two sub-policies: acquisition scheduler and feature selector, respectively. Moreover, we introduce a novel risk-aversion training strategy to focus on the underrepresented subgroup of high-risk patients for whom timely and accurate prediction of disease progression is of greater value. Our method outperforms baseline active sensing approaches in experiments with both synthetic and real-world datasets, and we illustrate the significance of our policy decomposition and the necessity of a risk-averse sensing policy through case studies. | Conference | NeurIPS | Reinforcement learning | Early warning systems | |||
2020/04/26 00:00 | Estimating Counterfactual Treatment Outcomes over Time through Adversarially Balanced Representations | I. Bica, A. M. Alaa, J. Jordon, M. van der Schaar | 2020 | https://openreview.net/pdf?id=BJg866NFvB | Identifying when to give treatments to patients and how to select among multiple treatments over time are important medical problems with a few existing solutions. In this paper, we introduce the Counterfactual Recurrent Network (CRN), a novel sequence-to-sequence model that leverages the increasingly available patient observational data to estimate treatment effects over time and answer such medical questions. To handle the bias from time-varying confounders, covariates affecting the treatment assignment policy in the observational data, CRN uses domain adversarial training to build balancing representations of the patient history. At each timestep, CRN constructs a treatment invariant representation which removes the association between patient history and treatment assignments and thus can be reliably used for making counterfactual predictions. On a simulated model of tumour growth, with varying degree of time-dependent confounding, we show how our model achieves lower error in estimating counterfactuals and in choosing the correct treatment and timing of treatment than current state-of-the-art methods. | Conference | ICLR | Causal inference, Deep learning, Time series analysis | Treatment & trials, Treatment & trials | |||
2020/04/26 00:00 | Target-Embedding Autoencoders for Supervised Representation Learning | D. Jarrett, M. van der Schaar | 2020 | https://openreview.net/pdf?id=BygXFkSYDH | Autoencoder-based learning has emerged as a staple for disciplining representations in unsupervised and semi-supervised settings. This paper analyzes a framework for improving generalization in a purely supervised setting, where the target space is high-dimensional. We motivate and formalize the general framework of target-embedding autoencoders (TEA) for supervised prediction, learning intermediate latent representations jointly optimized to be both predictable from features as well as predictive of targets---encoding the prior that variations in targets are driven by a compact set of underlying factors. As our theoretical contribution, we provide a guarantee of generalization for linear TEAs by demonstrating uniform stability, interpreting the benefit of the auxiliary reconstruction task as a form of regularization. As our empirical contribution, we extend validation of this approach beyond existing static classification applications to multivariate sequence forecasting, verifying their advantage on both linear and nonlinear recurrent architectures---thereby underscoring the further generality of this framework beyond feedforward instantiations. | Conference | ICLR | Deep learning | ||||
2020/01/01 00:00 | A Non-Stationary Bandit-Learning Approach to Energy-Efficient Femto-Caching with Rateless-Coded Transmission | S. Maghsudi, M. van der Schaar | 2020 | https://ieeexplore.ieee.org/document/9080557 | The ever-increasing demand for media streaming together with limited backhaul capacity renders developing efficient file-delivery methods imperative. One such method is femto-caching, which, despite its great potential, imposes several challenges such as efficient resource management. We study a resource allocation problem for joint caching and transmission in small cell networks, where the system operates in two consecutive phases: (i) cache placement, and (ii) joint file- and transmit power selection followed by broadcasting. We define the utility of every small base station in terms of the number of successful reconstructions per unit of transmission power. We then formulate the problem as to select a file from the cache together with a transmission power level for every broadcast round so that the accumulated utility over the horizon is maximized. The former problem boils down to a stochastic knapsack problem, and we cast the latter as a multi-armed bandit problem. We develop a solution to each problem and provide theoretical and numerical evaluations. In contrast to the state-of-the-art research, the proposed approach is especially suitable for networks with time-variant statistical properties. Moreover, it is applicable and operates well even when no initial information about the statistical characteristics of the random parameters such as file popularity and channel quality is available. | 10.1109/TWC.2020.2989179 | Journal | IEEE Transactions on Wireless Communications | Multi-armed bandits | |||
2020/01/01 00:00 | A primer on coupled state-switching models for multiple interacting time series | J. Pohle, R. Langrock, R. King, F.-H. Jensen, M. van der Schaar | 2020 | https://journals.sagepub.com/doi/10.1177/1471082X20956423 | State-switching models such as hidden Markov models or Markov-switching regression models are routinely applied to analyse sequences of observations that are driven by underlying non-observable states. Coupled state-switching models extend these approaches to address the case of multiple observation sequences whose underlying state variables interact. In this article, we provide an overview of the modelling techniques related to coupling in state-switching models, thereby forming a rich and flexible statistical framework particularly useful for modelling correlated time series. Simulation experiments demonstrate the relevance of being able to account for an asynchronous evolution as well as interactions between the underlying latent processes. The models are further illustrated using two case studies related to (a) interactions between a dolphin mother and her calf as inferred from movement data and (b) electronic health record data collected on 696 patients within an intensive care unit. | 10.1177/1471082X20956423 | Journal | Statistical Modelling | ||||
2020/01/01 00:00 | Anonymization through Data Synthesis using Generative Adversarial Networks (ADS-GAN) | J. Yoon, L. N. Drumright, M. van der Schaar | 2020 | https://ieeexplore.ieee.org/document/9034117 | The medical and machine learning communities are relying on the promise of artificial intelligence (AI) to transform medicine through enabling more accurate decisions and personalized treatment. However, progress is slow. Legal and ethical issues around unconsented patient data and privacy is one of the limiting factors in data sharing, resulting in a significant barrier in accessing routinely collected electronic health records (EHR) by the machine learning community. We propose a novel framework for generating synthetic data that closely approximates the joint distribution of variables in an original EHR dataset, providing a readily accessible, legally and ethically appropriate solution to support more open data sharing, enabling the development of AI solutions. In order to address issues around lack of clarity in defining sufficient anonymization, we created a quantifiable, mathematical definition for “identifiability”. We used a conditional generative adversarial networks (GAN) framework to generate synthetic data while minimize patient identifiability that is defined based on the probability of re-identification given the combination of all data on any individual patient. We compared models fitted to our synthetically generated data to those fitted to the real data across four independent datasets to evaluate similarity in model performance, while assessing the extent to which original observations can be identified from the synthetic data. Our model, ADS-GAN, consistently outperformed state-of-the-art methods, and demonstrated reliability in the joint distributions. We propose that this method could be used to develop datasets that can be made publicly available while considerably lowering the risk of breaching patient confidentiality. | 10.1109/JBHI.2020.2980262 | Journal | IEEE Journal of Biomedical and Health Informatics | Privacy-preserving ML & synthetic data | |||
2020/01/01 00:00 | Aspirin for Primary Prevention of ST Segment Elevation Myocardial Infarction in Persons with Diabetes and Multiple Risk Factors | R. Bugiardini, S. Pavasović, S. Kedev, M. Vavlukis, Z. Vasiljevic, M. Bergami, D. Miličić, O. Manfrini, E. Cenko, L. Badimon, J. Yoon, M. van der Schaar | 2020 | https://www.thelancet.com/journals/eclinm/article/PIIS2589-5370(20)30292-3/fulltext | Controversy exists as to whether low-dose aspirin use may give benefit in primary prevention of cardiovascular (CV) events. We hypothesized that the benefits of aspirin are underevaluated. We investigated 12,123 Caucasian patients presenting to hospital with acute coronary syndromes as first manifestation of CV disease from 2010 to 2019 in the ISACS-TC multicenter registry (ClinicalTrials.gov, NCT01218776). Individual risk of ST segment elevation myocardial infarction (STEMI) and its association with 30-day mortality was quantified using inverse probability of treatment weighting models matching for concomitant medications. Estimates were compared by test of interaction on the log scale. The risk of STEMI was lower in the aspirin users (absolute reduction: 6·8%; OR: 0·73; 95%CI: 0·65–0·82) regardless of sex (p for interaction=0·1962) or age (p for interaction=0·1209). Benefits of aspirin were seen in patients with hypertension, hypercholesterolemia, and in smokers. In contrast, aspirin failed to demonstrate a significant risk reduction in STEMI among diabetic patients (OR:1·10;95%CI:0·89–1·35) with a significant interaction (p: <0·0001) when compared with controls (OR:0·64,95%CI:0·56–0·73). Stratification of diabetes in risk categories revealed benefits (p interaction=0·0864) only in patients with concomitant hypertension and hypercholesterolemia (OR:0·87, 95% CI:0·65–1·15), but not in smokers. STEMI was strongly related to 30-day mortality (OR:1·93; 95%CI:1·59–2·35). Low-dose aspirin reduces the risk of STEMI as initial manifestation of CV disease with potential benefit in mortality. Patients with diabetes derive substantial benefit from aspirin only in the presence of multiple risk factors. In the era of precision medicine, a more tailored strategy is required. | 10.1016/j.eclinm.2020.100548 | Journal | The Lancet EClinicalMedicine | ||||
2020/01/01 00:00 | AutoCP: Automated Pipelines for Accurate Prediction Intervals | Y. Zhang, W. R. Zame, M. van der Schaar | 2020 | https://arxiv.org/abs/2006.14099 | Successful application of machine learning models to real-world prediction problems, e.g. financial forecasting and personalized medicine, has proved to be challenging, because such settings require limiting and quantifying the uncertainty in the model predictions, i.e. providing valid and accurate prediction intervals. Conformal Prediction is a distribution-free approach to construct valid prediction intervals in finite samples. However, the prediction intervals constructed by Conformal Prediction are often (because of over-fitting, inappropriate measures of nonconformity, or other issues) overly conservative and hence inadequate for the application(s) at hand. This paper proposes an AutoML framework called Automatic Machine Learning for Conformal Prediction (AutoCP). Unlike the familiar AutoML frameworks that attempt to select the best prediction model, AutoCP constructs prediction intervals that achieve the user-specified target coverage rate while optimizing the interval length to be accurate and less conservative. We tested AutoCP on a variety of datasets and found that it significantly outperforms benchmark algorithms. | Other | Automated ML, Uncertainty estimation | |||||
2020/01/01 00:00 | Between-centre differences for COVID-19 ICU mortality from early data in England | Z. Qian, A. M. Alaa, M. van der Schaar, A. Ercole | 2020 | https://link.springer.com/article/10.1007%2Fs00134-020-06150-y | Since the first cases in November 2019, the spread of SARS-CoV-2 infections has placed unprecedented strain on healthcare. The intensive care unit (ICU) is of particular concern as large numbers of patients with severe respiratory complications mean that in some areas, ICUs have been completely overwhelmed. Understanding determinants of ICU outcome is crucial both for surge planning and shared decision making. Whilst a number of risk scores have been published, they do not specifically look at this population. Furthermore, ICU availability, admission policy and structure vary across Europe, as do demographics and government policy. Thus, it is likely that ICU outcomes could also vary significantly by region motivating an individualised modelling approach. UK mortality has been particularly high, and we sought to urgently identify predictors of mortality in patients admitted to the ICU with COVID-19. | 10.1007-s00134-020-06150-y | Journal | Intensive Care Medicine | ||||
2020/01/01 00:00 | CPAS: the UK's National Machine Learning-based Hospital Capacity Planning System for COVID-19 | Z. Qian, A. M. Alaa, M. van der Schaar | 2020 | https://pubmed.ncbi.nlm.nih.gov/33250568/ | The coronavirus disease 2019 (COVID-19) global pandemic poses the threat of overwhelming healthcare systems with unprecedented demands for intensive care resources. Managing these demands cannot be effectively conducted without a nationwide collective effort that relies on data to forecast hospital demands on the national, regional, hospital and individual levels. To this end, we developed the COVID-19 Capacity Planning and Analysis System (CPAS)—a machine learning-based system for hospital resource planning that we have successfully deployed at individual hospitals and across regions in the UK in coordination with NHS Digital. In this paper, we discuss the main challenges of deploying a machine learning-based decision support system at national scale, and explain how CPAS addresses these challenges by (1) defining the appropriate learning problem, (2) combining bottom-up and top-down analytical approaches, (3) using state-of-the-art machine learning algorithms, (4) integrating heterogeneous data sources, and (5) presenting the result with an interactive and transparent interface. CPAS is one of the first machine learning-based systems to be deployed in hospitals on a national scale to address the COVID-19 pandemic—we conclude the paper with a summary of the lessons learned from this experience. | 10.1007/s10994-020-05921-4 | Journal | Machine Learning | ||||
2020/01/01 00:00 | Deep Generative Models for 3D Linker Design | F. Imrie, A. Bradley, M. van der Schaar, C. Deane | 2020 | https://pubs.acs.org/doi/10.1021/acs.jcim.9b01120 | Rational compound design remains a challenging problem for both computational methods and medicinal chemists. Computational generative methods have begun to show promising results for the design problem. However, they have not yet used the power of three-dimensional (3D) structural information. We have developed a novel graph-based deep generative model that combines state-of-the-art machine learning techniques with structural knowledge. Our method (“DeLinker”) takes two fragments or partial structures and designs a molecule incorporating both. The generation process is protein-context-dependent, utilizing the relative distance and orientation between the partial structures. This 3D information is vital to successful compound design, and we demonstrate its impact on the generation process and the limitations of omitting such information. In a large-scale evaluation, DeLinker designed 60% more molecules with high 3D similarity to the original molecule than a database baseline. When considering the more relevant problem of longer linkers with at least five atoms, the outperformance increased to 200%. We demonstrate the effectiveness and applicability of this approach on a diverse range of design problems: fragment linking, scaffold hopping, and proteolysis targeting chimera (PROTAC) design. As far as we are aware, this is the first molecular generative model to incorporate 3D structural information directly in the design process. The code is available at https://github.com/oxpig/DeLinker. | 10.1021/acs.jcim.9b01120 | Journal | Journal of Chemical Information and Modeling | Deep learning, Privacy-preserving ML & synthetic data | |||
2020/01/01 00:00 | Development of a Novel, Potentially Universal Machine Learning Algorithm for Prediction of Complications After Total Hip Arthroplasty | A. Shah, S. Devana, C. Lee, R. Kianian, N. SooHoo, M. van der Schaar | 2020 | https://www.arthroplastyjournal.org/article/S0883-5403(20)31300-0/fulltext | As the prevalence of hip osteoarthritis increases, the number of total hip arthroplasty (THA) procedures performed is also projected to increase. Accurately risk-stratifying patients who undergo THA would be of great utility, given the significant cost and morbidity associated with developing perioperative complications. We aim to develop a novel machine learning (ML)-based ensemble algorithm for the prediction of major complications after THA, as well as compare its performance against standard benchmark ML methods.This is a retrospective cohort study of 89,986 adults who underwent primary THA at any California-licensed hospital between 2015 and 2017. The primary outcome was major complications (eg infection, venous thromboembolism, cardiac complication, pulmonary complication). We developed a model predicting complication risk using AutoPrognosis, an automated ML framework that configures the optimally performing ensemble of ML-based prognostic models. We compared our model with logistic regression and standard benchmark ML models, assessing discrimination and calibration. There were 545 patients who had major complications (0.61%). Our novel algorithm was well-calibrated and improved risk prediction compared to logistic regression, as well as outperformed the other four standard benchmark ML algorithms. The variables most important for AutoPrognosis (eg malnutrition, dementia, cancer) differ from those that are most important for logistic regression (eg chronic atherosclerosis, renal failure, chronic obstructive pulmonary disease). We report a novel ensemble ML algorithm for the prediction of major complications after THA. It demonstrates superior risk prediction compared to logistic regression and other standard ML benchmark algorithms. By providing accurate prognostic information, this algorithm may facilitate more informed preoperative shared decision-making. | 10.1016/j.arth.2020.12.040 | Journal | The Journal of Arthroplasty | ||||
2020/01/01 00:00 | Feedback Adaptive Learning for Medical and Educational Application Recommendation | C. Tekin, S. Elahi, M. van der Schaar | 2020 | https://ieeexplore.ieee.org/document/9253629 | Recommending applications (apps) to improve health or educational outcomes requires long-term planning and adaptation based on the user feedback, as it is imperative to recommend the right app at the right time to improve engagement and benefit. We model the challenging task of app recommendation for these specific categories of apps-or alike-using a new reinforcement learning method referred to as episodic multi-armed bandit (eMAB). In eMAB, the learner recommends apps to individual users and observes their interactions with the recommendations on a weekly basis. It then uses this data to maximize the total payoff of all users by learning to recommend specific apps. Since computing the optimal recommendation sequence is intractable, as a benchmark, we define an oracle that sequentially recommends apps to maximize the expected immediate gain. Then, we propose our online learning algorithm, named FeedBack Adaptive Learning (FeedBAL), and prove that its regret with respect to the benchmark increases logarithmically in expectation. We demonstrate the effectiveness of FeedBAL on recommending mental health apps based on data from an app suite and show that it results in a substantial increase in the number of app sessions compared with episodic versions of ϵn -greedy, Thompson sampling, and collaborative filtering methods. | 10.1109/TSC.2020.3037224 | Journal | IEEE Transactions on Services Computing | Multi-armed bandits | Personalized Education | ||
2020/01/01 00:00 | Flexible Modelling of Longitudinal Medical Data: A Bayesian Nonparametric Approach | A. Bellot, M. van der Schaar | 2020 | https://dl.acm.org/doi/10.1145/3377164 | Using electronic medical records to learn personalized risk trajectories poses significant challenges because often very few samples are available in a patient’s history, and, when available, their information content is highly diverse. In this article, we consider how to integrate sparsely sampled longitudinal data, missing measurements informative of the underlying health status, and static information to estimate (dynamically, as new information becomes available) personalized survival distributions. We achieve this by developing a nonparametric probabilistic model that generates survival trajectories, and corresponding uncertainty estimates, from an ensemble of Bayesian trees in which time is incorporated explicitly to learn variable interactions over time, without needing to specify the longitudinal process beforehand. As such, the changing influence on survival of variables over time is inferred from the data directly, which we analyze with post-processing statistics derived from our model. | 10.1145/3377164 | Journal | ACM Transactions on Computing for Healthcare | Survival analysis competing risks & comorbidities, Time series analysis | |||
2020/01/01 00:00 | From real-world patient data to individualized treatment effects using machine learning: Current and future methods to address underlying challenges | I. Bica, A. M. Alaa, C. Lambert, M. van der Schaar | 2020 | https://ascpt.onlinelibrary.wiley.com/doi/abs/10.1002/cpt.1907 | Clinical decision making needs to be supported by evidence that treatments are beneficial to individual patients. Although randomized control trials (RCTs) are the gold standard for testing and introducing new drugs, due to the focus on specific questions with respect to establishing efficacy and safety vs. standard treatment, they do not provide a full characterization of the heterogeneity in the final intended treatment population. Conversely, real-world observational data, such as electronic health records (EHRs), contain large amounts of clinical information about heterogeneous patients and their response to treatments. In this paper, we introduce the main opportunities and challenges in using observational data for training machine learning methods to estimate individualized treatment effects and make treatment recommendations. We describe the modeling choices of the state-of-the-art machine learning methods for causal inference, developed for estimating treatment effects both in the cross-section and longitudinal settings. Additionally, we highlight future research directions that could lead to achieving the full potential of leveraging EHRs and machine learning for making individualized treatment recommendations. We also discuss how experimental data from RCTs and Pharmacometric and Quantitative Systems Pharmacology approaches can be used to not only improve machine learning methods, but also provide ways for validating them. These future research directions will require us to collaborate across the scientific disciplines to incorporate models based on RCTs and known disease processes, physiology, and pharmacology into these machine learning models based on EHRs to fully optimize the opportunity these data present. | 10.1002/cpt.1907 | Journal | Clinical Pharmacology & Therapeutics | Causal inference | Treatment & trials | ||
2020/01/01 00:00 | How Artificial Intelligence and Machine Learning Can Help Healthcare Systems Respond to COVID-19 | M. van der Schaar, A. M. Alaa, A. Floto, A. Gimson, S. Scholtes, A. Wood, E. McKinney, D. Jarrett, P. Lio, A. Ercole | 2020 | https://link.springer.com/article/10.1007/s10994-020-05928-x | The COVID-19 global pandemic is a threat not only to the health of millions of individuals, but also to the stability of infrastructure and economies around the world. The disease will inevitably place an overwhelming burden on healthcare systems that cannot be effectively dealt with by existing facilities or responses based on conventional approaches. We believe that a rigorous clinical and societal response can only be mounted by using intelligence derived from a variety of data sources to better utilize scarce healthcare resources, provide personalized patient management plans, inform policy, and expedite clinical trials. In this paper, we introduce five of the most important challenges in responding to COVID-19 and show how each of them can be addressed by recent developments in machine learning (ML) and artificial intelligence (AI). We argue that the integration of these techniques into local, national, and international healthcare systems will save lives, and propose specific methods by which implementation can happen swiftly and efficiently. We offer to extend these resources and knowledge to assist policymakers seeking to implement these techniques. | 10.1007/s10994-020-05928-x | Journal | Machine Learning | ||||
2020/01/01 00:00 | Large-scale ICU data sharing for global collaboration: the first 1633 critically ill COVID-19 patients in the Dutch Data Warehouse | L. Fleuren, ..., M. van der Schaar, ..., P. Thoral, P. Elbers | 2020 | https://link.springer.com/article/10.1007%2Fs00134-021-06361-x | The coronavirus disease 2019 (COVID-19) pandemic continues to stretch intensive care unit (ICU) capacity to its limits worldwide and optimizing management of critically ill COVID-19 patients remains urgently needed. Fortunately, most ICUs deploy Electronic Health Records (EHRs) to routinely capture high-frequency clinical information. These data reflect the clinical practice variation resulting from the novelty of COVID-19 as well as the variation in patient characteristics and outcomes between centers. Therefore, these data may be employed to better understand the clinical course of COVID-19 and individualize treatment. | 10.1007/s00134-021-06361-x | Journal | Intensive Care Medicine | ||||
2020/01/01 00:00 | Machine learning for clinical trials in the era of COVID-19 | W. R. Zame, I. Bica, C. Shen, A. Curth, H.-S. Lee, S. Bailey, J. Weatherall, D. Wright, F. Bretz, M. van der Schaar | 2020 | https://www.tandfonline.com/doi/full/10.1080/19466315.2020.1797867 | The world is in the midst of a pandemic. We still know little about the disease COVID-19 or about the virus (SARS-CoV-2) that causes it. We do not have a vaccine or a treatment (aside from managing symptoms). We do not know if recovery from COVID-19 produces immunity, and if so for how long, hence we do not know if “herd immunity” will eventually reduce the risk or if a successful vaccine can be developed – and this knowledge may be a long time coming. In the meantime, the COVID-19 pandemic is presenting enormous challenges to medical research, and to clinical trials in particular. This paper identifies some of those challenges and suggests ways in which machine learning can help in response to those challenges. We identify three areas of challenge: ongoing clinical trials for non-COVID-19 drugs; clinical trials for repurposing drugs to treat COVID-19, and clinical trials for new drugs to treat COVID-19. Within each of these areas, we identify aspects for which we believe machine learning can provide invaluable assistance. | 10.1080/19466315.2020.1797867 | Journal | Statistics in Biopharmaceutical Research | Next-generation clinical trials, Causal inference | Treatment & trials | ||
2020/01/01 00:00 | Opportunities for Machine Learning to Transform Care for People with Cystic Fibrosis | M. Abroshan, A. M. Alaa, O. Rayner, M. van der Schaar | 2020 | https://www.cysticfibrosisjournal.com/article/S1569-1993(20)30008-4/fulltext | The availability of high-quality data from patient registries provides a robust starting point for using Machine Learning (ML) techniques to enhance the care of the patient with cystic fibrosis (CF). Capitalizing on the wealth of information provided by registry data, ML techniques can augment clinical workflows by making individual-level predictions for a patient's prognosis that are tailored to their specific traits, features, and medical history. Such personalized approaches become especially relevant as CFTR modulators precipitate a shift to mutation-based medicine. ML-based techniques can help provide clinicians with a refined understanding of patient heterogeneity. Here, we discuss several areas where ML techniques can help underpin a personalized approach to patient management. | 10.1016/j.jcf.2020.01.002 | Journal | Journal of Cystic Fibrosis | ||||
2020/01/01 00:00 | Outcome-Oriented Deep Temporal Phenotyping of Disease Progression | C. Lee, J. Rashbass, M. van der Schaar | 2020 | https://pubmed.ncbi.nlm.nih.gov/33259292/ | Chronic diseases evolve slowly throughout a patient's lifetime creating heterogeneous progression patterns that make clinical outcomes remarkably varied across individual patients. A tool capable of identifying temporal phenotypes based on the patients different progression patterns and clinical outcomes would allow clinicians to better forecast disease progression by recognizing a group of similar past patients, and to better design treatment guidelines that are tailored to specific phenotypes. To build such a tool, we propose a deep learning approach, which we refer to as outcome-oriented deep temporal phenotyping (ODTP), to identify temporal phenotypes of disease progression considering what type of clinical outcomes will occur and when based on the longitudinal observations. More specifically, we model clinical outcomes throughout a patient's longitudinal observations via time-to-event (TTE) processes whose conditional intensity functions are estimated as non-linear functions using a recurrent neural network. Temporal phenotyping of disease progression is carried out by our novel loss function that is specifically designed to learn discrete latent representations that best characterize the underlying TTE processes. The key insight here is that learning such discrete representations groups progression patterns considering the similarity in expected clinical outcomes, and thus naturally provides outcome-oriented temporal phenotypes. We demonstrate the power of ODTP by applying it to a real-world heterogeneous cohort of 11 779 stage III breast cancer patients from the U.K. National Cancer Registration and Analysis Service. The experiments show that ODTP identifies temporal phenotypes that are strongly associated with the future clinical outcomes and achieves significant gain on the homogeneity and heterogeneity measures over existing methods. Furthermore, we are able to identify the key driving factors that lead to transitions between phenotypes which can be translated into actionable information to support better clinical decision-making. | 10.1109/TBME.2020.3041815 | Journal | IEEE Transactions on Biomedical Engineering | Deep learning, Survival analysis competing risks & comorbidities, Time series analysis | Phenotyping & subgroup analysis, Risk & disease trajectories | ||
2020/01/01 00:00 | Predicting the Risk of Inpatient Hypoglycemia with Machine Learning using Electronic Health Records | Y. Ruan, A. Bellot, Z. Moysova, G. D. Tan, A. Lumb, J. Davies, M. van der Schaar, R. Rea | 2020 | https://care.diabetesjournals.org/content/early/2020/05/11/dc19-1743 | We analyzed data from inpatients with diabetes admitted to a large university hospital to predict the risk of hypoglycemia through the use of machine learning algorithms. Four years of data were extracted from a hospital electronic health record system. This included laboratory and point-of-care blood glucose (BG) values to identify biochemical and clinically significant hypoglycemic episodes (BG ≤3.9 and ≤2.9 mmol/L, respectively). We used patient demographics, administered medications, vital signs, laboratory results, and procedures performed during the hospital stays to inform the model. Two iterations of the data set included the doses of insulin administered and the past history of inpatient hypoglycemia. Eighteen different prediction models were compared using the area under the receiver operating characteristic curve (AUROC) through a 10-fold cross validation. We analyzed data obtained from 17,658 inpatients with diabetes who underwent 32,758 admissions between July 2014 and August 2018. The predictive factors from the logistic regression model included people undergoing procedures, weight, type of diabetes, oxygen saturation level, use of medications (insulin, sulfonylurea, and metformin), and albumin levels. The machine learning model with the best performance was the XGBoost model (AUROC 0.96). This outperformed the logistic regression model, which had an AUROC of 0.75 for the estimation of the risk of clinically significant hypoglycemia. Advanced machine learning models are superior to logistic regression models in predicting the risk of hypoglycemia in inpatients with diabetes. Trials of such models should be conducted in real time to evaluate their utility to reduce inpatient hypoglycemia. | 10.2337/dc19-1743 | Journal | Diabetes Care | Risk & prognosis | |||
2020/01/01 00:00 | Prior Beta-Blocker Therapy for Hypertension and Sex-Based Differences in Heart Failure Among Patients with Incident Coronary Heart Disease | R. Bugiardini, J. Yoon, S. Kedev, G. Stankovic, Z. Vasiljevic, D. Miličić, O. Manfrini, M. van der Schaar, C. Gale, L. Badimon, E. Cenko | 2020 | https://www.ahajournals.org/doi/10.1161/HYPERTENSIONAHA.120.15323 | The usefulness of β-blockers has been questioned for patients who have hypertension without a prior manifestation of coronary heart disease or heart failure. In addition, sex-based differences in the efficacy of β-blockers for prevention of heart failure during acute myocardial ischemia have never been evaluated. We explored whether the effect of β-blocker therapy varied according to the sex among patients with hypertension who have no prior history of cardiovascular disease. Data were drawn from the ISACS (International Survey of Acute Coronary Syndromes)-Archives. The study population consisted of 13 764 patients presenting with acute coronary syndromes. There were 2590 patients in whom hypertension was treated previously with β-blocker (954 women and 1636 men). Primary outcome measure was the incidence of heart failure according to Killip class classification. Subsidiary analyses were conducted to estimate the association between heart failure and all-cause mortality at 30 days. Outcome rates were assessed using the inverse probability of treatment weighting and logistic regression models. Estimates were compared by test of interaction on the log scale. Among patients taking β-blockers before admission, there was an absolute difference of 4.6% between women and men in the rate of heart failure (Killip ≥2) at hospital presentation (21.3% versus 16.7%; relative risk ratio, 1.35 [95% CI, 1.10–1.65]). On the opposite, the rate of heart failure was approximately similar among women and men who did not receive β-blockers (17.2% versus 16.1%; relative risk ratio, 1.09 [95% CI, 0.97–1.21]). The test of interaction identified a significant (P=0.034) association between sex and β-blocker therapy. Heart failure was predictive of mortality at 30-day either in women (odds ratio, 7.54 [95% CI, 5.78–9.83]) or men (odds ratio, 9.62 [95% CI, 7.67–12.07]). In conclusion, β-blockers use may be an acute precipitant of heart failure in new-onset coronary heart disease among women, but not men. Heart failure increases the risk of death. | 10.1161/HYPERTENSIONAHA.120.15323 | Journal | Hypertension | Treatment & trials | |||
2020/01/01 00:00 | Reputational Dynamics in Financial Networks During a Crisis | S. Zhang, M. van der Schaar | 2020 | https://www.sciencedirect.com/science/article/pii/S1572308920300589 | Firm reputation plays a vital role in financial networks, and it is especially impactful in times of market stress or a financial crisis. This paper uses a novel theoretical model to study reputational dynamics in financial networks, taking into account that firms begin with incomplete information, learn about others over time, and update their connections as their beliefs evolve. In our model, stronger firms develop high reputations and remain in the network, while weaker firms will eventually drop in reputation and get shut out. We show that more information revelation during crisis generally increases network fragility and harms social welfare. It is thus crucial to maintain anonymity among firms during a crisis. Certain network structures, such as core-periphery networks, are more systemically resilient against negative informational effects. | 10.1016/j.jfs.2020.100759 | Journal | Journal of Financial Stability | ||||
2020/01/01 00:00 | Retrospective cohort study of admission timing and mortality following COVID-19 infection in England | A. M. Alaa, Z. Qian, J. Rashbass, J. Benger, M. van der Schaar | 2020 | https://bmjopen.bmj.com/content/10/11/e042712 | Objectives We investigated whether the timing of hospital admission is associated with the risk of mortality for patients with COVID-19 in England, and the factors associated with a longer interval between symptom onset and hospital admission. Design Retrospective observational cohort study of data collected by the COVID-19 Hospitalisation in England Surveillance System (CHESS). Data were analysed using multivariate regression analysis. Setting Acute hospital trusts in England that submit data to CHESS routinely. Participants Of 14 150 patients included in CHESS until 13 May 2020, 401 lacked a confirmed diagnosis of COVID-19 and 7666 lacked a recorded date of symptom onset. This left 6083 individuals, of whom 15 were excluded because the time between symptom onset and hospital admission exceeded 3 months. The study cohort therefore comprised 6068 unique individuals. Main outcome measures All-cause mortality during the study period. Results Timing of hospital admission was an independent predictor of mortality following adjustment for age, sex, comorbidities, ethnicity and obesity. Each additional day between symptom onset and hospital admission was associated with a 1% increase in mortality risk (HR 1.01; p<0.005). Healthcare workers were most likely to have an increased interval between symptom onset and hospital admission, as were people from Black, Asian and minority ethnic (BAME) backgrounds, and patients with obesity. Conclusion The timing of hospital admission is associated with mortality in patients with COVID-19. Healthcare workers and individuals from a BAME background are at greater risk of later admission, which may contribute to reports of poorer outcomes in these groups. Strategies to identify and admit patients with high-risk and those showing signs of deterioration in a timely way may reduce the consequent mortality from COVID-19, and should be explored. | 10.1136/bmjopen-2020-042712 | Journal | BMJ Open | ||||
2020/01/01 00:00 | Risk-Aware Multi-Armed Bandits with Refined Upper Confidence Bounds | X. Liu, M. Derakhshani, S. Lambotharan, M. van der Schaar | 2020 | https://ieeexplore.ieee.org/document/9309321 | The classical multi-armed bandit (MAB) framework studies the exploration-exploitation dilemma of the decisionmaking problem and always treats the arm with the highest expected reward as the optimal choice. However, in some applications, an arm with a high expected reward can be risky to play if the variance is high. Hence, the variation of the reward should be considered to make the arm-selection process risk-aware. In this letter, the mean-variance metric is investigated to measure the uncertainty of the received rewards. We first study a risk-aware MAB problem when the reward follows a Gaussian distribution, and a concentration inequality on the variance is developed to design a Gaussian risk aware-upper confidence bound algorithm. Furthermore, we extend this algorithm to a novel asymptotic risk aware-upper confidence bound algorithm by developing an upper confidence bound of the variance based on the asymptotic distribution of the sample variance. Theoretical analysis proves that both proposed algorithms achieve the O(log(T)) regret. Finally, numerical results demonstrate that our algorithms outperform several risk-aware MAB algorithms. | 10.1109/LSP.2020.3047725 | Journal | IEEE Signal Processing Letters | Multi-armed bandits | |||
2020/01/01 00:00 | Sex Differences in Modifiable Risk Factors and Severity of Coronary Artery Disease | R. Bugiardini, S. Kedev, O. Manfrini, M. Valvukis, G. Stankovic, M. Scarpone, D. Miličić, Z. Vasiljevic, L. Badimon, E. Cenko, J. Yoon, M. van der Schaar | 2020 | https://www.ahajournals.org/doi/10.1161/JAHA.120.017235 | It is still unknown whether traditional risk factors may have a sex‐specific impact on coronary artery disease (CAD) burden. We identified 14 793 patients who underwent coronary angiography for acute coronary syndromes in the ISACS‐TC (International Survey of Acute Coronary Syndromes in Transitional Countries; ClinicalTrials.gov, NCT01218776) registry from 2010 to 2019. The main outcome measure was the association between traditional risk factors and severity of CAD and its relationship with 30‐day mortality. Relative risk (RR) ratios and 95% CIs were calculated from the ratio of the absolute risks of women versus men using inverse probability of weighting. Estimates were compared by test of interaction on the log scale. Severity of CAD was categorized as obstructive (≥50% stenosis) versus nonobstructive CAD. The RR ratio for obstructive CAD in women versus men among people without diabetes mellitus was 0.49 (95% CI, 0.41–0.60) and among those with diabetes mellitus was 0.89 (95% CI, 0.62–1.29), with an interaction by diabetes mellitus status of P =0.002. Exposure to smoking shifted the RR ratios from 0.50 (95% CI, 0.41–0.61) in nonsmokers to 0.75 (95% CI, 0.54–1.03) in current smokers, with an interaction by smoking status of P=0.018. There were no significant sex‐related interactions with hypercholesterolemia and hypertension. Women with obstructive CAD had higher 30‐day mortality rates than men (RR, 1.75; 95% CI, 1.48–2.07). No sex differences in mortality were observed in patients with nonobstructive CAD. Obstructive CAD in women signifies a higher risk for mortality compared with men. Current smoking and diabetes mellitus disproportionally increase the risk of obstructive CAD in women. Achieving the goal of improving cardiovascular health in women still requires intensive efforts toward further implementation of lifestyle and treatment interventions. | 10.1161/JAHA.120.017235 | Journal | Journal of the American Heart Association | ||||
2020/01/01 00:00 | Synthetic Data: Opening the data floodgates to enable faster, more directed development of machine learning methods | J. Jordon, A. Wilson, M. van der Schaar | 2020 | https://arxiv.org/abs/2012.04580 | Many ground-breaking advancements in machine learning can be attributed to the availability of a large volume of rich data. Unfortunately, many large-scale datasets are highly sensitive, such as healthcare data, and are not widely available to the machine learning community. Generating synthetic data with privacy guarantees provides one such solution, allowing meaningful research to be carried out "at scale" - by allowing the entirety of the machine learning community to potentially accelerate progress within a given field. In this article, we provide a high-level view of synthetic data: what it means, how we might evaluate it and how we might use it. | Other | Privacy-preserving ML & synthetic data | |||||
2020/01/01 00:00 | The Value of Patient and Tumor Factors in Predicting Preoperative Breast MRI Outcomes | H. Rahbar, D. S. Hippe, A. M. Alaa, S. H. Cheeney, M. van der Schaar, S. C. Partridge, C. I. Lee | 2020 | https://pubs.rsna.org/doi/10.1148/rycan.2020190099 | Purpose To identify patient and tumor features that predict true-positive, false-positive, and negative breast preoperative MRI outcomes. Materials and Methods Using a breast MRI database from a large regional cancer center, the authors retrospectively identified all women with unilateral breast cancer who underwent preoperative MRI from January 2005 to February 2015. A total of 1396 women with complete data were included. Patient features (ie, age, breast density) and index tumor features (ie, type, grade, hormone receptor, human epidermal growth factor receptor type 2/neu, Ki-67) were extracted and compared with preoperative MRI outcomes (ie, true positive, false positive, negative) using univariate (ie, Fisher exact) and multivariate machine learning approaches (ie, least absolute shrinkage and selection operator, AutoPrognosis). Overall prediction performance was summarized using the area under the receiver operating characteristic curve (AUC), calculated using internal validation techniques (bootstrap and cross-validation) to account for model training. Results At the examination level, 181 additional cancers were identified among 1396 total preoperative MRI examinations (median patient age, 56 years; range, 25–94 years), resulting in a positive predictive value for biopsy of 43% (181 true-positive findings of 419 core-needle biopsies). In univariate analysis, no patient or tumor feature was associated with a true-positive outcome (P > .05), although greater mammographic density (P = .022) and younger age (< 50 years, P = .025) were associated with false-positive examinations. Machine learning approaches provided weak performance for predicting true-positive, false-positive, and negative examinations (AUC range, 0.50–0.57). Conclusion Commonly used patient and tumor factors driving expert opinion for the use of preoperative MRI provide limited predictive value for determining preoperative MRI outcomes in women. | 10.1148/rycan.2020190099 | Journal | Radiology: Imaging Cancer | Medical imaging, Risk & prognosis | |||
2019/12/08 00:00 | Attentive State-Space Modeling of Disease Progression | A. M. Alaa, M. van der Schaar | 2019 | https://papers.nips.cc/paper/2019/hash/1d0932d7f57ce74d9d9931a2c6db8a06-Abstract.html | Models of disease progression are instrumental for predicting patient outcomes and understanding disease dynamics. Existing models provide the patient with pragmatic (supervised) predictions of risk, but do not provide the clinician with intelligible (unsupervised) representations of disease pathophysiology. In this paper, we develop the attentive state-space model, a deep probabilistic model that learns accurate and interpretable structured representations for disease trajectories. Unlike Markovian state-space models, in which the dynamics are memoryless, our model uses an attention mechanism to create "memoryful" dynamics, whereby attention weights determine the dependence of future disease states on past medical history. To learn the model parameters from medical records, we develop an infer ence algorithm that simultaneously learns a compiled inference network and the model parameters, leveraging the attentive state-space representation to construct a "Rao-Blackwellized" variational approximation of the posterior state distribution. Experiments on data from the UK Cystic Fibrosis registry show that our model demonstrates superior predictive accuracy and provides insights into the progression of chronic disease. | Conference | NeurIPS | Deep learning, Interpretability & explainability, Survival analysis competing risks & comorbidities, Time series analysis | Risk & disease trajectories | |||
2019/12/08 00:00 | Conditional Independence Testing using Generative Adversarial Networks | A. Bellot, M. van der Schaar | 2019 | https://papers.nips.cc/paper/2019/hash/dc87c13749315c7217cdc4ac692e704c-Abstract.html | We consider the hypothesis testing problem of detecting conditional dependence, with a focus on high-dimensional feature spaces. Our contribution is a new test statistic based on samples from a generative adversarial network designed to approximate directly a conditional distribution that encodes the null hypothesis, in a manner that maximizes power (the rate of true negatives). We show that such an approach requires only that density approximation be viable in order to ensure that we control type I error (the rate of false positives); in particular, no assumptions need to be made on the form of the distributions or feature dependencies. Using synthetic simulations with high-dimensional data we demonstrate significant gains in power over competing methods. In addition, we illustrate the use of our test to discover causal markers of disease in genetic data. | Conference | NeurIPS | Causal inference, Deep learning, Feature selection, Time series analysis | ||||
2019/12/08 00:00 | Demystifying Black-box Models with Symbolic Metamodels | A. M. Alaa, M. van der Schaar | 2019 | https://papers.nips.cc/paper/2019/hash/567b8f5f423af15818a068235807edc0-Abstract.html | Understanding the predictions of a machine learning model can be as crucial as the model's accuracy in many application domains. However, the black-box nature of most highly-accurate (complex) models is a major hindrance to their interpretability. To address this issue, we introduce the symbolic metamodeling framework — a general methodology for interpreting predictions by converting "black-box" models into "white-box" functions that are understandable to human subjects. A symbolic metamodel is a model of a model, i.e., a surrogate model of a trained (machine learning) model expressed through a succinct symbolic expression that comprises familiar mathematical functions and can be subjected to symbolic manipulation. We parameterize symbolic metamodels using Meijer G-functions — a class of complex-valued contour integrals that depend on scalar parameters, and whose solutions reduce to familiar elementary, algebraic, analytic and closed-form functions for different parameter settings. This parameterization enables efficient optimization of metamodels via gradient descent, and allows discovering the functional forms learned by a machine learning model with minimal a priori assumptions. We show that symbolic metamodeling provides an all-encompassing framework for model interpretation — all common forms of global and local explanations of a model can be analytically derived from its symbolic metamodel. | Conference | NeurIPS | Feature selection, Interpretability & explainability | ||||
2019/12/08 00:00 | Differentially Private Bagging: Improved Utility and Cheaper Privacy than Subsample-and-Aggregate | J. Jordon, J. Yoon, M. van der Schaar | 2019 | https://proceedings.neurips.cc/paper/2019/hash/5dec707028b05bcbd3a1db5640f842c5-Abstract.html | Differential Privacy is a popular and well-studied notion of privacy. In the era ofbig data that we are in, privacy concerns are becoming ever more prevalent and thusdifferential privacy is being turned to as one such solution. A popular method forensuring differential privacy of a classifier is known as subsample-and-aggregate,in which the dataset is divided into distinct chunks and a model is learned on eachchunk, after which it is aggregated. This approach allows for easy analysis of themodel on the data and thus differential privacy can be easily applied. In this paper,we extend this approach by dividing the data several times (rather than just once)and learning models on each chunk within each division. The first benefit of thisapproach is the natural improvement of utility by aggregating models trained ona more diverse range of subsets of the data (as demonstrated by the well-knownbagging technique). The second benefit is that, through analysis that we provide inthe paper, we can derive tighter differential privacy guarantees when several queriesare made to this mechanism. In order to derive these guarantees, we introducethe upwards and downwards moments accountants and derive bounds for thesemoments accountants in a data-driven fashion. We demonstrate the improvementsour model makes over standard subsample-and-aggregate in two datasets (HeartFailure (private) and UCI Adult (public)). | Conference | NeurIPS | Deep learning, Privacy-preserving ML & synthetic data | ||||
2019/12/08 00:00 | Time-series Generative Adversarial Networks | J. Yoon, D. Jarrett, M. van der Schaar | 2019 | https://papers.nips.cc/paper/2019/hash/c9efe5f26cd17ba6216bbe2a7d26d490-Abstract.html | A good generative model for time-series data should preserve temporal dynamics, in the sense that new sequences respect the original relationships between variables across time. Existing methods that bring generative adversarial networks (GANs) into the sequential setting do not adequately attend to the temporal correlations unique to time-series data. At the same time, supervised models for sequence prediction - which allow finer control over network dynamics - are inherently deterministic. We propose a novel framework for generating realistic time-series data that combines the flexibility of the unsupervised paradigm with the control afforded by supervised training. Through a learned embedding space jointly optimized with both supervised and adversarial objectives, we encourage the network to adhere to the dynamics of the training data during sampling. Empirically, we evaluate the ability of our method to generate realistic samples using a variety of real and synthetic time-series datasets. Qualitatively and quantitatively, we find that the proposed framework consistently and significantly outperforms state-of-the-art benchmarks with respect to measures of similarity and predictive ability. | Conference | NeurIPS | Deep learning, Privacy-preserving ML & synthetic data, Time series analysis | ||||
2019/06/09 00:00 | Validating Causal Inference Models via Influence Functions | A. M. Alaa, M. van der Schaar | 2019 | http://proceedings.mlr.press/v97/alaa19a.html | The problem of estimating causal effects of treatments from observational data falls beyond the realm of supervised learning {—} because counterfactual data is inaccessible, we can never observe the true causal effects. In the absence of "supervision", how can we evaluate the performance of causal inference methods? In this paper, we use influence functions {—} the functional derivatives of a loss function {—} to develop a model validation procedure that estimates the estimation error of causal inference methods. Our procedure utilizes a Taylor-like expansion to approximate the loss function of a method on a given dataset in terms of the influence functions of its loss on a "synthesized", proximal dataset with known causal effects. Under minimal regularity assumptions, we show that our procedure is consistent and efficient. Experiments on 77 benchmark datasets show that using our procedure, we can accurately predict the comparative performances of state-of-the-art causal inference methods applied to a given observational study. | Conference | ICML | Automated ML, Causal inference | Treatment & trials | |||
2019/05/06 00:00 | INVASE: Instance-wise Variable Selection using Neural Networks | J. Yoon, J. Jordon, M. van der Schaar | 2019 | https://openreview.net/pdf?id=BJg_roAcK7 | The advent of big data brings with it data with more and more dimensions and thus a growing need to be able to efficiently select which features to use for a variety of problems. While global feature selection has been a well-studied problem for quite some time, only recently has the paradigm of instance-wise feature selection been developed. In this paper, we propose a new instance-wise feature selection method, which we term INVASE. INVASE consists of 3 neural networks, a selector network, a predictor network and a baseline network which are used to train the selector network using the actor-critic methodology. Using this methodology, INVASE is capable of flexibly discovering feature subsets of a different size for each instance, which is a key limitation of existing state-of-the-art methods. We demonstrate through a mixture of synthetic and real data experiments that INVASE significantly outperforms state-of-the-art benchmarks. | Conference | ICLR | Deep learning, Feature selection, Interpretability & explainability | ||||
2019/05/06 00:00 | KnockoffGAN: Generating Knockoffs for Feature Selection using Generative Adversarial Networks | J. Jordon, J. Yoon, M. van der Schaar | 2019 | https://openreview.net/pdf?id=ByeZ5jC5YQ | Feature selection is a pervasive problem. The discovery of relevant features can be as important for performing a particular task (such as to avoid overfitting in prediction) as it can be for understanding the underlying processes governing the true label (such as discovering relevant genetic factors for a disease). Machine learning driven feature selection can enable discovery from large, high-dimensional, non-linear observational datasets by creating a subset of features for experts to focus on. In order to use expert time most efficiently, we need a principled methodology capable of controlling the False Discovery Rate. In this work, we build on the promising Knockoff framework by developing a flexible knockoff generation model. We adapt the Generative Adversarial Networks framework to allow us to generate knockoffs with no assumptions on the feature distribution. Our model consists of 4 networks, a generator, a discriminator, a stability network and a power network. We demonstrate the capability of our model to perform feature selection, showing that it performs as well as the originally proposed knockoff generation model in the Gaussian setting and that it outperforms the original model in non-Gaussian settings, including on a real-world dataset. | Conference | ICLR | Deep learning, Feature selection | ||||
2019/05/06 00:00 | PATE-GAN: Generating Synthetic Data with Differential Privacy Guarantees | J. Yoon, J. Jordon, M. van der Schaar | 2019 | https://openreview.net/pdf?id=S1zk9iRqF7 | Machine learning has the potential to assist many communities in using the large datasets that are becoming more and more available. Unfortunately, much of that potential is not being realized because it would require sharing data in a way that compromises privacy. In this paper, we investigate a method for ensuring (differential) privacy of the generator of the Generative Adversarial Nets (GAN) framework. The resulting model can be used for generating synthetic data on which algorithms can be trained and validated, and on which competitions can be conducted, without compromising the privacy of the original dataset. Our method modifies the Private Aggregation of Teacher Ensembles (PATE) framework and applies it to GANs. Our modified framework (which we call PATE-GAN) allows us to tightly bound the influence of any individual sample on the model, resulting in tight differential privacy guarantees and thus an improved performance over models with the same guarantees. We also look at measuring the quality of synthetic data from a new angle; we assert that for the synthetic data to be useful for machine learning researchers, the relative performance of two algorithms (trained and tested) on the synthetic dataset should be the same as their relative performance (when trained and tested) on the original dataset. Our experiments, on various datasets, demonstrate that PATE-GAN consistently outperforms the state-of-the-art method with respect to this and other notions of synthetic data quality. | Conference | ICLR | Deep learning, Privacy-preserving ML & synthetic data | ||||
2019/04/16 00:00 | Boosting Survival Predictions with Auxiliary Data from Heterogeneous Domains | A. Bellot, M. van der Schaar | 2019 | http://proceedings.mlr.press/v89/bellot19a.html | Survival models derived from health care data are an important support to inform critical screening and therapeutic decisions. Most models however, do not generalize to populations outside the marginal and conditional distribution assumptions for which they were derived. This presents a significant barrier to the deployment of machine learning techniques into wider clinical practice as most medical studies are data scarce, especially for the analysis of time-to-event outcomes. In this work we propose a survival prediction model that is able to improve predictions on a small data domain of interest - such as a local hospital - by leveraging related data from other domains - such as data from other hospitals. We construct an ensemble of weak survival predictors which iteratively adapt the marginal distributions of the source and target data such that similar source patients contribute to the fit and ultimately improve predictions on target patients of interest. This represents the first boosting-based transfer learning algorithm in the survival analysis literature. We demonstrate the performance and utility of our algorithm on synthetic and real healthcare data collected at various locations. | Conference | AISTATS | Ensemble learning, Transfer learning | Risk & prognosis | |||
2019/04/16 00:00 | Sequential Patient Recruitment and Allocation for Adaptive Clinical Trials | O. Atan, W. R. Zame, M. van der Schaar | 2019 | http://proceedings.mlr.press/v89/atan19a.html | Randomized Controlled Trials (RCTs) are the gold standard for comparing the effectiveness of a new treatment to the current one (the control). Most RCTs allocate the patients to the treatment group and the control group by uniform randomization. We show that this procedure can be highly sub-optimal (in terms of learning) if – as is often the case – patients can be recruited in cohorts (rather than all at once), the effects on each cohort can be observed before recruiting the next cohort, and the effects are heterogeneous across identifiable subgroups of patients. We formulate the patient allocation problem as a finite stage Markov Decision Process in which the objective is to minimize a given weighted combination of type-I and type-II errors. Because finding the exact solution to this Markov Decision Process is computationally intractable, we propose an algorithm Knowledge Gradient for Randomized Controlled Trials (RCT-KG) – that yields an approximate solution. Our experiment on a synthetic dataset with Bernoulli outcomes shows that for a given size of trial our method achieves significant reduction in error, and to achieve a prescribed level of confidence (in identifying whether the treatment is superior to the control), our method requires many fewer patients. | Conference | AISTATS | Next-generation clinical trials, Reinforcement learning | Treatment & trials | |||
2019/04/16 00:00 | Temporal Quilting for Survival Analysis | C. Lee, W. R. Zame, A. M. Alaa, M. van der Schaar | 2019 | http://proceedings.mlr.press/v89/lee19a.html | The importance of survival analysis in many disciplines (especially in medicine) has led to the development of a variety of approaches to modeling the survival function. Models constructed via various approaches offer different strengths and weaknesses in terms of discriminative performance and calibration, but no one model is best across all datasets or even across all time horizons within a single dataset. Because we require both good calibration and good discriminative performance over different time horizons, conventional model selection and ensemble approaches are not applicable. This paper develops a novel approach that combines the collective intelligence of different underlying survival models to produce a valid survival function that is well-calibrated and offers superior discriminative performance at different time horizons. Empirical results show that our approach provides significant gains over the benchmarks on a variety of real-world datasets. | Conference | AISTATS | Automated ML, Ensemble learning, Survival analysis competing risks & comorbidities | Risk & prognosis | |||
2019/01/01 00:00 | A Bandit Learning Approach to Energy-Efficient Femto-Caching under Uncertainty | S. Maghsudi, M. van der Schaar | 2019 | https://ieeexplore.ieee.org/document/9013630 | We address a resource allocation problem for joint caching and broadcast transmission in small cell networks with time-varying statistical properties. Each small base station (SBS) selects some files to store in its capacity-limited cache, given no prior information about the random and dynamic parameters such as file popularity, channel quality, and network traffic. Moreover, at consecutive rounds, a file is selected from the cache to broadcast. We define the utility of the SBS in terms of the number of successful file receptions per power consumption. The problem is formulated as to place the cache, and afterward select a file from the cache together with a transmission power for every broadcast round. The goal is to maximize the accumulated utility over the horizon. Therefore, we decompose the initial problem into two sub- problems: (i) cache placement, and (ii) joint file- and transmit power selection. The former problem boils down to a stochastic knapsack problem with stationary items' value, whereas the latter is cast as a multi-armed bandit problem with mortal arms. We develop a solution to each problem and evaluate the proposed solutions by theoretical and numerical analysis. | 10.1109/GLOBECOM38437.2019.9013630 | Conference | IEEE Global Communications Conference (GLOBECOM) | Multi-armed bandits | |||
2019/01/01 00:00 | A Non-stationary Online Learning Approach to Mobility Management | Y. Zhou, C. Shen, M. van der Schaar | 2019 | https://ieeexplore.ieee.org/document/8624614 | Efficient mobility management is an important problem in modern wireless networks with heterogeneous cell sizes and increased node densities. We show that optimization-based mobility protocols cannot achieve long-term optimal performance, particularly for ultra-dense networks in a time-varying environment. To address the complex system dynamics, especially the possible change of statistics due to user movement and environment changes, we propose piece-wise stationary online-learning algorithms to learn the varying throughput distribution and solve the frequent handover problem. The proposed MMBD/MMBSW algorithms are proved to achieve sublinear regret performance in finite time horizon and a linear, non-trivial rigorous regret bound for infinite time horizon. We also study the robustness of the MMBD/MMBSW algorithms under delayed or missing feedback. The simulations show that the proposed algorithms can outperform 3GPP protocols with optimal thresholds. More importantly, they are more robust to system dynamics which are commonly present in practical ultra-dense wireless networks. | 10.1109/TWC.2019.2893168 | Journal | IEEE Transactions on Wireless Communications | Communications and Networks | |||
2019/01/01 00:00 | ASAC: Active Sensing using Actor-Critic Models | J. Yoon, J. Jordon, M. van der Schaar | 2019 | http://proceedings.mlr.press/v106/yoon19a.html | Deciding what and when to observe is critical when making observations is costly. In a medical setting where observations can be made sequentially, making these observations (or not) should be an active choice. We refer to this as the active sensing problem. In this paper, we propose a novel deep learning framework, which we call ASAC (Active Sensing using Actor-Critic models) to address this problem. ASAC consists of two networks: a selector network and a predictor network. The selector network uses previously selected observations to determine what should be observed in the future. The predictor network uses the observations selected by the selector network to predict a label, providing feedback to the selector network (well-selected variables should be predictive of the label). The goal of the selector network is then to select variables that balance the cost of observing the selected variables with their predictive power; we wish to preserve the conditional label distribution. During training, we use the actor-critic models to allow the loss of the selector to be “back-propagated” through the sampling process. The selector network “acts” by selecting future observations to make. The predictor network acts as a “critic” by feeding predictive errors for the selected variables back to the selector network. In our experiments, we show that ASAC significantly outperforms state-of-the-arts in two real-world medical datasets. | Conference | Machine Learning for Healthcare Conference (MLHC) | Deep learning, Feature selection, Time series analysis | Screening | |||
2019/01/01 00:00 | Cardiovascular Disease Risk Prediction using Automated Machine Learning: A Prospective Study of 423,604 UK Biobank Participants | A. M. Alaa, T. Bolton, E. Di Angelantonio, J. H. F. Rudd, M. van der Schaar | 2019 | https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0213653 | Identifying people at risk of cardiovascular diseases (CVD) is a cornerstone of preventative cardiology. Risk prediction models currently recommended by clinical guidelines are typically based on a limited number of predictors with sub-optimal performance across all patient groups. Data-driven techniques based on machine learning (ML) might improve the performance of risk predictions by agnostically discovering novel risk predictors and learning the complex interactions between them. We tested (1) whether ML techniques based on a state-of-the-art automated ML framework (AutoPrognosis) could improve CVD risk prediction compared to traditional approaches, and (2) whether considering non-traditional variables could increase the accuracy of CVD risk predictions.Using data on 423,604 participants without CVD at baseline in UK Biobank, we developed a ML-based model for predicting CVD risk based on 473 available variables. Our ML-based model was derived using AutoPrognosis, an algorithmic tool that automatically selects and tunes ensembles of ML modeling pipelines (comprising data imputation, feature processing, classification and calibration algorithms). We compared our model with a well-established risk prediction algorithm based on conventional CVD risk factors (Framingham score), a Cox proportional hazards (PH) model based on familiar risk factors (i.e, age, gender, smoking status, systolic blood pressure, history of diabetes, reception of treatments for hypertension and body mass index), and a Cox PH model based on all of the 473 available variables. Predictive performances were assessed using area under the receiver operating characteristic curve (AUC-ROC). Overall, our AutoPrognosis model improved risk prediction (AUC-ROC: 0.774, 95% CI: 0.768-0.780) compared to Framingham score (AUC-ROC: 0.724, 95% CI: 0.720-0.728, p < 0.001), Cox PH model with conventional risk factors (AUC-ROC: 0.734, 95% CI: 0.729-0.739, p < 0.001), and Cox PH model with all UK Biobank variables (AUC-ROC: 0.758, 95% CI: 0.753-0.763, p < 0.001). Out of 4,801 CVD cases recorded within 5 years of baseline, AutoPrognosis was able to correctly predict 368 more cases compared to the Framingham score. Our AutoPrognosis model included predictors that are not usually considered in existing risk prediction models, such as the individuals’ usual walking pace and their self-reported overall health rating. Furthermore, our model improved risk prediction in potentially relevant sub-populations, such as in individuals with history of diabetes. We also highlight the relative benefits accrued from including more information into a predictive model (information gain) as compared to the benefits of using more complex models (modeling gain). Our AutoPrognosis model improves the accuracy of CVD risk prediction in the UK Biobank population. This approach performs well in traditionally poorly served patient subgroups. Additionally, AutoPrognosis uncovered novel predictors for CVD disease that may now be tested in prospective studies. We found that the “information gain” achieved by considering more risk factors in the predictive model was significantly higher than the “modeling gain” achieved by adopting complex predictive models. | 10.1371/journal.pone.0213653 | Journal | PloS One | Automated ML, Ensemble learning | Phenotyping & subgroup analysis, Risk & prognosis | ||
2019/01/01 00:00 | "De novo" Heart Failure: a Mechanism Underscoring Sex Differences in Outcomes after ST-Segment Elevation Myocardial Infarction | E. Cenko, M. van der Schaar, J. Yoon, O. Manfrini, Z. Vasiljevic, M. Vavlukis, S. Kedev, M. Asanin, D. Miličić, L. Badimon, R. Bugiardini | 2019 | http://www.onlinejacc.org/content/73/9_Supplement_1/68 | Background ST-Segment Elevation Myocardial Infarction (STEMI) complicated by symptoms of acute heart failure (HF) is associated with excess mortality. Yet the relative contribution of sex to the development of acute HF and its related outcomes remains controversial. We aimed to compare the incidence and outcomes of patients with HF during index admission for STEMI according to sex and prior HF status: pre-existing diagnosis of HF, as assessed by past medical history, or no prior HF. Methods Cohort study using a population-based registry consisting of 8,409 STEMI patients with acute HF status recorded at baseline. Adjusted 30-day mortality and HF rates at index admission were estimated using inverse probability of weighting and logistic regression models. HF was defined as Killip class 2 or higher and classified according to prior medical history as acute “de novo” or decompensated HF. Results A total of 2,526 women and 5,883 men had HF status recorded at baseline and were included in the analysis. Of these patients, 2,403 (95.1%) women and 5,664 (96.3%) men have never experienced HF before index admission. After adjustment for baseline clinical covariates, the incidence of “de novo” HF was significantly higher for women than for men (29.4% vs 21.9 %, OR 1.23; 95%CI 1.10-1.38). For “de novo” HF presentations women have higher 30-day mortality than men (9.5% vs 6.2%: OR 1.58; 95%CI 1.33-1.88). After adjusting for potential confounders, a history of pre-existing HF was strongly associated with increased risk of acute decompensated HF at index admission (OR 3.89; 95%CI, 3.02-5.01). Nevertheless, when women and men presented with acute decompensated HF their outcomes are equally negative with a 30-day mortality of 11.3% vs 12.9%, respectively (OR 0.86; 95%CI 0.43-1.70). Conclusion Female sex has differing effects among patients with STEMI according to prior medical history of HF. It worsens outcomes in patients with acute “de novo” HF but has neutral effects in those with acute decompensated HF. “De novo” HF is a key feature to explain mortality difference between sexes. | 10.1016/S0735-1097(19)30677-1 | Journal | Journal of the American College of Cardiology | ||||
2019/01/01 00:00 | Dynamic Matching and Allocation of Tasks | K. Ahuja, M. van der Schaar | 2019 | https://dl.acm.org/doi/10.1145/3369925 | In many two-sided markets, the parties to be matched have incomplete information about their characteristics. We consider the settings where the parties engaged are extremely patient and are interested in long-term partnerships. Hence, once the final matches are determined, they persist for a long time. Each side has an opportunity to learn (some) relevant information about the other before final matches are made. For instance, clients seeking workers to perform tasks often conduct interviews that require the workers to perform some tasks and thereby provide information to both sides. The performance of a worker in such an interview- and hence the information revealed—depends both on the inherent characteristics of the worker and the task and also on the actions taken by the worker (e.g., the effort expended), which are not observed by the client. Thus, there is moral hazard. Our goal is to derive a dynamic matching mechanism that facilitates learning on both sides before final matches are achieved and ensures that the worker side does not have incentive to obscure learning of their characteristics through their actions. We derive such a mechanism that leads to final matching that achieves optimal performance (revenue) in equilibrium. We show that the equilibrium strategy is long-run coalitionally stable, which means there is no subset of workers and clients that can gain by deviating from the equilibrium strategy. We derive all the results under the modeling assumption that the utilities of the agents are defined as limit of means of the utility obtained in each interaction. | 10.1145/3369925 | Journal | ACM Transactions on Economics and Computation | Personalized Education | |||
2019/01/01 00:00 | Dynamic Prediction in Clinical Survival Analysis using Temporal Convolutional Networks | D. Jarrett, J. Yoon, M. van der Schaar | 2019 | https://ieeexplore.ieee.org/document/8765241 | Accurate prediction of disease trajectories is critical for early identification and timely treatment of patients at risk. Conventional methods in survival analysis are often constrained by strong parametric assumptions and limited in their ability to learn from high-dimensional data. This paper develops a novel convolutional approach that addresses the drawbacks of both traditional statistical approaches as well as recent neural network models for survival. We present Match-Net: a missingness-aware temporal convolutional hitting-time network, designed to capture temporal dependencies and heterogeneous interactions in covariate trajectories and patterns of missingness. To the best of our knowledge, this is the first investigation of temporal convolutions in the context of dynamic prediction for personalized risk prognosis. Using real-world data from the Alzheimer's disease neuroimaging initiative, we demonstrate state-of-the-art performance without making any assumptions regarding underlying longitudinal or time-to-event processes-attesting to the model's potential utility in clinical decision support. | 10.1109/JBHI.2019.2929264 | Journal | IEEE Journal of Biomedical and Health Informatics | Deep learning, Time series analysis | Risk & disease trajectories, Risk & prognosis | ||
2019/01/01 00:00 | Dynamic-DeepHit: A Deep Learning Approach for Dynamic Survival Analysis with Competing Risks based on Longitudinal Data | C. Lee, J. Yoon, M. van der Schaar | 2019 | https://ieeexplore.ieee.org/document/8681104 | Currently available risk prediction methods are limited in their ability to deal with complex, heterogeneous, and longitudinal data such as that available in primary care records, or in their ability to deal with multiple competing risks. This paper develops a novel deep learning approach that is able to successfully address current limitations of standard statistical approaches such as land marking and joint modeling. Our approach, which we call Dynamic-DeepHit, flexibly incorporates the available longitudinal data comprising various repeated measurements (rather than only the last available measurements) in order to issue dynamically updated survival predictions for one or multiple competing risk(s). Dynamic-DeepHit learns the time-to-event distributions without the need to make any assumptions about the underlying stochastic models for the longitudinal and the time-to-event processes. Thus, unlike existing works in statistics, our method is able to learn data-driven associations between the longitudinal data and the various associated risks without underlying model specifications. We demonstrate the power of our approach by applying it to a real-world longitudinal dataset from the U.K. Cystic Fibrosis Registry, which includes a heterogeneous cohort of 5883 adult patients with annual follow-ups between 2009 to 2015. The results show that Dynamic-DeepHit provides a drastic improvement in discriminating individual risks of different forms of failures due to cystic fibrosis. Furthermore, our analysis utilizes post-processing statistics that provide clinical insight by measuring the influence of each covariate on risk predictions and the temporal importance of longitudinal measurements, thereby enabling us to identify covariates that are influential for different competing risks. | 10.1109/TBME.2019.2909027 | Journal | IEEE Transactions on Biomedical Engineering | Deep learning, Survival analysis competing risks & comorbidities, Time series analysis | Risk & disease trajectories, Risk & prognosis | ||
2019/01/01 00:00 | Estimating counterfactual treatment outcomes over time through adversarially balanced representations | I. Bica, A. M. Alaa, M. van der Schaar | 2019 | https://openreview.net/pdf?id=BJg866NFvB | Identifying when to give treatments to patients and how to select among multiple treatments over time are important medical problems with a few existing solutions. In this paper, we introduce the Counterfactual Recurrent Network (CRN), a novel sequence-to-sequence model that leverages the increasingly available patient observational data to estimate treatment effects over time and answer such medical questions. To handle the bias from time-varying confounders, covariates affecting the treatment assignment policy in the observational data, CRN uses domain adversarial training to build balancing representations of the patient history. At each timestep, CRN constructs a treatment invariant representation which removes the association between patient history and treatment assignments and thus can be reliably used for making counterfactual predictions. On a simulated model of tumour growth, with varying degree of time-dependent confounding, we show how our model achieves lower error in estimating counterfactuals and in choosing the correct treatment and timing of treatment than current state-of-the-art methods. | Conference | NeurIPS Machine Learning for Health Workshop | Causal inference, Deep learning, Time series analysis | Treatment & trials, Treatment & trials | |||
2019/01/01 00:00 | Improving Workflow Efficiency for Mammography Using Machine Learning | T. Kyono, F. J. Gilbert, M. van der Schaar | 2019 | https://www.jacr.org/article/S1546-1440(19)30596-4/fulltext | The aim of this study was to determine whether machine learning could reduce the number of mammograms the radiologist must read by using a machine-learning classifier to correctly identify normal mammograms and to select the uncertain and abnormal examinations for radiological interpretation. Mammograms in a research data set from over 7,000 women who were recalled for assessment at six UK National Health Service Breast Screening Program centers were used. A convolutional neural network in conjunction with multitask learning was used to extract imaging features from mammograms that mimic the radiological assessment provided by a radiologist, the patient's nonimaging features, and pathology outcomes. A deep neural network was then used to concatenate and fuse multiple mammogram views to predict both a diagnosis and a recommendation of whether or not additional radiological assessment was needed. Ten-fold cross-validation was used on 2,000 randomly selected patients from the data set; the remainder of the data set was used for convolutional neural network training. While maintaining an acceptable negative predictive value of 0.99, the proposed model was able to identify 34% (95% confidence interval, 25%-43%) and 91% (95% confidence interval: 88%-94%) of the negative mammograms for test sets with a cancer prevalence of 15% and 1%, respectively. Machine learning was leveraged to successfully reduce the number of normal mammograms that radiologists need to read without degrading diagnostic accuracy. | 10.1016/j.jacr.2019.05.012 | Journal | Journal of the American College of Radiology | Medical imaging, Screening | |||
2019/01/01 00:00 | Joint Concordance Index | K. Ahuja, M. van der Schaar | 2019 | https://ieeexplore.ieee.org/document/9048941 | Existing metrics in competing risks survival analysis such as concordance and accuracy do not evaluate a model's ability to jointly predict the event type and the event time. To address these limitations, we propose a new metric, which we call the joint concordance. The joint concordance measures a model's ability to predict the overall risk profile, i.e., risk of death from different event types. We develop a consistent estimator for the new metric that accounts for the censoring bias. We use the new metric to develop a variable importance ranking approach. Using the real and synthetic data experiments, we show that models selected using the existing metrics are worse than those selected using joint concordance at jointly predicting the event type and event time. We show that the existing approaches for variable importance ranking often fail to recognize the importance of the event-specific risk factors, whereas, the proposed approach does not, since it compares risk factors based on their contribution to the prediction of the different event-types. To summarize, joint concordance is helpful for model comparisons and variable importance ranking and has the potential to impact applications such as risk-stratification and treatment planning in multimorbid populations. | 10.1109/IEEECONF44664.2019.9048941 | Conference | Asilomar Conference on Signals, Systems, and Computers | Survival analysis competing risks & comorbidities | |||
2019/01/01 00:00 | Lifelong Bayesian Optimization | Y. Zhang, J. Jordon, A. M. Alaa, M. van der Schaar | 2019 | https://arxiv.org/abs/1905.12280 | Automatic Machine Learning (Auto-ML) systems tackle the problem of automating the design of prediction models or pipelines for data science. In this paper, we present Lifelong Bayesian Optimization (LBO), an online, multitask Bayesian optimization (BO) algorithm designed to solve the problem of model selection for datasets arriving and evolving over time. To be suitable for "lifelong" Bayesian Optimization, an algorithm needs to scale with the ever increasing number of acquisitions and should be able to leverage past optimizations in learning the current best model. We cast the problem of model selection as a black-box function optimization problem. In LBO, we exploit the correlation between functions by using components of previously learned functions to speed up the learning process for newly arriving datasets. Experiments on real and synthetic data show that LBO outperforms standard BO algorithms applied repeatedly on the data. | Other | Automated ML | |||||
2019/01/01 00:00 | Multi-view Multi-task Learning for Improving Autonomous Mammogram Diagnosis | T. Kyono, F. J. Gilbert, M. van der Schaar | 2019 | http://proceedings.mlr.press/v106/kyono19a.html | The number of women requiring screening and diagnostic mammography is increasing. The recent promise of machine learning on medical images have led to an influx of studies using deep learning for autonomous mammogram diagnosis. We present a novel multi-view multi-task (MVMT) convolutional neural network (CNN) trained to predict the radiological assessments known to be associated with cancer, such as breast density, conspicuity, etc., in addition to cancer diagnosis. We show on full-field mammograms that multi-task learning has three advantages: 1) learning refined feature representations associated with cancer improves the classification performance of the diagnosis task, 2) issuing radiological assessments provides an additional layer of model interpretability that a radiologist can use to debug and scrutinize the diagnoses provided by the CNN, and 3) improves the radiological workflow by providing automated annotation of radiological reports. Results obtained on a private dataset of over 7,000 patients show that our MVMT network attained an AUROC and AUPRC of 0.855 ± 0.021 and 0.646 ± 0.023, respectively, and improved on the performance of other state-of-the-art multi-view CNNs. | Conference | Machine Learning for Healthcare Conference (MLHC) | Deep learning | Medical imaging, Risk & prognosis, Screening | |||
2019/01/01 00:00 | Optimal Piecewise Approximations for Model Interpretation | K. Ahuja, W. R. Zame, M. van der Schaar | 2019 | https://ieeexplore.ieee.org/document/9049045 | Recent literature interprets the predictions of "black-box" machine learning models (Neural Networks, Random Forests, etc.) by approximating these models in terms of simpler models such as piecewise linear or piecewise constant models. Existing literature does not provide guarantees on whether these approximations reflect the nature of the predictive model well, which can result in misleading interpretations. We provide a tractable dynamic programming algorithm that partitions the feature space into subsets and assigns a local model (constant/linear model) to provide piecewise constant/piecewise linear interpretations of an arbitrary predictive model. When approximation loss (between the interpretation and the predictive model) is measured in terms of mean squared error, our approximation is optimal; for more general loss functions, our interpretation is approximately optimal. Therefore, in both cases it probably approximately correctly (PAC) learns the predictive model. Experiments with real and synthetic data show that it provides significant improvements (in terms of mean squared error) over competing approaches. We also show real use cases to establish the utility of the proposed approach over competing approaches. | 10.1109/IEEECONF44664.2019.9049045 | Conference | Asilomar Conference on Signals, Systems, and Computers | Interpretability & explainability | |||
2019/01/01 00:00 | Sex-Related Differences in Heart Failure After ST-Segment Elevation Myocardial Infarction | E. Cenko, M. van der Schaar, J. Yoon, O. Manfrini, Z. Vasiljevic, M. Vavlukis, S. Kedev, D. Miličić, L. Badimon, R. Bugiardini | 2019 | http://www.onlinejacc.org/content/74/19/2379?_ga=2.67716561.1569861805.1573499147-258653805.1573499147 | ST-segment elevation myocardial infarction (STEMI) complicated by symptoms of acute de novo heart failure is associated with excess mortality. Whether development of heart failure and its outcomes differ by sex is unknown. This study sought to examine the relationships among sex, acute heart failure, and related outcomes after STEMI in patients with no prior history of heart failure recorded at baseline. Patients were recruited from a network of hospitals in the ISACS-TC (International Survey of Acute Coronary Syndromes in Transitional Countries) registry ( NCT01218776). Main outcome measures were incidence of Killip class ≥II at hospital presentation and risk-adjusted 30-day mortality rates were estimated using inverse probability of weighting and logistic regression models. This study included 10,443 patients (3,112 women). After covariate adjustment and matching for age, cardiovascular risk factors, comorbidities, disease severity, and delay to hospital presentation, the incidence of de novo heart failure at hospital presentation was significantly higher for women than for men (25.1% vs. 20.0%, odds ratio [OR]: 1.34; 95% confidence interval [CI]: 1.21 to 1.48). Women with de novo heart failure had higher 30-day mortality than did their male counterparts (25.1% vs. 20.6%; OR: 1.29; 95% CI: 1.05 to 1.58). The sex-related difference in mortality rates was still apparent in patients with de novo heart failure undergoing reperfusion therapy after hospital presentation (21.3% vs. 15.7%; OR: 1.45; 95% CI: 1.07 to 1.96). Women are at higher risk to develop de novo heart failure after STEMI and women with de novo heart failure have worse survival than do their male counterparts. Therefore, de novo heart failure is a key feature to explain mortality gap after STEMI among women and men. | 10.1016/j.jacc.2019.08.1047 | Journal | Journal of the American College of Cardiology | ||||
2019/01/01 00:00 | Sex-Specific Treatment Effects After Primary Percutaneous Intervention: A Study on Coronary Blood Flow and Delay to Hospital Presentation | E. Cenko, M. van der Schaar, J. Yoon, S. Kedev, M. Valvukis, Z. Vasiljevic, M. Asanin, D. Miličić, O. Manfrini, L. Badimon, R. Bugiardini | 2019 | https://www.ahajournals.org/doi/full/10.1161/JAHA.118.011190 | We hypothesized that female sex is a treatment effect modifier of blood flow and related 30‐day mortality after primary percutaneous coronary intervention (PCI) for ST‐segment–elevation myocardial infarction and that the magnitude of the effect on outcomes differs depending on delay to hospital presentation. We identified 2596 patients enrolled in the ISACS‐TC (International Survey of Acute Coronary Syndromes in Transitional Countries) registry from 2010 to 2016. Primary outcome was the occurrence of 30‐day mortality. Key secondary outcome was the rate of suboptimal post‐PCI Thrombolysis in Myocardial Infarction (TIMI; flow grade 0–2). Multivariate logistic regression and inverse probability of treatment weighted models were adjusted for baseline clinical covariates. We characterized patient outcomes associated with a delay from symptom onset to hospital presentation of ≤120 minutes. In multivariable regression models, female sex was associated with postprocedural TIMI flow grade 0 to 2 (odds ratio [OR], 1.68; 95% CI, 1.15–2.44) and higher mortality (OR, 1.72; 95% CI, 1.02–2.90). Using inverse probability of treatment weighting, 30‐day mortality was higher in women compared with men (4.8% versus 2.5%; OR, 2.00; 95% CI, 1.27–3.15). Likewise, we found a significant sex difference in post‐PCI TIMI flow grade 0 to 2 (8.8% versus 5.0%; OR, 1.83; 95% CI, 1.31–2.56). The sex gap in mortality was no longer significant for patients having hospital presentation of ≤120 minutes (OR, 1.28; 95% CI, 0.35–4.69). Sex difference in post‐PCI TIMI flow grade was consistent regardless of time to hospital presentation. Delay to hospital presentation and suboptimal post‐PCI TIMI flow grade are variables independently associated with excess mortality in women, suggesting complementary mechanisms of reduced survival. | 10.1161/JAHA.118.011190 | Journal | Journal of the American Heart Association | ||||
2019/01/01 00:00 | Time Series Deconfounder: Estimating Treatment Effects over Time in the Presence of Hidden Confounders | I. Bica, A. M. Alaa, M. van der Schaar | 2019 | http://proceedings.mlr.press/v119/bica20a.html | The estimation of treatment effects is a pervasive problem in medicine. Existing methods for estimating treatment effects from longitudinal observational data assume that there are no hidden confounders, an assumption that is not testable in practice and, if it does not hold, leads to biased estimates. In this paper, we develop the Time Series Deconfounder, a method that leverages the assignment of multiple treatments over time to enable the estimation of treatment effects in the presence of multi-cause hidden confounders. The Time Series Deconfounder uses a novel recurrent neural network architecture with multitask output to build a factor model over time and infer latent variables that render the assigned treatments conditionally independent; then, it performs causal inference using these latent variables that act as substitutes for the multi-cause unobserved confounders. We provide a theoretical analysis for obtaining unbiased causal effects of time-varying exposures using the Time Series Deconfounder. Using both simulated and real data we show the effectiveness of our method in deconfounding the estimation of treatment responses over time. | Conference | NeurIPS Machine Learning for Health Workshop | Causal inference, Deep learning, Time series analysis | Treatment & trials | |||
2019/01/01 00:00 | Working Alone and Working With Others: Implications for the Malthusian Era | K. Ahuja, M. van der Schaar, W. R. Zame | 2019 | https://link.springer.com/article/10.1007%2Fs00199-019-01234-3 | This paper presents a stylized dynamic model to study the impact of the social organization of production during the Malthusian Era (after the Neolithic Age and before the Industrial Revolution), during which there was little or no economic growth. The focus is on the division of time between working alone (individualism) and working with others (collectivism). This division of time matters because individuals have different productive abilities. A greater fraction of time spent working with others raises the income of current Low-ability individuals—but it may also lower the income of High-ability individuals and hence lower the bequests they leave for future Low-ability individuals. In the presence of congestion effects, these forces interact in a very complicated way. The paper analyzes the comparative statics implications of this division of time on economic outcomes in the (unique, non-degenerate) Malthusian steady state. It finds that a greater fraction of time spent working with others (a greater degree of collectivism) leads to a larger population, smaller per capita income, and lower income inequality. Some historical evidence is consistent with these predictions. | 10.1007_s00199-019-01234-3 | Journal | Economic Theory | ||||
2018/12/03 00:00 | Forecasting Treatment Responses Over Time Using Recurrent Marginal Structural Networks | B. Lim, A. M. Alaa, M. van der Schaar | 2018 | https://papers.nips.cc/paper/2018/hash/56e6a93212e4482d99c84a639d254b67-Abstract.html | Electronic health records provide a rich source of data for machine learning methods to learn dynamic treatment responses over time. However, any direct estimation is hampered by the presence of time-dependent confounding, where actions taken are dependent on time-varying variables related to the outcome of interest. Drawing inspiration from marginal structural models, a class of methods in epidemiology which use propensity weighting to adjust for time-dependent confounders, we introduce the Recurrent Marginal Structural Network - a sequence-to-sequence architecture for forecasting a patient's expected response to a series of planned treatments. Using simulations of a state-of-the-art pharmacokinetic-pharmacodynamic (PK-PD) model of tumor growth, we demonstrate the ability of our network to accurately learn unbiased treatment responses from observational data – even under changes in the policy of treatment assignments – and performance gains over benchmarks. | Conference | NeurIPS | Causal inference, Deep learning, Time series analysis | Treatment & trials | |||
2018/12/03 00:00 | Multitask Boosting for Survival Analysis with Competing Risks | A. Bellot, M. van der Schaar | 2018 | https://papers.nips.cc/paper/2018/hash/2afe4567e1bf64d32a5527244d104cea-Abstract.html | The co-occurrence of multiple diseases among the general population is an important problem as those patients have more risk of complications and represent a large share of health care expenditure. Learning to predict time-to-event probabilities for these patients is a challenging problem because the risks of events are correlated (there are competing risks) with often only few patients experiencing individual events of interest, and of those only a fraction are actually observed in the data. We introduce in this paper a survival model with the flexibility to leverage a common representation of related events that is designed to correct for the strong imbalance in observed outcomes. The procedure is sequential: outcome-specific survival distributions form the components of nonparametric multivariate estimators which we combine into an ensemble in such a way as to ensure accurate predictions on all outcome types simultaneously. Our algorithm is general and represents the first boosting-like method for time-to-event data with multiple outcomes. We demonstrate the performance of our algorithm on synthetic and real data. | Conference | NeurIPS | Ensemble learning, Survival analysis competing risks & comorbidities | Risk & prognosis | |||
2018/07/10 00:00 | AutoPrognosis: Automated Clinical Prognostic Modeling via Bayesian Optimization with Structured Kernel Learning | A. M. Alaa, M. van der Schaar | 2018 | http://proceedings.mlr.press/v80/alaa18b.html | Clinical prognostic models derived from largescale healthcare data can inform critical diagnostic and therapeutic decisions. To enable off-theshelf usage of machine learning (ML) in prognostic research, we developed AUTOPROGNOSIS: a system for automating the design of predictive modeling pipelines tailored for clinical prognosis. AUTOPROGNOSIS optimizes ensembles of pipeline configurations efficiently using a novel batched Bayesian optimization (BO) algorithm that learns a low-dimensional decomposition of the pipelines’ high-dimensional hyperparameter space in concurrence with the BO procedure. This is achieved by modeling the pipelines’ performances as a black-box function with a Gaussian process prior, and modeling the “similarities” between the pipelines’ baseline algorithms via a sparse additive kernel with a Dirichlet prior. Meta-learning is used to warmstart BO with external data from “similar” patient cohorts by calibrating the priors using an algorithm that mimics the empirical Bayes method. The system automatically explains its predictions by presenting the clinicians with logical association rules that link patients’ features to predicted risk strata. We demonstrate the utility of AUTOPROGNOSIS using 10 major patient cohorts representing various aspects of cardiovascular patient care. | Conference | ICML | Automated ML, Ensemble learning | Risk & prognosis | |||
2018/07/10 00:00 | GAIN: Missing Data Imputation using Generative Adversarial Nets | J. Yoon, J. Jordon, M. van der Schaar | 2018 | http://proceedings.mlr.press/v80/yoon18a.html | We propose a novel method for imputing missing data by adapting the well-known Generative Adversarial Nets (GAN) framework. Accordingly, we call our method Generative Adversarial Imputation Nets (GAIN). The generator (G) observes some components of a real data vector, imputes the missing components conditioned on what is actually observed, and outputs a completed vector. The discriminator (D) then takes a completed vector and attempts to determine which components were actually observed and which were imputed. To ensure that D forces G to learn the desired distribution, we provide D with some additional information in the form of a hint vector. The hint reveals to D partial information about the missingness of the original sample, which is used by D to focus its attention on the imputation quality of particular components. This hint ensures that G does in fact learn to generate according to the true data distribution. We tested our method on various datasets and found that GAIN significantly outperforms state-of-the-art imputation methods. | Conference | ICML | Deep learning | Missing Data Imputation | |||
2018/07/10 00:00 | Limits of Estimating Heterogeneous Treatment Effects: Guidelines for Practical Algorithm Design | A. M. Alaa, M. van der Schaar | 2018 | http://proceedings.mlr.press/v80/alaa18a.html | Estimating heterogeneous treatment effects from observational data is a central problem in many domains. Because counterfactual data is inaccessible, the problem differs fundamentally from supervised learning, and entails a more complex set of modeling choices. Despite a variety of recently proposed algorithmic solutions, a principled guideline for building estimators of treatment effects using machine learning algorithms is still lacking. In this paper, we provide such a guideline by characterizing the fundamental limits of estimating heterogeneous treatment effects, and establishing conditions under which these limits can be achieved. Our analysis reveals that the relative importance of the different aspects of observational data vary with the sample size. For instance, we show that selection bias matters only in small-sample regimes, whereas with a large sample size, the way an algorithm models the control and treated outcomes is what bottlenecks its performance. Guided by our analysis, we build a practical algorithm for estimating treatment effects using a non-stationary Gaussian processes with doubly-robust hyperparameters. Using a standard semi-synthetic simulation setup, we show that our algorithm outperforms the state-of-the-art, and that the behavior of existing algorithms conforms with our analysis. | Conference | ICML | Causal inference | Treatment & trials | |||
2018/07/10 00:00 | RadialGAN: Leveraging multiple datasets to improve target-specific predictive models using Generative Adversarial Networks | J. Yoon, J. Jordon, M. van der Schaar | 2018 | http://proceedings.mlr.press/v80/yoon18b.html | Training complex machine learning models for prediction often requires a large amount of data that is not always readily available. Leveraging these external datasets from related but different sources is therefore an important task if good predictive models are to be built for deployment in settings where data can be rare. In this paper we propose a novel approach to the problem in which we use multiple GAN architectures to learn to translate from one dataset to another, thereby allowing us to effectively enlarge the target dataset, and therefore learn better predictive models than if we simply used the target dataset. We show the utility of such an approach, demonstrating that our method improves the prediction performance on the target domain over using just the target dataset and also show that our framework outperforms several other benchmarks on a collection of real-world medical datasets. | Conference | ICML | Deep learning, Transfer learning, Privacy-preserving ML & synthetic data | ||||
2018/04/30 00:00 | Deep Sensing: Active Sensing using Multi-directional Recurrent Neural Networks | J. Yoon, W. R. Zame, M. van der Schaar | 2018 | https://openreview.net/pdf?id=r1SnX5xCb | For every prediction we might wish to make, we must decide what to observe (what source of information) and when to observe it. Because making observations is costly, this decision must trade off the value of information against the cost of observation. Making observations (sensing) should be an active choice. To solve the problem of active sensing we develop a novel deep learning architecture: Deep Sensing. At training time, Deep Sensing learns how to issue predictions at various cost-performance points. To do this, it creates multiple representations at various performance levels associated with different measurement rates (costs). This requires learning how to estimate the value of real measurements vs. inferred measurements, which in turn requires learning how to infer missing (unobserved) measurements. To infer missing measurements, we develop a Multi-directional Recurrent Neural Network (M-RNN). An M-RNN differs from a bi-directional RNN in that it sequentially operates across streams in addition to within streams, and because the timing of inputs into the hidden layers is both lagged and advanced. At runtime, the operator prescribes a performance level or a cost constraint, and Deep Sensing determines what measurements to take and what to infer from those measurements, and then issues predictions. To demonstrate the power of our method, we apply it to two real-world medical datasets with significantly improved performance. | Conference | ICLR | Deep learning, Time series analysis | Early warning systems, Missing Data Imputation, Risk & disease trajectories, Screening | |||
2018/04/30 00:00 | GANITE: Estimation of Individualized Treatment Effects using Generative Adversarial Nets | J. Yoon, J. Jordon, M. van der Schaar | 2018 | https://openreview.net/pdf?id=ByKWUeWA- | Estimating individualized treatment effects (ITE) is a challenging task due to the need for an individual's potential outcomes to be learned from biased data and without having access to the counterfactuals. We propose a novel method for inferring ITE based on the Generative Adversarial Nets (GANs) framework. Our method, termed Generative Adversarial Nets for inference of Individualized Treatment Effects (GANITE), is motivated by the possibility that we can capture the uncertainty in the counterfactual distributions by attempting to learn them using a GAN. We generate proxies of the counterfactual outcomes using a counterfactual generator, G, and then pass these proxies to an ITE generator, I, in order to train it. By modeling both of these using the GAN framework, we are able to infer based on the factual data, while still accounting for the unseen counterfactuals. We test our method on three real-world datasets (with both binary and multiple treatments) and show that GANITE outperforms state-of-the-art methods. | Conference | ICLR | Causal inference, Deep learning | ||||
2018/04/09 00:00 | Tree-based Bayesian Mixture Model for Competing Risks | A. Bellot, M. van der Schaar | 2018 | http://proceedings.mlr.press/v84/bellot18a.html | Many chronic diseases possess a shared biology. Therapies designed for patients at risk of multiple diseases need to account for the shared impact they may have on related diseases to ensure maximum overall well-being. Learning from data in this setting differs from classical survival analysis methods since the incidence of an event of interest may be obscured by other related competing events. We develop a semi-parametric Bayesian regression model for survival analysis with competing risks, which can be used for jointly assessing a patient’s risk of multiple (competing) adverse outcomes. We construct a Hierarchical Bayesian Mixture (HBM) model to describe survival paths in which a patient’s covariates influence both the estimation of the type of adverse event and the subsequent survival trajectory through Multivariate Random Forests. In addition variable importance measures, which are essential for clinical interpretability are induced naturally by our model. We aim with this setting to provide accurate individual estimates but also interpretable conclusions for use as a clinical decision support tool. We compare our method with various state-of-the-art benchmarks on both synthetic and clinical data. | Conference | AISTATS | Survival analysis competing risks & comorbidities | Risk & prognosis | |||
2018/01/01 00:00 | Forecasting Disease Trajectories in Alzheimer's Disease Using Deep Learning | B. Lim, M. van der Schaar | 2018 | https://arxiv.org/abs/1807.03159 | Joint models for longitudinal and time-to-event data are commonly used in longitudinal studies to forecast disease trajectories over time. Despite the many advantages of joint modeling, the standard forms suffer from limitations that arise from a fixed model specification and computational difficulties when applied to large datasets. We adopt a deep learning approach to address these limitations, enhancing existing methods with the flexibility and scalability of deep neural networks while retaining the benefits of joint modeling. Using data from the Alzheimer's Disease Neuroimaging Institute, we show improvements in performance and scalability compared to traditional methods. | Conference | KDD Workshop on Machine Learning for Medicine and Healthcare | Deep learning, Time series analysis | Risk & disease trajectories | |||
2018/01/01 00:00 | A Hierarchical Bayesian Model for Personalized Survival Predictions | A. Bellot, M. van der Schaar | 2018 | https://ieeexplore.ieee.org/document/8353457 | We study the problem of personalizing survival estimates of patients in heterogeneous populations for clinical decision support. The desiderata are to improve predictions by making them personalized to the patient-at-hand, to better understand diseases and their risk factors, and to provide interpretable model outputs to clinicians. To enable accurate survival prognosis in heterogeneous populations we propose a novel probabilistic survival model which flexibly captures individual traits through a hierarchical latent variable formulation. Survival paths are estimated by jointly sampling the location and shape of the individual survival distribution resulting in patient-specific curves with quantifiable uncertainty estimates. An understanding of model predictions is paramount in medical practice where decisions have major social consequences. We develop a personalized interpreter that can be used to test the effect of covariates on each individual patient, in contrast to traditional methods that focus on population average effects. We extensively validated the proposed approach in various clinical settings, with a special focus on cardiovascular disease. | 10.1109/JBHI.2018.2832599 | Journal | IEEE Journal of Biomedical and Health Informatics | Survival analysis competing risks & comorbidities | Phenotyping & subgroup analysis, Risk & prognosis | ||
2018/01/01 00:00 | A Non-Stationary Online Learning Approach to Mobility Management | Y. Zhou, C. Shen, X. Luo, M. van der Schaar | 2018 | https://ieeexplore.ieee.org/document/8422163 | Efficient mobility management is an important problem in modern wireless networks with heterogeneous cell sizes and increased nodes densities. We show that optimization- based mobility protocols cannot achieve long-term optimal performance, particularly in a time-varying environment for ultra-dense networks. To address the complex system dynamics, especially the possible change of statistics due to user movement and environment changes, we propose piece-wise stationary online-learning algorithms to track the activities of small base stations and solve frequent handover (FHO) problems. The BASD/BASSW algorithms are proved to achieve sublinear regret performance in finite time horizon and a linear, non-trivial rigorous bound for infinite time horizon. We study the robustness of the BASD/BASSW algorithms under missing feedback. Simulations show that proposed algorithms can outperform 3GPP protocols with the best threshold, and tend to be more robust than 3GPP to various dynamics which are common in practical ultra- dense wireless networks. | 10.1109/ICC.2018.8422163 | Conference | IEEE International Conference on Communications (ICC) Wireless Networking Symposium | Communications and Networks | |||
2018/01/01 00:00 | ACW-RNN: Adaptive Clockwork Recurrent Neural Networks for Early Warning Systems in Hospitals | Q. Feng, J. Yoon, M. van der Schaar | 2018 | https://www.vanderschaar-lab.com/papers/AIMed_Abstract_Adaptive_CWRNN.pdf | Early Warning Systems (EWS), which use physiological datastreams to timely predict clinical deterioration for patients in regular wards and Intensive Care Units (ICUs), have been shown to have life-saving impact. We propose a novel Recurrent Neural Network (RNN) architecture that can address the unique challenges of implementing EWS in hospitals: 1) learning from multi-variate physiological datastreams spanning long periods of time; 2) learning from imbalanced data, where only a small percentage of patients are experiencing adverse events; 3) learning from the multi-variate physiological datastreams in different hospital settings (regular wards as compared to ICU), where more types of measurements are collected and more frequently, and the patterns of clinical deterioration often differ. Our model is based on Clockwork RNN (CW-RNN), which effectively captures temporal correlations by learning multiresolution representations, but goes one step further. While CW-RNN learns only fixed multiresolution representations, our proposed model, dubbed Adaptive Clockwise RNN (ACWRNN) can learn adaptive multi-resolution representations based on the temporal correlations between physiological datastreams and their impact on clinical deterioration. This enables ACWRNN to effectively learn various patterns of deterioration from small and imbalanced datasets. We show that ACW-RNN can effectively operate in both regular wards and ICUs, and that it achieves large performance improvements over state-of-the-art deep learning models as well as existing clinical risk scores. To the best of our knowledge, this is the first deep learning solution which has been shown to issue timely and accurate predictions for clinical deterioration in both regular wards and ICU. | Other | Early warning systems, Risk & disease trajectories | |||||
2018/01/01 00:00 | Adaptive Contextual Learning for Unit Commitment in Microgrids with Renewable Energy Sources | H.-S. Lee, C. Tekin, M. van der Schaar, J. W. Lee | 2018 | https://ieeexplore.ieee.org/document/8392717 | In this paper, we study a unit commitment (UC) problem where the goal is to minimize the operating costs of a microgrid that involves renewable energy sources. Since traditional UC algorithms use a priori information about uncertainties such as the load demand and the renewable power outputs, their performances highly depend on the accuracy of the a priori information, especially in microgrids due to their limited scale and size. This makes the algorithms impractical in settings where the past data are not sufficient to construct an accurate prior of the uncertainties. To resolve this issue, we develop an adaptively partitioned contextual learning algorithm for UC (AP-CLUC) that learns the best UC schedule and minimizes the total cost over time in an online manner without requiring any a priori information. AP-CLUC effectively learns the effects of the uncertainties on the cost by adaptively considering context information strongly correlated with the uncertainties, such as the past load demand and weather conditions. For AP-CLUC, we first prove an analytical bound on the performance, which shows that its average total cost converges to that of the optimal policy with perfect a priori information. Then, we show via simulations that AP-CLUC achieves competitive performance with respect to the traditional UC algorithms with perfect a priori information, and it achieves better performance than them even with small errors on the information. These results demonstrate the effectiveness of utilizing the context information and the adaptive management of the past data for the UC problem. | 10.1109/JSTSP.2018.2849855 | Journal | IEEE Journal of Selected Topics in Signal Processing | Multi-agent learning | |||
2018/01/01 00:00 | Bayesian Nonparametric Causal Inference: Information Rates and Learning Algorithms | A. M. Alaa, M. van der Schaar | 2018 | https://ieeexplore.ieee.org/document/8387845 | We investigate the problem of estimating the causal effect of a treatment on individual subjects from observational data; this is a central problem in various application domains, including healthcare, social sciences, and online advertising. Within the Neyman-Rubin potential outcomes model, we use the Kullback-Leibler (KL) divergence between the estimated and true distributions as a measure of accuracy of the estimate, and we define the information rate of the Bayesian causal inference procedure as the (asymptotic equivalence class of the) expected value of the KL divergence between the estimated and true distributions as a function of the number of samples. Using Fano's method, we establish a fundamental limit on the information rate that can be achieved by any Bayesian estimator, and show that this fundamental limit is independent of the selection bias in the observational data. We characterize the Bayesian priors on the potential (factual and counterfactual) outcomes that achieve the optimal information rate. We go on to propose a prior adaptation procedure (which we call the information-based empirical Bayes procedure) that optimizes the Bayesian prior by maximizing an information-theoretic criterion on the recovered causal effects rather than maximizing the marginal likelihood of the observed (factual) data. Building on our analysis, we construct an information-optimal Bayesian causal inference algorithm. This algorithm embeds the potential outcomes in a vector-valued reproducing Kernel Hilbert space, and uses a multitask Gaussian process prior over that space to infer the individualized causal effects. We show that for such a prior, the proposed information-based empirical Bayes method adapts the smoothness of the multitask Gaussian process to the true smoothness of the causal effect function by balancing a tradeoff between the factual bias and the counterfactual variance. We conduct experiments on a well-known real-world dataset and show that our model significantly outperforms the state-of-the-art causal inference models. | 10.1109/JSTSP.2018.2848230 | Journal | IEEE Journal of Selected Topics in Signal Processing | Causal inference | |||
2018/01/01 00:00 | Boosted Trees for Risk Prognosis | A. Bellot, M. van der Schaar | 2018 | http://proceedings.mlr.press/v85/bellot18a.html | We present a new approach to ensemble learning for risk prognosis in heterogeneous medical populations. Our aim is to improve overall prognosis by focusing on under-represented patient subgroups with an atypical disease presentation; with current prognostic tools, these subgroups are being consistently mis-estimated. Our method proceeds sequentially by learning nonparametric survival estimators which iteratively learn to improve predictions of previously misdiagnosed patients - a process called boosting. This results in fully nonparametric survival estimates, that is, constrained neither by assumptions regarding the baseline hazard nor assumptions regarding the underlying covariate interactions - and thus differentiating our approach from existing boosting methods for survival analysis. In addition, our approach yields a measure of the relative covariate importance that accurately identifies relevant covariates within complex survival dynamics, thereby informing further medical understanding of disease interactions. We study the properties of our approach on a variety of heterogeneous medical datasets, demonstrating significant performance improvements over existing survival and ensemble methods. | Conference | Machine Learning for Healthcare Conference (MLHC) | Ensemble learning | Risk & prognosis | |||
2018/01/01 00:00 | Constructing Effective Personalized Policies Using Counterfactual Inference from Biased Data Sets with Many Features | O. Atan, W. R. Zame, Q. Feng, M. van der Schaar | 2018 | https://link.springer.com/article/10.1007%2Fs10994-018-5768-3 | This paper proposes a novel approach for constructing effective personalized policies when the observed data lacks counter-factual information, is biased and possesses many features. The approach is applicable in a wide variety of settings from healthcare to advertising to education to finance. These settings have in common that the decision maker can observe, for each previous instance, an array of features of the instance, the action taken in that instance, and the reward realized—but not the rewards of actions that were not taken: the counterfactual information. Learning in such settings is made even more difficult because the observed data is typically biased by the existing policy (that generated the data) and because the array of features that might affect the reward in a particular instance—and hence should be taken into account in deciding on an action in each particular instance—is often vast. The approach presented here estimates propensity scores for the observed data, infers counterfactuals, identifies a (relatively small) number of features that are (most) relevant for each possible action and instance, and prescribes a policy to be followed. Comparison of the proposed algorithm against state-of-art algorithms on actual datasets demonstrates that the proposed algorithm achieves a significant improvement in performance. | 10.1007/s10994-018-5768-3 | Journal | Machine Learning | Causal inference, Feature selection, Reinforcement learning | |||
2018/01/01 00:00 | Context-Aware Hierarchical Online Learning for Performance Maximization in Mobile Crowdsourcing | S. Muller, C. Tekin, M. van der Schaar, A. Klein | 2018 | https://ieeexplore.ieee.org/document/8353505 | In mobile crowdsourcing (MCS), mobile users accomplish outsourced human intelligence tasks. MCS requires an appropriate task assignment strategy, since different workers may have different performance in terms of acceptance rate and quality. Task assignment is challenging, since a worker's performance 1) may fluctuate, depending on both the worker's current personal context and the task context and 2) is not known a priori, but has to be learned over time. Moreover, learning context-specific worker performance requires access to context information, which may not be available at a central entity due to communication overhead or privacy concerns. In addition, evaluating worker performance might require costly quality assessments. In this paper, we propose a context-aware hierarchical online learning algorithm addressing the problem of performance maximization in MCS. In our algorithm, a local controller (LC) in the mobile device of a worker regularly observes the worker's context, her/his decisions to accept or decline tasks and the quality in completing tasks. Based on these observations, the LC regularly estimates the worker's context-specific performance. The mobile crowdsourcing platform (MCSP) then selects workers based on performance estimates received from the LCs. This hierarchical approach enables the LCs to learn context-specific worker performance and it enables the MCSP to select suitable workers. In addition, our algorithm preserves worker context locally, and it keeps the number of required quality assessments low. We prove that our algorithm converges to the optimal task assignment strategy. Moreover, the algorithm outperforms simpler task assignment strategies in experiments based on synthetic and real data. | 10.1109/TNET.2018.2828415 | Journal | IEEE/ACM Transactions on Networking | Communications and Networks | |||
2018/01/01 00:00 | Counterfactual Policy Optimization Using Domain-Adversarial Neural Networks | O. Atan, W. R. Zame, M. van der Schaar | 2018 | https://www.vanderschaar-lab.com/papers/cf_treat_v5 | Choosing optimal (or at least better) policies is an important problem in domains from medicine to education to finance and many others. One approach to this problem is through controlled experiments/trials - but controlled experiments are expensive. Hence it is important to choose the best policies on the basis of observational data. This presents two difficult challenges: (i) missing counterfactuals, and (ii) selection bias. This paper presents theoretical bounds on estimation errors of counterfactuals from observational data by making connections to domain adaptation theory. It also presents a principled way of choosing optimal policies using domain adversarial neural networks. This illustrates the effectiveness of domain adversarial training together with various features of our algorithm on a semi-synthetic breast cancer dataset. | Conference | ICML Causal Machine Learning Workshop | Causal inference, Deep learning, Reinforcement learning | ||||
2018/01/01 00:00 | Coupled Markov-switching regression: inference and a case study using electronic health record data | J. Pohle, R. King, M. van der Schaar, R. Langrock | 2018 | https://www.vanderschaar-lab.com/papers/IWSM2018_coupled%20Markov%20switching%20regression.pdf | Coupled hidden Markov models (HMMs) are designed to capture the structure of multivariate time series whose underlying latent state variables interact, but do not evolve synchronously. Here we extend coupled HMMs to allow for covariates in the observed time series, which leads to the class of coupled Markov-switching regression models. The method is applied to electronic health record data of 702 patients from an intensive care unit at the UCLA, where the aim is to gain a better understanding of the course of a disease as well as early-warning signs of potentially critical developments. | Conference | International Workshop on Statistical Modeling (IWSM) | Time series analysis | ||||
2018/01/01 00:00 | Deep-Treat: Learning Optimal Personalized Treatments from Observational Data using Neural Networks | O. Atan, J. Jordon, M. van der Schaar | 2018 | https://ojs.aaai.org/index.php/AAAI/article/view/11841 | We propose a novel approach for constructing effective treatment policies when the observed data is biased and lacks counterfactual information. Learning in settings where the observed data does not contain all possible outcomes for all treatments is difficult since the observed data is typically biased due to existing clinical guidelines. This is an important problem in the medical domain as collecting unbiased data is expensive and so learning from the wealth of existing biased data is a worthwhile task. Our approach separates the problem into two stages: first we reduce the bias by learning a representation map using a novel auto-encoder network---this allows us to control the trade-off between the bias-reduction and the information loss---and then we construct effective treatment policies on the transformed data using a novel feedforward network. Separation of the problem into these two stages creates an algorithm that can be adapted to the problem at hand---the bias-reduction step can be performed as a preprocessing step for other algorithms. We compare our algorithm against state-of-art algorithms on two semi-synthetic datasets and demonstrate that our algorithm achieves a significant improvement in performance. | Conference | AAAI | Causal inference, Deep learning, Reinforcement learning | Treatment & trials | |||
2018/01/01 00:00 | DeepHit: A Deep Learning Approach to Survival Analysis with Competing Risks | C. Lee, W. R. Zame, J. Yoon, M. van der Schaar | 2018 | https://ojs.aaai.org/index.php/AAAI/article/view/11842 | Survival analysis (time-to-event analysis) is widely used in economics and finance, engineering, medicine and many other areas. A fundamental problem is to understand the relationship between the covariates and the (distribution of) survival times(times-to-event). Much of the previous work has approached the problem by viewing the survival time as the first hitting time of a stochastic process, assuming a specific form for the underlying stochastic process, using available data to learn the relationship between the covariates and the parameters of the model, and then deducing the relationship between covariates and the distribution of first hitting times (the risk). However, previous models rely on strong parametric assumptions that are often violated. This paper proposes a very different approach to survival analysis, DeepHit, that uses a deep neural network to learn the distribution of survival times directly.DeepHit makes no assumptions about the underlying stochastic process and allows for the possibility that the relationship between covariates and risk(s) changes over time. Most importantly, DeepHit smoothly handles competing risks; i.e. settings in which there is more than one possible event of interest.Comparisons with previous models on the basis of real and synthetic datasets demonstrate that DeepHit achieves large and statistically significant performance improvements over previous state-of-the-art methods. | Conference | AAAI | Deep learning, Survival analysis competing risks & comorbidities | Risk & prognosis | |||
2018/01/01 00:00 | Disease-Atlas: Navigating Disease Trajectories using Deep Learning | B. Lim, M. van der Schaar | 2018 | http://proceedings.mlr.press/v85/lim18a.html | Joint models for longitudinal and time-to-event data are commonly used in longitudinal studies to forecast disease trajectories over time. While there are many advantages to joint modeling, the standard forms suffer from limitations that arise from a fixed model specification, and computational difficulties when applied to high-dimensional datasets. In this paper, we propose a deep learning approach to address these limitations, enhancing existing methods with the inherent flexibility and scalability of deep neural networks, while retaining the benefits of joint modeling. Using longitudinal data from a real-world medical dataset, we demonstrate improvements in performance and scalability, as well as robustness in the presence of irregularly sampled data. | Conference | Machine Learning for Healthcare Conference (MLHC) | Deep learning, Time series analysis | Risk & disease trajectories, Risk & prognosis, Screening | |||
2018/01/01 00:00 | Distributed Task Management in Cyber-Physical Systems: How to Cooperate under Uncertainty? | S. Maghsudi, M. van der Schaar | 2018 | https://ieeexplore.ieee.org/document/8585045 | We consider the problem of task allocation in a network of cyber-physical systems (CPSs). The network can have different states, and the tasks are of different types. The task arrival is stochastic and state-dependent. Every CPS is capable of performing each type of task with some specific state-dependent efficiency. The CPSs have to agree on task allocation prior to knowing about the realized network's state and/or the arrived tasks. We model the problem as a multi-state stochastic cooperative game with state uncertainty. We then use the concept of deterministic equivalence and sequential core to solve the problem. We establish the non-emptiness of the strong sequential core in our designed task allocation game and investigate its characteristics including uniqueness and optimality. Moreover, we prove that in the task allocation game, the strong sequential core is equivalent to Walrasian equilibrium under state uncertainty; consequently, it can be implemented by using the Walras' tatonnement process. | 10.1109/TCCN.2018.2888970 | Journal | IEEE Transactions on Cognitive Communications and Networking | Multi-agent learning | Communications and Networks | ||
2018/01/01 00:00 | Distributed Task Management in Cyber-Physical Systems: How to Cooperate under Uncertainty? | S. Maghsudi, M. van der Schaar | 2018 | https://ieeexplore.ieee.org/document/8647643 | We consider the problem of task allocation in a network of cyber-physical systems (CPSs). The network can have different states, and the tasks are of different types. The task arrival is stochastic and state-dependent. Every CPS is capable of performing each type of task with some specific state-dependent efficiency. The CPSs have to agree on task allocation prior to knowing about the realized network's state and/or the arrived tasks. We model the problem as a multistate stochastic cooperative game with state uncertainty. We then use the concept of deterministic equivalence and sequential core to solve the problem. We establish the non-emptiness of the strong sequential core in our designed task allocation game and investigate its characteristics including uniqueness and optimality. Moreover, we prove that in the task allocation game, the strong sequential core is equivalent to Walrasian equilibrium under state uncertainty; consequently, it can be implemented by using the Walras' tatonnement process. | 10.1109/GLOCOM.2018.8647643 | Conference | IEEE Global Communications Conference (GLOBECOM) Ad Hoc and Sensor Networking Symposium (AHSN) | Multi-agent learning | Communications and Networks | ||
2018/01/01 00:00 | Estimating Missing Data in Temporal Data Streams Using Multi-directional Recurrent Neural Networks | J. Yoon, W. R. Zame, M. van der Schaar | 2018 | https://ieeexplore.ieee.org/document/8485748 | Missing data is a ubiquitous problem. It is especially challenging in medical settings because many streams of measurements are collected at different-and often irregular-times. Accurate estimation of the missing measurements is critical for many reasons, including diagnosis, prognosis, and treatment. Existing methods address this estimation problem by interpolating within data streams or imputing across data streams (both of which ignore important information) or ignoring the temporal aspect of the data and imposing strong assumptions about the nature of the data-generating process and/or the pattern of missing data (both of which are especially problematic for medical data). We propose a new approach, based on a novel deep learning architecture that we call a Multi-directional Recurrent Neural Network that interpolates within data streams and imputes across data streams. We demonstrate the power of our approach by applying it to five real-world medical datasets. We show that it provides dramatically improved estimation of missing measurements in comparison to 11 state-of-the-art benchmarks (including Spline and Cubic Interpolations, MICE, MissForest, matrix completion, and several RNN methods); typical improvements in Root Mean Squared Error are between 35%-50%. Additional experiments based on the same five datasets demonstrate that the improvements provided by our method are extremely robust. | 10.1109/TBME.2018.2874712 | Journal | IEEE Transactions on Biomedical Engineering | Deep learning, Time series analysis | Early warning systems, Missing Data Imputation, Risk & disease trajectories, Screening | ||
2018/01/01 00:00 | Estimation of Individual Treatment Effect in Latent Confounder Models via Adversarial Learning | C. Lee, N. Mastronarde, M. van der Schaar | 2018 | https://arxiv.org/abs/1811.08943 | Estimating the individual treatment effect (ITE) from observational data is essential in medicine. A central challenge in estimating the ITE is handling confounders, which are factors that affect both an intervention and its outcome. Most previous work relies on the unconfoundedness assumption, which posits that all the confounders are measured in the observational data. However, if there are unmeasurable (latent) confounders, then confounding bias is introduced. Fortunately, noisy proxies for the latent confounders are often available and can be used to make an unbiased estimate of the ITE. In this paper, we develop a novel adversarial learning framework to make unbiased estimates of the ITE using noisy proxies. | Conference | NeurIPS Machine Learning for Health Workshop | Causal inference, Deep learning | Treatment & trials | |||
2018/01/01 00:00 | Feature Selection for Survival Analysis with Competing Risks using Deep Learning | C. Rietschel, J. Yoon, M. van der Schaar | 2018 | https://arxiv.org/abs/1811.09317 | Deep learning models for survival analysis have gained significant attention in the literature, but they suffer from severe performance deficits when the dataset contains many irrelevant features. We give empirical evidence for this problem in real-world medical settings using the state-of-the-art model DeepHit. Furthermore, we develop methods to improve the deep learning model through novel approaches to feature selection in survival analysis. We propose filter methods for hard feature selection and a neural network architecture that weights features for soft feature selection. Our experiments on two real-world medical datasets demonstrate that substantial performance improvements against the original models are achievable. | Conference | NeurIPS Machine Learning for Health Workshop | Deep learning, Feature selection | Risk & prognosis | |||
2018/01/01 00:00 | Finding It Now: Networked Classifiers in Real-Time Stream Mining Systems | R. Ducasse, C. Tekin, M. van der Schaar | 2018 | https://link.springer.com/chapter/10.1007/978-3-319-91734-4_3 | The aim of this chapter is to describe and optimize the specifications of signal processing systems, aimed at extracting in real time valuable information out of large-scale decentralized datasets. A first section will explain the motivations and stakes and describe key characteristics and challenges of stream mining applications. We then formalize an analytical framework which will be used to describe and optimize distributed stream mining knowledge extraction from large scale streams. In stream mining applications, classifiers are organized into a connected topology mapped onto a distributed infrastructure. We will study linear chains and optimise the ordering of the classifiers to increase accuracy of classification and minimise delay. We then present a decentralized decision framework for joint topology construction and local classifier configuration. In many cases, accuracy of classifiers are not known beforehand. In the last section, we look at how to learn online the classifiers characteristics without increasing computation overhead. Stream mining is an active field of research, at the crossing of various disciplines, including multimedia signal processing, distributed systems, machine learning etc. As such, we will indicate several areas for future research and development. | 10.1007/978-3-319-91734-4 | Chapter | Handbook of Signal Processing Systems | ||||
2018/01/01 00:00 | Forecasting Individualized Disease Trajectories using Interpretable Deep Learning | A. M. Alaa, M. van der Schaar | 2018 | https://arxiv.org/abs/1810.10489 | Disease progression models are instrumental in predicting individual-level health trajectories and understanding disease dynamics. Existing models are capable of providing either accurate predictions of patients prognoses or clinically interpretable representations of disease pathophysiology, but not both. In this paper, we develop the phased attentive state space (PASS) model of disease progression, a deep probabilistic model that captures complex representations for disease progression while maintaining clinical interpretability. Unlike Markovian state space models which assume memoryless dynamics, PASS uses an attention mechanism to induce "memoryful" state transitions, whereby repeatedly updated attention weights are used to focus on past state realizations that best predict future states. This gives rise to complex, non-stationary state dynamics that remain interpretable through the generated attention weights, which designate the relationships between the realized state variables for individual patients. PASS uses phased LSTM units (with time gates controlled by parametrized oscillations) to generate the attention weights in continuous time, which enables handling irregularly-sampled and potentially missing medical observations. Experiments on data from a realworld cohort of patients show that PASS successfully balances the tradeoff between accuracy and interpretability: it demonstrates superior predictive accuracy and learns insightful individual-level representations of disease progression. | Other | Risk & disease trajectories | |||||
2018/01/01 00:00 | Generalized Global Bandit and Its Application in Cellular Coverage Optimization | C. Shen, R. Zhou, C. Tekin, M. van der Schaar | 2018 | https://ieeexplore.ieee.org/document/8269316 | Motivated by the engineering problem of cellular coverage optimization, we propose a novel multiarmed bandit model called generalized global bandit. We develop a series of greedy algorithms that have the capability to handle nonmonotonic but decomposable reward functions, multidimensional global parameters, and switching costs. The proposed algorithms are rigorously analyzed under the multiarmed bandit framework, where we show that they achieve bounded regret, and hence, they are guaranteed to converge to the optimal arm in finite time. The algorithms are then applied to the cellular coverage optimization problem to achieve the optimal tradeoff between sufficient small cell coverage and limited macroleakage without prior knowledge of the deployment environment. The performance advantage of the new algorithms over existing bandits solutions is revealed analytically and further confirmed via numerical simulations. The key element behind the performance improvement is a more efficient "trial and error" mechanism, in which any trial will help improve the knowledge of all candidate power levels. | 10.1109/JSTSP.2018.2798164 | Journal | IEEE Journal of Selected Topics in Signal Processing | Reinforcement learning | Communications and Networks | ||
2018/01/01 00:00 | Global Bandits | O. Atan, C. Tekin, M. van der Schaar | 2018 | https://ieeexplore.ieee.org/document/8337089 | Multiarmed bandits (MABs) model sequential decision-making problems, in which a learner sequentially chooses arms with unknown reward distributions in order to maximize its cumulative reward. Most of the prior works on MAB assume that the reward distributions of each arm are independent. But in a wide variety of decision problems- from drug dosage to dynamic pricing-the expected rewards of different arms are correlated, so that selecting one arm provides information about the expected rewards of other arms as well. We propose and analyze a class of models of such decision problems, which we call global bandits (GB). In the case in which rewards of all arms are deterministic functions of a single unknown parameter, we construct a greedy policy that achieves bounded regret, with a bound that depends on the single true parameter of the problem. Hence, this policy selects suboptimal arms only finitely many times with probability one. For this case, we also obtain a bound on regret that is independent of the true parameter; this bound is sublinear, with an exponent that depends on the informativeness of the arms. We also propose a variant of the greedy policy that achieves Õ(√T) worst case and O(1) parameter-dependent regret. Finally, we perform experiments on dynamic pricing and show that the proposed algorithms achieve significant gains with respect to the well-known benchmarks. | 10.1109/TNNLS.2018.2818742 | Journal | IEEE Transactions on Neural Networks and Learning Systems | Reinforcement learning | |||
2018/01/01 00:00 | Late PCI in STEMI: A Complex Interaction between Delay and Age | R. Bugiardini, E. Cenko, J. Yoon, B. Ricci, D. Miličić, S. Kedev, Z. Vasiljevic, O. Manfrini, M. van der Schaar, L. Badimon | 2018 | http://www.onlinejacc.org/content/71/11_Supplement/A44 | Background Data supporting the use of percutaneous coronary intervention (PCI) in ST-segment elevation myocardial infarction (STEMI) beyond a 12-hour cut-off are sparse and contradictory. We aimed to investigate whether different delays to hospital presentation and patient clinical characteristics may interfere with PCI resulting in a heterogeneous treatment effect in a population of clinically stable patients with STEMI presenting 12-48 hours after symptom onset. Methods Cohort study using a population based registry (ISACS-TC, NCT01218776) consisting of 2,730 clinically stable patients with STEMI who presented 12-48 hours after symptom onset undergoing PCI or routine medical treatment (RMT). To adjust for confounding, we used inverse probability of treatment weighted (IPTW) models. Effect modifier changes were estimated by tests for interaction. The primary outcome was the composite of 30-day all-cause mortality and severe left ventricular dysfunction (EF<30%). The secondary outcome was 30-day all-cause mortality Results Baseline characteristics were well matched between PCI and RMT groups. There were significant interactions for the primary outcome (p=0.001), and the secondary outcome (p =0.006) and time from symptom onset to hospital presentation (>12-24 hours or ≥ 25-48 hours). Patients were, then, stratified based on their time from symptom onset to hospital presentation in two delay cohorts. Among patients presenting ≥ 25-48 hours after symptom onset, the primary outcome occurred in 5.4% of patients undergoing PCI compared with 8.7% of patients managed with RMT (OR: 0.60 95%CI 0.40-0.89). This benefit was driven mainly by a reduction in mortality with PCI (1.8% vs 5.7% OR: 0.30 95%CI 0.16-0.58). Estimates of PCI effect in the 25-48-hour sample varied with age (OR: 0.26 for patients ≥65 years old and 1.33 for those younger; interaction test P<0.001). There was no difference in outcomes of patients undergoing PCI between 12 and 24 hours. Conclusion Clinically, stable STEMI patients may benefit of PCI beyond the recommended 12-hour cut-off. Advantage varies considerably depending on PCI-related delay and patients’ age | 10.1016/S0735-1097(18)30585-0 | Journal | Journal of the American College of Cardiology | Treatment & trials | |||
2018/01/01 00:00 | Machine Learning for Identifying the Value of Digital Breast Tomosynthesis using Data from a Multicentre Retrospective Study | A. M. Alaa, F. J. Gilbert, Y. Huang, M. van der Schaar | 2018 | https://www.vanderschaar-lab.com/papers/RSNA_2018.pdf | We sought to identify subgroups of women for whom digital breast tomosynthesis (DBT) showed improved diagnostic accuracy for different types of malignant lesions than 2D mammography. The study used multicenter retrospective data from 6,040 women (934 biopsy-confirmed cancers) who underwent both DBT and 2D mammography. An ensemble of 20 state-of-the-art machine learning models was created to predict biopsy outcomes based on radiological classification of 2D and DBT images, Volpara breast composition measures and age. We used this ensemble to assess the diagnostic accuracy of DBT- and 2D-based predictors, to identify subgroups of women for whom DBT is more informative, and to quantify the value of the individual predictors with respect to different types of malignant lesions. | Other | Medical imaging | |||||
2018/01/01 00:00 | Machine learning for individualised medicine | M. van der Schaar, W. R. Zame | 2018 | https://www.vanderschaar-lab.com/papers/Annual_Report_CMO_2018.pdf | In 2015, Eric Topol presented persuasive evidence and arguments that current advances in medical technology – and further advances that are likely to come in the very near future (including wearable devices, faster and cheaper means of genomic sequencing) – will enormously increase the amount of data that is available for individual patients. But as Topol points out, if we wish to use this data to predict, prevent and treat illness – especially chronic illness – data will not be enough: “we want answers, not just data.” In particular, we need predictive power. And as Topol also points out, “such predictive power must rely on machine learning.” | Chapter | Annual Report of Chief Medical Officer, Department of Health and Social Care, United Kingdom | |||||
2018/01/01 00:00 | MAMMO: A Deep Learning Solution for Facilitating Radiologist-Machine Collaboration in Breast Cancer Diagnosis | T. Kyono, F. J. Gilbert, M. van der Schaar | 2018 | https://arxiv.org/abs/1811.02661 | With an aging and growing population, the number of women requiring either screening or symptomatic mammograms is increasing. To reduce the number of mammograms that need to be read by a radiologist while keeping the diagnostic accuracy the same or better than current clinical practice, we develop Man and Machine Mammography Oracle (MAMMO) - a clinical decision support system capable of triaging mammograms into those that can be confidently classified by a machine and those that cannot be, thus requiring the reading of a radiologist. The first component of MAMMO is a novel multi-view convolutional neural network (CNN) with multi-task learning (MTL). MTL enables the CNN to learn the radiological assessments known to be associated with cancer, such as breast density, conspicuity, suspicion, etc., in addition to learning the primary task of cancer diagnosis. We show that MTL has two advantages: 1) learning refined feature representations associated with cancer improves the classification performance of the diagnosis task and 2) issuing radiological assessments provides an additional layer of model interpretability that a radiologist can use to debug and scrutinize the diagnoses provided by the CNN. The second component of MAMMO is a triage network, which takes as input the radiological assessment and diagnostic predictions of the first network's MTL outputs and determines which mammograms can be correctly and confidently diagnosed by the CNN and which mammograms cannot, thus needing to be read by a radiologist. Results obtained on a private dataset of 8,162 patients show that MAMMO reduced the number of radiologist readings by 42.8% while improving the overall diagnostic accuracy in comparison to readings done by radiologists alone. We analyze the triage of patients decided by MAMMO to gain a better understanding of what unique mammogram characteristics require radiologists' expertise. | Other | Deep learning | Medical imaging, Screening | ||||
2018/01/01 00:00 | MATCH-Net: Dynamic Prediction in Survival Analysis using Convolutional Neural Networks | D. Jarrett, J. Yoon, M. van der Schaar | 2018 | https://arxiv.org/abs/1811.10746 | Accurate prediction of disease trajectories is critical for early identification and timely treatment of patients at risk. Conventional methods in survival analysis are often constrained by strong parametric assumptions and limited in their ability to learn from high-dimensional data, while existing neural network models are not readily-adapted to the longitudinal setting. This paper develops a novel convolutional approach that addresses these drawbacks. We present MATCH-Net: a Missingness-Aware Temporal Convolutional Hitting-time Network, designed to capture temporal dependencies and heterogeneous interactions in covariate trajectories and patterns of missingness. To the best of our knowledge, this is the first investigation of temporal convolutions in the context of dynamic prediction for personalized risk prognosis. Using real-world data from the Alzheimer's Disease Neuroimaging Initiative, we demonstrate state-of-the-art performance without making any assumptions regarding underlying longitudinal or time-to-event processes attesting to the model's potential utility in clinical decision support. | Conference | NeurIPS Machine Learning for Health Workshop | Deep learning, Time series analysis | Risk & disease trajectories, Screening | |||
2018/01/01 00:00 | Measuring the quality of Synthetic data for use in competitions | J. Jordon, J. Yoon, M. van der Schaar | 2018 | https://arxiv.org/abs/1806.11345 | Machine learning has the potential to assist many communities in using the large datasets that are becoming more and more available. Unfortunately, much of that potential is not being realized because it would require sharing data in a way that compromises privacy. In order to overcome this hurdle, several methods have been proposed that generate synthetic data while preserving the privacy of the real data. In this paper we consider a key characteristic that synthetic data should have in order to be useful for machine learning researchers - the relative performance of two algorithms (trained and tested) on the synthetic dataset should be the same as their relative performance (when trained and tested) on the original dataset. | Conference | KDD Workshop on Machine Learning for Medicine and Healthcare | Deep learning, Privacy-preserving ML & synthetic data | ||||
2018/01/01 00:00 | Mnemosyne: A Decision Support System for Early Detection of Dementia | A. M. Alaa, D. J. Llewellyn, C Routledge, M. van der Schaar | 2018 | https://www.vanderschaar-lab.com/papers/dementia.pdf | Determining the underlying etiology of dementia can be challenging. Computer- based Clinical Decision Support Systems (CDSS) have the potential to provide an objective comparison of data and assist clinicians. To assess the diagnostic impact of a CDSS, the PredictND tool, for differential diagnosis of dementia in memory clinics. In this prospective multicenter study, we recruited 779 patients with either subjective cognitive decline (n=252), mild cognitive impairment (n=219) or any type of dementia (n=274) and followed them for minimum 12 months. Based on all available patient baseline data (demographics, neuropsychological tests, cerebrospinal fluid biomarkers, and MRI visual and computed ratings), the PredictND tool provides a comprehensive overview and analysis of the data with a likelihood index for five diagnostic groups; Alzheimer´s disease, vascular dementia, dementia with Lewy bodies, frontotemporal dementia and subjective cognitive decline. At baseline, a clinician defined an etiological diagnosis and confidence in the diagnosis, first without and subsequently with the PredictND tool. The follow-up diagnosis was used as the reference diagnosis. In total, 747 patients completed the follow-up visits (53% female, 69±10 years). The etiological diagnosis changed in 13% of all cases when using the PredictND tool, but the diagnostic accuracy did not change significantly. Confidence in the diagnosis, measured by a visual analogue scale (VAS, 0-100%) increased (ΔVAS=3.0%, p<0.0001), especially in correctly changed diagnoses (ΔVAS=7.2%, p=0.0011). Adding the PredictND tool to the diagnostic evaluation affected the diagnosis and increased clinicians' confidence in the diagnosis indicating that CDSSs could aid clinicians in the differential diagnosis of dementia. | Other | ||||||
2018/01/01 00:00 | Multiagent systems: learning, strategic behavior, cooperation, and network formation | C. Tekin, S. Zhang, J. Xu, M. van der Schaar | 2018 | https://www.sciencedirect.com/science/article/pii/B9780128136775000262 | Many applications ranging from crowdsourcing to recommender systems involve informationally decentralized agents repeatedly interacting with each other in order to reach their goals. These networked agents base their decisions on incomplete information, which they gather through interactions with their neighbors or through cooperation, which is often costly. This chapter presents a discussion on decentralized learning algorithms that enable the agents to achieve their goals through repeated interaction. First, we discuss cooperative online learning algorithms that help the agents to discover beneficial connections with each other and exploit these connections to maximize the reward. For this case, we explain the relation between the learning speed, network topology, and cooperation cost. Then, we focus on how informationally decentralized agents form cooperation networks through learning. We explain how learning features prominently in many real-world interactions, and greatly affects the evolution of social networks. Links that otherwise would not have formed may now appear, and a much greater variety of network configurations can be reached. We show that the impact of learning on efficiency and social welfare could be both positive or negative. We also demonstrate the use of the aforementioned methods in popularity prediction, recommender systems, expert selection, and multimedia content aggregation. | 10.1016/B978-0-12-813677-5.00026-2 | Chapter | Cooperative and Graph Signal Processing | ||||
2018/01/01 00:00 | Online Learning in Limit Order Book Trade Execution | N. Akbarzadeh, C. Tekin, M. van der Schaar | 2018 | https://ieeexplore.ieee.org/document/8416773 | In this paper, we propose an online learning algorithm for optimal execution in the limit order book of a financial asset. Given a certain number of shares to sell and an allocated time window to complete the transaction, the proposed algorithm dynamically learns the optimal number of shares to sell via market orders at prespecified time slots within the allocated time interval. We model this problem as a Markov Decision Process (MDP), which is then solved by dynamic programming. First, we prove that the optimal policy has a specific form, which requires either selling no shares or the maximum allowed amount of shares at each time slot. Then, we consider the learning problem, in which the state transition probabilities are unknown and need to be learned on the fly. We propose a learning algorithm that exploits the form of the optimal policy when choosing the amount to trade. Interestingly, this algorithm achieves bounded regret with respect to the optimal policy computed based on the complete knowledge of the market dynamics. Our numerical results on several finance datasets show that the proposed algorithm performs significantly better than the traditional Q-learning algorithm by exploiting the structure of the problem. | 10.1109/TSP.2018.2858188 | Journal | IEEE Transactions on Signal Processing | ||||
2018/01/01 00:00 | Personalized survival predictions via Trees of Predictors: An application to cardiac transplantation | J. Yoon, W. R. Zame, A. Banerjee, M. Cadeiras, A. M. Alaa, M. van der Schaar | 2018 | http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0194985 | Risk prediction is crucial in many areas of medical practice, such as cardiac transplantation, but existing clinical risk-scoring methods have suboptimal performance. We develop a novel risk prediction algorithm and test its performance on the database of all patients who were registered for cardiac transplantation in the United States during 1985-2015. We develop a new, interpretable, methodology (ToPs: Trees of Predictors) built on the principle that specific predictive (survival) models should be used for specific clusters within the patient population. ToPs discovers these specific clusters and the specific predictive model that performs best for each cluster. In comparison with existing clinical risk scoring methods and state-of-the-art machine learning methods, our method provides significant improvements in survival predictions, both post- and pre-cardiac transplantation. For instance: in terms of 3-month survival post-transplantation, our method achieves AUC of 0.660; the best clinical risk scoring method (RSS) achieves 0.587. In terms of 3-year survival/mortality predictions post-transplantation (in comparison to RSS), holding specificity at 80.0%, our algorithm correctly predicts survival for 2,442 (14.0%) more patients (of 17,441 who actually survived); holding sensitivity at 80.0%, our algorithm correctly predicts mortality for 694 (13.0%) more patients (of 5,339 who did not survive). ToPs achieves similar improvements for other time horizons and for predictions pre-transplantation. ToPs discovers the most relevant features (covariates), uses available features to best advantage, and can adapt to changes in clinical practice. We show that, in comparison with existing clinical risk-scoring methods and other machine learning methods, ToPs significantly improves survival predictions both post- and pre-cardiac transplantation. ToPs provides a more accurate, personalized approach to survival prediction that can benefit patients, clinicians, and policymakers in making clinical decisions and setting clinical policy. Because survival prediction is widely used in clinical decision-making across diseases and clinical specialties, the implications of our methods are far-reaching. | 10.1371/journal.pone.0194985 | Journal | PloS One | Ensemble learning | Phenotyping & subgroup analysis, Risk & prognosis | ||
2018/01/01 00:00 | Prognostication and Risk Factors for Cystic Fibrosis via Automated Machine Learning | A. M. Alaa, M. van der Schaar | 2018 | https://www.nature.com/articles/s41598-018-29523-2 | Accurate prediction of survival for cystic fibrosis (CF) patients is instrumental in establishing the optimal timing for referring patients with terminal respiratory failure for lung transplantation (LT). Current practice considers referring patients for LT evaluation once the forced expiratory volume (FEV1) drops below 30% of its predicted nominal value. While FEV1 is indeed a strong predictor of CF-related mortality, we hypothesized that the survival behavior of CF patients exhibits a lot more heterogeneity. To this end, we developed an algorithmic framework, which we call AutoPrognosis, that leverages the power of machine learning to automate the process of constructing clinical prognostic models, and used it to build a prognostic model for CF using data from a contemporary cohort that involved 99% of the CF population in the UK. AutoPrognosis uses Bayesian optimization techniques to automate the process of configuring ensembles of machine learning pipelines, which involve imputation, feature processing, classification and calibration algorithms. Because it is automated, it can be used by clinical researchers to build prognostic models without the need for in-depth knowledge of machine learning. Our experiments revealed that the accuracy of the model learned by AutoPrognosis is superior to that of existing guidelines and other competing models. | 10.1038/s41598-018-29523-2 | Journal | Nature Scientific Reports | Automated ML, Feature selection | Risk & prognosis, Scientific discovery | ||
2018/01/01 00:00 | Protein Family-specific Models using Deep Neural Networks and Transfer Learning Improve Virtual Screening and Highlight the Need for More Data | F. Imrie, A. Bradley, M. van der Schaar, C. Deane | 2018 | https://pubs.acs.org/doi/10.1021/acs.jcim.8b00350 | Machine learning has shown enormous potential for computer-aided drug discovery. Here we show how modern convolutional neural networks (CNNs) can be applied to structure-based virtual screening. We have coupled our densely connected CNN (DenseNet) with a transfer learning approach which we use to produce an ensemble of protein family-specific models. We conduct an in-depth empirical study and provide the first guidelines on the minimum requirements for adopting a protein family-specific model. Our method also highlights the need for additional data, even in data-rich protein families. Our approach outperforms recent benchmarks on the DUD-E data set and an independent test set constructed from the ChEMBL database. Using a clustered cross-validation on DUD-E, we achieve an average AUC ROC of 0.92 and a 0.5% ROC enrichment factor of 79. This represents an improvement in early enrichment of over 75% compared to a recent machine learning benchmark. Our results demonstrate that the continued improvements in machine learning architecture for computer vision apply to structure-based virtual screening. | 10.1021/acs.jcim.8b00350 | Journal | Journal of Chemical Information and Modeling | Deep learning, Transfer learning | |||
2018/01/01 00:00 | Repeated Network Games with Dominant Actions and Individual Rationality | Y. Song, M. van der Schaar | 2018 | https://ieeexplore.ieee.org/document/8485406 | Three kinds of activities happen in a network: agents link or unlink, agents act, and agents observe and learn. The existing literature on network formation, on network games, and on monitoring in networks studies each of these in isolation. We propose a framework in which these happen together. In our model, agents repeatedly choose to whom they link and the actions they take with their neighbors. Agents learn by observing links, actions of their neighbors, and (perhaps) from a public signal. We dene a repeated network game and an equilibrium of this game. An essential feature is that the network formed and hence the games played are determined endogenously and may be different at each moment of time. Equilibrium networks and actions are strongly inter-dependent. When a dominant action is available to each agent and each agents individual rationality is satisfied, we prove a Convergence Theorem that characterizes the networks and actions that persist in the steady-state. We also show endogenously formed networks with the associated action proles often yield a higher social welfare than exogenously prescribed ones. Finally, we show that in the absence of an informative monitoring structure, cooperation in sparse networks may fail even when agents are patient. | 10.1109/TNSE.2018.2874485 | Journal | IEEE Transactions on Network Science and Engineering | Communications and Networks, Networks | |||
2018/01/01 00:00 | RNN-SURV: a Deep Recurrent Model for Survival Analysis | E. Giunchiglia, A. Nemchenko, M. van der Schaar | 2018 | https://link.springer.com/chapter/10.1007/978-3-030-01424-7_3 | Current medical practice is driven by clinical guidelines which are designed for the “average” patient. Deep learning is enabling medicine to become personalized to the patient at hand. In this paper we present a new recurrent neural network model for personalized survival analysis called rnn-surv. Our model is able to exploit censored data to compute both the risk score and the survival function of each patient. At each time step, the network takes as input the features characterizing the patient and the identifier of the time step, creates an embedding, and outputs the value of the survival function in that time step. Finally, the values of the survival function are linearly combined to compute the unique risk score. Thanks to the model structure and the training designed to exploit two loss functions, our model gets better concordance index (C-index) than the state of the art approaches. | 10.1007/978-3-030-01424-7_3 | Conference | International Conference on Artificial Neural Networks (ICANN) | Deep learning, Time series analysis | |||
2018/01/01 00:00 | Sex difference in the impact of delay to reperfusion on coronary blood flow and outcomes in ST-segment elevation myocardial infarction | E. Cenko, O. Manfrini, S Kedev, G Stankovic, Z Vasiljevic, M. van der Schaar, J. Yoon, M. Vavlukis, O. Kalpak, D. Miličić, A. Koller, L. Badimon, R. Bugiardini | 2018 | https://academic.oup.com/eurheartj/article/39/suppl_1/ehy564.P580/5081808 | Delay from symptom onset to reperfusion by primary percutaneous coronary intervention (PCI) is longer in women and has been linked to increased mortality and worse clinical outcome. The mechanism underlying this association is still unclear. We sought to investigate the impact of delay from symptom onset to hospital presentation on sex difference in TIMI flow grades and 30-day mortality after primary PCI for STEMI. The current study evaluated 2596 patients with STEMI who underwent primary PCI within 12 hours from symptom onset and had a stent implantation between 2010 and 2016 in the ISACS-TC registry (ClinicalTrials.gov, NCT01218776). Main outcomes measures were adjusted 30-day mortality rates and suboptimal post-PCI TIMI (Thrombolysis In Myocardial Infarction) flow (grades ≤2) estimated using inverse probability of treatment weighted (IPTW) models. Time from symptom onset to hospital presentation was classified as <2 hours, <6 hours, and <12 hours. Early reperfusion (<2 hours) was not associated with significant sex differences in the rates of mortality and final flow post-PCI TIMI flow (grades ≤2). Sex differences in outcomes differed if analyzing patients with≥2-hour delay. Mortality rates were 4.0% for women versus 2.1% for men with an OR of 1.94 (95% CI: 1.09 to 3.47) in patients with <6 hours delay, and 4.6% for women versus 2.3% for men with an OR of 2.02 (95% CI: 1.24 to 3.27) in patients with <12 hours delay. The odds of TIMI ≤2 in women versus men were 1.40 (95% CI: 0.85 to 2.31) in patients with <6 hours delay, and 1.49 (95% CI: 0.99 to 2.24) in patients with <12 hours delay. Longer delays to reperfusion are associated with sex differences in the rates of 30-day mortality and worse outcome in women. Women are more vulnerable to prolonged untreated ischemia. This effect appears not to be mediated by less successful reperfusion. | 10.1093/eurheartj/ehy564.P580 | Journal | European Heart Journal | Treatment & trials | |||
2018/01/01 00:00 | Sex Differences in Outcomes After STEMI: Effect Modification by Treatment Strategy and Age | E. Cenko, J. Yoon, S. Kedev, G. Stankovic, Z. Vasiljevic, G. Krljanac, O. Kalpak, B. Ricci, D. Miličić, O. Manfrini, M. van der Schaar, L. Badimon, R. Bugiardini | 2018 | https://jamanetwork.com/journals/jamainternalmedicine/fullarticle/2677062 | Previous works have shown that women hospitalized with ST-segment elevation myocardial infarction (STEMI) have higher short-term mortality rates than men. However, it is unclear if these differences persist among patients undergoing contemporary primary percutaneous coronary intervention (PCI). To investigate whether the risk of 30-day mortality after STEMI is higher in women than men and, if so, to assess the role of age, medications, and primary PCI in this excess of risk. From January 2010 to January 2016, a total of 8834 patients were hospitalized and received medical treatment for STEMI in 41 hospitals referring data to the International Survey of Acute Coronary Syndromes in Transitional Countries (ISACS-TC) registry (NCT01218776). Adjusted 30-day mortality rates estimated using inverse probability of treatment weighted (IPTW) logistic regression models. There were 2657 women with a mean (SD) age of 66.1 (11.6) years and 6177 men with a mean (SD) age of 59.9 (11.7) years included in the study. Thirty-day mortality was significantly higher for women than for men (11.6% vs 6.0%, P < .001). The gap in sex-specific mortality narrowed if restricting the analysis to men and women undergoing primary PCI (7.1% vs 3.3%, P < .001). After multivariable adjustment for comorbidities and treatment covariates, women under 60 had higher early mortality risk than men of the same age category (OR, 1.88; 95% CI, 1.04-3.26; P = .02). The risk in the subgroups aged 60 to 74 years and over 75 years was not significantly different between sexes (OR, 1.28; 95% CI, 0.88-1.88; P = .19 and OR, 1.17; 95% CI, 0.80-1.73; P = .40; respectively). After IPTW adjustment for baseline clinical covariates, the relationship among sex, age category, and 30-day mortality was similar (OR, 1.56 [95% CI, 1.05-2.3]; OR, 1.49 [95% CI, 1.15-1.92]; and OR, 1.21 [95% CI, 0.93-1.57]; respectively). Younger age was associated with higher 30-day mortality rates in women with STEMI even after adjustment for medications, primary PCI, and other coexisting comorbidities. This difference declines after age 60 and is no longer observed in oldest women. | 10.1001/jamainternmed.2018.0514 | Journal | JAMA Internal Medicine | Phenotyping & subgroup analysis, Treatment & trials | |||
2018/01/01 00:00 | Siamese Survival Analysis with Competing Risks | A. Nemchenko, T. Kyono, M. van der Schaar | 2018 | https://link.springer.com/chapter/10.1007/978-3-030-01424-7_26 | Survival analysis in the presence of multiple possible adverse events, i.e., competing risks, is a pervasive problem in many industries (healthcare, finance, etc.). Since only one event is typically observed, the incidence of an event of interest is often obscured by other related competing events. This nonidentifiability, or inability to estimate true cause-specific survival curves from empirical data, further complicates competing risk survival analysis. We introduce Siamese Survival Prognosis Network (SSPN), a novel deep learning architecture for estimating personalized risk scores in the presence of competing risks. SSPN circumvents the nonidentifiability problem by avoiding the estimation of cause-specific survival curves and instead determines pairwise concordant time-dependent risks, where longer event times are assigned lower risks. Furthermore, SSPN is able to directly optimize an approximation to the C-discrimination index, rather than relying on well-known metrics which are unable to capture the unique requirements of survival analysis with competing risks. | 10.1007/978-3-030-01424-7 | Conference | International Conference on Artificial Neural Networks (ICANN) | Deep learning | |||
2018/01/01 00:00 | ToPs: Ensemble Learning with Trees of Predictors | J. Yoon, W. R. Zame, M. van der Schaar | 2018 | https://ieeexplore.ieee.org/document/8294229 | We present a new approach to ensemble learning. Our approach differs from previous approaches in that it constructs and applies different predictive models to different subsets of the feature space. It does this by constructing a tree of subsets of the feature space and associating a predictor (predictive model) to each node of the tree; we call the resulting object a tree of predictors. The (locally) optimal tree of predictors is derived recursively; each step involves jointly optimizing the split of the terminal nodes of the previous tree and the choice of learner (from among a given set of base learners) and training set-hence predictor-for each set in the split. The features of a new instance determine a unique path through the optimal tree of predictors; the final prediction aggregates the predictions of the predictors along this path. Thus, our approach uses base learners to create complex learners that are matched to the characteristics of the data set while avoiding overfitting. We establish loss bounds for the final predictor in terms of the Rademacher complexity of the base learners. We report the results of a number of experiments on a variety of datasets, showing that our approach provides statistically significant improvements over a wide variety of state-of-the-art machine learning algorithms, including various ensemble learning methods. | 10.1109/TSP.2018.2807402 | Journal | IEEE Transactions on Signal Processing | Ensemble learning | Phenotyping & subgroup analysis, Risk & prognosis | ||
2018/01/01 00:00 | What is Interpretable? Using Machine Learning to Design Interpretable Decision-Support Systems | O. Lahav, N. Mastronarde, M. van der Schaar | 2018 | https://arxiv.org/abs/1811.10799 | Recent efforts in Machine Learning (ML) interpretability have focused on creating methods for explaining black-box ML models. However, these methods rely on the assumption that simple approximations, such as linear models or decision-trees, are inherently human-interpretable, which has not been empirically tested. Additionally, past efforts have focused exclusively on comprehension, neglecting to explore the trust component necessary to convince non-technical experts, such as clinicians, to utilize ML models in practice. In this paper, we posit that reinforcement learning (RL) can be used to learn what is interpretable to different users and, consequently, build their trust in ML models. To validate this idea, we first train a neural network to provide risk assessments for heart failure patients. We then design a RL-based clinical decision-support system (DSS) around the neural network model, which can learn from its interactions with users. We conduct an experiment involving a diverse set of clinicians from multiple institutions in three different countries. Our results demonstrate that ML experts cannot accurately predict which system outputs will maximize clinicians' confidence in the underlying neural network model, and suggest additional findings that have broad implications to the future of research into ML interpretability and the use of ML in medicine. | Conference | NeurIPS Machine Learning for Health Workshop | Interpretability & explainability | ||||
2017/12/04 00:00 | Bayesian Inference of Individualized Treatment Effects using Multi-task Gaussian Processes | A. M. Alaa, M. van der Schaar | 2017 | http://papers.nips.cc/paper/6934-bayesian-inference-of-individualized-treatment-effects-using-multi-task-gaussian-processes | Predicated on the increasing abundance of electronic health records, we investigate the problem of inferring individualized treatment effects using observational data. Stemming from the potential outcomes model, we propose a novel multi-task learning framework in which factual and counterfactual outcomes are modeled as the outputs of a function in a vector-valued reproducing kernel Hilbert space (vvRKHS). We develop a nonparametric Bayesian method for learning the treatment effects using a multi-task Gaussian process (GP) with a linear coregionalization kernel as a prior over the vvRKHS. The Bayesian approach allows us to compute individualized measures of confidence in our estimates via pointwise credible intervals, which are crucial for realizing the full potential of precision medicine. The impact of selection bias is alleviated via a risk-based empirical Bayes method for adapting the multi-task GP prior, which jointly minimizes the empirical error in factual outcomes and the uncertainty in (unobserved) counterfactual outcomes. We conduct experiments on observational datasets for an interventional social program applied to premature infants, and a left ventricular assist device applied to cardiac patients wait-listed for a heart transplant. In both experiments, we show that our method significantly outperforms the state-of-the-art. | Conference | NeurIPS | Causal inference | Treatment & trials | |||
2017/12/04 00:00 | Deep Multi-task Gaussian Processes for Survival Analysis with Competing Risks | A. M. Alaa, M. van der Schaar | 2017 | http://papers.nips.cc/paper/6827-deep-multi-task-gaussian-processes-for-survival-analysis-with-competing-risks | Designing optimal treatment plans for patients with comorbidities requires accurate cause-specific mortality prognosis. Motivated by the recent availability of linked electronic health records, we develop a nonparametric Bayesian model for survival analysis with competing risks, which can be used for jointly assessing a patient's risk of multiple (competing) adverse outcomes. The model views a patient's survival times with respect to the competing risks as the outputs of a deep multi-task Gaussian process (DMGP), the inputs to which are the patients' covariates. Unlike parametric survival analysis methods based on Cox and Weibull models, our model uses DMGPs to capture complex non-linear interactions between the patients' covariates and cause-specific survival times, thereby learning flexible patient-specific and cause-specific survival curves, all in a data-driven fashion without explicit parametric assumptions on the hazard rates. We propose a variational inference algorithm that is capable of learning the model parameters from time-to-event data while handling right censoring. Experiments on synthetic and real data show that our model outperforms the state-of-the-art survival models. | Conference | NeurIPS | Survival analysis competing risks & comorbidities | Risk & prognosis | |||
2017/12/04 00:00 | DPSCREEN: Dynamic Personalized Screening | K. Ahuja, W. R. Zame, M. van der Schaar | 2017 | http://papers.nips.cc/paper/6731-dpscreen-dynamic-personalized-screening | Screening is important for the diagnosis and treatment of a wide variety of diseases. A good screening policy should be personalized to the disease, to the features of the patient and to the dynamic history of the patient (including the history of screening). The growth of electronic health records data has led to the development of many models to predict the onset and progression of different diseases. However, there has been limited work to address the personalized screening for these different diseases. In this work, we develop the first framework to construct screening policies for a large class of disease models. The disease is modeled as a finite state stochastic process with an absorbing disease state. The patient observes an external information process (for instance, self-examinations, discovering comorbidities, etc.) which can trigger the patient to arrive at the clinician earlier than scheduled screenings. The clinician carries out the tests; based on the test results and the external information it schedules the next arrival. Computing the exactly optimal screening policy that balances the delay in the detection against the frequency of screenings is computationally intractable; this paper provides a computationally tractable construction of an approximately optimal policy. As an illustration, we make use of a large breast cancer data set. The constructed policy screens patients more or less often according to their initial risk -- it is personalized to the features of the patient -- and according to the results of previous screens – it is personalized to the history of the patient. In comparison with existing clinical policies, the constructed policy leads to large reductions (28-68 %) in the number of screens performed while achieving the same expected delays in disease detection. | Conference | NeurIPS | Time series analysis | Clinical practice, Screening | |||
2017/08/06 00:00 | Learning from Clinical Judgments: Semi-Markov-Modulated Marked Hawkes Processes for Risk Prognosis | A. M. Alaa, S. Hu, M. van der Schaar | 2017 | http://proceedings.mlr.press/v70/alaa17a.html | Critically ill patients in regular wards are vulnerable to unanticipated adverse events which require prompt transfer to the intensive care unit (ICU). To allow for accurate prognosis of deteriorating patients, we develop a novel continuous-time probabilistic model for a monitored patient’s temporal sequence of physiological data. Our model captures “informatively sampled” patient episodes: the clinicians’ decisions on when to observe a hospitalized patient’s vital signs and lab tests over time are represented by a marked Hawkes process, with intensity parameters that are modulated by the patient’s latent clinical states, and with observable physiological data (mark process) modeled as a switching multi-task Gaussian process. In addition, our model captures “informatively censored” patient episodes by representing the patient’s latent clinical states as an absorbing semi-Markov jump process. The model parameters are learned from offline patient episodes in the electronic health records via an EM-based algorithm. Experiments conducted on a cohort of patients admitted to a major medical center over a 3-year period show that risk prognosis based on our model significantly outperforms the currently deployed medical risk scores and other baseline machine learning algorithms. | Conference | ICML | Time series analysis | Early warning systems, Risk & disease trajectories | |||
2017/01/01 00:00 | A Hidden Absorbing Semi-Markov Model for Informatively Censored Temporal Data: Learning and Inference | A. M. Alaa, M. van der Schaar | 2017 | https://www.jmlr.org/papers/v19/16-656.html | Modeling continuous-time physiological processes that manifest a patient's evolving clinical states is a key step in approaching many problems in healthcare. In this paper, we develop the Hidden Absorbing Semi-Markov Model (HASMM): a versatile probabilistic model that is capable of capturing the modern electronic health record (EHR) data. Unlike existing models, the HASMM accommodates irregularly sampled, temporally correlated, and informatively censored physiological data, and can describe non-stationary clinical state transitions. Learning the HASMM parameters from the EHR data is achieved via a novel forward-filtering backward-sampling Monte-Carlo EM algorithm that exploits the knowledge of the end-point clinical outcomes (informative censoring) in the EHR data, and implements the E-step by sequentially sampling the patients' clinical states in the reverse-time direction while conditioning on the future states. Real-time inferences are drawn via a forward-filtering algorithm that operates on a virtually constructed discrete-time embedded Markov chain that mirrors the patient's continuous-time state trajectory. We demonstrate the prognostic utility of the HASMM in a critical care prognosis setting using a real-world dataset for patients admitted to the Ronald Reagan UCLA Medical Center. In particular, we show that using HASMMs, a patient's clinical deterioration can be predicted 8-9 hours prior to intensive care unit admission, with a 22% AUC gain compared to the Rothman index, which is the state-of-the-art critical care risk scoring technology. | 10.5555/3291125.3291129 | Journal | Journal of Machine Learning Research | Time series analysis | Early warning systems, Risk & disease trajectories | ||
2017/01/01 00:00 | A Learning Approach to Frequent Handover Mitigations in 3GPP Mobility Protocols | C. Shen, M. van der Schaar | 2017 | https://ieeexplore.ieee.org/document/7925950 | The industry standard 3GPP mobility solutions are analyzed through the lens of bandit learning theory. In particular, it is shown that the original 3GPP handover protocol, developed primarily from a radio frequency and load balancing perspective, can be viewed as a special case of the ε-greedy bandit algorithm, and thus its sub-optimality can be characterized via the regret analysis. Inspired by the equivalence between 3GPP handover protocols and bandit algorithms, we rigorously analyze the performance of cell range expansion in 3GPP handover enhancement, and further propose a learning-based approach to address the frequent handover (FHO) challenges in ultra-dense networks. The key component is to explicitly consider the handover cost to discourage FHOs. Rather surprisingly, we prove that the bandit-inspired scheme with handover cost can be viewed as an enhancement to the simple sticky biasing solution in 3GPP that has been developed to partially address the FHO problem, and hence lay a theoretic foundation to this industrial intuition. | 10.1109/WCNC.2017.7925950 | Conference | IEEE Wireless Communications and Networking Conference (WCNC) | ||||
2017/01/01 00:00 | A Machine Learning Approach for Tracking and Predicting Student Performance in Degree Programs | J. Xu, K. H. Moon, M. van der Schaar | 2017 | https://ieeexplore.ieee.org/document/7894238 | Accurately predicting students' future performance based on their ongoing academic records is crucial for effectively carrying out necessary pedagogical interventions to ensure students' on-time and satisfactory graduation. Although there is a rich literature on predicting student performance when solving problems or studying for courses using data-driven approaches, predicting student performance in completing degrees (e.g., college programs) is much less studied and faces new challenges: (1) Students differ tremendously in terms of backgrounds and selected courses; (2) courses are not equally informative for making accurate predictions; and (3) students' evolving progress needs to be incorporated into the prediction. In this paper, we develop a novel machine learning method for predicting student performance in degree programs that is able to address these key challenges. The proposed method has two major features. First, a bilayered structure comprising multiple base predictors and a cascade of ensemble predictors is developed for making predictions based on students' evolving performance states. Second, a data-driven approach based on latent factor models and probabilistic matrix factorization is proposed to discover course relevance, which is important for constructing efficient base predictors. Through extensive simulations on an undergraduate student dataset collected over three years at University of California, Los Angeles, we show that the proposed method achieves superior performance to benchmark approaches. | 10.1109/JSTSP.2017.2692560 | Journal | IEEE Journal of Selected Topics in Signal Processing | Personalized Education | |||
2017/01/01 00:00 | A Micro-foundation of Social Capital in Evolving Social Networks | A. M. Alaa, K. Ahuja, M. van der Schaar | 2017 | https://ieeexplore.ieee.org/document/7986996 | A social network confers benefits and advantages on individuals (and on groups); the literature refers to these benefits and advantages as social capital. An individual's social capital depends on its position in the network and on the shape of the network-but positions in the network and the shape of the network are determined endogenously and change as the network forms and evolves. This paper presents a micro-founded mathematical model of the evolution of a social network and of the social capital of individuals within the network. The evolution of the network and of social capital are driven by exogenous and endogenous processes-entry, meeting, linking-that have both random and deterministic components. These processes are influenced by the extent to which individuals are homophilic (prefer others of their own type), structurally opportunistic (prefer neighbors of neighbors to strangers), socially gregarious (desire more or fewer connections) and by the distribution of types in the society. In the analysis, we identify different kinds of social capital: bonding capital refers to links to others; popularity capital refers to links from others; bridging capital refers to connections between others. We show that each form of capital plays a different role and is affected differently by the characteristics of the society. Bonding capital is created by forming a circle of connections; homophily increases bonding capital because it makes this circle of connections more homogeneous. Popularity capital leads to preferential attachment : individuals who become popular tend to become more and more popular because others are more likely to link to them. Homophily creates inequality in the popularity capital attained by different social categories; more gregarious types of agents are more likely to become popular. However, in homophilic societies, individuals who belong to less gregarious, less opportunistic, or major types are likely to be more central in the network and thus acquire a bridging capital. And, while extreme homophily maximizes an individual's bonding capital, it also creates structural holes in the network, which hinder the exchange of ideas and information across social categories. Such structural holes represent a potential source of bridging capital: non-homophilic (tolerant or open-minded) individuals can fill these holes and broker interactions at the interface between different social categories. | 10.1109/TNSE.2017.2729782 | Journal | IEEE Transactions on Network Science and Engineering | Communications and Networks, Networks | |||
2017/01/01 00:00 | A Personalized Approach to Asthma Control Over Time: Discovering Phenotypes Using Machine Learning | M. K. Ross, J. Yoon, K. Moon, M. van der Schaar | 2017 | https://www.atsjournals.org/doi/abs/10.1164/ajrccm-conference.2017.195.1_MeetingAbstracts.A5093 | Over 22 million people suffer from asthma in the United States. Up to 30% of cases are difficult-to-treat and the poor treatment response is thought due to differences in underlying pathophysiology. The variation in pathophysiology is reflected by recently described granular phenotypes (endotypes). It is not known which medication each endotype best responds to. In order to address this knowledge gap, we applied our novel machine learning approach (Predictor Pursuit) to identify endotypes and response to medication over time. The discoveries can be used to provide real-time decision support to clinicians to personalized asthma treatment recommendations for their patients. We studied 1,019 patients enrolled in the Childhood Asthma Management Program (CAMP) Trial who had >4 clinical follow-ups documented. Asthma “severity” and “control” were determined using features from the 2007 NAEPP Asthma Guideline criteria. We defined “controlled” patients as those whose last three visits met criteria for “well-controlled” from the guideline. We defined overweight as body mass index >85 percentile for age. We evaluated differences in response to medication with a statistical method (proportionality test) with a significance level of 0.05. Patients were equally distributed in terms of asthma severity level: intermittent (23.6%), mild (23.1%), moderate (24.3%), and severe (29.1%) persistent. For the entire study population, we did not find a statistical difference in the level of control between the budesonide group as compared to the nedocromil group (34.4% (Bud) vs 32.3% (Ned), p-value=0.28). However; our method was able to discover the obese-asthma endotype, which had a significant difference in asthma control. More specifically, the obese asthma patients were more controlled with nedocromil (20.3% vs 37.9%, p-value<0.015) as compared to non-obese asthma patients who responded more favorably to budesonide (38.2% vs 30.9%, p-value<0.044). These methods successfully described asthma control over time in a protocolled trial. The findings of our algorithm also support the clinical construct that obese-asthma patients have a more neutrophilic inflammation profile and are at risk to respond less favorably to corticosteroids. In future work, we will apply Predictor Pursuit methods to a larger clinical data from the EHR. We will focus on time analysis of asthma control over time as applied to fine-grained asthma phenotypes. Our eventual goal is automated real-time clinical decision support for asthma management. | 10.1164/ajrccm-conference.2017.195.1_MeetingAbstracts.A5093 | Conference | American Thoracic Society (ATS) International Conference | Treatment & trials | |||
2017/01/01 00:00 | Actionable intelligence and online learning for semantic computing | C. Tekin, M. van der Schaar | 2017 | http://www.worldscientific.com/doi/abs/10.1142/S2425038416300111 | As the world becomes more connected and instrumented, high dimensional, heterogeneous and time-varying data streams are collected and need to be analyzed on the fly to extract the actionable intelligence from the data streams and make timely decisions based on this knowledge. This requires that appropriate classifiers are invoked to process the incoming streams and find the relevant knowledge. Thus, a key challenge becomes choosing online, at run-time, which classifier should be deployed to make the best possible predictions on the incoming streams. In this paper, we survey a class of methods capable to perform online learning in stream-based semantic computing tasks: multi-armed bandits (MABs). Adopting MABs for stream mining poses, numerous new challenges requires many new innovations. Most importantly, the MABs will need to explicitly consider and track online the time-varying characteristics of the data streams and to learn fast what is the relevant information out of the vast, heterogeneous and possibly highly dimensional data streams. In this paper, we discuss contextual MAB methods, which use similarities in context (meta-data) information to make decisions, and discuss their advantages when applied to stream mining for semantic computing. These methods can be adapted to discover in real-time the relevant contexts guiding the stream mining decisions, and tract the best classifier in presence of concept drift. Moreover, we also discuss how stream mining of multiple data sources can be performed by deploying cooperative MAB solutions and ensemble learning. We conclude the paper by discussing the numerous other advantages of MABs that will benefit semantic computing applications. | 10.1142/S2425038416300111 | Chapter | Encyclopedia with Semantic Computing and Robotic Intelligence | Multi-armed bandits | |||
2017/01/01 00:00 | Bandit Strategies for Blindly Attacking Networks | S. Amuru, R. M. Buehrer, M. van der Schaar | 2017 | http://ieeexplore.ieee.org/document/7996834/ | Can we optimally attack networks (in terms of disrupting the ability of the nodes in the network from communicating) when the network topology is unknown? In this paper, we show that it is not always possible to do so when the network topology is unknown a priori. Specifically, we develop multi armed bandit-based techniques that enable the attacker to learn the best network attack strategies and also discuss the potential limitations that the attacker faces in such blind scenarios. | 10.1109/ICC.2017.7996834 | Conference | IEEE International Conference on Communications (ICC) | ||||
2017/01/01 00:00 | Cognitive Radio Networks for Delay-Sensitive Applications: Games and Learning | Y. Xiao, M. van der Schaar | 2017 | https://link.springer.com/referenceworkentry/10.1007/978-981-10-1389-8_28-1 | We have witnessed an explosion in wireless video traffic in recent years. Video applications are bandwidth intensive and delay sensitive and hence require efficient utilization of spectrum resources. Born to utilize wireless spectrum more efficiently, cognitive radio networks are promising candidates for deployment of wireless video applications. In this chapter, we introduce our recent advances in foresighted resource allocation mechanisms that enable multiuser wireless video applications over cognitive radio networks. The introduced resource allocation mechanisms are foresighted, in the sense that they optimize the long-term video quality of the wireless users. Due to the temporal coupling of delay-sensitive video applications, such foresighted mechanisms outperform mechanisms that maximize the short-term video quality. Moreover, the introduced resource allocation mechanisms allow wireless users to optimize while learning the unknown dynamics in the environment (e.g., incoming traffic, primary user activities). Finally, we introduce variations of the mechanisms that are suitable for networks with self-interested users. These mechanisms ensure efficient video resource allocation even when the users are self-interested and aim to maximize their individual video quality. The foresighted resource allocation mechanisms introduced in this chapter are built upon our theoretical advances in multiuser Markov decision processes, reinforcement learning, and dynamic mechanism design. | 10.1007/978-981-10-1389-8_28-1 | Chapter | Handbook of Cognitive Radio | ||||
2017/01/01 00:00 | Context-Aware Proactive Content Caching With Service Differentiation in Wireless Networks | S. Muller, O. Atan, M. van der Schaar, A. Klein | 2017 | https://ieeexplore.ieee.org/document/7775114 | Content caching in small base stations or wireless infostations is considered to be a suitable approach to improve the efficiency in wireless content delivery. Placing the optimal content into local caches is crucial due to storage limitations, but it requires knowledge about the content popularity distribution, which is often not available in advance. Moreover, local content popularity is subject to fluctuations, since mobile users with different interests connect to the caching entity over time. Which content a user prefers may depend on the user's context. In this paper, we propose a novel algorithm for context-aware proactive caching. The algorithm learns context-specific content popularity online by regularly observing context information of connected users, updating the cache content and observing cache hits subsequently. We derive a sublinear regret bound, which characterizes the learning speed and proves that our algorithm converges to the optimal cache content placement strategy in terms of maximizing the number of cache hits. Furthermore, our algorithm supports service differentiation by allowing operators of caching entities to prioritize customer groups. Our numerical results confirm that our algorithm outperforms state-of-the-art algorithms in a real world data set, with an increase in the number of cache hits of at least 14%. | 10.1109/TWC.2016.2636139 | Journal | IEEE Transactions on Wireless Communications | Multi-agent learning | Communications and Networks | ||
2017/01/01 00:00 | Deep Counterfactual Networks with Propensity-Dropout | A. M. Alaa, M. Weisz, M. van der Schaar | 2017 | https://arxiv.org/abs/1706.05966 | We propose a novel approach for inferring the individualized causal effects of a treatment (intervention) from observational data. Our approach conceptualizes causal inference as a multitask learning problem; we model a subject's potential outcomes using a deep multitask network with a set of shared layers among the factual and counterfactual outcomes, and a set of outcome-specific layers. The impact of selection bias in the observational data is alleviated via a propensity-dropout regularization scheme, in which the network is thinned for every training example via a dropout probability that depends on the associated propensity score. The network is trained in alternating phases, where in each phase we use the training examples of one of the two potential outcomes (treated and control populations) to update the weights of the shared layers and the respective outcome-specific layers. Experiments conducted on data based on a real-world observational study show that our algorithm outperforms the state-of-the-art. | Conference | ICML Workshop on Principled Approaches to Deep Learning | Causal inference, Deep learning | ||||
2017/01/01 00:00 | Discovering Pediatric Asthma Phenotypes Based on Response to Controller Medication Using Machine Learning | M. K. Ross, J. Yoon, A. van der Schaar, M. van der Schaar | 2017 | https://www.atsjournals.org/doi/full/10.1513/AnnalsATS.201702-101OC | Pediatric asthma has variable underlying inflammation and symptom control. Approaches to addressing this heterogeneity, such as clustering methods to find phenotypes and predict outcomes, have been investigated. However, clustering based on the relationship between treatment and clinical outcome has not been performed, and machine learning approaches for long-term outcome prediction in pediatric asthma have not been studied in depth. Our objectives were to use our novel machine learning algorithm, predictor pursuit (PP), to discover pediatric asthma phenotypes on the basis of asthma control in response to controller medications, to predict longitudinal asthma control among children with asthma, and to identify features associated with asthma control within each discovered pediatric phenotype. We applied PP to the Childhood Asthma Management Program study data (n = 1,019) to discover phenotypes on the basis of asthma control between assigned controller therapy groups (budesonide vs. nedocromil). We confirmed PP’s ability to discover phenotypes using the Asthma Clinical Research Network/Childhood Asthma Research and Education network data. We next predicted children’s asthma control over time and compared PP’s performance with that of traditional prediction methods. Last, we identified clinical features most correlated with asthma control in the discovered phenotypes. Four phenotypes were discovered in both datasets: allergic not obese (A+/O−), obese not allergic (A−/O+), allergic and obese (A+/O+), and not allergic not obese (A−/O−). Of the children with well-controlled asthma in the Childhood Asthma Management Program dataset, we found more nonobese children treated with budesonide than with nedocromil (P = 0.015) and more obese children treated with nedocromil than with budesonide (P = 0.008). Within the obese group, more A+/O+ children’s asthma was well controlled with nedocromil than with budesonide (P = 0.022) or with placebo (P = 0.011). The PP algorithm performed significantly better (P < 0.001) than traditional machine learning algorithms for both short- and long-term asthma control prediction. Asthma control and bronchodilator response were the features most predictive of short-term asthma control, regardless of type of controller medication or phenotype. Bronchodilator response and serum eosinophils were the most predictive features of asthma control, regardless of type of controller medication or phenotype. Advanced statistical machine learning approaches can be powerful tools for discovery of phenotypes based on treatment response and can aid in asthma control prediction in complex medical conditions such as asthma. | 10.1513/AnnalsATS.201702-101OC | Journal | Annals of the American Thoracic Society | Phenotyping & subgroup analysis, Treatment & trials | |||
2017/01/01 00:00 | Distributed Learning for Stochastic Generalized Nash Equilibrium Problems | C.-K. Yu, M. van der Schaar, A. Sayed | 2017 | https://ieeexplore.ieee.org/document/7903731 | This paper examines a stochastic formulation of the generalized Nash equilibrium problem where agents are subject to randomness in the environment of unknown statistical distribution. We focus on fully distributed online learning by agents and employ penalized individual cost functions to deal with coupled constraints. Three stochastic gradient strategies are developed with constant step-sizes. We allow the agents to use heterogeneous step-sizes and show that the penalty solution is able to approach the Nash equilibrium in a stable manner within O(μ max ), for small step-size value μ max and sufficiently large penalty parameters. The operation of the algorithm is illustrated by considering the network Cournot competition problem. | 10.1109/TSP.2017.2695451 | Journal | IEEE Transactions on Signal Processing | Communications and Networks | |||
2017/01/01 00:00 | Dynamic, Data-Driven Processing of Multispectral Video Streams | H. Li, K. Sudusinghe, Y. Liu, J. Yoon, M. van der Schaar, E. Blasch, S. S. Bhattacharyya | 2017 | https://ieeexplore.ieee.org/document/8039188 | In this article, we have introduced a novel system design framework for dynamic, data-driven processing of multispectral video streams using LD techniques. The framework is motivated by the need for efficient and accurate video processing in a wide variety of systems for air and ground environments. This framework, called LDspectral, is designed to incorporate selection of subsets of bands as a core, front-end step in the video processing process. Band subset selection opens up a large design space for data-driven adaptation that influences key metrics, including accuracy and computational efficiency. We have demonstrated a prototype implementation of LDspectral applied to a background subtraction application. Through experiments with LDSpectral on a relevant data set, we have demonstrated the utility of flexible, optimized BSS in the navigation of operational trade-offs for multispectral video processing systems. The current version of LDspectral is developed for input streams in which the multispectral images are well aligned across the different bands. The band subset processing subsystem in LDspectral can readily be extended to incorporate image registration, which would be useful to extend the capabilities of the overall system to handle images that are not aligned. Such extension together with the integrated optimization of associated operational tradeoffs is a useful direction for future work. | 10.1109/MAES.2017.160132 | Journal | IEEE Aerospace and Electronic Systems Magazine | ||||
2017/01/01 00:00 | E-RNN: Entangled Recurrent Neural Networks for Causal Prediction | J. Yoon, M. van der Schaar | 2017 | https://www.vanderschaar-lab.com/papers/E-RNN_Final.pdf | We propose a novel architecture of recurrent neural networks (RNNs) for causal prediction which we call Entangled RNN (E-RNN). To issue causal predictions, E-RNN can propagate the backward hidden states of Bi-RNN through an additional forward hidden layer. Unlike a 2-layer RNNs, all the hidden states of E-RNN depend on all the inputs seen so far. Furthermore, unlike a Bi-RNN, for causal prediction, E-RNN depends on both the forward and backward hidden states. Importantly, E-RNN is a general architecture that can be combined with various RNN techniques such as multi-layer, dropout, and GRU. Using three real-world datasets, we show that E-RNN significantly and consistently improves the performance of previous RNN architectures with the same complexity. | Conference | ICML Workshop on Principled Approaches to Deep Learning | Deep learning, Time series analysis | ||||
2017/01/01 00:00 | Functional Contour-following via Haptic Perception and Reinforcement Learning | R. Hellman, C. Tekin, M. van der Schaar, V. Santos | 2017 | https://ieeexplore.ieee.org/document/8039205 | Many tasks involve the fine manipulation of objects despite limited visual feedback. In such scenarios, tactile and proprioceptive feedback can be leveraged for task completion. We present an approach for real-time haptic perception and decision-making for a haptics-driven, functional contour-following task: the closure of a ziplock bag. This task is challenging for robots because the bag is deformable, transparent, and visually occluded by artificial fingertip sensors that are also compliant. A deep neural net classifier was trained to estimate the state of a zipper within a robot's pinch grasp. A Contextual Multi-Armed Bandit (C-MAB) reinforcement learning algorithm was implemented to maximize cumulative rewards by balancing exploration versus exploitation of the state-action space. The C-MAB learner outperformed a benchmark Q-learner by more efficiently exploring the state-action space while learning a hard-to-code task. The learned C-MAB policy was tested with novel ziplock bag scenarios and contours (wire, rope). Importantly, this work contributes to the development of reinforcement learning approaches that account for limited resources such as hardware life and researcher time. As robots are used to perform complex, physically interactive tasks in unstructured or unmodeled environments, it becomes important to develop methods that enable efficient and effective learning with physical testbeds. | 10.1109/TOH.2017.2753233 | Journal | IEEE Transactions on Haptics | Reinforcement learning, Multi-armed bandits | |||
2017/01/01 00:00 | Individualism, Collectivism and Economic Outcomes: A Theory and Some Evidence | K. Ahuja, M. van der Schaar, W. R. Zame | 2017 | https://www.vanderschaar-lab.com/papers/Ahuja_Individualism.pdf | This paper presents a dynamic model to study the impact on the economic outcomes in different societies during the Malthusian Era of individualism (time spent working alone) and collectivism (complementary time spent working with others). The model is driven by opposing forces: a greater degree of collectivism provides a higher safety net for low quality workers but a greater degree of individualism allows high quality workers to leave larger bequests. The model suggests that more individualistic societies display smaller populations, greater per capita income and greater income inequality. Some (limited) historical evidence is consistent with these predictions. | Other | Networks | |||||
2017/01/01 00:00 | Individualized Risk Prognosis for Critical Care Patients: A Multi-task Gaussian Process Model | A. M. Alaa, J. Yoon, S. Hu, M. van der Schaar | 2017 | https://arxiv.org/abs/1705.07674 | We report the development and validation of a data-driven real-time risk score that provides timely assessments for the clinical acuity of ward patients based on their temporal lab tests and vital signs, which allows for timely intensive care unit (ICU) admissions. Unlike the existing risk scoring technologies, the proposed score is individualized; it uses the electronic health record (EHR) data to cluster the patients based on their static covariates into subcohorts of similar patients, and then learns a separate temporal, non-stationary multi-task Gaussian Process (GP) model that captures the physiology of every subcohort. Experiments conducted on data from a heterogeneous cohort of 6,094 patients admitted to the Ronald Reagan UCLA medical center show that our risk score significantly outperforms the state-of-the-art risk scoring technologies, such as the Rothman index and MEWS, in terms of timeliness, true positive rate (TPR), and positive predictive value (PPV). In particular, the proposed score increases the AUC with 20% and 38% as compared to Rothman index and MEWS respectively, and can predict ICU admissions 8 hours before clinicians at a PPV of 35% and a TPR of 50%. Moreover, we show that the proposed risk score allows for better decisions on when to discharge clinically stable patients from the ward, thereby improving the efficiency of hospital resource utilization. | Conference | Big Data in Medicine: Tools, Transformation and Translation | Time series analysis | ||||
2017/01/01 00:00 | Learn to Adapt: Self-Optimizing Small Cell Transmit Power with Correlated Bandit Learning | Z. Wang, C. Shen, X. Luo, M. van der Schaar | 2017 | https://ieeexplore.ieee.org/document/7997146 | Judiciously setting the base station transmit power that matches its deployment environment is a key problem in ultra dense networks and heterogeneous in-building cellular deployments. A unique characteristic of this problem is the tradeoff between sufficient indoor coverage and limited outdoor leakage, which has to be met without explicit knowledge of the environment. In this paper, we address the small base station (SBS) transmit power assignment problem based on stochastic bandit theory. We explicitly consider power switching penalties to discourage frequent changes of the transmit power, which causes varying coverage and uneven user experience. Unlike existing solutions that rely on RF surveys in the target area, we take advantage of the user behavior with simple coverage feedback in the network. In addition, the proposed power assignment algorithms follow the Bayesian principle to utilize the available prior knowledge and correlation structure from the self configuration phase. Simulations mimicking practical deployments are performed for both single and multiple SBS scenarios, and the resulting power settings are compared to the state-of-the-art solutions. Significant performance gains of the proposed algorithms are observed. | 10.1109/ICC.2017.7997146 | Conference | IEEE International Conference on Communications (ICC) | ||||
2017/01/01 00:00 | Machine Learning Techniques for Risk Stratification of Non-ST-Elevation Acute Coronary Syndrome: The Role of Diabetes and Age | B. Ricci, M. van der Schaar, J. Yoon, E. Cenko, Z. Vasiljevic, M. Dorobantu, M. Zdravkovic, S. Kedev, O. Kalpak, D. Miličić, O. Manfrini, L. Badimon, R. Bugiardini | 2017 | https://www.ahajournals.org/doi/abs/10.1161/circ.136.suppl_1.15892 | Introduction: Patients with diabetes and NSTE-ACS exhibit a highly variable risk of mortality and morbidity, even when undergoing similar therapeutic strategies. Machine-learning (ML) algorithms represent a novel approach, which may give insights on outcome prediction through risk stratification. Hypothesis: To investigate the impact of early (≤24 hrs) PCI compared with only routine medical treatment (RMT) without PCI on outcomes in pts with NSTE-ACS and diabetes. Methods: Cohort study using a population- based registry (ISACS-TC, 41 hospitals, 12 European countries) from 2010 to 2016. ML models were compared with traditional statistical methods using logistic regression combined with propensity matched analysis and inverse probability of treatment weighing of outcomes from a landmark of 24 hours from hospitalization. The primary endpoint was 30-day all-cause mortality. The secondary endpoint was the composite outcome of 30-day all-cause mortality and left ventricular dysfunction (ejection fraction<40%). Results: Of 1250 NSTE-ACS first-day survivors with diabetes (median age 67 yrs (IQR 60 to 74 yrs; 59%, men), 470 (38%) received early PCI and 780 RMT. Unadjusted rates of the primary end-point were higher in the RMT group than the early PCI group (6.3%; 49 events vs. 2.5%; 12 events). After propensity-matched analysis as well as after inverse probability-of-treatment weighting, early PCI was associated with a significant reduction in the primary end-point (OR: 0.44; 95%CI: 0.21 to 0.92 and 0.49; 95%CI: 0.28 to 0.86, respectively). The critical factor for personalization with ML algorithms was age (≥ 65 yrs). The direction and magnitude of the association between early PCI and the primary end-point remained unchanged after ML personalization in the older age (OR: 0.35; 95%CI: 0.14 to 0.92). Younger age had no association with 30-day all-cause mortality. Similar results were also obtained for the secondary endpoint. Conclusions: ML significantly improves accuracy of cardiovascular risk prediction in pts with diabetes hospitalized with NSTE-ACS. Pts of 65 yrs or older may benefit most from an early PCI strategy performed ≤ 24 hours after presentation. Conservative therapies may avoid unnecessary procedures in the younger pts. | 10.1161/circ.136.suppl_1.15892 | Journal | Circulation | Treatment & trials | |||
2017/01/01 00:00 | Multi-directional Recurrent Neural Networks: A Novel Method for Estimating Missing Data | J. Yoon, W. R. Zame, M. van der Schaar | 2017 | http://roseyu.com/time-series-workshop/submissions/TSW2017_paper_12.pdf | Most time-series datasets with multiple data streams have (many) missing measurements that need to be estimated. Most existing methods address this estimation problem either by interpolating within data streams or imputing across data streams; we develop a novel approach that does both. Our approach is based on a deep learning architecture that we call a Multidirectional Recurrent Neural Network (M-RNN). An M-RNN differs from a bi-directional RNN in that it operates across streams in addition to within streams, and because the timing of inputs into the hidden layers is both lagged and advanced. To demonstrate the power of our approach we apply it to a familiar real-world medical dataset and demonstrate significantly improved performance. | Conference | ICML Time Series Workshop | Deep learning, Time series analysis | ||||
2017/01/01 00:00 | Online Learning in Limit Order Book Trade Execution | N. Akbarzadeh, C. Tekin, M. van der Schaar | 2017 | https://ieeexplore.ieee.org/document/8416773 | In this paper, we propose an online learning algorithm for optimal execution in the limit order book of a financial asset. Given a certain number of shares to sell and an allocated time window to complete the transaction, the proposed algorithm dynamically learns the optimal number of shares to sell via market orders at prespecified time slots within the allocated time interval. We model this problem as a Markov Decision Process (MDP), which is then solved by dynamic programming. First, we prove that the optimal policy has a specific form, which requires either selling no shares or the maximum allowed amount of shares at each time slot. Then, we consider the learning problem, in which the state transition probabilities are unknown and need to be learned on the fly. We propose a learning algorithm that exploits the form of the optimal policy when choosing the amount to trade. Interestingly, this algorithm achieves bounded regret with respect to the optimal policy computed based on the complete knowledge of the market dynamics. Our numerical results on several finance datasets show that the proposed algorithm performs significantly better than the traditional Q-learning algorithm by exploiting the structure of the problem. | 10.1109/TSP.2018.2858188 | Conference | IEEE Global Conference on Signal and Information Processing (GlobalSIP) Symposium on Signal and Information Processing for Finance and Business | ||||
2017/01/01 00:00 | Personalized Donor-Recipient Matching for Organ Transplantation | J. Yoon, A. M. Alaa, M. Cadeiras, M. van der Schaar | 2017 | https://ojs.aaai.org/index.php/AAAI/article/view/10711 | Organ transplants can improve the life expectancy and quality of life for the recipient but carry the risk of serious post-operative complications, such as septic shock and organ rejection. The probability of a successful transplant depends in a very subtle fashion on compatibility between the donor and the recipient - but current medical practice is short of domain knowledge regarding the complex nature of recipient-donor compatibility. Hence a data-driven approach for learning compatibility has the potential for significant improvements in match quality. This paper proposes a novel system (ConfidentMatch) that is trained using data from electronic health records. ConfidentMatch predicts the success of an organ transplant (in terms of the 3-year survival rates) on the basis of clinical and demographic traits of the donor and recipient. ConfidentMatch captures the heterogeneity of the donor and recipient traits by optimally dividing the feature space into clusters and constructing different optimal predictive models to each cluster. The system controls the complexity of the learned predictive model in a way that allows for assuring more granular and accurate predictions for a larger number of potential recipient-donor pairs, thereby ensuring that predictions are "personalized" and tailored to individual characteristics to the finest possible granularity. Experiments conducted on the UNOS heart transplant dataset show the superiority of the prognostic value of ConfidentMatch to other competing benchmarks; ConfidentMatch can provide predictions of success with 95% accuracy for 5,489 patients of a total population of 9,620 patients, which corresponds to 410 more patients than the most competitive benchmark algorithm (DeepBoost). | Conference | AAAI | Ensemble learning | Phenotyping & subgroup analysis | |||
2017/01/01 00:00 | Personalized Risk Prediction using Predictive Pursuit Machine Learning: A Pilot Study in Cardiac Transplantation | J. Yoon, W. R. Zame, A. Banerjee, M. Cadeiras, A. M. Alaa, M. van der Schaar | 2017 | https://vanderschaar-lab.com/papers/Evidence_Live_2017.pdf | Across healthcare, risk prediction tools are suboptimal whether used in diagnosis, prognosis or treatment planning. Use of retrospective datasets for derivation and validation, lack of personalisation of tools and inadequate use of machine learning are some of the problems with the current paradigm. To improve accuracy of risk prediction using a novel personalized machine learning model ("predictive pursuit") by conducting a pilot study of wait-list and post-transplant mortality in cardiac transplantation. Our novel "predictive pursuit" algorithm out-performs currently available clinical risk prediction scores as well as the best machine learning tools for prediction of wait-list and post-cardiac transplant mortality. The predictive pursuit algorithm has potential to personalise and greatly improve accuracy of risk prediction. | Conference | European Society of Cardiology Congress | |||||
2017/01/01 00:00 | Personalized Risk Prediction using Predictive Pursuit Machine Learning: A Pilot Study in Cardiac Transplantation | J. Yoon, W. R. Zame, A. Banerjee, M. Cadeiras, A. M. Alaa, M. van der Schaar | 2017 | https://vanderschaar-lab.com/papers/Evidence_Live_2017.pdf | Across healthcare, risk prediction tools are suboptimal whether used in diagnosis, prognosis or treatment planning. Use of retrospective datasets for derivation and validation, lack of personalisation of tools and inadequate use of machine learning are some of the problems with the current paradigm. To improve accuracy of risk prediction using a novel personalized machine learning model ("predictive pursuit") by conducting a pilot study of wait-list and post-transplant mortality in cardiac transplantation. Our novel "predictive pursuit" algorithm out-performs currently available clinical risk prediction scores as well as the best machine learning tools for prediction of wait-list and post-cardiac transplant mortality. The predictive pursuit algorithm has potential to personalise and greatly improve accuracy of risk prediction. | Conference | Evidence Live Conference | |||||
2017/01/01 00:00 | Personalized Risk Scoring for Critical Care Prognosis using Mixtures of Gaussian Processes | A. M. Alaa, J. Yoon, S. Hu, M. van der Schaar | 2017 | https://ieeexplore.ieee.org/document/7913647 | In this paper, we develop a personalized real-time risk scoring algorithm that provides timely and granular assessments for the clinical acuity of ward patients based on their (temporal) lab tests and vital signs; the proposed risk scoring system ensures timely intensive care unit admissions for clinically deteriorating patients. The risk scoring system is based on the idea of sequential hypothesis testing under an uncertain time horizon. The system learns a set of latent patient subtypes from the offline electronic health record data, and trains a mixture of Gaussian Process experts, where each expert models the physiological data streams associated with a specific patient subtype. Transfer learning techniques are used to learn the relationship between a patient's latent subtype and her static admission information (e.g., age, gender, transfer status, ICD-9 codes, etc). Experiments conducted on data from a heterogeneous cohort of 6321 patients admitted to Ronald Reagan UCLA medical center show that our score significantly outperforms the currently deployed risk scores, such as the Rothman index, MEWS, APACHE, and SOFA scores, in terms of timeliness, true positive rate, and positive predictive value. Our results reflect the importance of adopting the concepts of personalized medicine in critical care settings; significant accuracy and timeliness gains can be achieved by accounting for the patients' heterogeneity. Significance: The proposed risk scoring methodology can confer huge clinical and social benefits on a massive number of critically ill inpatients who exhibit adverse outcomes including, but not limited to, cardiac arrests, respiratory arrests, and septic shocks. | 10.1109/TBME.2017.2698602 | Journal | IEEE Transactions on Biomedical Engineering | Time series analysis | Early warning systems, Risk & disease trajectories | ||
2017/01/01 00:00 | Progressive Prediction of Student Performance in College Programs | J. Xu, Y. Han, D. Marcu, M. van der Schaar | 2017 | https://ojs.aaai.org/index.php/AAAI/article/view/10713 | Accurately predicting students' future performance based on their tracked academic records in college programs is crucial for effectively carrying out necessary pedagogical interventions to ensure students' on-time graduation. Although there is a rich literature on predicting student performance in solving problems and studying courses using data-driven approaches, predicting student performance in completing college programs is much less studied and faces new challenges, mainly due to the diversity of courses selected by students and the requirement of continuous tracking and incorporation of students' evolving progresses. In this paper, we develop a novel algorithm that enables progressive prediction of students' performance by adapting ensemble learning techniques and utilizing education-specific domain knowledge. We prove its prediction performance guarantee and show its performance improvement against benchmark algorithms on a real-world student dataset from UCLA. | Conference | AAAI | Personalized Education | ||||
2016/12/05 00:00 | A Non-parametric Learning Method for Confidently Estimating Patient's Clinical State and Dynamics | W. Hoiles, M. van der Schaar | 2016 | https://papers.nips.cc/paper/6454-a-non-parametric-learning-method-for-confidently-estimating-patients-clinical-state-and-dynamics | Estimating patient's clinical state from multiple concurrent physiological streams plays an important role in determining if a therapeutic intervention is necessary and for triaging patients in the hospital. In this paper we construct a non-parametric learning algorithm to estimate the clinical state of a patient. The algorithm addresses several known challenges with clinical state estimation such as eliminating bias introduced by therapeutic intervention censoring, increasing the timeliness of state estimation while ensuring a sufficient accuracy, and the ability to detect anomalous clinical states. These benefits are obtained by combining the tools of non-parametric Bayesian inference, permutation testing, and generalizations of the empirical Bernstein inequality. The algorithm is validated using real-world data from a cancer ward in a large academic hospital. | Conference | NeurIPS | Time series analysis | ||||
2016/12/05 00:00 | Balancing Suspense and Surprise: Timely Decision Making with Endogenous Information Acquisition | A. M. Alaa, M. van der Schaar | 2016 | https://papers.nips.cc/paper/6062-balancing-suspense-and-surprise-timely-decision-making-with-endogenous-information-acquisition | We develop a Bayesian model for decision-making under time pressure with endogenous information acquisition. In our model, the decision-maker decides when to observe (costly) information by sampling an underlying continuous-time stochastic process (time series) that conveys information about the potential occurrence/non-occurrence of an adverse event which will terminate the decision-making process. In her attempt to predict the occurrence of the adverse event, the decision-maker follows a policy that determines when to acquire information from the time series (continuation), and when to stop acquiring information and make a final prediction (stopping). We show that the optimal policy has a "rendezvous" structure, i.e. a structure in which whenever a new information sample is gathered from the time series, the optimal "date" for acquiring the next sample becomes computable. The optimal interval between two information samples balances a trade-off between the decision maker’s "surprise", i.e. the drift in her posterior belief after observing new information, and "suspense", i.e. the probability that the adverse event occurs in the time interval between two information samples. Moreover, we characterize the continuation and stopping regions in the decision-maker’s state-space, and show that they depend not only on the decision-maker’s beliefs, but also on the "context", i.e. the current realization of the time series. | Conference | NeurIPS | Time series analysis | ||||
2016/06/19 00:00 | Bounded Off-Policy Evaluation with Missing Data for Course Recommendation and Curriculum Design | W. Whoiles, M. van der Schaar | 2016 | http://proceedings.mlr.press/v48/hoiles16.html | Successfully recommending personalized course schedules is a difficult problem given the diversity of students knowledge, learning behaviour, and goals. This paper presents personalized course recommendation and curriculum design algorithms that exploit logged student data. The algorithms are based on the regression estimator for contextual multi-armed bandits with a penalized variance term. Guarantees on the predictive performance of the algorithms are provided using empirical Bernstein bounds. We also provide guidelines for including expert domain knowledge into the recommendations. Using undergraduate engineering logged data from a post-secondary institution we illustrate the performance of these algorithms. | Conference | ICML | Causal inference, Multi-armed bandits | Personalized Education | |||
2016/06/19 00:00 | ForecastICU: A Prognostic Decision Support System for Timely Prediction of Intensive Care Unit Admission | J. Yoon, A. M. Alaa, S. Hu, M. van der Schaar | 2016 | http://proceedings.mlr.press/v48/yoon16.html | We develop ForecastICU: a prognostic decision support system that monitors hospitalized patients and prompts alarms for intensive care unit (ICU) admissions. ForecastICU is first trained in an offline stage by constructing a Bayesian belief system that corresponds to its belief about how trajectories of physiological data streams of the patient map to a clinical status. After that, ForecastICU monitors a new patient in real-time by observing her physiological data stream, updating its belief about her status over time, and prompting an alarm whenever its belief process hits a predefined threshold (confidence). Using a real-world dataset obtained from UCLA Ronald Reagan Medical Center, we show that ForecastICU can predict ICU admissions 9 hours before a physician’s decision (for a sensitivity of 40% and a precision of 50%). Also, ForecastICU performs consistently better than other state-of-the-art machine learning algorithms in terms of sensitivity, precision, and timeliness: it can predict ICU admissions 3 hours earlier, and offers a 7.8% gain in sensitivity and a 5.1% gain in precision compared to the best state-of-the-art algorithm. Moreover, ForecastICU offers an area under curve (AUC) gain of 22.3% compared to the Rothman index, which is the currently deployed technology in most hospital wards. | Conference | ICML | Time series analysis | Early warning systems, Risk & disease trajectories | |||
2016/01/01 00:00 | A Non-stochastic Learning Approach to Energy Efficient Mobility Management | C. Shen, C. Tekin, M. van der Schaar | 2016 | https://ieeexplore.ieee.org/document/7572176 | Energy efficient mobility management is an important problem in modern wireless networks with heterogeneous cell sizes and increased nodes densities. We show that optimization-based mobility protocols cannot achieve long-term optimal energy consumption, particularly for ultra-dense networks (UDNs). To address the complex dynamics of UDN, we propose a non-stochastic online-learning approach, which does not make any assumption on the statistical behavior of the small base station (SBS) activities. In addition, we introduce handover cost to the overall energy consumption, which forces the resulting solution to explicitly minimize frequent handovers. The proposed batched randomization with exponential weighting (BREW) algorithm relies on batching to explore in bulk, and hence reduces unnecessary handovers. We prove that the regret of BREW is sublinear in time, thus guaranteeing its convergence to the optimal SBS selection. We further study the robustness of the BREW algorithm to delayed or missing feedback. Moreover, we study the setting where SBSs can be dynamically turned ON and OFF. We prove that sublinear regret is impossible with respect to arbitrary SBS ON/OFF, and then develop a novel learning strategy, called ranking expert (RE), that simultaneously takes into account the handover cost and the availability of SBS. To address the high complexity of RE, we propose a contextual ranking expert (CRE) algorithm that only assigns experts in a given context. Rigorous regret bounds are proved for both RE and CRE with respect to the best expert. Simulations show that not only do the proposed mobility algorithms greatly reduce the system energy consumption, but they are also robust to various dynamics which are common in practical ultra-dense wireless networks. | 10.1109/JSAC.2016.2612038 | Journal | IEEE Journal on Selected Areas in Communications | Communications and Networks | |||
2016/01/01 00:00 | A Semi-Markov Switching Linear Gaussian Model for Censored Physiological Data | A. M. Alaa, J. Yoon, S. Hu, M. van der Schaar | 2016 | https://arxiv.org/abs/1611.05146 | Critically ill patients in regular wards are vulnerable to unanticipated clinical dete- rioration which requires timely transfer to the intensive care unit (ICU). To allow for risk scoring and patient monitoring in such a setting, we develop a novel Semi- Markov Switching Linear Gaussian Model (SSLGM) for the inpatients' physiol- ogy. The model captures the patients' latent clinical states and their corresponding observable lab tests and vital signs. We present an efficient unsupervised learn- ing algorithm that capitalizes on the informatively censored data in the electronic health records (EHR) to learn the parameters of the SSLGM; the learned model is then used to assess the new inpatients' risk for clinical deterioration in an online fashion, allowing for timely ICU admission. Experiments conducted on a het- erogeneous cohort of 6,094 patients admitted to a large academic medical center show that the proposed model significantly outperforms the currently deployed risk scores such as Rothman index, MEWS, SOFA and APACHE. | Conference | NeurIPS Machine Learning for Health Workshop | Time series analysis | Early warning systems | |||
2016/01/01 00:00 | A Theory of Individualism, Collectivism and Economic Outcomes | K. Ahuja, M. van der Schaar, W. R. Zame | 2016 | http://arxiv.org/abs/1512.01230 | This paper presents a dynamic model to study the impact on the economic outcomes in different societies during the Malthusian Era of individualism (time spent working alone) and collectivism (complementary time spent working with others). The model is driven by opposing forces: a greater degree of collectivism provides a higher safety net for low quality workers but a greater degree of individualism allows high quality workers to leave larger bequests. The model suggests that more individualistic societies display smaller populations, greater per capita income and greater income inequality. Some (limited) historical evidence is consistent with these predictions. | Other | Game Theory and Applications | |||||
2016/01/01 00:00 | Adaptive Ensemble Learning with Confidence Bounds | C. Tekin, J. Yoon, M. van der Schaar | 2016 | https://ieeexplore.ieee.org/document/7738590 | Extracting actionable intelligence from distributed, heterogeneous, correlated, and high-dimensional data sources requires run-time processing and learning both locally and globally. In the last decade, a large number of meta-learning techniques have been proposed in which local learners make online predictions based on their locally collected data instances, and feed these predictions to an ensemble learner, which fuses them and issues a global prediction. However, most of these works do not provide performance guarantees or, when they do, these guarantees are asymptotic. None of these existing works provide confidence estimates about the issued predictions or rate of learning guarantees for the ensemble learner. In this paper, we provide a systematic ensemble learning method called Hedged Bandits, which comes with both long-run (asymptotic) and short-run (rate of learning) performance guarantees. Moreover, our approach yields performance guarantees with respect to the optimal local prediction strategy, and is also able to adapt its predictions in a data-driven manner. We illustrate the performance of Hedged Bandits in the context of medical informatics and show that it outperforms numerous online and offline ensemble learning methods. | 10.1109/TSP.2016.2626250 | Journal | IEEE Transactions on Signal Processing | Ensemble learning, Multi-agent learning | |||
2016/01/01 00:00 | Adaptive ensemble learning with confidence bounds for personalized diagnosis | C. Tekin, J. Yoon, M. van der Schaar | 2016 | https://www.aaai.org/ocs/index.php/WS/AAAIW16/paper/view/12590 | With the advances in the field of medical informatics, automated clinical decision support systems are becoming the de facto standard in personalized diagnosis. In order to establish high accuracy and confidence in personalized diagnosis, massive amounts of distributed, heterogeneous, correlated and high-dimensional patient data from different sources such as wearable sensors, mobile applications, Electronic Health Record (EHR) databases etc. need to be processed. This requires learning both locally and globally due to privacy constraints and/or distributed nature of the multi-modal medical data. In the last decade, a large number of meta-learning techniques have been proposed in which local learners make online predictions based on their locally-collected data instances, and feed these predictions to an ensemble learner,which fuses them and issues a global prediction. However, most of these works do not provide performance guarantees or, when they do,these guarantees are asymptotic. None of these existing works provide confidence estimates about the issued predictions or rate of learning guarantees for the ensemble learner. In this paper, we provide a systematic ensemble learning method called Hedged Bandits, which comes with both long run (asymptotic) and short run (rate of learning) performance guarantees. Moreover, we show that our proposed method outperforms all existing ensemble learning techniques, even in the presence of concept drift. | Conference | AAAI Workshop on Expanding the Boundaries of Health Informatics using AI (HIAI) | Ensemble learning | ||||
2016/01/01 00:00 | Adaptive Learning for Stochastic Generalized Nash Equilibrium Problems | C.-K. Yu, M. van der Schaar, A. H. Sayed | 2016 | http://ieeexplore.ieee.org/document/7472597 | This work examines a stochastic formulation of the generalized Nash equilibrium problem (GNEP) where agents are subject to randomness in the environment of unknown statistical distribution. Three stochastic gradient strategies are developed by relying on a penalty-based approach where the constrained GNEP formulation is replaced by a penalized unconstrained formulation. It is shown that this penalty solution is able to approach the Nash equilibrium in a stable manner within O(p), for small step-size values p. The operation of the algorithms is illustrated by considering the Cournot competition problem. | 10.1109/ICASSP.2016.7472597 | Conference | IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) | ||||
2016/01/01 00:00 | Big-Data Streaming Applications Scheduling Based on Staged Multi-armed Bandits | K. Kanoun, C. Tekin, D. Atienza, M. van der Schaar | 2016 | https://ieeexplore.ieee.org/document/7447706 | Several techniques have been recently proposed to adapt Big-Data streaming applications to existing many core platforms. Among these techniques, online reinforcement learning methods have been proposed that learn how to adapt at run-time the throughput and resources allocated to the various streaming tasks depending on dynamically changing data stream characteristics and the desired applications performance (e.g., accuracy). However, most of state-of-the-art techniques consider only one single stream input in its application model input and assume that the system knows the amount of resources to allocate to each task to achieve a desired performance. To address these limitations, in this paper we propose a new systematic and efficient methodology and associated algorithms for online learning and energy-efficient scheduling of Big-Data streaming applications with multiple streams on many core systems with resource constraints. We formalize the problem of multi-stream scheduling as a staged decision problem in which the performance obtained for various resource allocations is unknown. The proposed scheduling methodology uses a novel class of online adaptive learning techniques which we refer to as staged multi-armed bandits (S-MAB). Our scheduler is able to learn online which processing method to assign to each stream and how to allocate its resources over time in order to maximize the performance on the fly, at run-time, without having access to any offline information. The proposed scheduler, applied on a face detection streaming application and without using any offline information, is able to achieve similar performance compared to an optimal semi-online solution that has full knowledge of the input stream where the differences in throughput, observed quality, resource usage and energy efficiency are less than 1, 0.3, 0.2 and 4 percent respectively. | 10.1109/TC.2016.2550454 | Journal | IEEE Transactions on Computers | Reinforcement learning, Multi-armed bandits | |||
2016/01/01 00:00 | Bits Learning: User-adjustable Privacy versus Accuracy in Internet Traffic Classification | Z. Yuan, J. Xu, Y. Xue, M. van der Schaar | 2016 | https://ieeexplore.ieee.org/document/7393470 | During the past decade, a great number of machine learning (ML)-based methods have been studied for accurate traffic classification. Flow features such as the discretizations of the first five packet sizes (PS) and flow ports (FP) are considered the best discriminators for per-flow classification. For the first time, this letter proposes to treat the first n-bits of a flow (BitFlow) as features and compares its overall performance with the well-known ACAS (automated construction of application signatures) that takes the first n-bytes of a flow (ByteFlow) as features. The results show that BitFlow achieves not only a higher classification accuracy but also 1-3 orders of magnitude faster speed than ACAS in training and classifying. More importantly, this letter also proposes to treat the first n-bits of each of the first few packet payloads (BitPack) as features, which enables a user-adjustable tradeoff between user privacy protection and classification accuracy maximization. The experiments show that BitPack can significantly outperform BitFlow, PS, and FP. | 10.1109/LCOMM.2016.2521837 | Journal | IEEE Communications Letters | Communications and Networks | |||
2016/01/01 00:00 | Blind Network Interdiction Strategies- A Learning Approach | S. Amuru, R. M. Buehrer, M. van der Schaar | 2016 | https://ieeexplore.ieee.org/document/7436818 | Network interdiction refers to disrupting a network in an attempt to either analyze the network’s vulnerabilities or to undermine a network’s communication capabilities. A vast majority of the works that have studied network interdiction assume a priori knowledge of the network topology. However, such knowledge may not be available in real-time settings. For instance, in practical electronic warfare-type settings, an attacker that intends to disrupt communication in the network may not know the topology a priori . Hence, it is necessary to develop online learning strategies that enable the attacker to interdict communication in the underlying network in real-time. In this paper, we develop several learning techniques that enable the attacker to learn the best network interdiction strategies (in terms of the best nodes to attack to maximally disrupt communication in the network) and also discuss the potential limitations that the attacker faces in such blind scenarios. We consider settings where 1) only one node can be attacked and 2) where multiple nodes can be attacked in the network. In addition to the single-attacker setting, we also discuss learning strategies when multiple attackers attack the network and discuss the limitations they face in real-time settings. Several different network topologies are considered in this study using which we show that under the blind settings considered in this paper, except for some simple network topologies, the attacker cannot optimally (measured in terms of the number of flows stopped) attack the network. | 10.1109/TCCN.2016.2542078 | Journal | IEEE Transactions on Cognitive Communications and Networking | Communications and Networks | |||
2016/01/01 00:00 | Collision Detection by Networked Sensors | L. Canzian, U. Demiryurek, M. van der Schaar | 2016 | https://ieeexplore.ieee.org/document/7342965 | The advances in sensor technologies enable real-time collection of high-fidelity spatiotemporal data on transportation networks of major cities. We consider a set of speed sensors that are spatially distributed along a street and can communicate via an exogenously determined network. In this paper, we address the problem of detecting in real-time collisions that occur within a certain distance from each sensor. The speed sensors observe the average speed value of the cars at regular time intervals and adopt a threshold-based approach to generate local predictions. Each sensor exchanges its local predictions with its neighbors and aggregates the local predictions it receives using a weighted majority aggregation rule to generate a final prediction. Since collisions are eventually reported (e.g., by a police officer or by crowd-sourced information), we assume that the information about the real occurrence of a collision is eventually given to the sensors. We propose an online learning rule that exploits this feedback to adapt the weights that each sensor gives to different local predictions. In the realizable case, i.e., when there exist unknown weights that would allow the sensors to distinguish between collisions and normal traffic behaviors, we determine an upper bound for the worst-case misdetection and false alarm probabilities of our scheme. We evaluate our scheme with traffic datasets collected from the segment of the 405 freeway that passes through Los Angeles County and the results show the efficacy of the proposed approach. | 10.1109/TSIPN.2015.2504721 | Journal | IEEE Transactions on Signal and Information Processing over Networks | Multi-agent learning | Communications and Networks | ||
2016/01/01 00:00 | ConfidentCare: A Clinical Decision Support System for Personalized Breast Cancer Screening | A. M. Alaa, K. H. Moon, W. Hsu, M. van der Schaar | 2016 | https://ieeexplore.ieee.org/document/7506232 | Breast cancer screening policies attempt to achieve timely diagnosis by regularly screening healthy women via various imaging tests. Various clinical decisions are needed to manage the screening process: selecting initial screening tests, interpreting test results, and deciding if further diagnostic tests are required. Current screening policies are guided by clinical practice guidelines (CPGs), which represent a "one-size-fits-all" approach, designed to work well (on average) for a population, and can only offer coarse expert-based patient stratification that is not rigorously validated through data. Since the risks and benefits of screening tests are functions of each patient's features,personalized screening policies tailored to the features of individuals are desirable. To address this issue, we developed ConfidentCare: a computer-aided clinical decision support system that learns a personalized screening policy from electronic health record (EHR) data. By a "personalized screening policy," we mean a clustering of women's features, and a set of customized screening guidelines for each cluster. ConfidentCare operates by computing clusters of patients with similar features, then learning the "best" screening procedure for each cluster using a supervised learning algorithm. The algorithm ensures that the learned screening policy satisfies a predefined accuracy requirement with a high level of confidence for every cluster. By applying ConfidentCare to real-world data, we show that it outperforms the current CPGs in terms of cost efficiency and false positive rates: a reduction of 31% in the false positive rate can be achieved. | 10.1109/TMM.2016.2589160 | Journal | IEEE Transactions on Multimedia | Screening, Medical imaging, Phenotyping & subgroup analysis | |||
2016/01/01 00:00 | Context-based Unsupervised Ensemble Learning and Feature Ranking | E. Soltanmohammadi, M. Naraghi-Pour, M. van der Schaar | 2016 | http://link.springer.com/article/10.1007/s10994-016-5576-6 | In ensemble systems, several experts, which may have access to possibly different data, make decisions which are then fused by a combiner (meta-learner) to obtain a final result. Such ensemble-based systems are well-suited for processing big-data from sources such as social media, in-stream monitoring systems, networks, and markets, and provide more accurate results than single expert systems. However, most existing ensemble-learning techniques have two limitations: (i) they are supervised, and hence they require access to the true label, which is often unknown in practice, and (ii) they are not able to evaluate the impact of the various data features/contexts on the final decision, and hence they do not learn which data is required. In this paper we propose a joint estimation–detection method for evaluating the accuracy of each expert as a function of the data features/context and for fusing the experts decisions. The proposed method is unsupervised: the true labels are not available and no prior information is assumed regarding the performance of each expert. Extensive simulation results show the improvement of the proposed method as compared to the state-of-the-art approaches. We also provide a systematic, unsupervised method for ranking the informativeness of each feature on the decision making process. | 10.1007-s10994-016-5576-6 | Journal | Machine Learning | Ensemble learning, Feature selection | |||
2016/01/01 00:00 | Contextual Learning for Unit Commitment with Renewable Energy Sources | H.-S. Lee, C. Tekin, M. van der Schaar, J.-W. Lee | 2016 | https://ieeexplore.ieee.org/document/7905966 | In this paper, we study a unit commitment (UC) problem minimizing operating costs of the power system with renewable energy sources. We develop a contextual learning algorithm for UC (CLUC) which learns which UC schedule to choose based on the context information such as past load demand and weather condition. CLUC does not require any prior knowledge on the uncertainties such as the load demand and the renewable power outputs, and learns them over time using the context information. We characterize the performance of CLUC analytically, and prove its optimality in terms of the long-term average cost. Through the simulation results, we show the performance of CLUC and the effectiveness of utilizing the context information in the UC problem. | 10.1109/GlobalSIP.2016.7905966 | Conference | IEEE Global Conference on Signal and Information Processing (GlobalSIP) | ||||
2016/01/01 00:00 | Discovery and Clinical Decision Support for Personalized Healthcare | J. Yoon, C. Davtyan, M. van der Schaar | 2016 | https://ieeexplore.ieee.org/document/7482682 | With the advent of electronic health records, more data are continuously collected for individual patients, and more data are available for review from past patients. Despite this, it has not yet been possible to successfully use this data to systematically build clinical decision support systems that can produce personalized clinical recommendations to assist clinicians in providing individualized healthcare. In this paper, we present a novel approach, discovery engine (DE), that discovers which patient characteristics are most relevant for predicting the correct diagnosis and/or recommending the best treatment regimen for each patient. We demonstrate the performance of DE in two clinical settings: diagnosis of breast cancer as well as a personalized recommendation for a specific chemotherapy regimen for breast cancer patients. For each distinct clinical recommendation, different patient features are relevant; DE can discover these different relevant features and use them to recommend personalized clinical decisions. The DE approach achieves a 16.6% improvement over existing state-of-the-art recommendation algorithms regarding kappa coefficients for recommending the personalized chemotherapy regimens. For diagnostic predictions, the DE approach achieves a 2.18% and 4.20% improvement over existing state-of-the-art prediction algorithms regarding prediction error rate and false positive rate, respectively. We also demonstrate that the performance of our approach is robust against missing information and that the relevant features discovered by DE are confirmed by clinical references. | 10.1109/JBHI.2016.2574857 | Journal | IEEE Journal of Biomedical and Health Informatics | Feature selection | Treatment & trials | ||
2016/01/01 00:00 | Distributed Online Learning and Stream Processing for a Smarter Planet | D. S. Turaga, M. van der Schaar | 2016 | https://onlinelibrary.wiley.com/doi/10.1002/9781119187202.ch10 | Smarter Planet applications for transportation, healthcare, energy and utilities have been en- | 10.1002/9781119187202.ch10 | Chapter | Fog for 5G and IoT | ||||
2016/01/01 00:00 | Dynamic Network Formation with Foresighted Agents | Y. Song, M. van der Schaar | 2016 | https://link.springer.com/article/10.1007/s00182-020-00714-4 | What networks can form and persist when agents are self-interested? Can such networks be efficient? A substantial theoretical literature predicts that various networks emerge randomly and efficiency is unlikely to be sustained, but these predictions are in stark contrast to empirical findings. In this paper, we present a new model of network formation. In contrast to the existing literature, we assume that agents are foresighted (rather than myopic) and have some but not necessarily all information about the history. We provide a tight characterization of the sustainable networks; in particular, efficient networks can form and persist if they provide every agent a strictly positive payoff. Our results are robust to model variations, while evidence from empirical networks suggests a modest improvement in prediction by our model compared with models with agent myopia. | 10.1007-s00182-020-00714-4 | Journal | International Journal of Game Theory | Game Theory and Applications, Game Theory and Applications, Networks | |||
2016/01/01 00:00 | Endogenous Matching in a Dynamic Assignment Model with Adverse Selection and Moral Hazard | M. van der Schaar, Y. Xiao, W. R. Zame | 2016 | https://www.vanderschaar-lab.com/papers/XiaoMatching.pdf | This paper formulates and analyzes a dynamic assignment model with one-sided adverse selection (unobserved worker characteristics) and moral hazard (unobserved worker effort). It defines a notion of stationary equilibrium in which workers are matched to tasks endogenously on the basis of observable output. For each given payment schedule, such an equilibrium exists and is unique. At equilibrium, adverse selection is eliminated and moral hazard is mitigated. Firm profit in equilibrium is compared against natural benchmarks. | Other | Game Theory and Applications | |||||
2016/01/01 00:00 | Foresighted Demand Side Management | Y. Xiao, M. van der Schaar | 2016 | https://arxiv.org/abs/1401.2185 | We consider a smart grid with an independent system operator (ISO), and distributed aggregators who have energy storage and purchase energy from the ISO to serve its customers. All the entities in the system are foresighted: each aggregator seeks to minimize its own long-term payments for energy purchase and operational costs of energy storage by deciding how much energy to buy from the ISO, and the ISO seeks to minimize the long-term total cost of the system (e.g. energy generation costs and the aggregators' costs) by dispatching the energy production among the generators. The decision making of the entities is complicated for two reasons. First, the information is decentralized: the ISO does not know the aggregators' states (i.e. their energy consumption requests from customers and the amount of energy in their storage), and each aggregator does not know the other aggregators' states or the ISO's state (i.e. the energy generation costs and the status of the transmission lines). Second, the coupling among the aggregators is unknown to them. Specifically, each aggregator's energy purchase affects the price, and hence the payments of the other aggregators. However, none of them knows how its decision influences the price because the price is determined by the ISO based on its state. We propose a design framework in which the ISO provides each aggregator with a conjectured future price, and each aggregator distributively minimizes its own long-term cost based on its conjectured price as well as its local information. The proposed framework can achieve the social optimum despite being decentralized and involving complex coupling among the various entities. | Multi-agent learning | ||||||
2016/01/01 00:00 | From Acquaintances to Friends: Homophily and Learning in Networks | S. Zhang, M. van der Schaar | 2016 | https://ieeexplore.ieee.org/document/7859293 | This paper considers the evolution of a network in a discrete time, stochastic setting in which agents learn about each other through repeated interactions and maintain/break links on the basis of what they learn. Agents exhibit homophily, the preference to link with others who are similar to themselves, and they have a limited capacity for links. They thus maintain links with others learned to be similar to themselves and cut links to those learned to be dissimilar to themselves. We introduce a new equilibrium concept we term “matching pairwise stable equilibrium”, and we prove that such equilibrium is unique in our model. We show that higher levels of homophily decrease the (average) number of links that agents form. However, the effect of homophily is anomalous: mutually beneficial links may be dropped before learning is completed, thereby resulting in sparser networks and less clustering than under complete information. Homophily also exhibits an interesting interaction with the presence of incomplete information: initially, greater levels of homophily increase the difference between the complete and incomplete information networks, but sufficiently high levels of homophily eventually decrease the difference. Complete and incomplete information networks differ most when the degree of homophily is intermediate. | 10.1109/JSAC.2017.2672238 | Journal | IEEE Journal on Selected Areas in Communications | Multi-agent learning | Communications and Networks, Game Theory and Applications, Networks | ||
2016/01/01 00:00 | Incentive Design in Peer Review: Rating and Repeated Endogenous Matching | Y. Xiao, F. Dörfler, M. van der Schaar | 2016 | https://ieeexplore.ieee.org/document/8502090 | Peer review (e.g., grading assignments in Massive Open Online Courses (MOOCs), academic paper review) is an effective and scalable method to evaluate the products (e.g., assignments, papers) of a large number of agents when the number of dedicated reviewing experts (e.g., teaching assistants, editors) is limited. Peer review poses two key challenges: 1) identifying the reviewers' intrinsic capabilities (i.e., adverse selection) and 2) incentivizing the reviewers to exert high effort (i.e., moral hazard). Some works in mechanism design address pure adverse selection using one-shot matching rules, and pure moral hazard was addressed in repeated games with exogenously given and fixed matching rules. However, in peer review systems exhibiting both adverse selection and moral hazard, one-shot or exogenous matching rules do not link agents' current behavior with future matches and future payoffs, and as we prove, will induce myopic behavior (i.e., exerting the lowest effort) resulting in the lowest review quality. In this paper, we propose for the first time a solution that simultaneously solves adverse selection and moral hazard. Our solution exploits the repeated interactions of agents, utilizes ratings to summarize agents' past review quality, and designs matching rules that endogenously depend on agents' ratings. Our proposed matching rules are easy to implement and require no knowledge about agents' private information (e.g., their benefit and cost functions). Yet, they are effective in guiding the system to an equilibrium where the agents are incentivized to exert high effort and receive ratings that precisely reflect their review quality. Using several illustrative examples, we quantify the significant performance gains obtained by our proposed mechanism as compared to existing one-shot or exogenous matching rules. | 10.1109/TNSE.2018.2877578 | Journal | IEEE Transactions on Network Science and Engineering | Networks, Personalized Education | |||
2016/01/01 00:00 | Jamming Bandits—A Novel Learning Method for Optimal Jamming | S. Amuru, C. Tekin, M. van der Schaar, M. Buehrer | 2016 | https://ieeexplore.ieee.org/document/7362035 | Can an intelligent jammer learn and adapt to unknown environments in an electronic warfare-type scenario? In this paper, we answer this question in the positive, by developing a cognitive jammer that adaptively and optimally disrupts the communication between a victim transmitter-receiver pair. We formalize the problem using a multiarmed bandit framework where the jammer can choose various physical layer parameters such as the signaling scheme, power level and the on-off/pulsing duration in an attempt to obtain power efficient jamming strategies. We first present online learning algorithms to maximize the jamming efficacy against static transmitter-receiver pairs and prove that these algorithms converge to the optimal (in terms of the error rate inflicted at the victim and the energy used) jamming strategy. Even more importantly, we prove that the rate of convergence to the optimal jamming strategy is sublinear, i.e., the learning is fast in comparison to existing reinforcement learning algorithms, which is particularly important in dynamically changing wireless environments. Also, we characterize the performance of the proposed bandit-based learning algorithm against multiple static and adaptive transmitter-receiver pairs. | 10.1109/TWC.2015.2510643 | Journal | IEEE Transactions on Wireless Communications | Reinforcement learning | Communications and Networks | ||
2016/01/01 00:00 | Optimal Repeated Spectrum Sharing by Delay-Sensitive Users | Y. Xiao, M. van der Schaar | 2016 | https://www.cambridge.org/core/books/cloud-radio-access-networks/optimal-repeated-spectrum-sharing-by-delaysensitive-users/A6E7610A49E434CF4F718318FB525E6C | The spectrum is becoming an increasingly scarce resource, owing to the emergence of a plethora of bandwidth-intensive and delay-critical applications (e.g. multimedia streaming, video conferencing, and gaming). To achieve the gigabit data rates required by next-generation wireless systems, we need to manage efficiently the interference among a multitude of wireless devices, most of which have limited computational capability. Central to interference management are spectrum-sharing policies, which specify when and at which power level each device should access the spectrum. Given the heterogeneity and the huge number of distributed wireless devices, it is computationally hard to design efficient spectrum sharing policies. | 10.1017/9781316529669.017 | Chapter | Cloud Radio Access Networks: Principles, Technologies, and Applications | ||||
2016/01/01 00:00 | Personalized Active Learning for Activity Classification using Wireless Wearable Sensors | J. Xu, J. Y. Xu, L. Song, G. Pottie, M. van der Schaar | 2016 | https://ieeexplore.ieee.org/document/7452393 | Enabling accurate and low-cost classification of a range of motion activities is important for numerous applications, ranging from disease treatment and in-community rehabilitation of patients to athlete training. This paper proposes a novel contextual online learning method for activity classification based on data captured by low-cost, body-worn inertial sensors, and smartphones. The proposed method is able to address the unique challenges arising in enabling online, personalized and adaptive activity classification without requiring training phase from the individual. Another key challenge of activity classification is that the labels may change over time, as the data as well as the activity to be monitored evolve continuously, and the true label is often costly and difficult to obtain. The proposed algorithm is able to actively learn when to ask for the true label by assessing the benefits and costs of obtaining them. We rigorously characterize the performance of the proposed learning algorithm and Our experiments show that the proposed algorithm outperforms existing algorithms. | 10.1109/JSTSP.2016.2553648 | Journal | IEEE Journal of Selected Topics in Signal Processing | Communications and Networks | |||
2016/01/01 00:00 | Personalized Course Sequence Recommendations | J. Xu, T. Xiang, M. van der Schaar | 2016 | https://ieeexplore.ieee.org/document/7524023 | Given the variability in student learning, it is becoming increasingly important to tailor courses as well as course sequences to student needs. This paper presents a systematic methodology for offering personalized course sequence recommendations to students. First, a forward-search backward-induction algorithm is developed that can optimally select course sequences to decrease the time required for a student to graduate. The algorithm accounts for prerequisite requirements (typically present in higher level education) and course availability. Second, using the tools of multiarmed bandits, an algorithm is developed that can optimally recommend a course sequence that both reduces the time to graduate while also increasing the overall GPA of the student. The algorithm dynamically learns how students with different contextual backgrounds perform for given course sequences and, then, recommends an optimal course sequence for new students. Using real-world student data from the UCLA Mechanical and Aerospace Engineering Department, we illustrate how the proposed algorithms outperform other methods that do not include student contextual information when making course sequence recommendations. | 10.1109/TSP.2016.2595495 | Journal | IEEE Transactions on Signal Processing | Personalized Education | |||
2016/01/01 00:00 | Personalized Risk Scoring for Critical Care Patients using Mixtures of Gaussian Process Experts | A. M. Alaa, J. Yoon, S. Hu, M. van der Schaar | 2016 | https://arxiv.org/abs/1605.00959 | We develop a personalized real time risk scoring algorithm that provides timely and granular assessments for the clinical acuity of ward patients based on their (temporal) lab tests and vital signs. Heterogeneity of the patients population is captured via a hierarchical latent class model. The proposed algorithm aims to discover the number of latent classes in the patients population, and train a mixture of Gaussian Process (GP) experts, where each expert models the physiological data streams associated with a specific class. Self-taught transfer learning is used to transfer the knowledge of latent classes learned from the domain of clinically stable patients to the domain of clinically deteriorating patients. For new patients, the posterior beliefs of all GP experts about the patient's clinical status given her physiological data stream are computed, and a personalized risk score is evaluated as a weighted average of those beliefs, where the weights are learned from the patient's hospital admission information. Experiments on a heterogeneous cohort of 6,313 patients admitted to Ronald Regan UCLA medical center show that our risk score outperforms the currently deployed risk scores, such as MEWS and Rothman scores. | Conference | ICML Workshop on Computational Frameworks for Personalization | Time series analysis | Early warning systems | |||
2016/01/01 00:00 | Popularity-Driven Content Caching | S. Li, J. Xu, M. van der Schaar, W. Li | 2016 | https://ieeexplore.ieee.org/document/7524381 | This paper presents a novel cache replacement method - Popularity-Driven Content Caching (PopCaching). PopCaching learns the popularity of content and uses it to determine which content it should store and which it should evict from the cache. Popularity is learned in an online fashion, requires no training phase and hence, it is more responsive to continuously changing trends of content popularity. We prove that the learning regret of PopCaching (i.e., the gap between the hit rate achieved by PopCaching and that by the optimal caching policy with hindsight) is sublinear in the number of content requests. Therefore, PopCaching converges fast and asymptotically achieves the optimal cache hit rate. We further demonstrate the effectiveness of PopCaching by applying it to a movie.douban.com dataset that contains over 38 million requests. Our results show significant cache hit rate lift compared to existing algorithms, and the improvements can exceed 40% when the cache capacity is limited. In addition, PopCaching has low complexity. | 10.1109/INFOCOM.2016.7524381 | Conference | IEEE International Conference on Computer Communications (INFOCOM) | ||||
2016/01/01 00:00 | Predicting Grades | Y. Meier, J. Xu, O. Atan, M. van der Schaar | 2016 | https://ieeexplore.ieee.org/document/7313031 | To increase efficacy in traditional classroom courses as well as in Massive Open Online Courses (MOOCs), automated systems supporting the instructor are needed. One important problem is to automatically detect students that are going to do poorly in a course early enough to be able to take remedial actions. Existing grade prediction systems focus on maximizing the accuracy of the prediction while overseeing the importance of issuing timely and personalized predictions. This paper proposes an algorithm that predicts the final grade of each student in a class. It issues a prediction for each student individually, when the expected accuracy of the prediction is sufficient. The algorithm learns online what is the optimal prediction and time to issue a prediction based on past history of students' performance in a course. We derive a confidence estimate for the prediction accuracy and demonstrate the performance of our algorithm on a dataset obtained based on the performance of approximately 700 UCLA undergraduate students who have taken an introductory digital signal processing over the past seven years. We demonstrate that for 85% of the students we can predict with 76% accuracy whether they are going do well or poorly in the class after the fourth course week. Using data obtained from a pilot course, our methodology suggests that it is effective to perform early in-class assessments such as quizzes, which result in timely performance prediction for each student, thereby enabling timely interventions by the instructor (at the student or class level) when necessary. | 10.1109/TSP.2015.2496278 | Journal | IEEE Transactions on Signal Processing | Personalized Education | |||
2016/01/01 00:00 | Repeated Matching Mechanism Design with Moral Hazard and Adverse Selection | K. Ahuja, M. van der Schaar | 2016 | http://arxiv.org/abs/1602.02439 | In many two-sided markets, the parties to be matched have incomplete information about their characteristics. We consider the settings where the parties engaged are extremely patient and are interested in long-term partnerships. Hence, once the final matches are determined, they persist for a long time. Each side has an opportunity to learn (some) relevant information about the other before final matches are made. For instance, clients seeking workers to perform tasks often conduct interviews that require the workers to perform some tasks and thereby provide information to both sides. The performance of a worker in such an interview- and hence the information revealed - depends both on the inherent characteristics of the worker and the task and also on the actions taken by the worker (e.g. the effort expended), which are not observed by the client. Thus there is moral hazard. Our goal is to derive a dynamic matching mechanism that facilitates learning on both sides before final matches are achieved and ensures that the worker side does not have incentive to obscure learning of their characteristics through their actions. We derive such a mechanism that leads to final matching that achieve optimal performance (revenue) in equilibrium. We show that the equilibrium strategy is long-run coalitionally stable, which means there is no subset of workers and clients that can gain by deviating from the equilibrium strategy. We derive all the results under the modeling assumption that the utilities of the agents are defined as limit of means of the utility obtained in each interaction. | Other | Game Theory and Applications | |||||
2016/01/01 00:00 | Reputational Learning and Network Dynamics | S. Zhang, M. van der Schaar | 2016 | https://arxiv.org/abs/1507.04065v1 | In many real world networks agents are initially unsure of each other's qualities and learn about each other over time via repeated interactions. This paper is the first to provide a methodology for studying the formation of such networks, taking into account that agents differ from each other, that they begin with incomplete information, and that they must learn through observations which connections/links to form and which to break. The network dynamics in our model vary drastically from the dynamics emerging in models of complete information. With incomplete information and learning, agents who provide high benefits will develop high reputations and remain in the network, while agents who provide low benefits will drop in reputation and become ostracized. We show, among many other things, that the information to which agents have access and the speed at which they learn and act can have tremendous impact on the resulting network dynamics. Using our model, we can also compute the \textit{ex ante} social welfare given an arbitrary initial network, which allows us to characterize the socially optimal network structures for different sets of agents. Importantly, we show through examples that the optimal network structure depends sharply on both the initial beliefs of the agents, as well as the rate of learning by the agents. | Other | Game Theory and Applications, Networks | |||||
2016/01/01 00:00 | Smart Caching in Wireless Small Cell Networks via Contextual Multi-Armed Bandits | S. Muller, O. Atan, M. van der Schaar, A. Klein | 2016 | https://ieeexplore.ieee.org/document/7511570 | A promising architecture for content caching in wireless small cell networks is storing popular files at small base stations (sBSs) with limited storage capacities. Using localized communication, an sBS serves local user requests, while reducing the load on the macro cellular network. The sBS should cache the most popular files to maximize the number of cache hits. Content popularity is described by a popularity profile containing the expected demand of each file. Assuming a fixed popularity profile of which the sBS has complete knowledge, the optimal content placement problem reduces to ranking the files according to their expected demands and caching the highest ranked ones. Instead, we assume that the popularity profile is varying, for example depending on fluctuating types of users in the vicinity of the sBS, and it is unknown a priori. We present a novel algorithm based on contextual multi-armed bandits, in which the sBS regularly updates its cache content and observes the demands for cached files in different contexts, thereby learning context-dependent popularity profiles over time. We derive a sub-linear regret bound, proving that our algorithm learns smart caching. Our numerical results confirm that by exploiting contextual information, our algorithm outperforms reference algorithms in various scenarios. | 10.1109/ICC.2016.7511570 | Conference | IEEE International Conference on Communications (ICC) | Multi-armed bandits | |||
2016/01/01 00:00 | Social Norm Incentives for Network Coding in MANETs | C. Wu, M. Gerla, M. van der Schaar | 2016 | https://ieeexplore.ieee.org/document/7851063 | The performance of mobile ad hoc network transmissions subject to disruption, loss, interference, and jamming can be significantly improved with the use of network coding (NC). However, NC requires extra work for forwarders, including additional bandwidth consumption due to transmitting overheads for redundant NC packets and additional processing due to generating the NC packets. Selfish forwarders may prefer to simply forward packets without coding them to avoid such overhead. This is especially true when network coding must be protected from pollution attacks, which involves additional, often processor intensive, pollution detection procedures. To drive selfish nodes to cooperate and encode the packets, this paper introduces social norm-based incentives. The social norm consists of a social strategy and a reputation system with reward and punishment connected with node behavior. Packet coding and forwarding are modeled and formalized as a repeated NC forwarding game. The conditions for the sustainability (or compliance) of the social norm are identified, and a sustainable social norm that maximizes the social utility is designed via selecting the optimal design parameters, including the social strategy, reputation threshold, reputation update frequency, and the generation size of network coding. For this game, the impacts of packet loss rate and transmission patterns on performance are evaluated, and their impacts on the decision of selecting the optimal social norm are discussed. Finally, practical issues, including distributed reputation dissemination and the existence of altruistic and malicious users, are discussed. | 10.1109/TNET.2017.2656059 | Journal | IEEE/ACM Transactions on Networking | Communications and Networks | |||
2016/01/01 00:00 | To Relay or Not to Relay: Learning Device-to-Device Relaying Strategies in Cellular Networks | N. Mastronarde, V. Patel, J. Xu, L. Liu, M. van der Schaar | 2016 | https://ieeexplore.ieee.org/document/7181721 | We consider a cellular network where mobile transceiver devices that are owned by self-interested users are incentivized to cooperate with each other using tokens, which they exchange electronically to "buy" and "sell" downlink relay services, thereby increasing the network's capacity compared to a network that only supports base station-to-device (B2D) communications. We investigate how an individual device in the network can learn its optimal cooperation policy online, which it uses to decide whether or not to provide downlink relay services for other devices in exchange for tokens. We propose a supervised learning algorithm that devices can deploy to learn their optimal cooperation strategies online given their experienced network environment. We then systematically evaluate the learning algorithm in various deployment scenarios. Our simulation results suggest that devices have the greatest incentive to cooperate when the network contains (i) many devices with high energy budgets for relaying, (ii) many highly mobile users (e.g., users in motor vehicles), and (iii) neither too few nor too many tokens. Additionally, within the token system, self-interested devices can effectively learn to cooperate online, and achieve up to 20 percent throughput gains on average compared to B2D communications alone, all while selfishly maximizing their own utilities. | 10.1109/TMC.2015.2465379 | Journal | IEEE Transactions on Mobile Computing | Multi-agent learning | Communications and Networks | ||
2016/01/01 00:00 | Trend-Aware Video Caching through Online Learning | S. Li, J. Xu, M. van der Schaar, W. Li | 2016 | https://ieeexplore.ieee.org/document/7524790 | This paper presents Trend-Caching, a novel cache replacement method that optimizes cache performance according to the trends of video content. Trend-Caching explicitly learns the popularity trend of video content and uses it to determine which video it should store and which it should evict from the cache. Popularity is learned in an online fashion and requires no training phase, hence it is more responsive to continuously changing trends of videos. We prove that the learning regret of Trend-Caching (i.e., the gap between the hit rate achieved by Trend-Caching and that by the optimal caching policy with hindsight) is sublinear in the number of video requests, thereby guaranteeing both fast convergence and asymptotically optimal cache hit rate. We further validate the effectiveness of Trend-Caching by applying it to a movie.douban.com dataset that contains over 38 million requests. Our results show significant cache hit rate lift compared to existing algorithms, and the improvements can exceed 40% when the cache capacity is limited. Furthermore, Trend-Caching has low complexity. | 10.1109/TMM.2016.2596042 | Journal | IEEE Transactions on Multimedia | Multi-agent learning | Communications and Networks | ||
2015/07/06 00:00 | Context-based Unsupervised Data Fusion for Decision Making | E. Soltanmohammadi, M. Naraghi-Pour, M. van der Schaar | 2015 | http://proceedings.mlr.press/v37/soltanmohammadi15.html | Big Data received from sources such as social media, in-stream monitoring systems, networks, and markets is often mined for discovering patterns, detecting anomalies, and making decisions or predictions. In distributed learning and real-time processing of Big Data, ensemble-based systems in which a fusion center (FC) is used to combine the local decisions of several classifiers, have shown to be superior to single expert systems. However, optimal design of the FC requires knowledge of the accuracy of the individual classifiers which, in many cases, is not available. Moreover, in many applications supervised training of the FC is not feasible since the true labels of the data set are not available. In this paper, we propose an unsupervised joint estimation-detection scheme to estimate the accuracies of the local classifiers as functions of data context and to fuse the local decisions of the classifiers. Numerical results show the dramatic improvement of the proposed method as compared with the state of the art approaches. | Conference | ICML | |||||
2015/05/09 00:00 | Global Multi-armed Bandits with Hölder Continuity | O. Atan, C. Tekin, M. van der Schaar | 2015 | http://proceedings.mlr.press/v38/atan15.html | Standard Multi-Armed Bandit (MAB) problems assume that the arms are independent. However, in many application scenarios, the information obtained by playing an arm provides information about the remainder of the arms. Hence, in such applications, this informativeness can and should be exploited to enable faster convergence to the optimal solution. In this paper, formalize a new class of multi-armed bandit methods, Global Multi-armed Bandit (GMAB), in which arms are globally informative through a global parameter, i.e., choosing an arm reveals information about all the arms. We propose a greedy policy for the GMAB which always selects the arm with the highest estimated expected reward, and prove that it achieves bounded parameter-dependent regret. Hence, this policy selects suboptimal arms only finitely many times, and after a finite number of initial time steps, the optimal arm is selected in all of the remaining time steps with probability one. In addition, we also study how the informativeness of the arms about each other’s rewards affects the speed of learning. Specifically, we prove that the parameter-free (worst-case) regret is sublinear in time, and decreases with the informativeness of the arms. We also prove a sublinear in time Bayesian risk bound for the GMAB which reduces to the well-known Bayesian risk bound for linearly parameterized bandits when the arms are fully informative. GMABs have applications ranging from drug dosage control to dynamic pricing. | Conference | AISTATS | Reinforcement learning, Multi-armed bandits | ||||
2015/05/09 00:00 | Real-time stream mining: online knowledge extraction using classifier networks | L. Canzian, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7293299 | The world is increasingly information-driven. Vast amounts of data are being produced by different sources and in diverse formats. It is becoming critical to endow assessment systems with the ability to process streaming information from sensors in real time in order to better manage physical systems, derive informed decisions, tweak production processes, and optimize logistics choices. This article first surveys the works dealing with building, adapting, and managing networks of classifiers, then describes the challenges and limitations of the current approaches, discusses possible directions to deal with these limitations, and presents some open research questions that need to be investigated. | 10.1109/MNET.2015.7293299 | Journal | IEEE Network | Multi-agent learning | |||
2015/05/09 00:00 | Using contextual learning to improve diagnostic accuracy: application in breast cancer screening | L. Song, W. Hsu, J. Xu, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7064753 | Clinicians need to routinely make management decisions about patients who are at risk for a disease such as breast cancer. This paper presents a novel clinical decision support tool that is capable of helping physicians make diagnostic decisions. We apply this support system to improve the specificity of breast cancer screening and diagnosis. The system utilizes clinical context (e.g., demographics, medical history) to minimize the false positive rates while avoiding false negatives. An online contextual learning algorithm is used to update the diagnostic strategy presented to the physicians over time. We analytically evaluate the diagnostic performance loss of the proposed algorithm, in which the true patient distribution is not known and needs to be learned, as compared with the optimal strategy where all information is assumed known, and prove that the false positive rate of the proposed learning algorithm asymptotically converges to the optimum. In addition, our algorithm also has the important merit that it can provide individualized confidence estimates about the accuracy of the diagnosis recommendation. Moreover, the relevancy of contextual features is assessed, enabling the approach to identify specific contextual features that provide the most value of information in reducing diagnostic errors. Experiments were conducted using patient data collected at a large academic medical center. Our proposed approach outperforms the current clinical practice by 36% in terms of false positive rate given a 2% false negative rate. | 10.1109/JBHI.2015.2414934 | Journal | IEEE Journal of Biomedical and Health Informatics | Medical imaging | |||
2015/01/01 00:00 | A Data-Driven Approach for Matching Clinical Expertise to Individual Cases | O. Atan, C. Tekin, M. van der Schaar, W. Hsu | 2015 | https://ieeexplore.ieee.org/document/7178342 | Hospitals are increasingly utilizing business intelligence and analytics tools to mine electronic health data to uncover inefficiencies in care delivery (e.g., slow turnaround times, high readmission rates). Given that the expertise and experience of healthcare providers may vary significantly, an area of potential improvement is optimizing the way patient cases are recommended to clinical experts (e.g., the pathologist who is most adept at diagnosing a rare cancer). In this paper, we propose an expert selection system that automatically matches a given patient case to the best available expert considering both the available contextual information about a patient (e.g., demographics, medical history, signs and symptoms, past interventions) and the congestion of the expert. We prove that as the number of patients grows, the proposed algorithm will discover the best expert to select for patients with a specific context. Moreover, the algorithm also provides confidence bounds on the diagnostic accuracy of the expert it selects. While the proposed system can be applied in many scenarios, we demonstrate its performance in the context of assigning mammography exams to individual radiologists for interpretation. We show that our proposed system can improve current clinical practice by improving overall sensitivity and specificity of screening exams compared to random assignment.Finally, since each expert can only take a certain number of diagnosis decisions on a daily basis, we show how our system can take the experts' workload into account as well as the expertise when deciding how to select experts. | 10.1109/ICASSP.2015.7178342 | Conference | IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) | ||||
2015/01/01 00:00 | A Dynamic Network Formation Model for Understanding Bacterial Self-Organization Into Micro-Colonies | L. Canzian, K. Zhao, G. C. Wong, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7181698 | We propose a general parametrizable model to capture the dynamic interaction among bacteria in the formation of micro-colonies. Micro-colonies represent the first social step towards the formation of structured multicellular communities known as bacterial biofilms, which protect the bacteria against antimicrobials. In our model, bacteria can form links in the form of intercellular adhesins (such as polysaccharides) to collaborate in the production of resources that are fundamental to protect them against antimicrobials. Since maintaining a link can be costly, we assume that each bacterium forms and maintains a link only if the benefit received from the link is larger than the cost, and we formalize the interaction among bacteria as a dynamic network formation game. We rigorously characterize some of the key properties of the network evolution depending on the parameters of the system. In particular, we derive the parameters under which it is guaranteed that all bacteria will join micro- colonies and the parameters under which it is guaranteed that some bacteria will not join micro-colonies. Importantly, our study does not only characterize the properties of networks emerging in equilibrium, but it also provides important insights on how the network dynamically evolves and on how the formation history impacts the emerging networks in equilibrium. This analysis can be used to develop methods to influence on-the-fly the evolution of the network, and such methods can be useful to treat or prevent biofilm-related diseases. | 10.1109/TMBMC.2015.2465515 | Journal | IEEE Transactions on Molecular, Biological, and Multi-Scale Communications | Communications and Networks, Networks | |||
2015/01/01 00:00 | A Reinforcement Learning-based Data-Link Protocol for Underwater Acoustic Communications | V. Di Valerio, C. Petrioli, L. Pescosolido, M. van der Schaar | 2015 | https://dl.acm.org/doi/10.1145/2831296.2831338 | We consider an underwater acoustic link where a sender transmits a flow of packets to a receiver through a channel with time varying quality. We address the problem of scheduling packets transmission, forward error correction (FEC) code selection, and channel probing to achieve the best trade-off between energy consumption and latency. Unlike previous works, which assume complete knowledge of the statistics of the underwater acoustic environment, we make the protocol learn the optimal behavior based on experience, without relying on any prior knowledge on the environment. We design a Reinforcement-Learning (RL)-based protocol which learns how to minimize a cost function which is a combination of delay and energy consumption, at the same time ensuring packet delivery. Starting from a basic Q-learning strategy, we design two learning algorithms to speed up learning time, and compare the performance of the proposed solutions with the Q-learning-based strategy and with an aggressive strategy which always transmits all the packets in the buffer. The results show that the proposed techniques outperform the aggressive policy and Q-learning, and are successful in achieving good tradeoffs between energy consumption and packet delivery latency (PDL). | 10.1145/2831296.2831338 | Conference | ACM International Conference on Underwater Networks & Systems (WUWNet) | Reinforcement learning | |||
2015/01/01 00:00 | A Systematic Learning Method for Optimal Jamming | S. Amuru, C. Tekin, M. van der Schaar, M. Buehrer | 2015 | https://ieeexplore.ieee.org/document/7248754 | Can an intelligent jammer learn and adapt to unknown environments in an electronic warfare-type scenario? In this paper, we answer this question in the positive, by developing a cognitive jammer that disrupts the communication between a victim transmitter-receiver pair. We formalize the problem using a novel multi-armed bandit framework where the jammer can choose various physical layer parameters such as signaling scheme, power level and the on-off/pulsing duration in an attempt to obtain power efficient jamming strategies. We first present novel online learning algorithms to maximize the jamming efficacy against static transmitter-receiver pairs i.e., the case when the victim does not change its communication technique despite the presence of interference. We prove that our learning algorithm converges to the optimal jamming strategy. Even more importantly, we prove that the rate of convergence to the optimal jamming strategy is sub-linear, i.e. the learning is fast, which is important in dynamically changing wireless environments. Also, we characterize the performance of the proposed bandit-based learning algorithm against adaptive transmitter-receiver pairs. | 10.1109/ICC.2015.7248754 | Conference | IEEE International Conference on Communications (ICC) | Multi-armed bandits | |||
2015/01/01 00:00 | Active Learning in Context-Driven Stream Mining with an Application to Image Mining | C. Tekin, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7126997 | We propose an image stream mining method in which images arrive with contexts (metadata) and need to be processed in real time by the image mining system (IMS), which needs to make predictions and derive actionable intelligence from these streams. After extracting the features of the image by preprocessing, IMS determines online the classifier to use on the extracted features to make a prediction using the context of the image. A key challenge associated with stream mining is that the prediction accuracy of the classifiers is unknown, since the image source is unknown; thus, these accuracies need to be learned online. Another key challenge of stream mining is that learning can only be done by observing the true label, but this is costly to obtain. To address these challenges, we model the image stream mining problem as an active, online contextual experts problem, where the context of the image is used to guide the classifier selection decision. We develop an active learning algorithm and show that it achieves regret sublinear in the number of images that have been observed so far. To further illustrate and assess the performance of our proposed methods, we apply them to diagnose breast cancer from the images of cellular samples obtained from the fine needle aspirate of breast mass. Our findings show that very high diagnosis accuracy can be achieved by actively obtaining only a small fraction of true labels through surgical biopsies. Other applications include video surveillance and video traffic monitoring. | 10.1109/TIP.2015.2446936 | Journal | IEEE Transactions on Image Processing | Reinforcement learning | Medical imaging | ||
2015/01/01 00:00 | Adaptive Prioritized Random Linear Coding and Scheduling for Layered Data Delivery from Multiple Servers | N. Thomos, E. Kurdoglu, P. Frossard, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7091934 | In this paper, we deal with the problem of jointly determining the optimal coding strategy and the scheduling decisions when receivers obtain layered data from multiple servers. The layered data is encoded by means of prioritized random linear coding (PRLC) in order to be resilient to channel loss while respecting the unequal levels of importance in the data, and data blocks are transmitted simultaneously in order to reduce decoding delays and improve the delivery performance. We formulate the optimal coding and scheduling decisions problem in our novel framework with the help of Markov decision processes (MDP), which are effective tools for modeling adapting streaming systems. Reinforcement learning approaches are then proposed to derive reduced computational complexity solutions to the adaptive coding and scheduling problems. The novel reinforcement learning approaches and the MDP solution are examined in an illustrative example for scalable video transmission . Our methods offer large performance gains over competing methods that deliver the data blocks sequentially. The experimental evaluation also shows that our novel algorithms offer continuous playback and guarantee small quality variations which is not the case for baseline solutions. Finally, our work highlights the advantages of reinforcement learning algorithms to forecast the temporal evolution of data demands and to decide the optimal coding and scheduling decisions. | 10.1109/TMM.2015.2425228 | Journal | IEEE Transactions on Multimedia | Communications and Networks | |||
2015/01/01 00:00 | Caring Analytics for Adults With Special Needs | M. Wolf, M. van der Schaar, H. Kim, J. Xu | 2015 | https://ieeexplore.ieee.org/document/7118178 | We propose a novel caring analytics system for assisting with the long-term care of adults with special needs. Our proposed system combines sensor network-driven activity analysis and online learning algorithms to analyze each resident's care. The analysis should result in a variety of reports and alerts on activities of interest (is the resident eating regularly?) as well as recommendations (try a different type of food). We do so in a complex environment: each home contains several residents, one or more caregivers, and visitors. Our system must extract the activity of each resident from this noisy environment. Moreover, the conditions of the residents vary widely, and the recommendation system must be robust even though the available information may be limited. | 10.1109/MDAT.2015.2441717 | Journal | IEEE Design & Test | Reinforcement learning | |||
2015/01/01 00:00 | Big-Data Streaming Applications Scheduling with Online Learning and Concept Drift Detection | K. Kanoun, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7092635 | Several techniques have been proposed to adapt Big-Data streaming applications to resource constraints. These techniques are mostly implemented at the application layer and make simplistic assumptions about the system resources and they are often agnostic to the system capabilities. Moreover, they often assume that the data streams characteristics and their processing needs are stationary, which is not true in practice. In fact, data streams are highly dynamic and may also experience concept drift, thereby requiring continuous online adaptation of the throughput and quality to each processing task. Hence, existing solutions for Big-Data streaming applications are often too conservative or too aggressive. To address these limitations, we propose an online energy-efficient scheduler which maximizes the QoS (i.e., throughput and output quality) of Big-Data streaming applications under energy and resources constraints. Our scheduler uses online adaptive reinforcement learning techniques and requires no offline information. Moreover, our scheduler is able to detect concept drifts and to smoothly adapt the scheduling strategy. Our experiments realized on a chain of tasks modeling real-life streaming application demonstrate that our scheduler is able to learn the scheduling policy and to adapt it such that it maximizes the targeted QoS given energy constraint as the Big-Data characteristics are dynamically changing. | 10.7873/DATE.2015.0786 | Conference | Design, Automation & Test in Europe Conference & Exhibition (DATE) | ||||
2015/01/01 00:00 | Contextual Online Learning for Multimedia Content Aggregation | C. Tekin, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7041202 | The last decade has witnessed a tremendous growth in the volume as well as the diversity of multimedia content generated by a multitude of sources (news agencies, social media, etc.). Faced with a variety of content choices, consumers are exhibiting diverse preferences for content; their preferences often depend on the context in which they consume content as well as various exogenous events. To satisfy the consumers' demand for such diverse content, multimedia content aggregators (CAs) have emerged which gather content from numerous multimedia sources. A key challenge for such systems is to accurately predict what type of content each of its consumers prefers in a certain context, and adapt these predictions to the evolving consumers' preferences, contexts, and content characteristics . We propose a novel, distributed, online multimedia content aggregation framework, which gathers content generated by multiple heterogeneous producers to fulfill its consumers' demand for content. Since both the multimedia content characteristics and the consumers' preferences and contexts are unknown, the optimal content aggregation strategy is unknown a priori. Our proposed content aggregation algorithm is able to learn online what content to gather and how to match content and users by exploiting similarities between consumer types. We prove bounds for our proposed learning algorithms that guarantee both the accuracy of the predictions as well as the learning speed. Importantly, our algorithms operate efficiently even when feedback from consumers is missing or content and preferences evolve over time. Illustrative results highlight the merits of the proposed content aggregation system in a variety of settings. | 10.1109/TMM.2015.2403234 | Journal | IEEE Transactions on Multimedia | ||||
2015/01/01 00:00 | Discover Relevant Sources: A Multi-Armed Bandit Approach | O. Atan, M. van der Schaar | 2015 | https://www.vanderschaar-lab.com/papers/Sourceselection.pdf | Existing work on online learning for decision making takes the information available as a given and focuses solely on choosing the best actions given this information. Instead, in this paper, the decision maker needs to simultaneously learn both what decisions to make and what source(s) of information to consult/gather data from in order to inform its decisions such that its reward is maximized. We formalize this dual-learning and online decision making problem as a multi-armed bandit problem. If it were known in advance which sources were relevant for which decisions, the problem would be simple but they are not. We propose algorithms that discover the relevant source(s) over time, while simultaneously learning what actions to take based on the information revealed by the selected source(s). Our algorithm resembles that of the well-known UCB algorithm but adds to it the online discovery of what specific sources are relevant to consult to inform specific decisions. We prove logarithmic regret bounds and also provide a matching lower bound on the number of times a wrong source is selected, which is achieved by RSUCB for specific cases. The proposed algorithm can be applied in many applications including clinical decision assist systems for medical diagnosis, recommender systems, actionable intelligence, etc. where observing the complete information of a patient or a consumer or consulting all the available sources to gather intelligence is not feasible. | Other | Feature selection, Reinforcement learning, Multi-armed bandits | |||||
2015/01/01 00:00 | Discover the Expert: Context-Adaptive Expert Selection for Medical Diagnosis | C. Tekin, O. Atan, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/6998045 | In this paper, we propose an expert selection system that learns online the best expert to assign to each patient depending on the context of the patient. In general, the context can include an enormous number and variety of information related to the patient's health condition, age, gender, previous drug doses, and so forth, but the most relevant information is embedded in only a few contexts. If these most relevant contexts were known in advance, learning would be relatively simple but they are not. Moreover, the relevant contexts may be different for different health conditions. To address these challenges, we develop a new class of algorithms aimed at discovering the most relevant contexts and the best clinic and expert to use to make a diagnosis given a patient's contexts. We prove that as the number of patients grows, the proposed context-adaptive algorithm will discover the optimal expert to select for patients with a specific context. Moreover, the algorithm also provides confidence bounds on the diagnostic accuracy of the expert it selects, which can be considered by the primary care physician before making the final decision. While our algorithm is general and can be applied in numerous medical scenarios, we illustrate its functionality and performance by applying it to a real-world breast cancer diagnosis data set. Finally, while the application we consider in this paper is medical diagnosis, our proposed algorithm can be applied in other environments where expertise needs to be discovered. | 10.1109/TETC.2014.2386133 | Journal | IEEE Transactions on Emerging Topics in Computing | Multi-agent learning, Reinforcement learning | Clinical practice | Communications and Networks, Networks, Personalized Education | |
2015/01/01 00:00 | Discovering Action-Dependent Relevance: Learning from Logged Data | O. Atan, C. Tekin, J. Xu, M. van der Schaar | 2015 | https://www.vanderschaar-lab.com/papers/Discovering_Relevance_v4.pdf | In many learning problems, the decision maker is provided with various (types of) context information that she might utilize to select actions in order to maximize performance/rewards. But not all information is equally relevant: some context information may be more relevant to the decision problem at hand. Discovering and exploiting the most relevant context information speeds up learning, reduces costs and eliminates noise introduced by irrelevant context information. In many settings, discovering and exploiting the most relevant context information converts intractable problems into tractable problems. This paper develops methods to discover the relevant context information and learn the best actions to take on the basis of a logged bandit dataset and establishes performance bounds for these methods. These methods deal effectively with the two central challenges. The first is that only the rewards of actions actually taken will be observed; counterfactual reward observations are not available. The second is that the relevant context information can be different for different actions. Applications of these methods include clinical decision support systems, smart cities, recommender systems. | Other | Feature selection, Reinforcement learning | |||||
2015/01/01 00:00 | Distributed Interference Management Policies for Heterogeneous Small Cell Networks | K. Ahuja, Y. Xiao, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7067417 | We study the problem of distributed interference management in a network of heterogeneous small cells with different cell sizes, different numbers of user equipments (UEs) served, and different throughput requirements by UEs. We consider the uplink transmission, where each UE determines when and at what power level it should transmit to its serving small cell base station (SBS). We propose a general framework for designing distributed interference management policies, which exploits weak interference among non-neighboring UEs by letting them transmit simultaneously (i.e., spatial reuse), while eliminating strong interference among neighboring UEs by letting them transmit in different time slots. The design of optimal interference management policies has two key steps. Ideally, we need to find all the subsets of non-interfering UEs i.e., the maximal independent sets (MISs) of the interference graph, but this is computationally intractable even when solved in a centralized manner. Then, to maximize some given network performance criterion subject to UEs' minimum throughput requirements, we need to determine the optimal fraction of time occupied by each MIS, which requires global information (e.g., all the UEs' throughput requirements and channel gains). In our framework, we first propose a distributed algorithm for the UE-SBS pairs to find a subset of MISs in logarithmic time (with respect to the number of UEs). Then we propose a novel problem reformulation which enables UE-SBS pairs to determine the optimal fraction of time occupied by each MIS with only local message exchange among the neighbors in the interference graph. Despite the fact that our interference management policies are distributed and utilize only local information, we can analytically bound their performance under a wide range of heterogeneous deployment scenarios in terms of the competitive ratio with respect to the optimal network performance, which can only be obtained in a centralized manner with NP complexity. Remarkably, we prove that the competitive ratio is independent of the network size. Through extensive simulations, we show that our proposed policies achieve significant performance improvements (ranging from 160% to 700%) over state-of-the-art policies. | 10.1109/JSAC.2015.2417014 | Journal | IEEE Journal on Selected Areas in Communications | Communications and Networks | |||
2015/01/01 00:00 | Distributed Multi-Agent Online Learning Based on Global Feedback | J. Xu, C. Tekin, S. Zhang, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7041172 | In this paper, we develop online learning algorithms that enable the agents to cooperatively learn how to maximize the overall reward in scenarios where only noisy global feedback is available without exchanging any information among themselves. We prove that our algorithms' learning regrets-the losses incurred by the algorithms due to uncertainty-are logarithmically increasing in time and thus the time average reward converges to the optimal average reward. Moreover, we also illustrate how the regret depends on the size of the action space, and we show that this relationship is influenced by the informativeness of the reward structure with regard to each agent's individual action. When the overall reward is fully informative, regret is shown to be linear in the total number of actions of all the agents. When the reward function is not informative, regret is linear in the number of joint actions. Our analytic and numerical results show that the proposed learning algorithms significantly outperform existing online learning solutions in terms of regret and learning speed. We illustrate how our theoretical framework can be used in practice by applying it to online Big Data mining using distributed classifiers. | 10.1109/TSP.2015.2403288 | Journal | IEEE Transactions on Signal Processing | Communications and Networks, Networks | |||
2015/01/01 00:00 | Distributed Online Learning via Cooperative Contextual Bandits | C. Tekin, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7103356 | In this paper, we propose a novel framework for decentralized, online learning by many learners. At each moment of time, an instance characterized by a certain context may arrive to each learner; based on the context, the learner can select one of its own actions (which gives a reward and provides information) or request assistance from another learner. In the latter case, the requester pays a cost and receives the reward but the provider learns the information. In our framework, learners are modeled as cooperative contextual bandits. Each learner seeks to maximize the expected reward from its arrivals, which involves trading off the reward received from its own actions, the information learned from its own actions, the reward received from the actions requested of others and the cost paid for these actions-taking into account what it has learned about the value of assistance from each other learner. We develop distributed online learning algorithms and provide analytic bounds to compare the efficiency of these with algorithms with the complete knowledge (oracle) benchmark (in which the expected reward of every action in every context is known by every learner). Our estimates show that regret-the loss incurred by the algorithm-is sublinear in time. Our theoretical framework can be used in many practical applications including Big Data mining, event detection in surveillance sensor networks and distributed online recommendation systems. | 10.1109/TSP.2015.2430837 | Journal | IEEE Transactions on Signal Processing | Multi-agent learning, Reinforcement learning | Communications and Networks, Networks | ||
2015/01/01 00:00 | Dynamic Network Formation with Incomplete Information | Y. Song, M. van der Schaar | 2015 | https://link.springer.com/article/10.1007/s00199-015-0858-y | How do networks form and what is their ultimate topology? Most of the literature that addresses these questions assumes complete information: agents know in advance the value of linking even with agents they have never met and with whom they have had no previous interaction (direct or indirect). This paper addresses the same questions under the much more natural assumption of incomplete information: agents do not know in advance—but must learn—the value of linking. We show that incomplete information has profound implications for the formation process and the ultimate topology. Under complete information, the network topologies that form and are stable typically consist of agents of relatively high value only. Under incomplete information, a much wider collection of network topologies can emerge and be stable. Moreover, even with the same topology, the locations of agents can be very different: An agent can achieve a central position purely as the result of chance rather than as the result of merit. All of this can occur even in settings where agents eventually learn everything so that information, although initially incomplete, eventually becomes complete. The ultimate network topology depends significantly on the formation history, which is natural and true in practice, and incomplete information makes this phenomenon more prevalent. | 10.1007/s00199-015-0858-y | Journal | Economic Theory | Multi-agent learning | Communications and Networks, Game Theory and Applications, Networks | ||
2015/01/01 00:00 | Dynamic Pricing and Energy Consumption Scheduling with Reinforcement Learning | B.-G. Kim, Y. Zhang, M. van der Schaar, J.-W. Lee | 2015 | https://ieeexplore.ieee.org/document/7321806 | In this paper, we study a dynamic pricing and energy consumption scheduling problem in the microgrid where the service provider acts as a broker between the utility company and customers by purchasing electric energy from the utility company and selling it to the customers. For the service provider, even though dynamic pricing is an efficient tool to manage the microgrid, the implementation of dynamic pricing is highly challenging due to the lack of the customer-side information and the various types of uncertainties in the microgrid. Similarly, the customers also face challenges in scheduling their energy consumption due to the uncertainty of the retail electricity price. In order to overcome the challenges of implementing dynamic pricing and energy consumption scheduling, we develop reinforcement learning algorithms that allow each of the service provider and the customers to learn its strategy without a priori information about the microgrid. Through numerical results, we show that the proposed reinforcement learning-based dynamic pricing algorithm can effectively work without a priori information about the system dynamics and the proposed energy consumption scheduling algorithm further reduces the system cost thanks to the learning capability of each customer. | 10.1109/TSG.2015.2495145 | Journal | IEEE Transactions on Smart Grid | Reinforcement learning | Communications and Networks | ||
2015/01/01 00:00 | Efficient Interference Management Policies for Femtocell Networks | K. Ahuja, Y. Xiao, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7098410 | Managing interference in a network of macrocells underlaid with femtocells presents an important, yet challenging problem. A majority of spatial (frequency/time) reuse based approaches partition the users based on coloring the interference graph, which is shown to be suboptimal. Some spatial time reuse based approaches schedule the maximal independent sets (MISs) in a cyclic, (weighted) round-robin fashion, which is inefficient for delay-sensitive applications. Our proposed policies schedule the MISs in a non-cyclic fashion, which aim to optimize any given network performance criterion for delay-sensitive applications while fulfilling minimum throughput requirements of the users. Importantly, we do not take the interference graph as given as in existing works; we propose an optimal construction of the interference graph. We prove that under certain conditions, the proposed policy achieves the optimal network performance. For large networks, we propose a low-complexity algorithm for computing the proposed policy. We show that the policy computed achieves a constant competitive ratio (with respect to the optimal network performance), which is independent of the network size, under wide range of deployment scenarios. The policy can be implemented in a decentralized manner by the users. Compared to the existing policies, our proposed policies can achieve improvement of up to 130% in large-scale deployments. | 10.1109/TWC.2015.2428239 | Journal | IEEE Transactions on Wireless Communications | Communications and Networks | |||
2015/01/01 00:00 | Efficient Outcomes in Repeated Games with Limited Monitoring | M. van der Schaar, Y. Xiao, W. R. Zame | 2015 | https://link.springer.com/article/10.1007/s00199-015-0893-8 | The folk theorem for infinitely repeated games with imperfect public monitoring implies that for a general class of games, nearly efficient payoffs can be supported in perfect public equilibrium (PPE), provided the monitoring structure is sufficiently rich and players are arbitrarily patient. This paper shows that for stage games in which actions of players interfere strongly with each other, exactly efficient payoffs can be supported in PPE even when the monitoring structure is not rich and players are not arbitrarily patient. The class of stage games we study abstracts many environments including resource sharing. | 10.1007-s00199-015-0893-8 | Journal | Economic Theory | Communications and Networks, Game Theory and Applications | |||
2015/01/01 00:00 | Efficient Working and Shirking in Information Sharing Networks | J. Xu, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7012009 | In many systems, agents interact repeatedly with each other over an exogenously determined network and need to cooperate with each other by producing and sharing valuable knowledge or information with the agents with which they are connected. However, producing and sharing information can be costly for the agents themselves, while providing no direct immediate benefit to them. Hence, there are incentives for individual agents to shirk rather than to work-to free ride on the information production and sharing of other agents rather than to produce information themselves. In this paper, we develop a systematic framework for designing rating systems aimed at promoting efficient production and sharing in these networks, thereby significantly improving the social welfare (i.e., sum utility of agents) of such networks. The schemes proposed operated effectively even in settings where monitoring of agent behavior is subject to significant errors. In many scenarios our schemes achieve maximum social welfare; in others, we prove that optimal schemes necessarily fall short of maximum social welfare due to imperfect monitoring. The distinction between these scenarios arises from the tension between the social value of producing for others and the strategic value of withholding production. In some scenarios, the optimal scheme allows that less-productive agents shirk (not produce); this creates the largest incentives for more-productive agents to work at the socially-desired level. We establish conditions under which recommending “work” to all agents is the optimal strategy and develop low-complexity algorithms to determine the optimal strategy in general settings for arbitrary information sharing networks. | 10.1109/JSAC.2015.2393432 | Journal | IEEE Journal on Selected Areas in Communications | Multi-agent learning | Communications and Networks, Networks | ||
2015/01/01 00:00 | Ensemble of Distributed Learners for Online Classification of Dynamic Data Streams | L. Canzian, Y. Zhang, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7274771 | We present a distributed online learning scheme to classify data captured from distributed and dynamic data sources. Our scheme consists of multiple distributed local learners, which analyze different streams of data that are correlated to a common event that needs to be classified. Each learner uses a local classifier to make a local prediction. The local predictions are then collected by each learner and combined using a weighted majority rule to output the final prediction. We propose a novel online ensemble learning algorithm to update the aggregation rule in order to adapt to the underlying data dynamics. We rigorously determine an upper bound for the worst-case mis-classification probability of our algorithm, which tends asymptotically to 0 if the misclassification probability of the best (unknown) static aggregation rule is 0. Then we extend our algorithm to address challenges specific to the distributed implementation and prove new bounds that apply to these settings. Finally, we test our scheme by performing an evaluation study on several data sets. | 10.1109/TSIPN.2015.2470125 | Journal | IEEE Transactions on Signal and Information Processing over Networks | Ensemble learning | |||
2015/01/01 00:00 | eTutor: Online Learning for Personalized Education | C.Tekin, J. Braun, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7179032 | Given recent advances in information technology and artificial intelligence, web-based education systems have became complementary and, in some cases, viable alternatives to traditional classroom teaching. The popularity of these systems stems from their ability to make education available to a large demographics (see MOOCs). However, existing systems do not take advantage of the personalization which becomes possible when web-based education is offered: they continue to be one-size-fits-all. In this paper, we aim to provide a first systematic method for designing a personalized web-based education system. Personalizing education is challenging: (i) students need to be provided personalized teaching and training depending on their contexts (e.g. classes already taken, methods of learning preferred, etc.), (ii) for each specific context, the best teaching and training method (e.g type and order of teaching materials to be shown) must be learned, (iii) teaching and training should be adapted online, based on the scores/feedback (e.g. tests, quizzes, final exam, likes/dislikes etc.) of the students. Our personalized online system, e-Tutor, is able to address these challenges by learning how to adapt the teaching methodology (in this case what sequence of teaching material to present to a student) to maximize her performance in the final exam, while minimizing the time spent by the students to learn the course (and possibly dropouts). We illustrate the efficiency of the proposed method on a real-world eTutor platform which is used for remedial training for a Digital Signal Processing (DSP) course. | 10.1109/ICASSP.2015.7179032 | Conference | IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) | ||||
2015/01/01 00:00 | Game Theoretic Design of MAC Protocols: Pricing versus Intervention | L. Canzian, M. Zorzi, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7247691 | In many wireless communication networks a common channel is shared by multiple users who must compete to gain access to it. The operation of the network by self-interested and strategic users usually leads to the overuse of the channel resources and to substantial inefficiencies. Hence, incentive schemes are needed to overcome the inefficiencies of non-cooperative equilibrium. In this work, we consider a slotted-Aloha-like random access protocol and two incentive schemes: pricing and intervention. We provide some criteria for the designer of the protocol to choose one scheme between them and to design the best policy for the selected scheme, depending on the system parameters. Our results show that intervention can achieve the maximum efficiency in the perfect monitoring scenario. In the imperfect monitoring scenario, instead, the performance of the system depends on the beliefs of the different entities and, in some cases, there exists a threshold for the number of users such that, for a number of users lower than the threshold, intervention outperforms pricing, whereas for a number of users higher than the threshold pricing outperforms intervention. | 10.1109/TCOMM.2015.2477341 | Journal | IEEE Transactions on Communications | Communications and Networks | |||
2015/01/01 00:00 | Incentive-Compatible Demand-Side Management for Smart Grids based on Review Strategies | J. Xu, M. van der Schaar | 2015 | https://asp-eurasipjournals.springeropen.com/articles/10.1186/s13634-015-0235-9 | Demand-side load management is able to significantly improve the energy efficiency of smart grids. Since the electricity production cost depends on the aggregate energy usage of multiple consumers, an important incentive problem emerges: self-interested consumers want to increase their own utilities by consuming more than the socially optimal amount of energy during peak hours since the increased cost is shared among the entire set of consumers. To incentivize self-interested consumers to take the socially optimal scheduling actions, we design a new class of protocols based on review strategies. These strategies work as follows: first, a review stage takes place in which a statistical test is performed based on the daily prices of the previous billing cycle to determine whether or not the other consumers schedule their electricity loads in a socially optimal way. If the test fails, the consumers trigger a punishment phase in which, for a certain time, they adjust their energy scheduling in such a way that everybody in the consumer set is punished due to an increased price. Using a carefully designed protocol based on such review strategies, consumers then have incentives to take the socially optimal load scheduling to avoid entering this punishment phase. We rigorously characterize the impact of deploying protocols based on review strategies on the system’s as well as the users’ performance and determine the optimal design (optimal billing cycle, punishment length, etc.) for various smart grid deployment scenarios. Even though this paper considers a simplified smart grid model, our analysis provides important and useful insights for designing incentive-compatible demand-side management schemes based on aggregate energy usage information in a variety of practical scenarios. | 10.1186/s13634-015-0235-9 | Journal | EURASIP Journal on Advances in Signal Processing | ||||
2015/01/01 00:00 | Information-Sharing over Adaptive Networks with Self-interested Agents | C.-K. Yu, M. van der Schaar, A. H. Sayed | 2015 | https://ieeexplore.ieee.org/document/7128740 | We examine the behavior of multiagent networks where information-sharing is subject to a positive communications cost over the edges linking the agents. We consider a general mean-square-error formulation, where all agents are interested in estimating the same target vector. We first show that in the absence of any incentives to cooperate, the optimal strategy for the agents is to behave in a selfish manner with each agent seeking the optimal solution independently of the other agents. Pareto inefficiency arises as a result of the fact that agents are not using historical data to predict the behavior of their neighbors and to know whether they will reciprocate and participate in sharing information. Motivated by this observation, we develop a reputation protocol to summarize the opponent's past actions into a reputation score, which can then be used to form a belief about the opponent's subsequent actions. The reputation protocol entices agents to cooperate and turns their optimal strategy into an action-choosing strategy that enhances the overall social benefit of the network. In particular, we show that when the communications cost becomes large, the expected social benefit of the proposed protocol outperforms the social benefit that is obtained by cooperative agents that always share data. We perform a detailed mean-square-error analysis of the evolution of the network over three domains: (1) far held; (2) near-held; and (3) middle-held, and show that the network behavior is stable for sufficiently small step-sizes. The various theoretical results are illustrated by numerical simulations. | 10.1109/TSIPN.2015.2447832 | Journal | IEEE Transactions on Signal and Information Processing over Networks | Communications and Networks | |||
2015/01/01 00:00 | Mining the Situation: Spatiotemporal Traffic Prediction with Big Data | J. Xu, D. Deng, U. Demiryurek, C. Shahabi, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7001625 | With the vast availability of traffic sensors from which traffic information can be derived, a lot of research effort has been devoted to developing traffic prediction techniques, which in turn improve route navigation, traffic regulation, urban area planning, etc. One key challenge in traffic prediction is how much to rely on prediction models that are constructed using historical data in real-time traffic situations, which may differ from that of the historical data and change over time. In this paper, we propose a novel online framework that could learn from the current traffic situation (or context) in real-time and predict the future traffic by matching the current situation to the most effective prediction model trained using historical data. As real-time traffic arrives, the traffic context space is adaptively partitioned in order to efficiently estimate the effectiveness of each base predictor in different situations. We obtain and prove both short-term and long-term performance guarantees (bounds) for our online algorithm. The proposed algorithm also works effectively in scenarios where the true labels (i.e., realized traffic) are missing or become available with delay. Using the proposed framework, the context dimension that is the most relevant to traffic prediction can also be revealed, which can further reduce the implementation complexity as well as inform traffic policy making. Our experiments with real-world data in real-life conditions show that the proposed approach significantly outperforms existing solutions. | 10.1109/JSTSP.2015.2389196 | Journal | IEEE Journal of Selected Topics in Signal Processing | Multi-agent learning | |||
2015/01/01 00:00 | Multiobjective Design Optimization in the Lightweight Dataflow for DDDAS Environment (LiD4E) | K. Sudusinghe, Y. Jiao, H. Ben Salem, M. van der Schaar, S. S. Bhattacharyya | 2015 | https://www.sciencedirect.com/science/article/pii/S1877050915011722 | In this paper, we introduce new methods for multiobjective, system-level optimization that have been incorporated into the Lightweight Dataflow for Dynamic Data Driven Application Systems (DDDAS) Environment (LiD4E). LiD4E is a design tool for optimized implementation of dynamic, data-driven stream mining systems using high-level dataflow models of computation. More specifically, we develop in this paper new methods for integrated modeling and optimization of real-time stream mining constraints, multidimensional stream mining performance (precision and recall), and energy efficiency. Using a design methodology centered on data-driven control of and coordination between alternative dataflow subsystems for stream mining (classification modes), we develop systematic methods for exploring complex, multidimensional design spaces associated with dynamic stream mining systems, and deriving sets of Pareto-optimal system configurations that can be switched among based on data characteristics and operating constraints. | 10.1016/j.procs.2015.05.364 | Conference | International Conference on Computational Science (ICCS) | ||||
2015/01/01 00:00 | Network Formation Games based on Conditional Independence Graphs | S. Barbarossa, P. Di Lorenzo, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7178510 | The goal of this paper is to propose a network formation game where strategic agents decide whether to form or sever a link with other agents depending on the net balance between the benefit resulting from the additional information coming from the new link and the cost associated to establish the link. Differently from previous works, where the benefits are functions of the distances among the involved agents, in our work the benefit is a function of the mutual information that can be exchanged among the agents, conditioned to the information already available before setting up the link. An interesting result of our network formation game is that, under certain conditions, the final network topology tends to match the topology of the Markov graph describing the conditional independencies among the random variables observed in each node, at least when the cost of forming a link is small. | 10.1109/ICASSP.2015.7178510 | Conference | IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) | ||||
2015/01/01 00:00 | Online Transfer Learning for Differential Diagnosis Determination | J. Xu, D. Sow, D. S. Turaga, M. van der Schaar | 2015 | https://aaai.org/ocs/index.php/WS/AAAIW15/paper/view/10102 | In this paper we present a novel online transfer learning approach to determine the set of tests to perform, and the sequence in which they need to be performed, in order to develop an accurate diagnosis while minimizing the cost of performing the tests. Our learning approach can be incorporated as part of a clinical decision support system (CDSS) with which clinicians can interact. The approach builds on a contextual bandit framework and uses online transfer learning to overcome limitations with the availability of rich training data sets that capture different conditions, context, test results as well as outcomes. We provide confidence bounds for our recommended policies, which is essential in order to build the trust of clinicians. We evaluate the algorithm against different transfer learning approaches on real-world patient alarm datasets collected from Neurological Intensive Care Units (with reduced costs by 20%). | Conference | AAAI Workshop on the World Wide Web and Public Health Intelligence | Transfer learning | ||||
2015/01/01 00:00 | Optimal Foresighted Multi-User Wireless Video | Y. Xiao, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/6840971 | Recent years have seen an explosion in wireless video communication systems. Optimization in such systems is crucial - but most existing methods intended to optimize the performance of multi-user wireless video transmission are inefficient. Some works (e.g., Network Utility Maximization (NUM)) are myopic: they choose actions to maximize instantaneous video quality while ignoring the future impact of these actions. Such myopic solutions are known to be inferior to foresighted solutions that optimize the long-term video quality. Alternatively, foresighted solutions such as rate-distortion optimized packet scheduling focus on single-user wireless video transmission, while ignoring the resource allocation among the users. In this paper, we propose a general framework of foresighted resource allocation among multiple video users sharing a wireless network. Our framework allows each user to flexibly choose individual cross-layer strategies. Our proposed resource allocation is optimal in terms of the total payoff (e.g., video quality) of the users. A key challenge in developing foresighted solutions for multiple video users is that the users' decisions are coupled. To decouple the users' decisions, we adopt a novel dual decomposition approach, which differs from the conventional optimization solutions such as NUM, and determines foresighted policies. Specifically, we propose an informationally-decentralized algorithm in which the network manager updates state- and user-dependent resource “prices” (i.e., the dual variables associated with the resource constraints), and the users make individual packet scheduling decisions based on these prices. Because a priori knowledge of the system dynamics is almost never available at run-time, the proposed solution can learn online while performing the foresighted optimization. Simulation results show 7 dB and 3 dB improvements in Peak Signal-to-Noise Ratio (PSNR) over myopic solutions and existing foresighted solutions, respectively. | 10.1109/JSTSP.2014.2332299 | Journal | IEEE Journal of Selected Topics in Signal Processing | Multi-agent learning | Communications and Networks | ||
2015/01/01 00:00 | Optimal Intervention for Incentivizing the Adoption of Commercial Electric Vehicles | Y. Xiao, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7418247 | While electric vehicles (EVs) have great potential in reducing greenhouse gas emissions, successful EV adoption depends largely on the availability of public charging stations. The lack of charging stations poses an even more serious problem for the adoption of commercial EVs (e.g., trucks used for freight transportation), because the (commercial EV) fleet planner needs to build their own specialized charging stations, requiring additional investment in commercial EV purchase. This paper presents a first mathematical model and rigorous analysis for incentivizing the adoption of commercial EVs. We propose an intervention policy for the social planner (e.g., the government) to promote commercial EV adoption. The intervention policy includes a subsidy for EV purchase (i.e., expenditure) and a carbon tax for gas emissions (i.e., income). We propose provably fast algorithms for the social planner to find the optimal budget-balanced intervention policy. We analyze in detail the effect of the intervention policy on commercial EV adoption, and prove that the proposed intervention policy achieves higher commercial EV adoption rates. | 10.1109/GlobalSIP.2015.7418247 | Conference | IEEE Global Conference on Signal and Information Processing (GlobalSIP) | ||||
2015/01/01 00:00 | Personalized Grade Prediction: A Data Mining Approach | Y. Meier, J. Xu, O. Atan, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7373410 | To increase efficacy in traditional classroom courses as well as in Massive Open Online Courses (MOOCs), automated systems supporting the instructor are needed. One important problem is to automatically detect students that are going to do poorly in a course early enough to be able to take remedial actions. This paper proposes an algorithm that predicts the final grade of each student in a class. It issues a prediction for each student individually, when the expected accuracy of the prediction is sufficient. The algorithm learns online what is the optimal prediction and time to issue a prediction based on past history of students' performance in a course. We derive demonstrate the performance of our algorithm on a dataset obtained based on the performance of approximately 700 undergraduate students who have taken an introductory digital signal processing over the past 7 years. Using data obtained from a pilot course, our methodology suggests that it is effective to perform early in-class assessments such as quizzes, which result in timely performance prediction for each student, thereby enabling timely interventions by the instructor (at the student or class level) when necessary. | 10.1109/ICDM.2015.54 | Conference | IEEE International Conference on Data Mining (ICDM) | ||||
2015/01/01 00:00 | RELEAF: An Algorithm for Learning and Exploiting Relevance | C. Tekin, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7039192 | Recommender systems, medical diagnosis, network security, etc., require on-going learning and decision-making in real time. These-and many others-represent perfect examples of the opportunities and difficulties presented by Big Data: the available information often arrives from a variety of sources and has diverse features so that learning from all the sources may be valuable but integrating what is learned is subject to the curse of dimensionality. This paper develops and analyzes algorithms that allow efficient learning and decision-making while avoiding the curse of dimensionality. We formalize the information available to the learner/decision-maker at a particular time as a context vector which the learner should consider when taking actions. In general the context vector is very high dimensional, but in many settings, the most relevant information is embedded into only a few relevant dimensions. If these relevant dimensions were known in advance, the problem would be simple-but they are not. Moreover, the relevant dimensions may be different for different actions. Our algorithm learns the relevant dimensions for each action, and makes decisions based in what it has learned. Formally, we build on the structure of a contextual multi-armed bandit by adding and exploiting a relevance relation. We prove a general regret bound for our algorithm whose time order depends only on the maximum number of relevant dimensions among all the actions, which in the special case where the relevance relation is single-valued (a function), reduces to \mathtildeO(T2(√2-1)); in the absence of a relevance relation, the best known contextual bandit algorithms achieve regret \mathtildeO(T(D+1)/(D+2)), where D is the full dimension of the context vector. Our algorithm alternates between exploring and exploiting and does not require observing outcomes during exploitation (so allows for active learning). Moreover, during exploitation, suboptimal actions are chosen with arbitrarily low probability. Our algorithm is tested on datasets arising from network security and online news article recommendations. | 10.1109/JSTSP.2015.2402646 | Journal | IEEE Journal of Selected Topics in Signal Processing | Multi-armed bandits | |||
2015/01/01 00:00 | Self-organizing Networks of Information Gathering Cognitive Agents | A. M. Alaa, K. Ahuja, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7323822 | In many scenarios, networks emerge endogenously as cognitive agents establish links in order to exchange information. Network formation has been widely studied in economics, but only on the basis of simplistic models that assume that the value of each additional piece of information is constant. In this paper, we present a first model and associated analysis for network formation under the much more realistic assumption that the value of each additional piece of information depends on the type of that piece of information and on the information already possessed: information may be complementary or redundant. We model the formation of a network as a noncooperative game in which the actions are the formation of links and the benefit of forming a link is the value of the information exchanged minus the cost of forming the link. We characterize the topologies of the networks emerging at a Nash equilibrium (NE) of this game and compare the efficiency of equilibrium networks with the efficiency of centrally designed networks. To quantify the impact of information redundancy and linking cost on social information loss we provide estimates for the price of anarchy (PoA), and to quantify the impact on individual information loss we introduce and provide estimates for a measure we call maximum information loss (MIL). Finally, we consider the setting in which agents are not endowed with information, but must produce it. We show that the validity of the well-known “law of the few” depends on how information aggregates, in particular, the “law of the few” fails when information displays complementarities. | 10.1109/TCCN.2015.2499284 | Journal | IEEE Transactions on Cognitive Communications and Networking | Communications and Networks, Networks | |||
2015/01/01 00:00 | Silence is Gold: Strategic Interference Mitigation Using Tokens in Heterogeneous Small Cell Networks | C. Shen, J. Xu, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7070657 | Electronic tokens have been successfully used as incentive mechanisms to stimulate self-interested network nodes to relay other nodes' traffic. In other words, tokens are paid to buy transmission (relaying) services. In this work, we propose a novel distributed token exchange framework, which can be used in heterogeneous small cell networks to successfully mitigate interference among the self-interested users. Contrary to the traditional role of buying transmission, tokens are exchanged between users to buy silence. Heterogeneity poses unique challenges for interference mitigation, which are difficult to handle with previous solutions but can be effectively tackled with the proposed token design. This paper focuses on the rigorous design of the optimal token scheme that minimizes the system outage probability. We first analyze the optimal strategies of individual users, which only consider their own utility maximization and do not care about the system-wise performance. We prove that under some mild conditions the optimal strategy has a simple threshold structure. We then analytically derive the optimal token supply that minimizes the network outage probability. Analysis shows that even if each user adopts the optimal strategy that only maximizes its own utility, a careful token system design can lead to a significant overall network performance improvement. Simulation results show that not only does the proposed token system design greatly improve the network outage probability, it also improves the overall network QoS, particularly when the deployment density is high. | 10.1109/JSAC.2015.2417012 | Journal | IEEE Journal on Selected Areas in Communications | Communications and Networks | |||
2015/01/01 00:00 | Socially-Optimal Design of Service Exchange Platforms with Imperfect Monitoring | Y. Xiao, M. van der Schaar | 2015 | https://dl.acm.org/doi/10.1145/2785627 | We study the design of service exchange platforms in which long-lived anonymous users exchange services with each other. The users are randomly and repeatedly matched into pairs of clients and servers, and each server can choose to provide high-quality or low-quality services to the client with whom it is matched. Since the users are anonymous and incur high costs (e.g., exert high effort) in providing high-quality services, it is crucial that the platform incentivizes users to provide high-quality services. Rating mechanisms have been shown to work effectively as incentive schemes in such platforms. A rating mechanism labels each user by a rating, which summarizes the user's past behaviors, recommends a desirable behavior to each server (e.g., provide higher-quality services for clients with higher ratings), and updates each server's rating based on the recommendation and its client's report on the service quality. Based on this recommendation, a low-rating user is less likely to obtain high-quality services, thereby providing users with incentives to obtain high ratings by providing high-quality services. However, if monitoring or reporting is imperfect—clients do not perfectly assess the quality or the reports are lost—a user's rating may not be updated correctly. In the presence of such errors, existing rating mechanisms cannot achieve the social optimum. In this article, we propose the first rating mechanism that does achieve the social optimum, even in the presence of monitoring or reporting errors. On one hand, the socially-optimal rating mechanism needs to be complicated enough, because the optimal recommended behavior depends not only on the current rating distribution, but also (necessarily) on the history of past rating distributions in the platform. On the other hand, we prove that the social optimum can be achieved by “simple” rating mechanisms that use binary rating labels and a small set of (three) recommended behaviors. We provide design guidelines of socially-optimal rating mechanisms and a low-complexity online algorithm for the rating mechanism to determine the optimal recommended behavior. | 10.1145/2785627 | Journal | ACM Transactions on Economics and Computation | Communications and Networks, Networks | |||
2015/01/01 00:00 | The Population Dynamics of Websites | K. Ahuja, S. Zhang, M. van der Schaar | 2015 | https://dl.acm.org/doi/10.1145/2847220.2847237 | Websites derive revenue by advertising or charging fees for services and so their profit depends on their user base -- the number of users visiting the website. But how should websites control their user base? This paper is the first to address and answer this question. It builds a model in which, starting from an initial user base, the website controls the growth of the population by choosing the intensity of referrals and targeted ads to potential users. A larger population provides more profit to the website, but building a larger population through referrals and targeted ads is costly; the optimal policy must therefore balance the marginal benefit of adding users against the marginal cost of referrals and targeted ads. The nature of the optimal policy depends on a number of factors. Most obvious is the initial user base; websites starting with a small initial population should offer many referrals and targeted ads at the beginning, but then decrease referrals and targeted ads over time. Less obvious factors are the type of website and the typical length of time users remain on the site: the optimal policy for a website that generates most of its revenue from a core group of users who remain on the site for a long time -- e.g., mobile and online gaming sites -- should be more aggressive and protective of its user base than that of a website whose revenue is more uniformly distributed across users who remain on the site only briefly. When arrivals and exits are stochastic, the optimal policy is more aggressive -- offering more referrals and targeted ads. | 10.1145/2847220.2847237 | Conference | ACM Electronic Commerce (EC) Workshop on the Economics of Networks, Systems and Computation (NetEcon) | ||||
2015/01/01 00:00 | Timely Event Detection by Networked Learners | L. Canzian, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7004055 | We consider a set of distributed learners that are interconnected via an exogenously-determined network. The learners observe different data streams that are related to common events of interest, which need to be detected in a timely manner. Each learner is equipped with a set of local classifiers, which generate local predictions about the common event based on the locally observed data streams. In this work, we address the following key questions: (1) Can the learners improve their detection accuracy by exchanging and aggregating information? (2) Can the learners improve the timeliness of their detections by forming clusters, i.e., by collecting information only from surrounding learners? (3) Given a specific tradeoff between detection accuracy and detection delay, is it desirable to aggregate a large amount of information, or is it better to focus on the most recent and relevant information? To address these questions, we propose a cooperative online learning scheme in which each learner maintains a set of weight vectors (one for each possible cluster), selects a cluster and the corresponding weight vector, generates a local prediction, disseminates it through the network, and combines all the received local predictions from the learners belonging to the selected cluster by using a weighted majority rule. The optimal cluster and weight vector that a learner should adopt depend on the specific network topology, on the location of the learner in the network, and on the characteristics of the data streams. To learn such optimal values, we propose a general online learning rule that exploits only the feedbacks that the learners receive. We determine an upper bound for the worst-case mis-detection probability and for the worst-case prediction delay of our scheme in the realizable case. Numerical simulations show that the proposed scheme is able to successfully adapt to the unknown characteristics of the data streams and can achieve substantial performance gains with respect to a scheme in which the learners act individually or a scheme in which the learners always aggregate all available local predictions. We numerically evaluate the impact that different network topologies have on the final performance. Finally, we discuss several surprising existing trade-offs. | 10.1109/TSP.2015.2389765 | Journal | IEEE Transactions on Signal Processing | Communications and Networks | |||
2015/01/01 00:00 | Timely video popularity forecasting based on social networks | J. Xu, H. Li, J. Liu, M. van der Schaar | 2015 | https://ieeexplore.ieee.org/document/7218618 | This paper presents Pop-Forecast, a systematic method for accurately forecasting the popularity of videos promoted through social networks. Pop-Forecast aims to optimize the forecasting accuracy and the timeliness with which forecasts are issued, by explicitly taking into account the dynamic propagation of videos in social networks. The forecasting is performed online and requires no training phase or a priori knowledge. We analytically bound the performance loss of Pop-Forecast as compared to that obtained by an omniscient oracle and prove that the bound is sublinear in the number of video arrivals, thereby guaranteeing its fast rate of convergence as well as its asymptotic convergence to the optimal performance. We validate the performance of Pop-Forecast through extensive experiments using real-world data traces collected from the videos shared in RenRen, one of the largest online social networks in China. These experiments show that our proposed method outperforms existing approaches for popularity prediction (which do not take into account the propagation in social network) by more than 30% in terms of prediction rewards. | 10.1109/INFOCOM.2015.7218618 | Conference | IEEE International Conference on Computer Communications (INFOCOM) | ||||
2015/01/01 00:00 | To Send or Not To Send - Learning MAC Contention | S. Amuru, Y. Xiao, M. van der Schaar, M. Buehrer | 2015 | https://ieeexplore.ieee.org/document/7417224 | The exponential back-off mechanism, proposed for reducing MAC- layer contention in the 802.11 standard, is sub-optimal in terms of the network throughput. This back-off mechanism and its improved variants are especially inefficient under unknown dynamics such as packet arrivals and user entry/exit. In this paper, we formulate the problem of optimizing this back-off mechanism as a Markov decision process, and propose online learning algorithms to learn the optimal back-off schemes under unknown dynamics. By exploiting the fact that some components of the system dynamics (such as protocol states) are known because the users follow the common 802.11 protocol, we propose a post-decision state (PDS)- based learning algorithm to speed up the learning process. Compared to traditional Q-learning algorithms, the advantages of the proposed online learning algorithm are that 1) it exploits partial information about the system so that less information needs to be learned in comparison to other learning algorithms, and 2) it removes the necessity for action exploration which usually impedes the learning process of conventional learning algorithms (such as Q-Learning). We prove the optimality of the proposed PDS-based learning algorithm and via numerical results demonstrate the improvement over existing protocols and Q-learning in terms of throughput and convergence speed. We first address this problem from a single-user perspective and later describe the challenges involved and present new insights into the multi-user learning scenarios, especially in cases where the MDP models of the users are coupled with each other. | 10.1109/GLOCOM.2015.7417224 | Conference | IEEE Global Communications Conference (GLOBECOM) | ||||
2015/01/01 00:00 | BitMiner: Bits Mining in Internet Traffic Classification | Z. Yuan, Y. Xue, M. van der Schaar | 2015 | http://www.sigcomm.org/node/3789 | Traditionally, signatures used for traffic classification are constructed at the byte-level. However, as more and more data-transfer formats of network protocols and applications are encoded at the bit-level, byte-level signatures are losing their effectiveness in traffic classification. In this poster, we creatively construct bit-level signatures by associating the bit-values with their bit-positions in each traffic flow. Furthermore, we present BitMiner, an automated traffic mining tool that can mine application signatures at the most fine-grained bit-level granularity. Our preliminary test on popular peer-to-peer (P2P) applications, e.g. Skype, Google Hangouts, PPTV, eMule, Xunlei and QQDownload, reveals that although they all have no byte-level signatures, there are significant bit-level signatures hidden in their traffic. | 10.1145/2829988.2789997 | Conference | ACM Conference on Special Interest Group on Data Communication (SIGCOMM) | ||||
2014/12/08 00:00 | Discovering, Learning and Exploiting Relevance | C. Tekin, M. van der Schaar | 2014 | https://papers.nips.cc/paper/2014/hash/99bcfcd754a98ce89cb86f73acc04645-Abstract.html | In this paper we consider the problem of learning online what is the information to consider when making sequential decisions. We formalize this as a contextual multi-armed bandit problem where a high dimensional ( D -dimensional) context vector arrives to a learner which needs to select an action to maximize its expected reward at each time step. Each dimension of the context vector is called a type. We assume that there exists an unknown relation between actions and types, called the relevance relation, such that the reward of an action only depends on the contexts of the relevant types. When the relation is a function, i.e., the reward of an action only depends on the context of a single type, and the expected reward of an action is Lipschitz continuous in the context of its relevant type, we propose an algorithm that achieves ~ O ( T γ ) regret with a high probability, where γ = 2 / ( 1 + √ 2 ) . Our algorithm achieves this by learning the unknown relevance relation, whereas prior contextual bandit algorithms that do not exploit the existence of a relevance relation will have ~ O ( T ( D + 1 ) / ( D + 2 ) ) regret. Our algorithm alternates between exploring and exploiting, it does not require reward observations in exploitations, and it guarantees with a high probability that actions with suboptimality greater than ϵ are never selected in exploitations. Our proposed method can be applied to a variety of learning applications including medical diagnosis, recommender systems, popularity prediction from social networks, network security etc., where at each instance of time vast amounts of different types of information are available to the decision maker, but the effect of an action depends only on a single type. | Conference | NeurIPS | Feature selection, Multi-armed bandits | ||||
2014/01/01 00:00 | A Dynamic Model of Certification and Reputation | M. van der Schaar, S. Zhang | 2014 | https://www.jstor.org/stable/43562997 | Markets typically have many ways of learning about quality, with two of the most important being reputational forces and certification, and these types of learning often interact with and influence each other. This paper is the first to consider markets where learning occurs through these different sources simultaneously, which allows us to investigate the rich interplay and dynamics that can arise. Our work offers four main insights: (1) Without certification, market learning through reputation alone can get "stuck" at inefficient levels and high-quality agents may get forced out of the market. (2) Certification "frees" the reputation of agents, allowing good agents to keep working even after an unfortunate string of bad signals. (3) Certification can be both beneficial and harmful from a social perspective, so a social planner must choose the certification scheme carefully. In particular, the market will tend to demand more certification than socially optimal because the market does not bear the certification costs. (4) Certification and reputational learning can act as complementary forces so that the social welfare produced by certification can be increased by faster information revelation. | 10.1007-s00199-014-0836-9 | Journal | Economic Theory | ||||
2014/01/01 00:00 | A Dynamic Model of Certification and Reputation (1) | M. van der Schaar, S. Zhang | 2014 | https://dl.acm.org/doi/10.1145/2600057.2602870 | Markets typically have many ways of learning about quality, with two of the most important being reputational forces and certification, and these types of learning often interact with and influence each other. This paper is the first to consider markets where learning occurs through these different sources simultaneously, which allows us to demonstrate the rich interplay and dynamics that can arise. Our work offers four main insights: (1) Without certification, market learning through reputation alone can get 'stuck' at inefficient levels and high quality agents may get forced out of the market. (2) Certification 'frees' the reputation of agents, allowing good agents to keep working even after an unfortunate string of bad signals. (3) Certification can be both beneficial and harmful, and so the social planner must choose the certification scheme carefully. In particular, the market will tend to demand more certification than socially optimal because the market does not bear the certification costs. (4) Certification and reputational learning can act as complementary forces so that a more informative reputational mechanism will increase the social welfare generated by certification. | 10.1145/2600057.2602870 | Conference | ACM Conference on Economics and Computation (EC) | ||||
2014/01/01 00:00 | A Network of Cooperative Learners For Data-Driven Stream Mining | L. Canzian, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6854132 | We propose and analyze a distributed learning system to classify data captured from distributed and dynamic data streams. Our scheme consists of multiple distributed learners that are interconnected via an exogenously-determined network. Each learner observes a specific data stream, which is correlated to a common event that needs to be classified, and maintains a set of local classifiers and a weight for each local classifier. We propose a cooperative online learning scheme in which the learners exchange information through the network both to compute an aggregate prediction and to adapt the weights to the dynamic characteristics of the data streams. The information dissemination protocol is designed to minimize the time required to compute the final prediction. We determine an upper bound for the worst-case misclas-sification probability of our scheme, which depends on the misclassification probability of the best (unknown) static aggregation rule. Importantly, such bound tends to zero if the misclassification probability of the best static aggregation rule tends to zero. When applied to well-known data sets experiencing concept drifts, our scheme exhibits gains ranging from 20% to 70% with respect to state-of-the-art solutions. | 10.1109/ICASSP.2014.6854132 | Conference | IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) | ||||
2014/01/01 00:00 | A Unified Online Directed Acyclic Graph Flow Manager for Multicore Schedulers | K. Kanoun, D. Atienza, N. Mastronarde, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6742974 | Numerous Directed-Acyclic Graph (DAG) schedulers have been developed to improve the energy efficiency of various multi-core systems. However, the DAG monitoring modules proposed by these schedulers make a priori assumptions about the workload and relationship between the task dependencies. Thus, schedulers are limited to work on a limited subset of DAG models. To address this problem, we propose a unified online DAG monitoring solution independent from the connected scheduler and able to handle all possible DAG models. Our novel low-complexity solution processes online the DAG of the application and provides relevant information about each task that can be used by any scheduler connected to it. Using H.264/AVC video decoding as an illustrative application and multiple configurations of complex synthetic DAGs, we demonstrate that our solution connected to an external simple energy-efficient scheduler is able to achieve significant improvements in energy-efficiency and deadline miss rates compared to existing approaches. | 10.1109/ASPDAC.2014.6742974 | Conference | Asia and South Pacific Design Automation Conference (ASP-DAC) | ||||
2014/01/01 00:00 | An Experts Learning Approach to Mobile Service Offloading | C. Tekin, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/7028516 | Mobile devices are more and more often called on to perform services which require too much computation power and battery energy. If delay is an important consideration, offloading to the cloud may be too slow and a better approach is to offload to a resource-rich machine in the proximity of the device. This paper develops a new approach to this problem in which the machines are viewed as a collection of experts - but experts that are coupled in space and in time: the current action at a given machine affects the future state of the given machine and of other machines to which the given machine is connected. At any time, given the state and unknown dynamics of the system, the experts available at that time should cooperatively pick the best available actions. Within this framework, we propose online learning algorithms that results in substantial savings in energy consumption. | 10.1109/ALLERTON.2014.7028516 | Conference | Allerton | ||||
2014/01/01 00:00 | Bandit Framework For Systematic Learning In Wireless Video-Based Face Recognition | O. Atan, Y. Andreopoulos, C. Tekin, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6832509 | Video-based object or face recognition services on mobile devices have recently garnered significant attention, given that video cameras are now ubiquitous in all mobile communication devices. In one of the most typical scenarios for such services, each mobile device captures and transmits video frames over wireless to a remote computing cluster (a.k.a. “cloud” computing infrastructure) that performs the heavy-duty video feature extraction and recognition tasks for a large number of mobile devices. A major challenge of such scenarios stems from the highly varying contention levels in the wireless transmission, as well as the variation in the task-scheduling congestion in the cloud. In order for each device to adapt the transmission, feature extraction and search parameters and maximize its object or face recognition rate under such contention and congestion variability, we propose a systematic learning framework based on multi-user multi-armed bandits. The performance loss under two instantiations of the proposed framework is characterized by the derivation of upper bounds for the achievable short-term and long-term loss in the expected recognition rate per face recognition attempt against the “oracle” solution that assumes a-priori knowledge of the system performance under every possible setting. Unlike well-known reinforcement learning techniques that exhibit very slow convergence when operating in highly-dynamic environments, the proposed bandit-based systematic learning quickly approaches the optimal transmission and cloud resource allocation policies based on feedback on the experienced dynamics (contention and congestion levels). To validate our approach, time-constrained simulation results are presented via: (i) contention-based H.264/AVC video streaming over IEEE 802.11 WLANs and (ii) principal-component based face recognition algorithms running under varying congestion levels of a cloud-computing infrastructure. Against state-of-the-art reinforcement learning methods, our framework is shown to provide 17.8% ~ 44.5% reduction of the number of video frames that must be processed by the cloud for recognition and 11.5% ~ 36.5% reduction in the video traffic over the WLAN. | 10.1109/JSTSP.2014.2330799 | Journal | IEEE Journal of Selected Topics in Signal Processing | Reinforcement learning, Multi-armed bandits | Communications and Networks | ||
2014/01/01 00:00 | Bandit Framework for Systematic Learning in Wireless Video-Based Face Recognition | O. Atan, Y. Andreopoulos, C. Tekin, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6853687 | In most video-based object or face recognition services on mobile devices, each device captures and transmits video frames over wireless to a remote computing service (a.k.a. “cloud”) that performs the heavy-duty video feature extraction and recognition tasks for a large number of mobile devices. The major challenges of such scenarios stem from the highly-varying contention levels in the wireless local area network (WLAN), as well as the variation in the task-scheduling congestion in the cloud. In order for each device to maximize its object or face recognition rate under such contention and congestion variability, we propose a systematic learning framework based on multi-armed bandits. Unlike well-known reinforcement learning techniques that exhibit very slow convergence rates when operating in highly-dynamic environments, the proposed bandit-based systematic learning quickly approaches the optimal transmission and processing-complexity policies based on feedback on the experienced dynamics (contention and congestion levels). Comparisons against state-of-the-art reinforcement learning methods demonstrate that this makes our proposal especially suitable for the highly-dynamic levels of wireless contention and cloud scheduling congestion. | 10.1109/ICASSP.2014.6853687 | Conference | IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) | Reinforcement learning, Multi-armed bandits | Communications and Networks | ||
2014/01/01 00:00 | Clustering Based Online Learning in Recommender Systems: A Bandit Approach | L. Song, C. Tekin, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6854459 | A big challenge for the design and implementation of large-scale online services is determining what items to recommend to their users. For instance, Netflix makes movie recommendations; Amazon makes product recommendations; and Yahoo! makes webpage recommendations. In these systems, items are recommended based on the characteristics and circumstances of the users, which are provided to the recommender as contexts (e.g., search history, time, and location). The task of building an efficient recommender system is challenging due to the fact that both the item space and the context space are very large. Existing works either focus on a large item space without contexts, large context space with small number of items, or they jointly consider the space of items and contexts together to solve the online recommendation problem. In contrast, we develop an algorithm that does exploration and exploitation in the context space and the item space separately, and develop an algorithm that combines clustering of the items with information aggregation in the context space. Basically, given a user's context, our algorithm aggregates its past history over a ball centered on the user's context, whose radius decreases at a rate that allows sufficiently accurate estimates of the payoffs such that the recommended payoffs converge to the true (unknown) payoffs. Theoretical results show that our algorithm can achieve a sublinear learning regret in time, namely the payoff difference of the oracle optimal benchmark, where the preferences of users on certain items in certain context are known, and our algorithm, where the information is incomplete. Numerical results show that our algorithm significantly outperforms (over 48%) the existing algorithms in terms of regret. | 10.1109/ICASSP.2014.6854459 | Conference | IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) | Reinforcement learning | |||
2014/01/01 00:00 | Coalitional Games with Intervention: Application to Spectrum Leasing in Cognitive Radio | J. Alcaraz, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6846374 | We consider a spectrum leasing system in which secondary networks offer offload services to a primary network (PN) in exchange for temporary access to the PN's spectrum. When the coverage areas of several secondary access nodes (SANs) overlap, they compete for primary users (PUs), which benefits the PN, except when the SANs collude and coordinate their prices, forming a cartel. As a result, the PN obtains lower transmission rates for the serviced PUs. Our coalitional game analysis shows that stable cartels always exist and can form easily. To protect the spectrum owner's interests and enforce market regulation, we propose an intervention framework in which an intervention manager counteracts cartel formation. The specific features that make wireless systems different from conventional markets enable the manager to modify the set of achievable outcomes. The intervention capability is limited; thus, the objective is to design an intervention rule maximizing the PN transmission rate within the given constraints. Importantly, the intervention can solely act as a threat or a warning that does not need to be executed in practice. To reduce the computational effort, we also propose a low-complexity intervention rule that performs similarly to the optimal one in terms of assurable PN rate increment and outperforms other effective approaches. | 10.1109/TWC.2014.2333512 | Journal | IEEE Transactions on Wireless Communications | Communications and Networks | |||
2014/01/01 00:00 | Collective Ratings for Online Communities with Strategic Users | Y. Zhang, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6805221 | Despite the success of emerging online communities, they face a serious practical challenge: the participating agents are strategic, and incentive mechanisms are needed to compel such agents to provide high-quality services. Traditional mechanisms based on pricing and direct reciprocity schemes are not effective in providing incentives in such communities due to their unique features: large number of agents able to perform diverse services, imperfect monitoring of agents' service quality, etc. To compel agents to provide high-quality services, we develop a novel game-theoretic framework for providing incentives using rating-based pricing schemes. In our framework, the service-providing agents are not rated individually; instead, they are divided into separate groups based on their expertise, location, etc., and are rated collectively, as a group. A collective rating is updated for each group based on the quality of service provided by all the agents appertaining to the group. Depending on whether a group of agents collectively contributes a sufficiently high level of services or not, the agents in the group are rewarded or punished through increased or decreased collective rating, which will lead to higher or lower payments they receive in the future. We systematically analyze how the group size and the rating scheme affect the community designer's revenue as well as the social welfare of the agents and, based on this analysis. We design optimal rating protocols and show that these protocols can significantly improve the social welfare of the community as compared to a variety of alternative incentive mechanisms. | 10.1109/TSP.2014.2320457 | Journal | IEEE Transactions on Signal Processing | Multi-agent learning | Communications and Networks, Game Theory and Applications | ||
2014/01/01 00:00 | Context-Adaptive Big Data Stream Mining | C. Tekin, L. Canzian, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/7028494 | Emerging stream mining applications require classification of large data streams generated by single or multiple heterogeneous sources. Different classifiers can be used to produce predictions. However, in many practical scenarios the distribution over data and labels (and hence the accuracies of the classifiers) may be unknown a priori and may change in unpredictable ways over time. We consider data streams that are characterized by their context information which can be used as meta-data to choose which classifier should be used to make a specific prediction. Since the context information can be high dimensional, learning the best classifiers to make predictions using contexts suffers from the curse of dimensionality. In this paper, we propose a context-adaptive learning algorithm which learns online what is the best context, learner, and classifier to use to process a data stream. Then, we theoretically bound the regret of the proposed algorithm and show that its time order is independent of the dimension of the context space. Our numerical results illustrate that our algorithm outperforms most prior online learning algorithms, for which such online performance bounds have not been proven. | 10.1109/ALLERTON.2014.7028494 | Conference | Allerton | ||||
2014/01/01 00:00 | Context-Aware Online Spatiotemporal Traffic Prediction | J. Xu, D. Deng, U. Demiryurek, C. Shahabi, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/7022576 | With the availability of traffic sensors data, various techniques have been proposed to make congestion prediction by utilizing those datasets. One key challenge in predicting traffic congestion is how much to rely on the historical data v.s. The real-time data. To better utilize both the historical and real-time data, in this paper we propose a novel online framework that could learn the current situation from the real-time data and predict the future using the most effective predictor in this situation from a set of predictors that are trained using historical data. In particular, the proposed framework uses a set of base predictors (e.g. A Support Vector Machine or a Bayes classifier) and learns in real-time the most effective one to use in different contexts (e.g. Time, location, weather condition). As real-time traffic data arrives, the context space is adaptively partitioned in order to efficiently estimate the effectiveness of each predictor in different contexts. We obtain and prove both short-term and long-term performance guarantees (bounds) for our online algorithm. Our experiments with real-world data in real-life conditions show that the proposed approach significantly outperforms existing solutions. | 10.1109/ICDMW.2014.102 | Conference | IEEE International Conference on Data Mining Workshop | ||||
2014/01/01 00:00 | Context-Driven Online Learning for Activity Classification in Wireless Health | J. Xu, J. Y. Xu, L. Song, G. Pottie, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/7037171 | Enabling accurate and low-cost classification of a range of motion activities is of significant importance for wireless health through body worn inertial sensors and smartphones, due to the need by healthcare and fitness professonals to monitor exercises for quality and compliance. This paper proposes a novel contextual multi-armed bandits approach for large-scale activity classification. The proposed method is able to address the unique challenges arising from scaling, lack of training data and adaptation by melding context augmentation and continuous online learning into traditional activity classification. We rigorously characterize the performance of the proposed learning algorithm and prove that the learning regret (i.e. reward loss) is sublinear in time, thereby ensuring fast convergence to the optimal reward as well as providing short-term performance guarantees. Our experiments show that the proposed algorithm outperforms existing algorithms in terms of both providing higher classification accuracy as well as lower energy consumption. | 10.1109/GLOCOM.2014.7037171 | Conference | IEEE Global Communications Conference (GLOBECOM) | Multi-armed bandits | |||
2014/01/01 00:00 | Cooperative Multi-Agent Learning and Coordination for Cognitive Radio Networks | W. R. Zame, J. Xu, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6683132 | The radio spectrum is a scarce resource. Cognitive radio stretches this resource by enabling secondary stations to operate in portions of the spectrum that are reserved for primary stations but not currently used by the primary stations. As it is whenever stations share resources, coordination is a central issue in cognitive radio networks: absent coordination, there may be collision, congestion or interference, with concomitant loss of performance. Cognitive radio networks require coordination of secondary stations with primary stations (so that secondary stations should not interfere with primary stations) and of secondary stations with each other. Coordination in this setting is especially challenging because of the various types of sensing errors. This paper proposes novel protocols that enable secondary stations to learn and teach with the goal of coordinating to achieve a round-robin Time Division Multiple Access (TDMA) schedule. These protocols are completely distributed (requiring neither central control nor the exchange of any control messages), fast (with speeds exceeding those of existing protocols), efficient (in terms of throughput and delay) and scalable. The protocols proposed rely on cooperative learning, exploiting the ability of stations to learn from and condition on their own histories while simultaneously teaching other stations about these histories. Analytic results and simulations illustrate the power of these protocols. | 10.1109/JSAC.2014.140308 | Journal | IEEE Journal on Selected Areas in Communications | Multi-agent learning | Communications and Networks, Game Theory and Applications | ||
2014/01/01 00:00 | Data-driven Stream Mining Systems for Computer Vision | S. Bhattacharyya, M. van der Schaar, O. Atan, C. Tekin, K. Sudusinghe | 2014 | https://link.springer.com/chapter/10.1007%2F978-3-319-09387-1_12 | In this chapter, we discuss the state of the art and future challenges in adaptive stream mining systems for computer vision. Adaptive stream mining in this context involves the extraction of knowledge from image and video streams in real-time, and from sources that are possibly distributed and heterogeneous. With advances in sensor and digital processing technologies, we are able to deploy networks involving large numbers of cameras that acquire increasing volumes of image data for diverse applications in monitoring and surveillance. However, to exploit the potential of such extensive networks for image acquisition, important challenges must be addressed in efficient communication and analysis of such data under constraints on power consumption, communication bandwidth, and end-to-end latency. We discuss these challenges in this chapter, and we also discuss important directions for research in addressing such challenges using dynamic, data-driven methodologies. | 10.1007/978-3-319-09387-1_12 | Chapter | Advances in Embedded Computer Vision | ||||
2014/01/01 00:00 | Decentralized Foresighted Energy Purchase and Procurement With Renewable Generation and Energy Storage | Y. Xiao, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/7040270 | We study a power system with one independent system operator (ISO) who procures energy from energy generators, and decentralized aggregators who purchase energy from the ISO to serve their customers. With the penetration of renewable energy generation, the aggregators are adopting energy storage to deal with the high volatility in supply and prices. In the presence of energy storage, it is beneficial for all the entities (i.e. the ISO and aggregators) to make foresighted decisions (i.e. energy procurement decisions by the ISO and energy purchase decisions by the aggregators) to minimize their long-term costs. However, the optimal foresighted decision making is complicated mainly because the information required to make optimal decisions is decentralized among the entities. We propose a design framework in which the ISO provides each aggregator with a conjectured future price, and each aggregator distributively minimizes its own long-term cost based on its conjectured price as well as its local information. The proposed framework can achieve the social optimum despite the decentralized information among the entities. Simulation results demonstrate significant reduction in the total cost by the proposed foresighted energy purchase and procurement (EPP), compared to the optimal myopic EPP (up to 60% reduction), and the foresighted EPP based on the Lyapunov optimization framework (up to 30% reduction). | 10.1109/CDC.2014.7040270 | Conference | IEEE Conference on Decision and Control (CDC) | ||||
2014/01/01 00:00 | Demand Side Management in Smart Grids using a Repeated Game Framework | L. Song, Y. Xiao, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6840289 | Demand-side management (DSM) is a key solution for reducing the peak-time power consumption in smart grids. To provide incentives for consumers to shift their consumption to off-peak times, the utility company charges consumers the differential pricing for using power at different times of the day. Consumers take into account these differential prices when deciding when and how much power to consume daily. Importantly, while consumers enjoy lower billing costs when shifting their power usage to off-peak times, they also incur discomfort costs due to the altering of their power consumption patterns. Existing works propose stationary strategies for the myopic consumers to minimize their short-term billing and discomfort costs. In contrast, we model the interaction emerging among self-interested and foresighted consumers as a repeated energy scheduling game and prove that the stationary strategies are suboptimal in terms of long-term total billing and discomfort costs. Subsequently, we propose a novel framework for determining optimal nonstationary DSM strategies, in which consumers can choose different daily power consumption patterns depending on their preferences, routines, and needs. As a direct consequence of the nonstationary DSM policy, different subsets of consumers are allowed to use power in peak times at a low price. The subset of consumers that are selected daily to have their joint discomfort and billing costs minimized is determined based on the consumers power consumption preferences as well as on the past history of which consumers have shifted their usage previously. Importantly, we show that the proposed strategies are incentive compatible. Simulations confirm that, given the same peak-to-average ratio, the proposed strategy can reduce the total cost (billing and discomfort costs) by up to 50% compared to existing DSM strategies. | 10.1109/JSAC.2014.2332119 | Journal | IEEE Journal on Selected Areas in Communications | Multi-agent learning | |||
2014/01/01 00:00 | Distributed Online Learning in Social Recommender Systems | C. Tekin, S. Zhang, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6709807 | In this paper, we consider decentralized sequential decision making in distributed online recommender systems, where items are recommended to users based on their search query as well as their specific background including history of bought items, gender and age, all of which comprise the context information of the user. In contrast to centralized recommender systems, in which there is a single centralized seller who has access to the complete inventory of items as well as the complete record of sales and user information, in decentralized recommender systems each seller/learner only has access to the inventory of items and user information for its own products and not the products and user information of other sellers, but can get commission if it sells an item of another seller. Therefore, the sellers must distributedly find out for an incoming user which items to recommend (from the set of own items or items of another seller), in order to maximize the revenue from own sales and commissions. We formulate this problem as a cooperative contextual bandit problem, analytically bound the performance of the sellers compared to the best recommendation strategy given the complete realization of user arrivals and the inventory of items, as well as the context-dependent purchase probabilities of each item, and verify our results via numerical examples on a distributed data set adapted based on Amazon data. We evaluate the dependence of the performance of a seller on the inventory of items the seller has, the number of connections it has with the other sellers, and the commissions which the seller gets by selling items of other sellers to its users. | 10.1109/JSTSP.2014.2299517 | Journal | IEEE Journal of Selected Topics in Signal Processing | Multi-agent learning | Communications and Networks | ||
2014/01/01 00:00 | Dynamic Incentive Design for Participation in Direct Load Scheduling Programs | M. Alizadeh, Y. Xiao, A. Scaglione, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6876120 | Interruptible Load (IL) programs have long been an accepted measure to intelligently and reliably shed demand in case of contingencies in the power grid. However, the emerging market for Electric Vehicles (EV) and the notion of providing non-emergency ancillary services through the demand side have sparked new interest in designing direct load scheduling programs that manage the consumption of appliances on a day-to-day basis. In this paper, we define a mechanism for a Load Serving Entity (LSE) to strategically compensate customers that allow the LSE to directly schedule their consumption, every time they want to use an eligible appliance. We study how the LSE can compute such incentives by forecasting its profits from shifting the load of recruited appliances to hours when electricity is cheap, or by providing ancillary services, such as regulation and load following. To make the problem scalable and tractable we use a novel clustering approach to describe appliance load and laxity. In our model, customers choose to participate in this program strategically, in response to incentives posted by the LSE in publicly available menus. Since 1) appliances have different levels of demand flexibility; and 2) demand flexibility has a time-varying value to the LSE due to changing wholesale prices, we allow the incentives to vary dynamically with time and appliance cluster. We study the economic effects of the implementation of such program on a population of EVs, using real-world data for vehicle arrival and charge patterns. | 10.1109/JSTSP.2014.2347003 | Journal | IEEE Journal of Selected Topics in Signal Processing | ||||
2014/01/01 00:00 | Dynamic Network Formation with Incomplete Information | Y. Song, M. van der Schaar | 2015 | https://link.springer.com/article/10.1007/s00199-015-0858-y | How do networks form and what is their ultimate topology? Most of the literature that addresses these questions assumes complete information: agents know in advance the value of linking even with agents they have never met and with whom they have had no previous interaction (direct or indirect). This paper addresses the same questions under the much more natural assumption of incomplete information: agents do not know in advance—but must learn—the value of linking. We show that incomplete information has profound implications for the formation process and the ultimate topology. Under complete information, the network topologies that form and are stable typically consist of agents of relatively high value only. Under incomplete information, a much wider collection of network topologies can emerge and be stable. Moreover, even with the same topology, the locations of agents can be very different: An agent can achieve a central position purely as the result of chance rather than as the result of merit. All of this can occur even in settings where agents eventually learn everything so that information, although initially incomplete, eventually becomes complete. The ultimate network topology depends significantly on the formation history, which is natural and true in practice, and incomplete information makes this phenomenon more prevalent. | Conference | Southwest Economic Theory Conference | Multi-agent learning | Communications and Networks, Game Theory and Applications, Networks | |||
2014/01/01 00:00 | Dynamic Pricing for Smart Grid with Reinforcement Learning | B. Kim, Y. Zhang, M. van der Schaar, J. Lee | 2014 | https://ieeexplore.ieee.org/document/6849306 | In the smart grid system, dynamic pricing can be an efficient tool for the service provider which enables efficient and automated management of the grid. However, in practice, the lack of information about the customers' time-varying load demand and energy consumption patterns and the volatility of electricity price in the wholesale market make the implementation of dynamic pricing highly challenging. In this paper, we study a dynamic pricing problem in the smart grid system where the service provider decides the electricity price in the retail market. In order to overcome the challenges in implementing dynamic pricing, we develop a reinforcement learning algorithm. To resolve the drawbacks of the conventional reinforcement learning algorithm such as high computational complexity and low convergence speed, we propose an approximate state definition and adopt virtual experience. Numerical results show that the proposed reinforcement learning algorithm can effectively work without a priori information of the system dynamics. | 10.1109/INFCOMW.2014.6849306 | Conference | IEEE International Conference on Computer Communications (INFOCOM) Workshop on Communications and Control for Smart Energy Systems | Reinforcement learning | |||
2014/01/01 00:00 | Dynamic Scheduling for Energy Minimization in Delay-Sensitive Stream Mining | S. Ren, N. Deligiannis, Y. Andreopoulos, M. Islam, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6877705 | Numerous stream mining applications, such as visual detection, online patient monitoring, and video search and retrieval, are emerging on both mobile and high-performance computing systems. These applications are subject to responsiveness (i.e., delay) constraints for user interactivity and, at the same time, must be optimized for energy efficiency. The increasingly heterogeneous power-versus-performance profile of modern hardware presents new opportunities for energy saving as well as challenges. For example, employing low-performance processing nodes can save energy but may violate delay requirements, whereas employing high-performance processing nodes can deliver a fast response but may unnecessarily waste energy. Existing scheduling algorithms balance energy versus delay assuming constant processing and power requirements throughout the execution of a stream mining task and without exploiting hardware heterogeneity. In this paper, we propose a novel framework for dynamic scheduling for energy minimization (DSE) that leverages this emerging hardware heterogeneity. By optimally determining the processing speeds for hardware executing classifiers, DSE minimizes the average energy consumption while satisfying an average delay constraint. To assess the performance of DSE, we build a face detection application based on the Viola-Jones classifier chain and conduct experimental studies via heterogeneous processor system emulation. The results show that, under the same delay requirement, DSE reduces the average energy consumption by up to 50% in comparison to conventional scheduling that does not exploit hardware heterogeneity. We also demonstrate that DSE is robust against processing node switching overhead and model inaccuracy. | 10.1109/TSP.2014.2347260 | Journal | IEEE Transactions on Signal Processing | Communications and Networks | |||
2014/01/01 00:00 | Energy-efficient Nonstationary Spectrum Sharing | Y. Xiao, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6730890 | We develop a novel design framework for energy-efficient spectrum sharing among autonomous users who aim to minimize their energy consumptions subject to minimum throughput requirements. Most existing works proposed stationary spectrum sharing policies, in which users transmit at fixed power levels. Since users transmit simultaneously under stationary policies, to fulfill minimum throughput requirements, they need to transmit at high power levels to overcome interference. To improve energy efficiency, we construct nonstationary spectrum sharing policies, in which the users transmit at time-varying power levels. Specifically, we focus on TDMA (time-division multiple access) policies in which one user transmits at each time (but not in a round-robin fashion). The proposed policy can be implemented by each user running a low-complexity algorithm in a decentralized manner. It achieves high energy efficiency even when the users have erroneous and binary feedback about their interference levels. Moreover, it can adapt to dynamic entry and exit of users. The proposed policy is also deviation-proof, namely autonomous users will find it in their self-interests to follow it. Compared to existing policies, the proposed policy can achieve an energy saving of up to 90% under a large number of users. | 10.1109/TCOMM.2014.012614.130953 | Journal | IEEE Transactions on Communications | Communications and Networks | |||
2014/01/01 00:00 | Ensemble Online Clustering through Decentralized Observations | D. Katselis, C. L. Beck, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/7039497 | We investigate the problem of online learning for an ensemble of agents clustering incoming data, i.e., the problem of combining online local clustering decisions made by distributed agents to improve knowledge and accuracy of implicit clusters hidden in the incoming data streams. We focus on clustering using the well-known K-means algorithm for numerical data due to its efficiency in clustering large data sets. Nevertheless, our results can be straightforwardly extended to, e.g., the K-modes variant of the K-means algorithm to handle categorical data, as well as to other clustering algorithms. We show that the proposed ensemble online solutions, which are based on a simple majority-voting scheme, converge to the centralized solutions that would be made by a fusion center, that is, the solutions resulting from one agent with access to all information across agents. Given the dimensions of the clustering model, the aforementioned convergence is demonstrated to be achievable for relatively small sizes of the ensemble. | 10.1109/CDC.2014.7039497 | Conference | IEEE Conference on Decision and Control (CDC) | ||||
2014/01/01 00:00 | Forecasting Popularity of Videos using Social Media | J. Xu, M. van der Schaar, J. Liu, H. Li | 2014 | https://ieeexplore.ieee.org/document/6955832 | This paper presents a systematic online prediction method (Social-Forecast) that is capable to accurately forecast the popularity of videos promoted by social media. Social-Forecast explicitly considers the dynamically changing and evolving propagation patterns of videos in social media when making popularity forecasts, thereby being situation and context aware. Social-Forecast aims to maximize the forecast reward, which is defined as a tradeoff between the popularity prediction accuracy and the timeliness with which a prediction is issued. The forecasting is performed online and requires no training phase or a priori knowledge. We analytically bound the prediction performance loss of Social-Forecast as compared to that obtained by an omniscient oracle and prove that the bound is sublinear in the number of video arrivals, thereby guaranteeing its short-term performance as well as its asymptotic convergence to the optimal performance. In addition, we conduct extensive experiments using real-world data traces collected from the videos shared in RenRen, one of the largest online social networks in China. These experiments show that our proposed method outperforms existing view-based approaches for popularity prediction (which are not context-aware) by more than 30% in terms of prediction rewards. | 10.1109/JSTSP.2014.2370942 | Journal | IEEE Journal of Selected Topics in Signal Processing | Multi-agent learning | Communications and Networks | ||
2014/01/01 00:00 | Incentivizing Information Sharing in Networks | J. Xu, Y. Song, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6854648 | For many networks (e.g. opinion consensus, cooperative estimation, distributed learning and adaptation etc.) to proliferate and efficiently operate, the participating agents need to collaborate with each other by repeatedly sharing information which is often costly while brings no direct immediate benefit for the agents. In this paper, we develop a systematic framework for designing distributed rating protocols aimed at incentivizing the strategic agents to collaborate with each other by sharing information. The proposed incentive protocols exploit the ongoing nature of the agents' interactions to assign ratings and through them, determine future rewards and punishments through social reciprocation. Unlike existing rating protocols, the proposed protocol operates in a distributed manner, and takes into consideration the underlying interconnectivity of agents as well as their heterogeneity. We prove that in many deployment scenarios adopting the proposed rating protocols achieves full efficiency (i.e. price of anarchy is one) even with strategic agents. | 10.1109/ICASSP.2014.6854648 | Conference | IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) | ||||
2014/01/01 00:00 | Intervention Framework for Counteracting Collusion in Spectrum Leasing Systems | J. Alcaraz, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6855021 | We consider a spectrum leasing system in which secondary networks offer offload services to a primary network (PN) in exchange of temporary access to the PN's spectrum. When the SANs collude and coordinate their prices, forming a cartel, the PN experiences cartel overcharge, which in our scenario implies lower transmission rates for the serviced PUs. To protect the spectrum owner's interests and possibly enforce market regulation, we propose an intervention framework in which an intervention device or manager (possibly with the authorization and/or supervision of an external regulatory agency) counteracts cartel formation. This framework exploits the specific features that make wireless systems different from conventional markets, enabling the manager to modify the set of achievable outcomes. The intervention capability is limited, so the objective is to design an intervention rule which is maximizes the PN transmission rate within the given constraints. | 10.1109/ICASSP.2014.6855021 | Conference | IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) | ||||
2014/01/01 00:00 | Low Power and Scalable Many-Core Architecture for Big-Data Stream Computing | K. Kanoun, M. Ruggiero, D. Atienza, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6903408 | In the last years the process of examining large amounts of different types of data, or Big-Data, in an effort to uncover hidden patterns or unknown correlations has become a major need in our society. In this context, stream mining applications are now widely used in several domains such as financial analysis, video annotation, surveillance, medical services, traffic prediction, etc. In order to cope with the Big-Data stream input and its high variability, modern stream mining applications implement systems with heterogeneous classifiers and adapt online to its input data stream characteristics variation. Moreover, unlike existing architectures for video processing and compression applications, where the processing units are reconfigurable in terms of parameters and possibly even functions as the input data is changing, in Big-Data stream mining applications the complete computing pipeline is changing, as entirely new classifiers and processing functions are invoked depending on the input stream. As a result, new approaches of reconfigurable hardware platform architectures are needed to handle Big-Data streams. However, hardware solutions that have been proposed so far for stream mining applications either target high performance computing without any power consideration (i.e., limiting their applicability in small-scale computing infrastructures or current embedded systems), or they are simply dedicated to a specific learning algorithm (i.e., limited to run with a single type of classifiers). Therefore, in this paper we propose a novel low-power many-core architecture for stream mining applications that is able to cope with the dynamic data-driven nature of stream mining applications while consuming limited power. Our exploration indicates that this new proposed architecture is able to adapt to different classifiers complexities thanks to its multiple scalable vector processing units and their re-configurability feature at run-time. Moreover, our platform architecture includes a memory hierarchy optimized for Big-Data streaming and implements modern fine-grained power management techniques over all the different types of cores allowing then minimum energy consumption for each type of executed classifier. | 10.1109/ISVLSI.2014.77 | Conference | IEEE Computer Society Annual Symposium on VLSI (ISVLSI) | ||||
2014/01/01 00:00 | Model Based Design Environment for Data-Driven Embedded Signal Processing Systems | K. Sudusinghe, I. Cho, M. van der Schaar, S. Bhattacharyya | 2014 | https://www.sciencedirect.com/science/article/pii/S1877050914002841 | In this paper, we investigate new design methods for data-driven digital signal processing (DSP) systems that are targeted to resource- and energy-constrained embedded environments, such as UAVs, mobile communication platforms and wireless sensor networks. Signal process- ing applications, such as keyword matching, speaker identification, and face recognition, are of great importance in such environments. Due to critical application constraints on energy consumption, real-time performance, computational resources, and core application accuracy, the design spaces for such applications are highly complex. Thus, conventional static methods for configuring and executing such embedded DSP systems are severely limited in the degree to which processing tasks can adapt to current operating conditions and mission requirements. We address this limitation by developing a novel design framework for multi-mode, data driven signal processing systems, where different application modes with complementary trade-offs are selected, configured, executed, and switched dynamically, in a data-driven manner. We demon- strate the utility of our proposed new design methods on an energy-constrained, multi-mode face detection application. | 10.1016/j.procs.2014.05.107 | Conference | International Conference on Computational Science (ICCS) | ||||
2014/01/01 00:00 | Network Dynamics with Incomplete Information and Learning | J. Xu, S. Zhang, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/7028586 | We analyze networks that feature reputational learning: how links are initially formed by agents under incomplete information, how agents learn about their neighbors through these links, and how links may ultimately become broken. We show that the type of information agents have access to, and the speed at which agents learn about each other, can have tremendous repercussions for the network evolution and the overall network social welfare. Specifically, faster learning can often be harmful for networks as a whole if agents are myopic, because agents fail to fully internalize the benefits of experimentation and break off links too quickly. As a result, preventing two agents from linking with each other can be socially beneficial, even if the two agents are initially believed to be of high quality. This is due to the fact that having fewer connections slows the rate of learning about these agents, which can be socially beneficial. Another method of solving the informational problem is to impose costs for breaking links, in order to incentivize agents to experiment more carefully. | 10.1109/ALLERTON.2014.7028586 | Conference | Allerton | ||||
2014/01/01 00:00 | Non-stationary Demand Side Management Method for Smart Grids | L. Song, Y. Xiao, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6855110 | Demand side management (DSM) is a key solution for reducing the peak-time power consumption in smart grids. The consumers choose their power consumption patterns according to different prices charged at different times of the day. Importantly, consumers incur discomfort costs from altering their power consumption patterns. Existing works propose stationary strategies for consumers that myopically minimize their short-term billing and discomfort costs. In contrast, we model the interaction emerging among self-interested consumers as a repeated energy scheduling game which foresightedly minimizes their long-term total costs. We then propose a novel methodology for determining optimal nonstationary DSM strategies in which consumers can choose different daily power consumption patterns depending on their preferences and routines, as well as on their past history of actions. We prove that the existing stationary strategies are suboptimal in terms of long-term total billing and discomfort costs and that the proposed strategies are optimal and incentive-compatible (strategy-proof). Simulations confirm that, given the same peak-to-average ratio, the proposed strategy can reduce the total cost (billing and discomfort costs) by up to 50% compared to existing DSM strategies. | 10.1109/ICASSP.2014.6855110 | Conference | IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) | ||||
2014/01/01 00:00 | Non-stationary Resource Allocation Policies for Delay-constrained Video Streaming: Application to Video over Internet-of-Things-enabled Networks | J. Xu, Y. Andreopoulos, Y. Xiao, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6774597 | Due to the high bandwidth requirements and stringent delay constraints of multi-user wireless video transmission applications, ensuring that all video senders have sufficient transmission opportunities to use before their delay deadlines expire is a longstanding research problem. We propose a novel solution that addresses this problem without assuming detailed packet-level knowledge, which is unavailable at resource allocation time (i.e. prior to the actual compression and transmission). Instead, we translate the transmission delay deadlines of each sender's video packets into a monotonically-decreasing weight distribution within the considered time horizon. Higher weights are assigned to the slots that have higher probability for deadline-abiding delivery. Given the sets of weights of the senders' video streams, we propose the low-complexity Delay-Aware Resource Allocation (DARA) approach to compute the optimal slot allocation policy that maximizes the deadline-abiding delivery of all senders. A unique characteristic of the DARA approach is that it yields a non-stationary slot allocation policy that depends on the allocation of previous slots. This is in contrast with all existing slot allocation policies such as round-robin or rate-adaptive round-robin policies, which are stationary because the allocation of the current slot does not depend on the allocation of previous slots. We prove that the DARA approach is optimal for weight distributions that are exponentially decreasing in time. We further implement our framework for real-time video streaming in wireless personal area networks that are gaining significant traction within the new Internet-of-Things (IoT) paradigm. For multiple surveillance videos encoded with H.264/AVC and streamed via the 6tisch framework that simulates the IoT-oriented IEEE 802.15.4e TSCH medium access control, our solution is shown to be the only one that ensures all video bitstreams are delivered with acceptable quality in a deadline-abiding manner. | 10.1109/JSAC.2014.140410 | Journal | IEEE Journal on Selected Areas in Communications | Communications and Networks | |||
2014/01/01 00:00 | Online Energy-Efficient Task-Graph Scheduling for Multicore Platforms | K. Kanoun, N. Mastronarde, D. Atienza, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6856301 | Numerous directed acyclic graph (DAG) schedulers have been developed to improve the energy efficiency of various multicore platforms. However, these schedulers make a priori assumptions about the relationship between the task dependencies, and they are unable to adapt online to the characteristics of each application without offline profiling data. Therefore, we propose a novel energy-efficient online scheduling solution for the general DAG model to address the two aforementioned problems. Our proposed scheduler is able to adapt at run-time to the characteristics of each application by making smart foresighted decisions, which take into account the impact of current scheduling decisions on the present and future deadline miss rates and energy efficiency. Moreover, our scheduler is able to efficiently handle execution with very limited resources by avoiding scheduling tasks that are expected to miss their deadlines and do not have an impact on future deadlines. We validate our approach against state-of-the-art solutions. In our first set of experiments, our results with the H.264 video decoder demonstrate that the proposed low-complexity solution for the general DAG model reduces the energy consumption by up to 15% compared to an existing sophisticated and complex scheduler that was specifically built for the H.264 video decoder application. In our second set of experiments, our results with different configurations of synthetic DAGs demonstrate that our proposed solution is able to reduce the energy consumption by up to 55% and the deadline miss rates by up to 99% compared to a second existing scheduling solution. Finally, we show that our DAG flow manager and scheduler have low complexities on a real mobile platform and we show that our solution is resilient to workload prediction errors by using different estimator accuracies. | 10.1109/TCAD.2014.2316094 | Journal | IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | ||||
2014/01/01 00:00 | Online Learning in Large-scale Contextual Recommender Systems | L. Song, C. Tekin, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6940318 | In this paper, we propose a novel large-scale, context-aware recommender system that provides accurate recommendations, scalability to a large number of diverse users and items, differential services, and does not suffer from “cold start” problems. Our proposed recommendation system relies on a novel algorithm which learns online the item preferences of users based on their click behavior, and constructs online item-cluster trees. The recommendations are then made by choosing an item-cluster level and then selecting an item within that cluster as a recommendation for the user. This approach is able to significantly improve the learning speed when the number of users and items is large, while still providing high recommendation accuracy. Each time a user arrives at the website, the system makes a recommendation based on the estimations of item payoffs by exploiting past context arrivals in a neighborhood of the current user's context. It exploits the similarity of contexts to learn how to make better recommendations even when the number and diversity of users and items is large. This also addresses the cold start problem by using the information gained from similar users and items to make recommendations for new users and items. We theoretically prove that the proposed algorithm for item recommendations converges to the optimal item recommendations in the long-run. We also bound the probability of making a suboptimal item recommendation for each user arriving to the system while the system is learning. Experimental results show that our approach outperforms the state-of-the-art algorithms by over 20 percent in terms of click through rates. | 10.1109/TSC.2014.2365795 | Journal | IEEE Transactions on Services Computing | ||||
2014/01/01 00:00 | Optimal Foresighted Packet Scheduling and Resource Allocation for Multi-user Video Transmission in 4G Cellular Networks | Y. Xiao, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6853688 | We study joint resource allocation and packet scheduling for multi-user video transmission in a 4G cellular network, where the base station (BS) allocates resources (i.e. bandwidth) among the users and each user schedules its video packets based on the allocated resources. Most existing works either propose myopic solutions for multi-user video transmission, in which the resource allocation and packet scheduling is designed to maximize the short-term video quality, or propose foresighted packet scheduling solutions for single-user video transmission which maximize the long-term video quality. In this work, we propose foresighted resource allocation and packet scheduling solutions for multi-user video transmission. Specifically, we develop a low-complexity algorithm in which the BS updates the prices of resources for each user and the users make individual packet scheduling decisions based on the prices. The algorithm can be implemented by the BS and the users in a decentralized manner, and converges to the optimal prices under which the users' optimal decisions maximize the long-term total video quality subject to per-user minimum video quality guarantees. Simulation results show 7 dB and 3 dB improvements in PSNR (Peak Signal-to-Noise Ratio) over myopic solutions and existing foresighted solutions, respectively. | 10.1109/ICASSP.2014.6853688 | Conference | IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) | ||||
2014/01/01 00:00 | Rating and Matching in Peer Review Systems | Y. Xiao, F. Dörfler, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/7028435 | Peer review (e.g., review of research papers) is essential for the success of the scientific community. In peer review, the reviewers voluntarily exert costly effort in reviewing papers. Hence, it is important to design mechanisms to elicit high effort from reviewers. Exploiting the fact that the researchers interact with each other repeatedly (e.g., by submitting and reviewing papers over years), we propose a rating and matching mechanism to elicit high effort from reviewers. Our proposed mechanism overcomes two major difficulties, namely adverse selection (i.e., the unidentifiable quality of heterogeneous reviewers) and moral hazard (i.e., the unobservable effort levels from reviewers). Specifically, our proposed mechanism assigns and updates ratings for the researchers, and matches researchers' papers to reviewers with similar ratings. In this way, the mechanism identifies different types of reviewers by their ratings, and incentivizes different reviewers to exert high effort. Focusing on the matching rule, we first provide design guidelines for a general matching rule that leads the system to an equilibrium, where the reviewers' types are identified and their high efforts are elicited. Then we study in detail a baseline matching rule that assigns each researcher's paper to one of the two reviewers with the closest ratings, provide guidelines of how to choose the initial ratings, and analyze equilibrium review quality and equilibrium ratings. Finally, we extend the baseline matching rule to two classes. The first extension provides extra reward and/or punishment by adjusting the probabilities of matching each researcher's paper to its neighbors. The second extension provides extra reward and/or punishment by allowing to match each researcher's paper to reviewers other than its neighbors. We prove that it is beneficial (in the sense that the optimal equilibrium review quality is higher) to reward reviewers in the first extension, and to punish reviewers in the second extension, due to the different ways the reward and punishment are carried out. We also prove that our proposed matching rules elicit much higher effort from reviewers, compared to matching rules that mimic the current mechanisms of assigning papers. | 10.1109/ALLERTON.2014.7028435 | Conference | Allerton | ||||
2014/01/01 00:00 | Rating Protocols for Online Communities | Y. Zhang, J. Park, M. van der Schaar | 2014 | https://dl.acm.org/doi/abs/10.1145/2560794 | Sustaining cooperation among self-interested agents is critical for the proliferation of emerging online communities. Providing incentives for cooperation in online communities is particularly challenging because of their unique features: a large population of anonymous agents having asymmetric interests and dynamically joining and leaving the community, operation errors, and agents trying to whitewash when they have a low standing in the community. In this article, we take these features into consideration and propose a framework for designing and analyzing a class of incentive schemes based on rating protocols, which consist of a rating scheme and a recommended strategy. We first define the concept of sustainable rating protocols under which every agent has the incentive to follow the recommended strategy given the deployed rating scheme. We then formulate the problem of designing an optimal rating protocol, which selects the protocol that maximizes the overall social welfare among all sustainable rating protocols. Using the proposed framework, we study the structure of optimal rating protocols and explore the impact of one-sided rating, punishment lengths, and whitewashing on optimal rating protocols. Our results show that optimal rating protocols are capable of sustaining cooperation, with the amount of cooperation varying depending on the community characteristics. | 10.1145/2560794 | Journal | ACM Transactions on Economics and Computation | Multi-agent learning | Communications and Networks, Game Theory and Applications | ||
2014/01/01 00:00 | Robust Additively Coupled Games in the Presence of Bounded Uncertainty in Communication Networks | S. Parsaeefard, A. R. Sharafat, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6619406 | We propose a novel scheme for robust communications in scenarios where channel gains, interference levels, or other measured values are uncertain or erroneously measured due to channel variations, delayed feedback, and users' mobility. When the exact values of such measurements are known, it has been shown in the literature that multiuser wireless interactions can be modeled as additively coupled games (ACGs) in which users converge to a unique Nash equilibrium by following a distributed best-response algorithm. However, in practice, such measurements are uncertain or erroneous, and hence, it is important to analyze how these uncertainties and errors affect the performance of the users playing ACGs. Most importantly, novel adjustment schemes are needed to ensure that the utility of each user is preserved under such uncertainties, i.e., introduce robustness against uncertainties and errors in ACGs. We utilize the worst case robust optimization techniques to analyze the impact of uncertainties on the users' performance and to build robust ACGs (RACGs). We derive sufficient conditions for the existence and uniqueness of their robust equilibrium and compare the outcome of an RACG and an ACG at their respective equilibria in terms of both utilities and the actions taken by the users. To reach the RACG's equilibrium, we propose a novel distributed best-response algorithm and derive sufficient conditions for its convergence. Our analytical results are supported by simulations for power control games in interference channels and for flow control in Jackson networks. | 10.1109/TVT.2013.2284344 | Journal | IEEE Transactions on Vehicular Technology | Communications and Networks | |||
2014/01/01 00:00 | Robust Power Control for Heterogeneous Users in Shared Unlicensed Bands | S. Parsaeefard, A. R. Sharafat, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6807571 | We develop a robust formalism for power control games in unlicensed bands between two groups of users competing for the spectrum: informed-users (leaders) who have advanced capabilities to extract side-information about other users and their strategies, and uninformed-users (followers) who can only observe the aggregate interference caused by others. Such nominal leader-follower games have been previously studied in the power control literature; however, these prior works fail to capture an important aspect of such interactions: the side-information and observations made by users may be uncertain, which has an important impact on users' strategies and network performance. Thus, in this paper we propose a new, robust game-theoretic formalism and solution which takes these uncertainties into account. Specifically, each group chooses its actions by solving its respective worst-case robust optimization problems. We show how various types of uncertainties affect the social utility of each group, and identify in which deployment scenarios the social utility of the robust game is higher than that of the nominal game. Importantly, we show that robust solutions in such games are more energy efficient. Finally, our theoretical formalism, analysis and solutions are complemented by simulations. | 10.1109/TWC.2014.042314.130981 | Journal | IEEE Transactions on Wireless Communications | Communications and Networks | |||
2014/01/01 00:00 | Sharing in Networks of Strategic Agents | J. Xu, Y. Song, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6787069 | In social, economic and engineering networks, connected agents need to cooperate by repeatedly sharing information and/or goods. Typically, sharing is costly and there are no immediate benefits for agents who share. Hence, agents who strategically aim to maximize their own individual utilities will “free-ride” because they lack incentives to cooperate/share, thereby leading to inefficient operation or even collapse of networks. To incentivize the strategic agents to cooperate with each other, we design distributed rating protocols which exploit the ongoing nature of the agents' interactions to assign ratings and through them, determine future rewards and punishments: agents that have behaved as directed enjoy high ratings-and hence greater future access to the information/goods of others; agents that have not behaved as directed enjoy low ratings-and hence less future access to the information/goods of others. Unlike existing rating protocols, the proposed protocol operates in a distributed manner and takes into consideration the underlying interconnectivity of agents as well as their heterogeneity. We prove that in many networks, the price of anarchy (PoA) obtained by adopting the proposed rating protocols is 1, that is, the optimal social welfare is attained. In networks where PoA is larger than 1, we show that the proposed rating protocol significantly outperforms existing incentive mechanisms. Last but not least, the proposed rating protocols can also operate efficiently in dynamic networks, where new agents enter the network over time. | 10.1109/JSTSP.2014.2316597 | Journal | IEEE Journal of Selected Topics in Signal Processing | Multi-agent learning | Communications and Networks | ||
2014/01/01 00:00 | Silence is Gold: Strategic Small Cell Interference Management Using Tokens | C. Shen, J. Xu, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/7037493 | Electronic tokens have been proved as an effective incentive scheme in stimulating self-interested network nodes to transmit other nodes' traffic. In other words, tokens are paid to buy transmission. In this work, we propose a novel token framework in a distributed small cell network and design the token system for improved interference mitigation. Contrary to the traditional role of tokens for buying transmission, they are exchanged between users to buy "silence". We focus on designing the optimal token system that minimizes the system outage probability. We first analyze the optimal strategies of individual users, which only consider their own utility maximization and do not care about the system-wise performance. We show that under some mild conditions the optimal strategy has a simple threshold structure. We then analytically derive the optimal token supply that minimizes the network outage probability. Simulation results show that not only does the proposed token system design greatly improve the network outage probability (by up to 75%), it also improves the overall small cell network QoS, particularly when the deployment density is high. | 10.1109/GLOCOM.2014.7037493 | Conference | IEEE Global Communications Conference (GLOBECOM) | Communications and Networks | |||
2014/01/01 00:00 | Spectrum Sharing For Delay-Sensitive Applications With Continuing QoS Guarantees | Y. Xiao, K. Ahuja, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/7036982 | We study a wireless network in which multiple users stream delay-sensitive applications such as video conferencing and video streaming. Existing spectrum sharing policies, which determine when users access the spectrum and at what power levels, are either constant (i.e. users transmit simultaneously, at constant power levels) or weighted round-robin time-division multiple access (TDMA) (i.e. users access the spectrum in turn, one at a time). Due to multi-user interference, constant policies have low spectrum efficiency. We show that round-robin policies are inefficient for delay-sensitive applications because the various "positions" (i.e. transmission opportunities) in a cycle are not created equal: earlier transmission opportunities are more desirable since they enable users to transmit with lower delays. Specifically, we show that (weighted) round-robin TDMA policies cannot simultaneously achieve high network performance and low transmission delays. This problem is exacerbated when the number of users is large. We propose a novel framework for designing optimal TDMA spectrum sharing policies for delay-sensitive applications, which can guarantee their continuing QoS (CQoS), i.e. the desired throughput (and the resulting transmission delay) starting from every moment in time is guaranteed for each user. We prove that the fulfillment of CQoS guarantees provides strict upper bounds on the transmission delays incurred by the users. We construct the optimal TDMA policy that maximizes the desired network performance (e.g. max-min fairness or social welfare) subject to the users' CQoS guarantees. The key feature of the proposed policy is that it is not cyclic as in (weighted) round-robin policies. Instead, it adaptively determines which user should transmit next, based on the users' remaining amounts of transmission opportunities needed to achieve the desired performance. We also propose a low-complexity algorithm, which is run by each user in a distributed manner, to construct the optimal policy. Simulation results demonstrate that our proposed policy significantly outperforms the optimal constant policy and round-robin policies by up to 6 dB and 4 dB in peak signal-to-noise ratio (PSNR) for video streaming. | 10.1109/GLOCOM.2014.7036982 | Conference | IEEE Global Communications Conference (GLOBECOM) | ||||
2014/01/01 00:00 | Structure-aware Stochastic Load Management in Smart Grids | Y. Zhang, M. Van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6848212 | Load management based on dynamic pricing has been advocated as a key approach for demand-side management in smart grids. By appropriately pricing energy, economic incentives are given to consumers to shift their usage away from peak hours, thereby limiting the amount of energy that needs to be produced. However, traditional pricing-based load management methods usually rely on the assumption that the statistics of the system dynamics (e.g. the time-varying electricity price, the arrival distribution of consumers' load demands) are known a priori, which is not true in practice. In this paper, we propose a novel price-dependent load scheduling algorithm which, unlike previous works, can operate optimally in systems where such statistical knowledge is unknown. We consider a power grid system where each consumer is equipped with an energy storage device that has the capability of storing electrical energy during peak hours. Specifically, we allow each consumer to proactively determine the amount of energy to purchase from the utility companies (or energy producers) while taking into consideration that its load demand and the electricity price dynamically vary over time in an a priori unknown manner. We first assume that all the dynamics are known and formulate the real-time load scheduling as a Markov decision process and systematically unravel the structural properties exhibited by the resulting optimal load scheduling policy. By utilizing these structural properties, we then prove that our proposed load scheduling algorithm can learn the system dynamics in an online manner and converge to the optimal solution. A distinctive feature of our algorithm is that it actively exploits partial information about the system dynamics so that less information needs to be learned than when using conventional reinforcement learning methods, which significantly improves the adaptation speed and the runtime performance. Our simulation results demonstrate that the proposed load scheduling algorithm achieves efficiency by more than 30% compared to existing state-of-the-art online learning algorithms. | 10.1109/INFOCOM.2014.6848212 | Conference | IEEE International Conference on Computer Communications (INFOCOM) | ||||
2014/01/01 00:00 | Structure-Aware Stochastic Storage Management In Smart Grids | Y. Zhang, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6874494 | Demand-side management has been proposed as an important solution for improving the energy consumption efficiency in smart grids. However, traditional pricing-based demand-side management methods usually rely on the assumption that the statistics of the system dynamics (e.g., the time-varying electricity price, the arrival distribution of consumers' demanded load) are known a priori, which does not hold in practice. In this paper, we propose a novel price-aware energy storage management algorithm for consumption scheduling which, unlike previous works, can operate optimally in systems where such statistical knowledge is unknown. We consider a power grid system where each consumer is equipped with an electrical energy storage device. Each consumer proactively determines how much energy to purchase from the energy producers by taking into consideration the time-varying and a priori unknown system dynamics, in order to maximize its own energy consumption utility. We first formulate the real-time energy storage management and demand response of the consumers as a Markov decision process and then propose an online learning algorithm that enables the consumers to learn the unknown system dynamics on-the-fly and have their energy storage management policies converge to the optimum. Our simulation results validate the efficacy of our algorithm, which helps consumers achieve higher average utility as opposed to other state-of-the-art online learning algorithms and energy storage management algorithms. | 10.1109/JSTSP.2014.2346477 | Journal | IEEE Journal of Selected Topics in Signal Processing | Multi-agent learning | Game Theory and Applications | ||
2014/01/01 00:00 | Technology Choices and Pricing Policies in Public and Private Wireless Networks | Y. Xiao, W. R. Zame, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/6930770 | This paper studies the provision of a wireless network by a monopolistic provider who may be either benevolent (seeking to maximize social welfare, namely the sum utility of all the users) or selfish (seeking to maximize provider profit). The paper addresses the following questions: Under what circumstances is it feasible for a provider, either benevolent or selfish, to operate a network in such a way as to cover costs? How is the optimal behavior of a benevolent provider different from the optimal behavior of a selfish provider? And, most importantly, how does the medium access control (MAC) technology influence the answers to these questions? To address these questions, we build a general model, and provide analysis and simulations for simplified but typical scenarios; the focus in these scenarios is on the contrast between the outcomes obtained under carrier-sensing multiple access (CSMA) and outcomes obtained under time-division multiple access (TDMA). Simulation results demonstrate that differences in MAC technology can have a significant effect on social welfare, on provider profit, and even on the (financial) feasibility of a wireless network. | 10.1109/TWC.2014.2364123 | Journal | IEEE Transactions on Wireless Communications | Communications and Networks, Game Theory and Applications | |||
2014/01/01 00:00 | To Tax or To Subsidize: The Economics of User-Generated Content Platforms | S. Ren, M. van der Schaar | 2014 | https://onlinelibrary.wiley.com/doi/10.1002/9781118899250.ch13 | As user-generated content platforms are becoming an integral part of our lives, both Internet users and platform owners are eager to see the continuing growth of such platforms, which are nonetheless largely hindered by a number of obstacles, notably low quality content and lack of revenue sources. Typically, users can view the content for free while they post content on user-generated content platforms voluntarily. This chapter provides a new model for user-generated content platform, in order to provide a formal analysis on when subsidizing or taxing content producers is profit maximizing for the intermediary. The model is a three-stage game that is played by the intermediary, content producers, and content viewers. The chapter focuses on a class of payment schemes in which the intermediary subsidizes or taxes content producers on a basis of per content view, while it provides the service for free to content viewers. | 10.1002/9781118899250.ch13 | Chapter | Smart Data Pricing | ||||
2014/01/01 00:00 | Towards a Theory of Societal Co-Evolution: Individualism versus Collectivism | K. Ahuja, S. Zhang, M. van der Schaar | 2014 | https://ieeexplore.ieee.org/document/7032223 | Substantial empirical research has shown that the level of individualism vs. collectivism is one of the most critical and important determinants of societal traits, such as economic growth, economic institutions and health conditions. But the exact nature of this impact has thus far not been well understood in an analytical setting. In this work, we develop one of the first theoretical models that analytically studies the impact of individualism-collectivism on the society. We model the growth of an individual's welfare (wealth, resources and health) as depending not only on himself, but also on the level of collectivism, i.e. the level of dependence on the rest of the individuals in the society, which leads to a co-evolutionary setting. Based on our model, we are able to predict the impact of individualism-collectivism on various societal metrics, such as average welfare, average lifetime, total population, cumulative welfare and average inequality. We analytically show that individualism has a positive impact on average welfare and cumulative welfare, but comes with the drawbacks of lower average life-time, lower total population and higher average inequality. | 10.1109/GlobalSIP.2014.7032223 | Conference | IEEE Global Conference on Signal and Information Processing (GlobalSIP) | ||||
2013/01/01 00:00 | A design methodology for distributed adaptive stream mining systems | S. Won, I. Cho, K. Sudusinghe, J. Xu, Y. Zhang, M. van der Schaar, S. S. Bhattacharyya | 2013 | http://www.sciencedirect.com/science/article/pii/S1877050913005681 | Data-driven, adaptive computations are key to enabling the deployment of accurate and efficient stream mining systems, which invoke suitably configured queries in real-time on streams of input data. Due to the physical separation among data sources and computational resources, it is often necessary to deploy such stream mining systems in a distributed fashion, where local learners have access to disjoint subsets of the data that is to be mined, and forward their intermediate results to an ensemble learner that combines the results from the local learners. In this paper, we develop a design methodology for integrated de- sign, simulation, and implementation of dynamic data-driven adaptive stream mining systems. By systematically integrating considerations associated with local embedded processing, classifier configuration, data-driven adaptation and networked com- munication, our approach allows for effective assessment, prototyping, and implementation of alternative distributed design methods for data-driven, adaptive stream mining systems. We demonstrate our results on a dynamic data-driven application involving patient health care monitoring. | 10.1016/j.procs.2013.05.425 | Conference | International Conference on Computational Science (ICCS) | ||||
2013/01/01 00:00 | A Fast Online Learning Algorithm for Distributed Mining of BigData | Y. Zhang, D. Sow, D. S. Turaga, M. van der Schaar | 2013 | https://dl.acm.org/doi/abs/10.1145/2627534.2627562 | BigData analytics require that distributed mining of numerous data streams is performed in real-time. Unique challenges associated with designing such distributed mining systems are: online adaptation to incoming data characteristics, online processing of large amounts of heterogeneous data, limited data access and communication capabilities between distributed learners, etc. We propose a general framework for distributed data mining and develop an efficient online learning algorithm based on this. Our framework consists of an ensemble learner and multiple local learners, which can only access different parts of the incoming data. By exploiting the correlations of the learning models among local learners, our proposed learning algorithms can optimize the prediction accuracy while requiring significantly less information exchange and computational complexity than existing state-of-the-art learning solutions. | 10.1145/2627534.2627562 | Conference | ACM SIGMETRICS Big Data Analytics workshop | ||||
2013/01/01 00:00 | A Learning Based Congestion Control for Multimedia Transmission in Wireless Networks | O. Habachi, N. Mastronarde, H. Shiang, M. van der Schaar, Y. Hayel | 2013 | https://ieeexplore.ieee.org/document/6607585 | The intense throughput and stringent delay requirements of Internet multimedia applications has spurred the need for new transport protocols with flexible transmission control. Current TCP congestion control adopts an Additive Increase Multiplicative Decrease (AIMD) algorithm that linearly increases or exponentially decreases the congestion window based on transmission acknowledgements. In this paper, we propose an AIMD-based media-aware congestion control that determines the optimal congestion window updating policy for multimedia transmission. The media-aware congestion control is formulated as a Partially Observable Markov Decision Process (POMDP), which maximizes the long-term expected quality of the received multimedia data. Moreover, we propose a reinforcement learning algorithm in order to estimate the environment and adapt to the source and network variations on the fly. Simulation results show that the proposed approach can significantly improve the received video quality, particularly at high source rates, compared to conventional TCP. | 10.1109/ICME.2013.6607585 | Conference | International Conference on Multimedia and Expo (ICME) | ||||
2013/01/01 00:00 | A Novel Framework for Design and Implementation of Adaptive Stream Mining Systems | K. Sudusinghe, S. Won, M. van der Schaar, S. Bhattacharyya | 2013 | https://ieeexplore.ieee.org/document/6607565 | With the increasing need for accurate mining and classification from multimedia data content, and the growth of such multimedia applications in mobile and distributed architectures, stream mining systems require increasing amounts of flexibility, extensibility, and adaptivity for effective deployment. To address this challenge, we propose a novel approach that rigorously integrates foundations of dataflow modeling for high level signal processing system design, and adaptive stream mining based on dynamic topologies of classifiers. In particular, we introduce a new design environment, called the lightweight dataflow for dynamic data driven application systems (LiD4E) environment. LiD4E provides formal semantics, rooted in dataflow principles, for design and implementation of a broad class of multimedia stream mining topologies. We demonstrate the capabilities of LiD4E using a face detection application that systematically adapts the type of classifier used based on dynamically changing application constraints. | 10.1109/ICME.2013.6607565 | Conference | International Conference on Multimedia and Expo (ICME) | ||||
2013/01/01 00:00 | Bidirectional Energy Trading and Residential Load Scheduling with Electric Vehicles in the Smart Grid | B.-G. Kim, S. Ren, M. van der Schaar, J.-W. Lee | 2013 | https://ieeexplore.ieee.org/document/6547831 | Electric vehicles (EVs) will play an important role in the future smart grid because of their capabilities of storing electrical energy in their batteries during off-peak hours and supplying the stored energy to the power grid during peak hours. In this paper, we consider a power system with an aggregator and multiple customers with EVs and propose novel electricity load scheduling algorithms which, unlike previous works, jointly consider the load scheduling for appliances and the energy trading using EVs. Specifically, we allow customers to determine how much energy to purchase from or to sell to the aggregator while taking into consideration the load demands of their residential appliances and the associated electricity bill. We propose two different approaches: a collaborative and a non-collaborative approach. In the collaborative approach, we develop an optimal distributed load scheduling algorithm that maximizes the social welfare of the power system. In the non-collaborative approach, we model the energy scheduling problem as a non-cooperative game among self-interested customers, where each customer determines its own load scheduling and energy trading to maximize its own profit. In order to resolve the unfairness between heavy and light customers in the non-collaborative approach, we propose a tiered billing scheme that can control the electricity rates for customers according to their different energy consumption levels. In both approaches, we also consider the uncertainty in the load demands, with which customers' actual energy consumption may vary from the scheduled energy consumption. To study the impact of the uncertainty, we use the worst-case-uncertainty approach and develop distributed load scheduling algorithms that provide the guaranteed minimum performances in uncertain environments. Subsequently, we show when energy trading leads to an increase in the social welfare and we determine what are the customers' incentives to participate in the energy trading in various usage scenarios including practical environments with uncertain load demands. | 10.1109/JSAC.2013.130706 | Journal | IEEE Journal on Selected Areas in Communications | ||||
2013/01/01 00:00 | Bidirectional Energy Trading for Residential Load Scheduling and Electric Vehicles | B.-G. Kim, S. Ren, M. van der Schaar, J.-W. Lee | 2013 | https://ieeexplore.ieee.org/document/6566842 | Electric vehicles (EVs) will play an important role in the future smart grid because of their capabilities of storing electrical energy in their batteries during off-peak hours and supplying the stored energy to the power grid during peak hours. In this paper, we consider a power system with an aggregator and multiple customers with EVs and propose a novel electricity load scheduling which, unlike previous works, jointly considers the load scheduling for appliances and the energy trading using EVs. Specifically, we allow customers to determine how much energy to purchase from or to sell to the aggregator while taking into consideration the load demands of their residential appliances and the associated electricity bill. Under the assumption of the collaborative system where the customers agree to maximize the social welfare of the power system, we develop an optimal distributed load scheduling algorithm that maximizes the social welfare. Through numerical results, we show when the energy trading leads to an increase in the social welfare in various usage scenarios. | 10.1109/INFCOM.2013.6566842 | Conference | IEEE International Conference on Computer Communications (INFOCOM) | ||||
2013/01/01 00:00 | Cluster Formation Over Adaptive Networks with Selfish Agents | C.K. Yu, M. van der Schaar, A.H. Sayed | 2013 | https://ieeexplore.ieee.org/document/6811727 | We examine the problem of adaptation and learning over networks with selfish agents. In order to motivate agents to cooperate, we allow the agents to select their partners according to whether they can help them reduce their utility costs. We divide the operation of the network into two stages: a cluster formation stage and an information sharing stage. During cluster formation, agents evaluate a long-term combined cost function and decide on whether to cooperate or not with other agents. During the subsequent information sharing phase, agents share and process information over their sub-networks. Simulations illustrate how the clustering technique enhances the mean-square-error performance of the agents over non-cooperative processing. | Conference | European Signal Processing Conference (EUSIPCO) | |||||
2013/01/01 00:00 | Conjecture-Based Load Balancing for Delay-Sensitive Users Without Message Exchanges | H.P. Shiang, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6508952 | In this paper, we study how multiple users can balance their traffic loads to share common resources in an efficient and distributed manner, without message exchanges. Specifically, we study a deployment scenario where users deploy delay-sensitive applications over a wireless multipath network and aim to minimize their own expected delays. Since the performance of a user's load balancing strategy depends on the strategies that are deployed by other users, it becomes important that a user considers the multiuser coupling when making its own load balancing decisions. We model this multiuser interaction as a load balancing game (LBG) and show that users can converge to a ε-consistent conjectural equilibrium by building near-accurate beliefs about the remaining capacities on each path. Based on these beliefs, users can make load balancing decisions without explicitly knowing the actions of the other users. In such a conjecture-based LBG, we analytically show that, if a leader is elected to build beliefs about how the users' aggregate transmission strategies affect the remaining resources, then this leader can use this knowledge to shape its traffic such that the multiuser interaction can achieve an efficient allocation across paths. Even if no leader is present in the game, as long as the users follow a set of prescribed rules for building beliefs, they can reach efficient outcomes in a distributed manner. Importantly, the proposed distributed load balancing solution can be also applied to other multiuser communication and networking problems where message exchanges are prohibited (or prohibitively expensive in terms of delay or bandwidth), ranging from multichannel selection in wireless networks to relay assignment in multivehicle networks. | 10.1109/TVT.2013.2260188 | Journal | IEEE Transactions on Vehicular Technology | Multi-agent learning, Reinforcement learning | Communications and Networks | ||
2013/01/01 00:00 | Distributed Demand Side Management Among Foresighted Decision Makers in Power Networks | Y. Xiao, M. Van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6810521 | We consider a power network with an independent system operator (ISO), and geographically distributed aggregators who have energy storage and purchase energy from the ISO to serve its customers. All the entities in the system are foresighted: each aggregator minimizes its own long-term payments for energy purchase and operational costs of energy storage by deciding how much energy to buy from the ISO, and the ISO minimizes the long-term total cost of the network (i.e. energy generation costs and aggregators' costs) by dispatching energy generation among the generators. The decision making of the foresighted entities is complicated because 1) the information required to make optimal decisions is decentralized among the entities, and 2) the coupling (through the prices) among the aggregators is complicated. We propose a design framework in which the ISO provides each aggregator with a conjectured future price, and each aggregator distributively minimizes its own long-term cost based on its conjectured price as well as its local information. The proposed framework can achieve the social optimum despite the decentralized information and complex coupling among the entities. Simulation results demonstrate significant reduction in the total cost by the proposed foresighted demand side management (DSM), compared to the optimal myopic DSM (up to 60% reduction), and the foresighted DSM based on the Lyapunov optimization framework (up to 30% reduction). | 10.1109/ACSSC.2013.6810521 | Conference | Asilomar Conference on Signals, Systems, and Computers | ||||
2013/01/01 00:00 | Distributed Online Big Data Classification Using Context Information | C. Tekin, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6736696 | Distributed, online data mining systems have emerged as a result of applications requiring analysis of large amounts of correlated and high-dimensional data produced by multiple distributed data sources. We propose a distributed online data classification framework where data is gathered by distributed data sources and processed by a heterogeneous set of distributed learners which learn online, at run-time, how to classify the different data streams either by using their locally available classification functions or by helping each other by classifying each other's data. Importantly, since the data is gathered at different locations, sending the data to another learner to process incurs additional costs such as delays, and hence this will be only beneficial if the benefits obtained from a better classification will exceed the costs. We model the problem of joint classification by the distributed and heterogeneous learners from multiple data sources as a distributed contextual bandit problem where each data is characterized by a specific context. We develop a distributed online learning algorithm for which we can prove sublinear regret. Compared to prior work in distributed online data mining, our work is the first to provide analytic regret results characterizing the performance of the proposed algorithm. | 10.1109/Allerton.2013.6736696 | Conference | Allerton | ||||
2013/01/01 00:00 | Distributed Spectrum Sensing in the Presence of Selfish Users | C. K. Yu, M. van der Schaar, A. H. Sayed | 2013 | https://ieeexplore.ieee.org/document/6714090 | We study the problem of decentralized spectrum sensing in the presence of selfish secondary users. We employ diffusion strategies to guide the estimation process and a reputation mechanism to encourage secondary users to participate in the sharing of information. Simulation results illustrate the performance of the proposed technique for spectrum sensing over cognitive radios. | 10.1109/CAMSAP.2013.6714090 | Conference | IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP) | ||||
2013/01/01 00:00 | Dynamic Scheduling and Pricing in Wireless Cloud Computing | S. Ren, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6512487 | In this paper, we consider a wireless cloud computing system in which the service provider operates a data center and provides cloud services to its subscribers at dynamic prices. We propose a joint optimization of scheduling and pricing decisions for delay-tolerant batch services to maximize the service provider's long-term profit. Unlike the existing research on jointly scheduling and pricing that focuses on static or asymptotic analysis, we focus on a dynamic setting and develop a provably-efficient Dynamic Scheduling and Pricing (Dyn-SP) algorithm which, without the necessity of predicting the future information, can be applied to an arbitrarily random environment that may follow an arbitrary trajectory overtime. We prove that, compared to the optimal offline algorithm with future information, Dyn-SP produces a close-to-optimal average profit while bounding the job queue length in the data center. We perform a trace-based simulation study to validate Dyn-SP. In particular, we show both analytically and numerically that a desired tradeoff between the profit and queueing delay can be obtained by appropriately tuning the control parameter. Our results also indicate that, compared to the existing algorithms which neglect demand-side management, cooling system energy consumption, and/or the queue length information, Dyn-SP achieves a higher average profit while incurring (almost) the same average queueing delay. | 10.1109/TMC.2013.57 | Journal | IEEE Transactions on Mobile Computing | Communications and Networks | |||
2013/01/01 00:00 | Efficient Online Exchange via Fiat Money | M. van der Schaar, J. Xu, W. R. Zame | 2013 | https://link.springer.com/article/10.1007%2Fs00199-013-0744-4 | In many online systems, individuals provide services for each other; the recipient of the service obtains a benefit but the provider of the service incurs a cost. If benefit exceeds cost, provision of the service increases social welfare and should therefore be encouraged—but the individuals providing the service gain no (immediate) benefit from providing the service and hence have an incentive to withhold service. Hence, there is scope for designing a protocol that improves welfare by encouraging exchange. To operate successfully within the confines of the online environment, such a protocol should be distributed, robust, and consistent with individual incentives. This paper proposes and analyzes protocols that rely solely on the exchange of fiat money or tokens. The analysis has much in common with work on search models of money but the requirements of the environment also lead to many differences from previous analyses—and some surprises; in particular, existence of equilibrium becomes a thorny problem and the optimal quantity of money is different. | 10.1007-s00199-013-0744-4 | Journal | Economic Theory | Multi-agent learning | Communications and Networks, Game Theory and Applications, Networks | ||
2013/01/01 00:00 | Efficient Resource Provisioning and Rate Selection for Stream Mining in a Community Cloud | S. Ren, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6413271 | Real-time stream mining such as surveillance and personal health monitoring, which involves sophisticated mathematical operations, is computation-intensive and prohibitive for mobile devices due to the hardware/computation constraints. To satisfy the growing demand for stream mining in mobile networks, we propose to employ a cloud-based stream mining system in which the mobile devices send via wireless links unclassified media streams to the cloud for classification. We aim at minimizing the classification-energy cost, defined as an affine combination of classification cost and energy consumption at the cloud, subject to an average stream mining delay constraint (which is important in real-time applications). To address the challenge of time-varying wireless channel conditions without a priori information about the channel statistics, we develop an online algorithm in which the cloud operator can dynamically adjust its resource provisioning on the fly and the mobile devices can adapt their transmission rates to the instantaneous channel conditions. It is proved that, at the expense of increasing the average stream mining delay, the online algorithm achieves a classification-energy cost that can be pushed arbitrarily close to the minimum cost achieved by the optimal offline algorithm. Extensive simulations are conducted to validate the analysis. | 10.1109/TMM.2013.2240673 | Journal | IEEE Transactions on Multimedia | ||||
2013/01/01 00:00 | Energy-Efficient Design of Real-Time Stream Mining Systems | S. Ren, C. Lan, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6638327 | In this paper, we propose an efficient solution for supporting real-time stream mining applications on heterogeneous systems operating at various processing speeds. Unlike the existing solutions that (1) rely on accurate knowledge or prediction of the service demand of each individual service request and (2) only consider a single type of delay constraint (e.g., typically, average or maximum delay), we propose an optimal algorithm, MinEnergy-MD, which determines the processing speeds for all classifiers based on the probability distribution of the service demand to minimize the average energy consumption while simultaneously satisfying multiple delay constraints. We conduct an extensive study to quantify the performance of MinEnergy-MD. | 10.1109/ICASSP.2013.6638327 | Conference | IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) | ||||
2013/01/01 00:00 | Energy-efficient Nonstationary Power Control in Cognitive Radio Networks | Y. Xiao, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6831536 | Spectrum sharing policies are essential for cognitive radio networks, where primary and secondary users aim to minimize their average energy consumptions subject to minimum throughput requirements. Most existing works proposed stationary spectrum sharing policies, in which users transmit simultaneously at fixed power levels, and need to transmit at high power levels due to multi-user interference. In this paper, we propose nonstationary spectrum sharing policies in which users transmit in a TDMA fashion (but not necessarily in a round-robin manner). Due to the absence of multi-user interference and the ability to let users adaptively switch between transmission and dormancy, our proposed policy greatly improves the spectrum and energy efficiency, and ensures no interference to primary users. Moreover, the proposed policy achieves high energy efficiency even when users have erroneous and binary feedback about their received interference and noise power levels. The proposed policy is also deviation-proof, namely the autonomous users find it in their self-interests to comply with the policy. The proposed policy can be implemented by each user running a low-complexity algorithm in a distributed fashion. Compared to existing policies, the proposed policies can achieve an energy saving of up to 80%. | 10.1109/GLOCOM.2013.6831536 | Conference | IEEE Global Communications Conference (GLOBECOM) | ||||
2013/01/01 00:00 | Entry and Spectrum Sharing Scheme Selection in Femtocell Communications Markets | S. Ren, J. Park, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6205636 | Focusing on a femtocell communications market, we study the entrant network service provider's (NSP's) long-term decision: whether to enter the market and which spectrum sharing technology to select to maximize its profit. This long-term decision is closely related to the entrant's pricing strategy and the users' aggregate demand, which we model as medium-term and short-term decisions, respectively. We consider two markets, one with no incumbent and the other with one incumbent. For both markets, we show the existence and uniqueness of an equilibrium point in the user subscription dynamics and provide a sufficient condition for the convergence of the dynamics. For the market with no incumbent, we derive upper and lower bounds on the optimal price and market share that maximize the entrant's revenue, based on which the entrant selects an available technology to maximize its long-term profit. For the market with one incumbent, we model competition between the two NSPs as a noncooperative game, in which the incumbent and the entrant choose their market shares independently, and provide a sufficient condition that guarantees the existence of at least one pure Nash equilibrium. Finally, we formalize the problem of entry and spectrum-sharing scheme selection for the entrant and provide numerical results to complement our analysis. | 10.1109/TNET.2012.2198073 | Journal | IEEE/ACM Transactions on Networking | Communications and Networks | |||
2013/01/01 00:00 | Finding It Now: Construction and Configuration of Networked Classifiers in Real-Time | R. Ducasse, M. van der Schaar | 2013 | http://link.springer.com/chapter/10.1007/978-1-4614-6859-2_4 | As data is becoming more and more prolific and complex, the ability to process it and extract valuable information has become a critical requirement. However, performing such signal processing tasks requires to solve multiple challenges. Indeed, information must frequently be extracted (a) from many distinct data streams, (b) using limited resources, and (c) in real time to be of value. The aim of this chapter is to describe and optimize the specifications of signal processing systems, aimed at extracting in real time valuable information out of large-scale decentralized datasets. A first section will explain the motivations and stakes which have made stream mining a new and emerging field of research and describe key characteristics and challenges of stream mining applications. We then formalize an analytical framework which will be used to describe and optimize distributed stream mining knowledge extraction from large scale streams. In stream mining applications, classifiers are organized into a connected topology mapped onto a distributed infrastructure. We will study linear chains of classifiers and determine how the ordering of the classifiers in the chain impacts accuracy of classification and delay and determine how to choose the most suitable order of classifiers. Finally, we present a decentralized decision framework upon which distributed algorithms for joint topology construction and local classifier configuration can be constructed. Stream mining is an active field of research, at the crossing of various disciplines, including multimedia signal processing, distributed systems, machine learning etc. As such, we will indicate several areas for future research and development. | 10.1007/978-1-4614-6859-2_4 | Chapter | Handbook of Signal Processing Systems | ||||
2013/01/01 00:00 | Game Theoretic Design of MAC Protocols: Pricing and Intervention in Slotted-Aloha | L. Canzian, Y. Xiao, M. Zorzi, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6736594 | In many wireless communication networks a common channel is shared by multiple users who must compete to gain access to it. The operation of the network by self-interested and strategic users usually leads to the overuse of the channel resources and to substantial inefficiencies. Hence, incentive schemes are needed to overcome the inefficiencies of non-cooperative equilibrium. In this work we consider a slotted-Aloha random access protocol and two incentive schemes: pricing and intervention. We provide some criteria for the designer of the protocol to choose one scheme between them and to design the best policy for the selected scheme, depending on the system parameters. Our results show that intervention can achieve the maximum efficiency in the perfect monitoring scenario. In the imperfect monitoring scenario, instead, there exists a threshold for the number of users such that, for a number of users lower than the threshold, intervention outperforms pricing, whereas, for a number of users higher than the threshold pricing outperforms intervention. | 10.1109/Allerton.2013.6736594 | Conference | Allerton | ||||
2013/01/01 00:00 | Incentive Design for Direct Load Control Programs | M. Alizadeh, Y. Xiao, A. Scaglione, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6736638 | We study the problem of optimal incentive design for voluntary participation of electricity customers in a Direct Load Scheduling (DLS) program, a new form of Direct Load Control (DLC) based on a three way communication protocol between customers, embedded controls in flexible appliances, and the central entity in charge of the program. Participation decisions are made in real-time on an event-based basis, with every customer that needs to use a flexible appliance considering whether to join the program given current incentives. Customers have different interpretations of the level of risk associated with committing to pass over the control over the consumption schedule of their devices to an operator, and these risk levels are only privately known. The operator maximizes his expected profit of operating the DLS program by posting the right participation incentives for different appliance types, in a publicly available and dynamically updated table. Customers are then faced with the dynamic decision making problem of whether to take the incentives and participate or not. We define an optimization framework to determine the profit-maximizing incentives for the operator. In doing so, we also investigate the utility that the operator expects to gain from recruiting different types of devices. These utilities also provide an upper-bound on the benefits that can be attained from any type of demand response program. | 10.1109/Allerton.2013.6736638 | Conference | Allerton | ||||
2013/01/01 00:00 | Incentive Design for Heterogeneous User-Generated Content Networks | J. Xu, M. van der Schaar | 2013 | https://dl.acm.org/doi/abs/10.1145/2627534.2627545 | This paper designs rating systems aimed at incentivizing users in UGC networks to produce content, thereby significantly improving the social welfare of such networks. We explicitly consider that monitoring user's production activities is imperfect. Such imperfect monitoring will lead to undesired rating drop of users, thereby reducing the social welfare of the network. The network topology constraint and users' heterogeneity further complicates the optimal rating system design problem since users' incentives are complexly coupled. This paper determines optimal recommendation strategies under a variety of monitoring scenarios. Our results suggest that, surprisingly, allowing a certain level of freeriding behavior may lead to higher social welfare than incentivizing all users to produce. | 10.1145/2627534.2627545 | Conference | ACM SIGMETRICS W-Pin+NetEcon workshop | ||||
2013/01/01 00:00 | Incentive Provision and Job Allocation in Social Cloud Systems | Y. Zhang, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6544547 | Social cloud systems, which aggregate the computing capabilities of a large pool of users, have emerged in recent years as a key solution for resource provision and sharing in large-scale online communities due to their inherent flexibility and cost-effectiveness. However, the performance and reliability of these systems depend on the users' cooperative behavior in sharing their computing capabilities. Hence, incentive mechanisms are needed to deter users from free-riding. In this paper, we first model the selfish behavior of the users supplying resources and aiming to maximize their own benefits, and compute the performance of the resulting non-cooperative equilibrium, which is highly inefficient. We then augment the existing job allocation schemes currently implemented in social cloud systems with a novel class of incentive mechanisms based on reputation-based pricing and collective punishment schemes that compel suppliers to change their selfish strategies in a manner that improves the efficiency of the system. We study the cloud system operator's problem of jointly optimizing the incentive mechanism and the job allocation scheme in order to find an optimal social cloud protocol which eliminates the free-riding behavior of suppliers while maximizing the social welfare of the system. We rigorously prove that, using only simple designs for both the incentive mechanism and the job allocation scheme, the resulting protocol provides significant improvements in terms of the social welfare compared to existing social cloud systems. | 10.1109/JSAC.2013.SUP.0513053 | Journal | IEEE Journal on Selected Areas in Communications | Multi-agent learning | Communications and Networks | ||
2013/01/01 00:00 | Intervention with Complete and Incomplete Information: Application to Flow Control | L. Canzian, Y. Xiao, W. R. Zame, M. Zorzi, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6544191 | Most congestion control schemes are based on user cooperation, i.e., they implicitly assume that users are willing to share their private information and to take actions such that the network operates efficiently. However, a self-interested and strategic user might exploit such schemes to obtain an individual gain at the expenses of the other users, misrepresenting its private information and overusing the resources. We first quantify the inefficiency of the network in the presence of selfish users for two different scenario: in the complete information case - in which the users have no private information - and in the incomplete information case - in which the users have private information. Then, we ask whether the congestion control scheme can be designed to be robust to self-interested strategic users. To reach this objective, we use an intervention scheme. For the complete information scenario we describe a scheme that is able to give the users an incentive to optimally use the resources. For the incomplete information scenario we describe two schemes that provide the users with an incentive to report truthfully and to use the resources efficiently, although not always optimally. Illustrative results show that the considered schemes can considerably improve the efficiency of the network. | 10.1109/TCOMM.2013.061013.120559 | Journal | IEEE Transactions on Communications | Communications and Networks, Game Theory and Applications | |||
2013/01/01 00:00 | Intervention with Private Information, Imperfect Monitoring and Costly Communication | L. Canzian, Y. Xiao, W. R. Zame, M. Zorzi, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6544194 | This paper studies the interaction between a designer and a group of strategic and self-interested users who possess information the designer does not have. Because the users are strategic and self-interested, they will act to their own advantage, which will often be different from the interest of the designer, even if the latter is benevolent and seeks to maximize (some measure of) social welfare. In the settings we consider, the designer and the users can communicate (perhaps with noise), the designer can observe the actions of the users (perhaps with error) and the designer can commit to (plans of) actions - interventions - of its own. The designer's problem is to construct and implement a mechanism that provides incentives for the users to communicate and act in such a way as to further the interest of the designer - despite the fact that they are strategic and self-interested and possess private information. To address the designer's problem we propose a general and flexible framework that applies to many scenarios. To illustrate the usefulness of this framework, we discuss some simple examples, leaving further applications to other papers. In an important class of environments, we find conditions under which the designer can obtain its benchmark optimum - the utility that could be obtained if it had all information and could command the actions of the users - and conditions under which it cannot. More broadly we are able to characterize the solution to the designer's problem, even when it does not yield the benchmark optimum. Because the optimal mechanism may be difficult to construct and implement, we also propose a simpler and more readily implemented mechanism that, while falling short of the optimum, still yields the designer a "good" result. | 10.1109/TCOMM.2013.061013.120558 | Journal | IEEE Transactions on Communications | Communications and Networks, Game Theory and Applications | |||
2013/01/01 00:00 | Joint Design of Dynamic Scheduling and Pricing in Wireless Cloud Computing | S. Ren, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6566760 | In this paper, we consider a wireless cloud computing system in which a profit-maximizing wireless service provider provides cloud computing services to its subscribers. In particular, we focus on batch services, which, due to their non-urgent nature, allow more scheduling flexibility than their interactive counterparts. Unlike the existing research that studied separately demand-side management and energy cost saving techniques (both of which are critical to profit maximization), we propose a provably-efficient Dynamic Scheduling and Pricing (Dyn-SP) algorithm which proactively adapts the service demand to workload scheduling in the data center and opportunistically utilizes low electricity prices to process batch jobs for energy cost saving. Without the necessity of predicting future information as assumed by some prior works, Dyn-SP can be applied to an arbitrarily random environment in which the electricity price, available renewable energy supply, and wireless network capacities may evolve over time as arbitrary stochastic processes. It is proved that, compared to the optimal offline algorithm with future information, Dyn-SP can produce a close-to-optimal longterm profit while bounding the job queue length in the data center. We also show both analytically and numerically that a desired tradeoff between the profit and queueing delay can be obtained by appropriately tuning the control parameter. Finally, we perform a simulation study to demonstrate the effectiveness of Dyn-SP. | 10.1109/INFCOM.2013.6566760 | Conference | IEEE International Conference on Computer Communications (INFOCOM) | ||||
2013/01/01 00:00 | Joint Physical-Layer and System-Level Power Management for Delay-Sensitive Wireless Communications | N. Mastronarde, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6148231 | We consider the problem of energy-efficient point-to-point transmission of delay-sensitive data (e.g., multimedia data) over a fading channel. Existing research on this topic utilizes either physical-layer centric solutions, namely power-control and adaptive modulation and coding (AMC), or system-level solutions based on dynamic power management (DPM); however, there is currently no rigorous and unified framework for simultaneously utilizing both physical-layer centric and system-level techniques to achieve the minimum possible energy consumption, under delay constraints, in the presence of stochastic and a priori unknown traffic and channel conditions. In this paper, we propose such a framework. We formulate the stochastic optimization problem as a Markov decision process (MDP) and solve it online using reinforcement learning (RL). The advantages of the proposed online method are that 1) it does not require a priori knowledge of the traffic arrival and channel statistics to determine the jointly optimal power-control, AMC, and DPM policies; 2) it exploits partial information about the system so that less information needs to be learned than when using conventional reinforcement learning algorithms; and 3) it obviates the need for action exploration, which severely limits the adaptation speed and runtime performance of conventional reinforcement learning algorithms. Our results show that the proposed learning algorithms can converge up to two orders of magnitude faster than a state-of-the-art learning algorithm for physical layer power-control and up to three orders of magnitude faster than conventional reinforcement learning algorithms. | 10.1109/TMC.2012.36 | Journal | IEEE Transactions on Mobile Computing | Communications and Networks | |||
2013/01/01 00:00 | Joint Scheduling-Traffic Admission Control: Structural Results and Online Learning Algorithm | K. Phan, T. Le-Ngoc, F. Fu, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6655460 | This work studies the joint scheduling - admission control (SAC) problem over a fading channel. In particular, the optimal trade-off between maximizing the throughput and minimizing the queue size (or average congestion) is investigated. The SAC problem is formulated as a constrained Markov decision process (MDP) to maximize a utility defined as a function of the throughput and the queue size. The structural properties of the optimal policies are subsequently derived. When the statistical knowledge of the traffic arrival and channel processes is not available, we propose an online learning algorithm for the optimal policies. The analysis and algorithm development are relied on the reformulation of the Bellman's optimality dynamic programming equation using suitably defined value functions which can be learned using online time-averaging. | 10.1109/ICC.2013.6655460 | Conference | IEEE International Conference on Communications (ICC) Wireless Communications Symposium | ||||
2013/01/01 00:00 | Learning Optimal Classifier Chains for Real-time Big Data Mining | J. Xu, C. Tekin, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6736568 | A plethora of emerging Big Data applications require processing and analyzing streams of data to extract valuable information in real-time. For this, chains of classifiers which can detect various concepts need to be constructed in real-time. In this paper, we propose online distributed algorithms which can learn how to construct the optimal classifier chain in order to maximize the stream mining performance (i.e. mining accuracy minus cost) based on the dynamically-changing data characteristics. The proposed solution does not require the distributed local classifiers to exchange any information when learning at runtime. Moreover, our algorithm requires only limited feedback of the mining performance to enable the learning of the optimal classifier chain. We model the learning problem of the optimal classifier chain at run-time as a multi-player multi-armed bandit problem with limited feedback. To our best knowledge, this paper is the first that applies bandit techniques to stream mining problems. However, existing bandit algorithms are inefficient in the considered scenario due to the fact that each component classifier learns its optimal classification functions using only the aggregate overall reward without knowing its own individual reward and without exchanging information with other classifiers. We prove that the proposed algorithms achieve logarithmic learning regret uniformly over time and hence, they are order optimal. Therefore, the long-term time average performance loss tends to zero. We also design learning algorithms whose regret is linear in the number of classification functions. This is much smaller than the regret results which can be obtained using existing bandit algorithms that scale linearly in the number of classifier chains and exponentially in the number of classification functions. | 10.1109/Allerton.2013.6736568 | Conference | Allerton | Multi-armed bandits | |||
2013/01/01 00:00 | Learning Perfect Coordination with Minimal Feedback in Wireless Multi-Access Communications | W. R. Zame, J. Xu, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6831675 | Coordination is a central problem whenever stations (or nodes or users) share resources across a network. In the absence of coordination, there will be collision, congestion or interference, with concomitant loss of performance. This paper proposes new protocols, which we call perfect coordination (PC) protocols, that solve the coordination problem. PC protocols are completely distributed (requiring neither central control nor the exchange of any control messages), fast (with speeds comparable to those of any existing protocols), fully efficient (achieving perfect coordination, with no collisions and no gaps) and require minimal feedback. PC protocols rely heavily on learning, exploiting the possibility to use both actions and silence as messages and the ability of stations to learn from their own histories while simultaneously enabling the learning of other stations. PC protocols can be formulated as finite automata and implemented using currently existing technology (e.g., wireless cards). Simulations show that, in a variety of deployment scenarios, PC protocols outperform existing state-of-the-art protocols - despite requiring much less feedback. | 10.1109/GLOCOM.2013.6831675 | Conference | IEEE Global Communications Conference (GLOBECOM) | Game Theory and Applications | |||
2013/01/01 00:00 | Learning relaying strategies in cellular D2D Networks with Token-Based Incentives | N. Mastronarde, V. Patel, J. Xu, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6824980 | We consider a cellular network where intelligent cellular devices owned by selfish users are incentivized to cooperate with each other by using tokens, which they exchange electronically to “buy” and “sell” downlink relay services, thereby increasing the network's capacity. We endow each device with the ability to learn its optimal cooperation strategy online in order to maximize its long-term utility in the dynamic network environment. We investigate the impact of the token exchange system on the overall downlink network performance and the performance of individual devices in various deployment scenarios involving mixtures of high and low mobility users. Our results suggest that devices have the greatest incentive to cooperate when the network contains many highly mobile users (e.g., users in motor vehicles). Moreover, within the token system, devices can effectively learn to cooperate online, and achieve over 20% higher throughput on average than with direct transmission alone, all while selfishly maximizing their own utility. | 10.1109/GLOCOMW.2013.6824980 | Conference | IEEE Global Communications Conference (GLOBECOM) International Workshop on Emerging Technologies for LTE-Advanced and Beyond-4G | ||||
2013/01/01 00:00 | Low-complexity reinforcement learning for delay-sensitive compression in networked video stream mining | X. Zhu, C. Lan, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6607600 | In networked video stream mining systems, real-time video contents are captured remotely and, subsequently, encoded and transmitted over bandwidth-constrained networks for classification at the receiver. One key task at the encoder is to adapt its compression on the fly based on time-varying network bandwidth and video characteristics - while attaining low delay and high classification accuracy. In this paper, we formalize the decision at the encoder side as an infinite horizon Markov Decision Process (MDP). We employ low-complexity, model-free reinforcement learning schemes to solve this problem efficiently under dynamic and unknown environment. Our proposed scheme adopts the technique of virtual experience (VE) update to drastically speed up convergence over conventional Q-learning, allowing the encoder to react to abrupt network changes on the order of minutes, instead of hours. In comparison to myopic optimization, it consistently achieves higher overall reward and lower sending delay under various network conditions. | 10.1109/ICME.2013.6607600 | Conference | International Conference on Multimedia and Expo (ICME) | Reinforcement learning | |||
2013/01/01 00:00 | Markov Decision Process Based Energy-Efficient On-Line Scheduling for Slice-Parallel Video Decoders on Multicore Systems | N. Mastronarde, K. Kanoun, D. Atienza, P. Frossard, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6374263 | We consider the problem of energy-efficient on-line scheduling for slice-parallel video decoders on multicore systems with Dynamic Voltage Frequency Scaling (DVFS) enabled processors. In the past, scheduling and DVFS policies in multi-core systems have been formulated heuristically due to the inherent complexity of the on-line multicore scheduling problem. The key contribution of this paper is that we rigorously formulate the problem as a Markov decision process (MDP), which simultaneously takes into account the on-line scheduling and per-core DVFS capabilities; the power consumption of the processor cores and caches; and the loss tolerant and dynamic nature of the video decoder. The objective of the MDP is to minimize long-term power consumption subject to a minimum Quality of Service (QoS) constraint related to the decoder's throughput. We evaluate the proposed on-line scheduling algorithm in Matlab using realistic video decoding traces generated from a cycle-accurate multiprocessor ARM simulator. | 10.1109/TMM.2012.2231668 | Journal | IEEE Transactions on Multimedia | Reinforcement learning | |||
2013/01/01 00:00 | Markov Decision Process Based Energy-efficient Scheduling for Slice-parallel Video Decoding | N. Mastronarde, K. Kanoun, D. Atienza, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6618393 | We consider the problem of energy-efficient scheduling for slice-parallel video decoders on multicore systems with Dynamic Voltage Frequency Scaling (DVFS) enabled processors. We rigorously formulate the problem as a Markov decision process (MDP), which simultaneously considers the on-line scheduling and per-core DVFS capabilities; the power consumption of the processor cores and caches; and the loss tolerant and dynamic nature of the video decoder. The objective is to minimize long-term power consumption subject to a minimum Quality of Service (QoS) constraint related to the decoder's throughput. We evaluate the proposed scheduling algorithm using traces generated from a cycle-accurate multiprocessor ARM simulator. | 10.1109/ICMEW.2013.6618393 | Conference | International Conference on Multimedia and Expo (ICME) | Reinforcement learning | |||
2013/01/01 00:00 | Nonstationary Resource Sharing with Imperfect Binary Feedback: An Optimal Design Framework for Cost Minimization | Y. Xiao, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6736625 | We develop a novel design framework for decentralized resource sharing among self-interested users, who adjust their resource usage levels to minimize the costs of resource usage (e.g. energy consumption or payment) while fulfilling minimum payoff (e.g. throughput) requirements. We model the users' interaction as a repeated resource sharing game with imperfect monitoring, which captures the following features of the considered interaction. First, the users are decentralized and self-interested, i.e. they aim to minimize their own costs based on their locally available information and will not “blindly” follow the prescribed resource sharing rules unless it is in their self-interests to do so. Second, the users coexist in the system for some time and interact with each other repeatedly. Finally, the players receive a binary feedback informing them about the imperfectly measured interference/congestion level. The key feature of our proposed policy is that it is nonstationary, namely the users choose time-varying resource usage levels. This is in contrast with all existing policies, which are stationary and dictate users to choose constant resource usage levels. The proposed nonstationary policy is also deviation-proof, in that the self-interested users find it in their self-interests to comply with the policy, and it can be constructed by a low-complexity online algorithm that is run by each user in a distributed fashion. Moreover, our proposed policy only requires the users to have imperfect binary feedback, as opposed to existing solutions based on repeated game models which require a large amount of feedback. The proposed design framework applies to many resource sharing systems, such as power control, medium access control (MAC), and flow control. As a motivating example, we investigate the performance improvement of our nonstationary policy over state-of-the-art policies in power control, and show that significant performance gain (up to 90% energy saving) can be achieved. | 10.1109/Allerton.2013.6736625 | Conference | Allerton | ||||
2013/01/01 00:00 | Online Learning based Congestion Control for Adaptive Multimedia Transmission | O. Habachi, H. Shiang, M. van der Schaar, Y. Hayel | 2013 | https://ieeexplore.ieee.org/document/6400258 | The increase of Internet application requirements, such as throughput and delay, has spurred the need for transport protocols with flexible transmission control. Current TCP congestion control adopts an Additive Increase Multiplicative Decrease (AIMD) algorithm that linearly increases or exponentially decreases the congestion window based on transmission acknowledgments. In this paper, we propose an AIMD-like media-aware congestion control that determines the optimal congestion window updating policy for multimedia transmission. The media-aware congestion control problem is formulated as a Partially Observable Markov Decision Process (POMDP), which maximizes the long-term expected quality of the received multimedia application. The solution of this POMDP problem gives a policy adapted to multimedia applications' characteristics (i.e., distortion impacts and delay deadlines of multimedia packets). Note that to obtain the optimal congestion policy, the sender requires the complete statistical knowledge of both multimedia traffic and the network environment, which may not be available in practice. Hence, an online reinforcement learning in the POMDP-based solution provides a powerful tool to accurately estimate the environment and to adapt the source to network variations on the fly. Simulation results show that the proposed online learning approach can significantly improve the received video quality while maintaining the responsiveness and TCP-friendliness of the congestion control in various network scenarios. | 10.1109/TSP.2012.2237171 | Journal | IEEE Transactions on Signal Processing | Communications and Networks | |||
2013/01/01 00:00 | Optimal Scheduling over Time-Varying Channels with Traffic Admission Control: Structural Results and Online Learning Algorithms | K.T. Phan, T. Le-Ngoc, M. van der Schaar, F. Fu | 2013 | https://ieeexplore.ieee.org/document/6585729 | This work studies the joint scheduling- admission control (SAC) problem for a single user over a fading channel. Specifically, the SAC problem is formulated as a constrained Markov decision process (MDP) to maximize a utility defined as a function of the throughput and queue size. The optimal throughput- queue size trade-off is investigated. Optimal policies and their structural properties (i.e., monotonicity and convexity) are derived for two models: simultaneous and sequential scheduling and admission control actions. Furthermore, we propose online learning algorithms for the optimal policies for the two models when the statistical knowledge of the time-varying traffic arrival and channel processes is unknown. The analysis and algorithm development are relied on the reformulation of the Bellman's optimality equations using suitably defined state-value functions which can be learned online, at transmission time, using time-averaging. The learning algorithms require less complexity and converge faster than the conventional Q-learning algorithms. This work also builds a connection between the MDP based formulation and the Lyapunov optimization based formulation for the SAC problem. Illustrative results demonstrate the performance of the proposed algorithms in various settings. | 10.1109/TW.2013.081913.121525 | Journal | IEEE Transactions on Wireless Communications | Communications and Networks | |||
2013/01/01 00:00 | Rating systems for enhanced cyber-security investments | J. Xu, Y. Zhang, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6638191 | Networked agents often share security risks but lack the incentive to make (sufficient) security investments if the cost exceeds their own benefit even though doing that would be socially beneficial. In this paper, we develop a systematic and rigorous framework based on rating systems for analyzing and significantly improving the mutual security of a network of agents that interact frequently over a long period of time. When designing the optimal rating systems, we explicitly consider that monitoring the agents' investment actions is imperfect and the heterogeneity of agents in terms of both generated traffic and underlying connectivity. Our analysis shows how the optimal rating system design should adapt to different monitoring and connectivity conditions. Even though this paper considers a simplified model of the networked agents' security, our analysis provides important and useful insights for designing rating systems that can significantly improve the mutual security of real networks in a variety of practical scenarios. | 10.1109/ICASSP.2013.6638191 | Conference | IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) | ||||
2013/01/01 00:00 | Reputation Design for Adaptive Networks with Selfish Agents | C.K. Yu, M. van der Schaar, A.H. Sayed | 2013 | https://ieeexplore.ieee.org/document/6612032 | We consider a general information-sharing game over adaptive networks with selfish agents, in which a diffusion strategy is employed to estimate a common target parameter. The benefit and cost of sharing information are embedded into the individual utility functions. We formulate the interactions among selfish agents as successive one-shot games and show that the dominant strategy is for agents not to share information with each other. In order to encourage cooperation among selfish agents, we design a reputation scheme that enables agents to utilize the historic summary of other agents' past actions to predict future returns that would result from being cooperative i.e., from sharing information with other agents. Simulations illustrate the benefits of the combined diffusion and reputation strategies for learning over networks with selfish agents. | 10.1109/SPAWC.2013.6612032 | Conference | IEEE Workshop on Signal Processing Advances in Wireless Communications (SPAWC) | ||||
2013/01/01 00:00 | Robust Reputation Protocol Design for Online Communities: A Stochastic Stability Analysis | Y. Zhang, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6525348 | This paper proposes a new class of incentive mechanisms aiming at compelling self-interested users in online communities to cooperate with each other by exchanging resources or services. Examples of such communities are social multimedia platforms, social networks, online labor markets, crowdsourcing platforms, etc. To optimize their individual long-term performance, users adapt their strategies by solving individual stochastic control problems. The users' adaptation catalyze a stochastic dynamic process, in which the strategies of users in the community evolve over time. We first characterize the structural properties of the users' best response strategies. Subsequently, using these structural results we design incentive mechanisms based on reputation protocols for governing the online communities, which can “manage” the long-run evolution of the community. We prove that by appropriately penalizing and rewarding users based on their behavior in the community, such incentive mechanisms can eliminate free-riding and ensure that the community converges to a desirable equilibrium selected by the community designer such that social welfare is maximized and in which users find it in their self-interest to cooperate with each other. | 10.1109/JSTSP.2013.2263785 | Journal | IEEE Journal of Selected Topics in Signal Processing | Multi-agent learning, Reinforcement learning | Communications and Networks, Game Theory and Applications | ||
2013/01/01 00:00 | Socially-Optimal Design of Crowdsourcing Platforms With Reputation Update Errors | Y. Xiao, Y. Zhang, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6638667 | Crowdsourcing systems (e.g. Yahoo! Answers and Amazon Mechanical Turk) provide a platform for requesters, who have tasks to solve, to ask for help from workers. Vital to the proliferation of crowdsourcing systems is incentivizing the workers to exert high effort to provide high-quality services. Reputation mechanisms have been shown to work effectively as incentive schemes in crowdsourcing systems. A reputation agency updates the reputations of the workers based on the requesters' reports on the quality of the workers' services. A low-reputation worker is less likely to get served when it requests help, which provides incentives for the workers to obtain a high reputation by exerting high effort. However, reputation update errors are inevitable, because of either system errors such as loss of reports, or inaccurate reports, resulting from the difficulty in accurately assessing the quality of a worker's service. The reputation update error prevents existing reputation mechanisms from achieving the social optimum. In this paper, we propose a simple binary reputation mechanism, which has only two reputation labels (“good” and “bad”). To the best of our knowledge, our proposed reputation mechanism is the first that is proven to be able to achieve the social optimum even in the presence of reputation update errors. We provide design guidelines for socially-optimal binary reputation mechanisms. | 10.1109/ICASSP.2013.6638667 | Conference | IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) | ||||
2013/01/01 00:00 | Spectrum Sharing Policies for Heterogeneous Delay-Sensitive Users: A Novel Design Framework | Y. Xiao, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6736509 | We develop a novel design framework for spectrum sharing among distributed users with heterogeneous delay-sensitivity (e.g. users with video streaming that requires low delay, and users with video conferencing that requires very low delay). Most existing spectrum sharing policies are stationary, i.e. users transmit at constant power levels simultaneously. Under stationary policies, the users have low throughput due to the strong interference from each other. Nonstationary spectrum sharing policies, which allow users to transmit at time-varying power levels, can significantly improve the spectrum efficiency. The most well-known and simple nonstationary policy is the round-robin TDMA (time-division multiple access) policy, in which the users access the spectrum in turn. Although the round-robin TDMA policy increases the spectrum efficiency by eliminating multi-user interference, it is suboptimal in terms of quality of experience for delay-sensitive users, especially when they have heterogeneous delay-sensitivity. This is because the round-robin TDMA policy allocates the users' transmission opportunities in a predetermined order such that they have (roughly) the same amount of transmission opportunities in any duration of time. However, some users may have earlier deadlines and need more transmission opportunities early on, while some can wait until later. This heterogeneity in delay-sensitivity is not considered in the round-robin TDMA policy. In this paper, we propose nonstationary policies that allocate the transmission opportunities based on the users' delay-sensitivity and their past deadline-abiding transmissions. As we will see, the optimal policy is not cyclic at all as is the round-robin TDMA policy. We also propose a low-complexity algorithm, which can be run by each user in a distributed manner, to construct the optimal nonstationary policy. Simulation results validate our analytical results and quantify the performance gains enabled by the proposed policies. | 10.1109/Allerton.2013.6736509 | Conference | Allerton | ||||
2013/01/01 00:00 | Strategic Information Dissemination and Link Formation in Social Networks | Y. Zhang, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6638668 | In this paper, we propose a novel game-theoretic framework for analyzing and understanding how strategic networks are formed endogenously, driven by the self-interested decisions of individual agents aiming to maximize their own utilities by trading-off the costs and benefits of forming links with other agents. We explicitly model and analyze the scenario in which agents benefit from disseminating their own information to other agents. We rigorously prove that the equilibria of strategic networks frequently exhibit a core-periphery structure, where there are only few agents at the center (core) of the network while the majority of agents are at the periphery of the network and communicate with other agents via links maintained by the “core” agents, who play the role of “connectors” in the network. Also, we are able to determine under what conditions the strategic networks operating in equilibrium are minimally connected (i.e. there is a unique path between any two agents) and have short network diameters. These properties are commonly observed on the Internet and important because they ensure the efficiency and robustness of the resulting equilibrium networks. However, none of these has been rigorously proven in a formal framework before. | 10.1109/ICASSP.2013.6638668 | Conference | IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) | ||||
2013/01/01 00:00 | Strategic Information Dissemination in Endogenous Networks | Y. Zhang, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6736550 | The self-interest of agents in strategic networks, i.e. networks where self-interested agents interact, leads to intrinsic incentive problems which impact the stability and efficiency of such networks. This paper propose the first game-theoretic framework for analyzing and understanding how strategic networks are formed endogenously, driven by the self-interested decisions of individual agents aiming to maximize their own utilities by trading-off the costs of forming links with other agents and the benefits of disseminating information to other agents. The proposed framework departs from the traditional research on strategic link formation in economics which postulates that agents only benefit by forming links to acquire the information produced by other agents. Given the agents' interests in information dissemination, our analysis is able to predict several important properties of the strategic networks (arising from the agents' strategic link formation) at equilibria. We rigorously prove that, in equilibrium, strategic networks frequently exhibit a core-periphery structure that is commonly observed on the Internet. In such core-periphery networks there are only few agents at the center (core) of the network while the majority of agents are at the periphery of the network and communicate with other agents via links maintained by the “core” agents, who play the role of “connectors” in the network. Also, the proposed framework can be used to determine under what conditions the strategic networks operating in equilibrium are minimally connected (i.e. there is a unique path between any two agents) and have short network diameters. These properties are important because they ensure the efficiency and robustness of the resulting equilibrium networks. | 10.1109/Allerton.2013.6736550 | Conference | Allerton | ||||
2013/01/01 00:00 | Strategic Networks: Information Dissemination and Link Formation Among Self-interested Agents | Y. Zhang, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6517115 | This paper presents the first study of the endogenous formation of networks by strategic, self-interested agents who benefit from producing and disseminating information. This work departs from previous works on network formation (especially in the economics literature) which assume that agents benefit only by acquiring information produced by other agents. The strategic production and dissemination of information have striking consequences. We show first that the network structure that emerges (in equilibrium) typically displays a core-periphery structure, with the few agents at the core playing the role of eeconnectorsee, creating and maintaining links to the agents at the periphery. We then determine conditions under which the networks that emerge are minimally connected and have short network diameters (properties that are important for efficiency). Finally, we show that the number of agents who produce information and the total amount of information produced in the network grow at the same rate as the agent population; this is in stark contrast to the "law of the few" that had been established in previous works which do not consider information dissemination. | 10.1109/JSAC.2013.130613 | Journal | IEEE Journal on Selected Areas in Communications | Multi-agent learning | Communications and Networks, Game Theory and Applications, Networks | ||
2013/01/01 00:00 | Tiered Billing Scheme for Residential Load Scheduling with Bidirectional Energy Trading | Kim, S. Ren, M. van der Schaar, J.-W. Lee | 2013 | https://ieeexplore.ieee.org/document/6562889 | Future generation smart grids will allow customers to trade energy bidirectionally. Specifically, each customer will be able to not only buy energy from the aggregator during its peak hours but also sell its surplus energy during its off-peak hours. In these emerging energy trading markets, a key component will be the deployment of effective energy billing schemes which consider the customers residential load scheduling. In this paper, we consider a residential load scheduling problem with bidirectional energy trading. Compared with the previous work, in which customers are assumed to be obedient and agree to maximize the social welfare of the smart grid system, in this paper, we consider a non-collaborative approach, where consumers are self-interested. We model the energy scheduling problem as a non-cooperative game, where each customer determines its load scheduling and energy trading to maximize its own profit. In order to resolve the unfairness between heavy and light customers, we propose a novel tiered billing scheme that can control the electricity rates for customers according to their different energy consumption levels. We also propose a distributed energy scheduling algorithm that converges to the unique Nash equilibrium of the studied non-cooperative game. Through the numerical results, we study the impact of the proposed tiered billing scheme on the selfish customers' behavior and on their incentives to participate in the energy trading market. | 10.1109/INFCOMW.2013.6562889 | Conference | IEEE International Conference on Computer Communications (INFOCOM) Workshop on Smart Data Pricing (SDP) | ||||
2013/01/01 00:00 | Token System Design for Autonomic Wireless Relay Networks | J. Xu, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6544193 | This paper proposes a novel framework for incentivizing self-interested transceivers operating in autonomic wireless networks to provide relaying services to other transceivers in exchange for tokens. Tokens represent a simple internal currency which can be used by the transceivers in a network to exchange services. Our emphasis in this paper is on developing optimal designs for the token system, which maximize the system efficiency, i.e. the probability that the relay transmission will be executed by transceivers whenever they are requested to provide such services. Particularly, we prove that the efficiency of the relay network heavily depends on issuing the proper amount of tokens rather than an arbitrary amount. First, we study the transceivers' optimal strategies (i.e. the strategies that maximize the transceivers' own utilities) using the formalism of repeated games. We prove that these strategies exhibit a simple threshold structure. We also prove that the threshold is unique given transmission costs. Second, we determine the optimal token amount which needs to be introduced in the relay system to maximize the overall relay network efficiency. This amount needs to be neither too small (since a too small amount leads to a small relaying service request probability) nor too large (since a too large amount leads to a small relaying service provision probability) and depends on the threshold strategy that the self-interested transceivers adopt. We subsequently develop an efficient algorithm which is able to determine, depending on the network characteristics, the threshold to be implemented by the optimal strategies and the optimal token amount. Finally, simulation results show the effectiveness of our token system design in providing incentives for cooperation among self-interested relays in autonomic wireless relay networks. | 10.1109/TCOMM.2013.061013.120777 | Journal | IEEE Transactions on Communications | Communications and Networks | |||
2013/01/01 00:00 | Utility-based Server Management Strategy in Cloud Networks | E. Choi, S. Song, H. Kim, J. Hong, H. Park, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6825031 | Numerous real-time applications are currently deployed over mobile networks - ranging from multimedia streaming, to real-time online games, to video conferences/chat, to real-time stock exchanges, etc. Supporting such applications is challenging because they have very stringent Quality of Service (QoS) requirements in terms of both throughput and delay. To address this challenge, in this paper, we propose to assist such mobile applications by a cloud-based network environment, which consists of multiple servers and clients (users). In cloud networks, servers can create multiple replica of the popular content in order to provide the needed QoS for the various service requests of the users. However, in order to efficiently provide services to the potentially large amount of service requests, it is essential for servers to strategically respond to the requests from users. Since the strategic response involves the resource management of the servers, we design a strategy which explicitly considers the service requests (e.g., service types, data priorities, delay constraints) as well as the servers' resource usages (e.g., current loads of servers, contributions of servers). Since the servers are strategic, they will aim to maximize their own utilities. Simulation results verify that the proposed approach enables servers to manage their resources more efficiently compared to existing approaches, thereby providing prioritized data processing and operating in a desired range and leading also to an improved performance for the users. | 10.1109/GLOCOMW.2013.6825031 | Conference | IEEE Global Communications Conference (GLOBECOM) Workshop on Cloud Computing Systems, Networks, and Applications (CCSNA) | ||||
2013/01/01 00:00 | Winning the Lottery: Learning Perfect Coordination with Minimal Feedback | W. R. Zame, J. Xu, M. van der Schaar | 2013 | https://ieeexplore.ieee.org/document/6507338 | Coordination is a central problem whenever stations (or nodes or users) share resources across a network. In the absence of coordination, there will be collision, congestion or interference, with concomitant loss of performance. This paper proposes new protocols, which we call perfect coordination (PC) protocols, that solve the coordination problem. PC protocols are completely distributed (requiring neither central control nor the exchange of any control messages), fast (with speeds comparable to those of any existing protocols), fully efficient (achieving perfect coordination, with no collisions and no gaps) and require minimal feedback. PC protocols rely heavily on learning, exploiting the possibility to use both actions and silence as messages and the ability of stations to learn from their own histories while simultaneously enabling the learning of other stations. PC protocols can be formulated as finite automata and implemented using currently existing technology (e.g., wireless cards). Simulations show that, in a variety of deployment scenarios, PC protocols outperform existing state-of-the-art protocols-despite requiring much less feedback. | 10.1109/JSTSP.2013.2259465 | Journal | IEEE Journal of Selected Topics in Signal Processing | Multi-agent learning, Reinforcement learning | Communications and Networks, Game Theory and Applications | ||
2012/01/01 00:00 | Analytical Modeling for Delay-Sensitive Video over WLAN | H. Bobarshad, M. van der Schaar, A. H. Aghvami, R. S. Dilmaghani, M. Shikh-Bahaei | 2012 | https://ieeexplore.ieee.org/document/6061963 | Delay-sensitive video transmission over IEEE 802.11 wireless local area networks (WLANs) is analyzed in a cross-layer optimization framework. The effect of delay constraint on the quality of received packets is studied by analyzing “expired-time packet discard rate”. Three analytical models are examined and it is shown that M/M/1 model is quite an adequate model for analyzing delay-limited applications such as live video transmission over WLAN. The optimal MAC retry limit corresponding to the minimum “total packet loss rate” is derived by exploiting both mathematical analysis and NS-2 simulations. We have shown that there is an interaction between "packet overflow drop" and "expired-time packet discard" processes in the queue. Subsequently, by introducing the concept of virtual buffer size, we will obtain the optimal buffer size in order to avoid "packet overflow drop". We finally introduced a simple and yet effective real-time algorithm for retry-limit adaptation over IEEE 802.11 MAC in order to maintain a loss protection for delay-critical video traffic transmission, and showed that the average link-layer throughput can be improved by using our adaptive scheme. | 10.1109/TMM.2011.2173477 | Journal | IEEE Transactions on Multimedia | Communications and Networks | |||
2012/01/01 00:00 | Collective Ratings for Online Labor Markets | Y. Zhang, M. van der Schaar | 2012 | https://ieeexplore.ieee.org/document/6483242 | In online labor markets, experts sell their expertise to buyers. Despite the success and the perceived promise of online labor markets, they face a serious practical challenge: providing appropriate incentives for experts to participate and exert effort to accurately (successfully) complete tasks. Personal rating schemes have been proposed to address this challenge: they provide differentiated reward/punishment to experts in order to incentivize them to cooperate (i.e. to their best to complete tasks). However, when the transactions in a market are subject to errors, the experts are wrongly punished frequently whenever personal rating schemes are deployed. This not only reduces the experts' incentives to cooperate, but also it harms the market performance such as the obtained social welfare or revenue. To mitigate the problem of wrong punishments, we develop a novel game-theoretic formalism based on collective ratings. We formalize an online labor market as a two-sided trading platform where buyers and experts interact repeatedly. The market designer's problem is to create a market policy that maximizes the market's revenue subject to the constraints imposed by the characteristics of the market and the incentives of the participants. We propose to organize such markets by dividing experts into groups for which a collective rating is created and maintained based on the buyers' aggregated feedback. We analyze how the group size and the adopted rating scheme affect the market's revenue and the social welfare of the participants in the market, and determine the optimal design of the market policy. We show that collective ratings are surprisingly more effective and more robust than personal rating for a wide variety of online labor markets. | 10.1109/Allerton.2012.6483242 | Conference | Allerton | ||||
2012/01/01 00:00 | Congestion, Information, and Secret Information in Flow Networks | K. T. Phan, M. van der Schaar, W. R. Zame | 2012 | https://ieeexplore.ieee.org/document/6121929 | Some users of a communications network may have more information about traffic on the network than others do-and this fact may be secret. Such secret information would allow the possessor to tailor its own traffic to the traffic of others; this would help the secret information possessor or informed user and (might) harm other uninformed users. To quantitatively study the impact of secret information, we formulate a flow control game with incomplete information where users choose their flows in order to maximize their (expected) utilities given the distribution of the actions of others. In this environment, the natural baseline notion is Bayesian Nash Equilibrium (BNE); we establish the existence of BNE. Next, we assume that there is a user who knows the realized congestion created by other users, but that the presence of this informed user is not known by other uninformed users; thus, the informed user has secret information. For this environment, we define a new equilibrium concept: the Bayesian Nash Equilibrium with Secret Information (BNE-SI) and establish its existence. We establish rigorous estimates for the benefit (to the informed user) and harm (to the uninformed users) that result from secret information; both the benefit and the harm become smaller for large networks. Interestingly, simulations demonstrate that secret information may in fact benefit all users. Secret information may also harm uninformed users in particular scenarios. This analysis can be used as a starting point for securing communications networks, both from the network manager and the user's perspectives. | 10.1109/JSTSP.2011.2182496 | Journal | IEEE Journal of Selected Topics in Signal Processing | Communications and Networks | |||
2012/01/01 00:00 | Data Demand Dynamics in Wireless Communications Markets | S. Ren, M. van der Schaar | 2012 | https://ieeexplore.ieee.org/document/6093980 | In this paper, we focus on the users' aggregate data demand dynamics in a wireless communications market served by a monopolistic wireless service provider (WSP). Based on the equilibrium data demand, we optimize the WSP's data plans and long-term network capacity decisions to maximize its profit. First, by considering a market where only one data plan is offered, we show that there exists a unique equilibrium in the data demand dynamics regardless of the data plans, and that the convergence of data demand dynamics is subject to the network congestion cost, which is closely related to the WSP's long-term capacity decision. A sufficient condition on the network congestion cost indicates that the WSP needs to provide a sufficiently large network capacity to guarantee the convergence of data demand dynamics. We also propose a heuristic algorithm that progressively optimizes the WSP's data plan to maximize its equilibrium revenue. Next, we turn to a market where two different data plans are offered. It is shown that the existence of a unique equilibrium data demand depends on the data plans, and the convergence of data demand dynamics is still subject to the network congestion cost (and hence, the WSP's network capacity, too). We formalize the problem of optimizing the WSP's data plans and network capacities to maximize its profit. Finally, we discuss the scenario in which the data plans are offered by two competing WSPs and conduct extensive simulations to validate our analysis. | 10.1109/TSP.2011.2177826 | Journal | IEEE Transactions on Signal Processing | Communications and Networks | |||
2012/01/01 00:00 | Designing Incentives for Wireless Relay Networks Using Tokens | J. Xu, M. Van der Schaar | 2012 | https://ieeexplore.ieee.org/document/6260451 | This paper proposes a novel system design for wireless relay networks formed of self-interested users that relies on token exchanges. Our emphasis in this paper is on developing optimal designs for token systems to be deployed in relay networks. The optimal designs aim to maximize the probability that the relay transmission will be executed by transceivers whenever they are requested to provide such services. We prove that the efficiency of the relay network heavily depends on issuing the optimal amount of tokens rather than an arbitrary amount. We formulate the design problem of the token system as a bi-level optimization problem. In the inner level optimization problem, we determine the transceivers' incentive-compatible strategies (i.e. the strategies that maximize the transceivers' own utilities). We prove that these strategies exhibit a simple threshold structure. The outer level problem determines the optimal token amount, which maximizes the overall relay network efficiency. We prove that the optimal amount of tokens tokens needs to be neither too small nor too large and depends on the threshold that the self-interested transceivers adopt in the inner level problem. | Conference | International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt) | |||||
2012/01/01 00:00 | Distributed Spectrum Sharing Policies for Selfish Users with Imperfect Monitoring Ability | Y. Xiao, M. Van der Schaar | 2012 | https://ieeexplore.ieee.org/document/6489093 | We develop a novel design framework for distributed spectrum sharing among secondary users (SUs), each one of which adjusts its power level to maximize its own payoff (e.g. throughput) while satisfying the interference temperature constraints imposed by primary users. Since the SUs can coexist in the system for a long time, we propose spectrum sharing policies that allow users to transmit in a time-division multiple-access (TDMA) fashion. In the presence of strong multi-user interference, our proposed TDMA policy outperforms existing spectrum sharing policies that dictate users to transmit at constant power levels simultaneously. Our proposed policy achieves Pareto optimality even when the SUs have limited and imperfect monitoring ability: they only observe whether the IT constraints are violated, and their observation is imperfect due to the erroneous measurements. In addition, our policy is deviation-proof, such that the autonomous users will find it in their self-interests to follow the policy. The policy can be implemented by the users in a distributed manner. Simulation results validate our analytical results and quantify the performance gains enabled by the proposed spectrum sharing policies. | 10.1109/ACSSC.2012.6489093 | Conference | Asilomar Conference on Signals, Systems, and Computers | ||||
2012/01/01 00:00 | Dynamic Spectrum Sharing Among Repeatedly Interacting Selfish Users With Imperfect Monitoring | Y. Xiao, M. van der Schaar | 2012 | https://ieeexplore.ieee.org/document/6331680 | We develop a novel design framework for dynamic distributed spectrum sharing among secondary users (SUs), who adjust their power levels to compete for spectrum opportunities while satisfying the interference temperature (IT) constraints imposed by primary users. The considered interaction among the SUs is characterized by the following three unique features. First, the SUs are interacting with each other repeatedly and they can coexist in the system for a long time. Second, the SUs have limited and imperfect monitoring ability: they only observe whether the IT constraints are violated, and their observation is imperfect due to the erroneous measurements. Third, since the SUs are decentralized, they are selfish and aim to maximize their own long-term payoffs from utilizing the network rather than obeying the prescribed allocation of a centralized controller. To capture these unique features, we model the interaction of the SUs as a repeated game with imperfect monitoring. We first characterize the set of Pareto optimal operating points that can be achieved by deviation-proof spectrum sharing policies, which are policies that the selfish users find it in their interest to comply with. Next, for any given operating point in this set, we show how to construct a deviation-proof policy to achieve it. The constructed deviation-proof policy is amenable to distributed implementation, and allows users to transmit in a time-division multiple-access (TDMA) fashion. In the presence of strong multi-user interference, our policy outperforms existing spectrum sharing policies that dictate users to transmit at constant power levels simultaneously. Moreover, our policy can achieve Pareto optimality even when the SUs have limited and imperfect monitoring ability, as opposed to existing solutions based on repeated game models, which require perfect monitoring abilities. Simulation results validate our analytical results and quantify the performance gains enabled by the proposed spectrum sharing policies. | 10.1109/JSAC.2012.121105 | Journal | IEEE Journal on Selected Areas in Communications | Multi-agent learning | Communications and Networks | ||
2012/01/01 00:00 | Energy-Efficient Community Cloud for Real-Time Stream Mining | S. Ren, M. Van der Schaar | 2012 | https://ieeexplore.ieee.org/document/6425967 | Real-time stream mining such as surveillance and personal health monitoring is computation-intensive and prohibitive for mobile devices due to the hardware/computation constraints. To satisfy the growing demand for stream mining in mobile networks, we propose to employ a cloud-based stream mining system in which the mobile devices send via wireless links unclassified media streams to the cloud for classification. We focus on minimizing the classification-energy cost, defined as an affine combination of classification cost and energy consumption at the cloud, subject to an average stream mining delay constraint (which is important in real-time applications). To address the challenge of time-varying wireless channel conditions without a priori information about the channel statistics, we develop an online algorithm in which the cloud operator can adjust its resource provisioning on the fly and the mobile devices can adapt their transmission rates to the instantaneous channel conditions. It is proved that, at the expense of increasing the average stream mining delay, the online algorithm achieves a classification-energy cost that can be pushed arbitrarily close to the minimum cost achieved by the optimal offline algorithm. Extensive simulations are conducted to validate the analysis. | 10.1109/CDC.2012.6425967 | Conference | IEEE Conference on Decision and Control (CDC) | ||||
2012/01/01 00:00 | Energy-efficient Delay-critical Communication in Unknown Wireless Environments | N. Mastronarde, M. van der Schaar | 2012 | http://mmc.committees.comsoc.org/files/2016/04/E-Letter-November12.pdf | Delay-critical multimedia applications such as videoconferencing, surveillance, medical health monitoring, etc. often operate in dynamic wireless environments where they experience time-varying and a priori unknown channel conditions and traffic loads. We propose to learn the impact of these dynamics on the user’s utility using a novel class of online reinforcement learning techniques that do not require a priori specified models of these environments. | Other | IEEE Communications Society Multimedia Communications Technical Committee E-letter | |||||
2012/01/01 00:00 | Information Production and Link Formation in Social Computing Systems | Y. Zhang, M. van der Schaar | 2012 | https://ieeexplore.ieee.org/document/6354272 | Social computing provide a popular, cost-effective and scalable framework for building new engineering systems as well as improving the performance of numerous existing systems. However, the self-interest of agents of such systems generates intrinsic incentive problems. This work analyzes these incentive problems from several points of view. First, we analyze the trade-offs (of each individual agent) between the costs and benefits of producing information personally and forming links to collect information (from other agents), and the strategic implications of these trade-offs. A central point of the analysis is that information is assumed to be heterogeneous (rather than homogeneous as in previous analyses) and agents value this heterogeneity. The analysis has implications for the topology that emerges endogenously. For large populations, the implication is that the topology is necessarily of a core-periphery type: hub agents (at the core of the network) produce and share most of the information, while spoke agents (at the periphery of the network) derive most of their information from hub agents, producing little of it themselves. As the population becomes larger, the number of hub agents and the total amount of information produced grow in proportion to the total population. Our conclusions had been conjectured for many social computing systems but not been previously derived in any formal framework, and are in stark contradiction to the "law of the few" that had been established in previous work, under the assumption that information is homogeneous and part of the endowment of agents, rather than heterogeneous and produced. | 10.1109/JSAC.2012.121206 | Journal | IEEE Journal on Selected Areas in Communications | Multi-agent learning | Communications and Networks, Networks | ||
2012/01/01 00:00 | Intervention in Power Control Games With Selfish Users | Y. Xiao, J. Park, M. van der Schaar | 2012 | https://ieeexplore.ieee.org/document/6093719 | We study the power control problem in single-hop wireless ad hoc networks with selfish users. Without incentive schemes, selfish users tend to transmit at their maximum power levels, causing excessive interference to each other. In this paper, we study a class of incentive schemes based on intervention to induce selfish users to transmit at desired power levels. In a power control scenario, an intervention scheme can be implemented by introducing an intervention device that can monitor the power levels of users and then transmit power to cause interference to users if necessary. Focusing on first-order intervention rules based on individual transmit powers, we derive conditions on the intervention rates and the power budget to achieve a desired outcome as a (unique) Nash equilibrium with intervention and propose a dynamic adjustment process to guide users and the intervention device to the desired outcome. We also analyze the effect of using aggregate receive power instead of individual transmit powers. Our results show that intervention schemes can be designed to achieve any positive power profile while using interference from the intervention device only as a threat. Lastly, simulation results are presented to illustrate the performance improvement from using intervention schemes and the theoretical results. | 10.1109/JSTSP.2011.2177811 | Journal | IEEE Journal of Selected Topics in Signal Processing | Communications and Networks | |||
2012/01/01 00:00 | Maximizing Profit on User Generated Content Platforms with Heterogeneous Participants | S. Ren, J. Park, M. van der Schaar | 2012 | https://ieeexplore.ieee.org/document/6195804 | In this paper, we consider a user-generated content platform monetized through advertising and managed by an intermediary. To maximize the intermediary's profit given the rational decision-making of content viewers and heterogeneous content producers, a payment scheme is proposed in which the intermediary can either tax or subsidize the content producers. First, we use a model with a representative content viewer to determine how the content viewers' attention is allocated across available content by solving a utility maximization problem. Then, by modeling the content producers as self-interested agents making independent production decisions, we show that there exists a unique equilibrium in the content production stage, and propose a best-response dynamics to model the decision-making process. Next, we study the intermediary's optimal payment based on decisions made by the representative content viewer and the content producers. In particular, by considering the well-known quality-adjusted Dixit-Stiglitz utility function for the representative content viewer, we derive explicitly the optimal payment maximizing the intermediary's profit and characterize analytical conditions under which the intermediary should tax or subsidize the content producers. Finally, we generalize the analysis by considering heterogeneity in terms of production costs among the content producers. | 10.1109/INFCOM.2012.6195804 | Conference | IEEE International Conference on Computer Communications (INFOCOM) | ||||
2012/01/01 00:00 | Mitigating Uncertainty in Stackelberg Games | S. Parsaeefard, M. Van der Schaar, A. Sharafat | 2012 | https://ieeexplore.ieee.org/document/6426583 | We present robust Stackelberg games with two types of players: leaders who have side information and can identify the actions of other players, and followers who do not have such information. In such games, a leader chooses its actions based on its side information, and a follower chooses its actions myopically based on its observations. However, in many cases, neither leaders nor followers can obtain accurate measurements, and there is a need to study the effect of uncertainty on the players' actions. In this paper, we introduce two types of robust equilibria for single-leader single-follower Stackelberg games, compare the performance of the robust game with that of the game that has exact side information, and validate our theoretical results by way of numerical calculations for the power control game in interference channels. | 10.1109/CDC.2012.6426583 | Conference | IEEE Conference on Decision and Control (CDC) | ||||
2012/01/01 00:00 | MOS-based Congestion Control for Conversational Services | O. Habachi, Y. Hu, M. Van der Schaar, Y. Hayel, F. Wu. | 2012 | https://ieeexplore.ieee.org/document/6248265 | Nowadays, multimedia applications and specifically streaming systems over wireless networks use the TCP transport protocol. Indeed, TCP can deal with practical issues such as firewalls and also deploys built-in retransmissions and congestion control mechanisms. We propose in this paper a Quality-centric Mean Opinion Score (MOS) based congestion control that determines an optimal congestion window updating policy for multimedia transmission. Unlike the standard congestion control algorithms, our approach defines a new Additive Increase Multiplicative Decrease (AIMD) algorithm given the multimedia application and the transmission characteristics. In order to get the optimal congestion policy in practice, the sender requires complete statistical knowledge of both multimedia traffic and the network environment, which may not be available in wireless systems. Hence, we propose in this paper, a Partially Observable Markov Decision Process (POMDP) framework in order to determine an optimal congestion control policy which maximizes the long term expected Quality of Experience (QoE) of the receiver. Moreover, the computation of an optimal policy is usually time/process consuming and as wireless devices are capacity-limited, we consider optimal solutions based on temporal difference (TD-λ) online learning algorithms. Finally, we do some practical experiments of our algorithm on a Microsoft Lync testbed with unidirectional and bidirectional communications over a wireless network. We observe that for both scenarios, our algorithm improves significantly the QoE compared to standard AIMD congestion control mechanism. | 10.1109/JSAC.2012.120808 | Journal | IEEE Journal on Selected Areas in Communications | Communications and Networks | |||
2012/01/01 00:00 | Near-Optimal Deviation-Proof Medium Access Control Designs in Wireless Networks | K. T. Phan, J. Park, M. van der Schaar | 2012 | https://dl.acm.org/doi/10.1109/TNET.2011.2182359 | Distributed medium access control (MAC) protocols are essential for the proliferation of low-cost, decentralized wireless local area networks (WLANs). Most MAC protocols are designed with the presumption that nodes comply with prescribed rules. However, selfish nodes have natural motives to manipulate protocols in order to improve their own performance. This often degrades the performance of other nodes as well as that of the overall system. In this paper, we propose a class of protocols that limit the performance gain from selfish manipulation while incurring only a small efficiency loss. The proposed protocols are based on the idea of a review strategy, with which nodes collect signals about the actions of other nodes over a period of time, use a statistical test to infer whether or not other nodes are following the prescribed behavior, and trigger a punishment if a deviation is inferred. We consider the cases of private and public signals and provide analytical and numerical results to demonstrate the properties of the proposed protocols. | 10.1109/TNET.2011.2182359 | Journal | IEEE/ACM Transactions on Networking | Communications and Networks | |||
2012/01/01 00:00 | Online Learning in BitTorrent Systems | R. Izhak-Ratzin, H. Park, M. van der Schaar | 2012 | https://ieeexplore.ieee.org/document/6171168 | We propose a BitTorrent-like protocol based on an online learning (reinforcement learning) mechanism, which can replace the peer selection mechanisms in the regular BitTorrent protocol. We model the peers' interactions in the BitTorrent-like network as a repeated stochastic game, where the strategic behaviors of the peers are explicitly considered. A peer that applies the reinforcement learning (RL)-based mechanism uses the observations on the associated peers' statistical reciprocal behaviors to determine its best responses and estimate the corresponding impact on its expected utility. The policy determines the peer's resource reciprocations such that the peer can maximize its long-term performance. We have implemented the proposed mechanism and incorporated it into an existing BitTorrent client. Our experiments performed on a controlled Planetlab testbed confirm that the proposed protocol 1) promotes fairness and provides incentives to contributed resources, i.e., high capacity peers improve their download completion time by up to 33 percent, 2) improves the system stability and robustness, i.e., reduces the peer selection fluctuations by 57 percent, and (3) discourages free-riding, i.e., peers reduce their uploads to free-riders by 64 percent as compared to the regular BitTorrent protocol. | 10.1109/TPDS.2012.90 | Journal | IEEE Transactions on Parallel and Distributed Systems | Reinforcement learning | Communications and Networks | ||
2012/01/01 00:00 | Peer-to-Peer Multimedia Sharing based on Social Norms | Y. Zhang, M. van der Schaar | 2012 | https://www.sciencedirect.com/science/article/abs/pii/S092359651200032X | Designing incentive schemes for Peer-to-Peer (P2P) multimedia sharing applications, where the participating peers find it in their self-interest to contribute resources rather than to “free-ride”, is challenging due to the unique features exhibited by such networks: large populations of anonymous peers interacting infrequently, asymmetric interests of peers, network errors, multiple concurrent transactions, low-cost implementation requirements, etc. In this paper, to address these challenges, we design and rigorously analyze a new family of incentive protocols that utilizes social norms. In the proposed protocols, each peer maintains a reputation reflecting its past behaviors in the P2P system (i.e. whether the peers have followed or not the social strategy prescribed by the social norm), and the social norm rewards and punishes peers depending on their reputations. We first define the concept of a sustainable social norm, under which no peer has an incentive to deviate from the social strategy prescribed by the protocol. We then formulate the problem of designing optimal social norms, which selects the social norm that maximizes the network performance among all sustainable social norms. In particular, we prove that, given the P2P network and peers' characteristics, social norms can be designed such that it becomes in the self-interest of peers to contribute their contents to the network rather than to free-ride. We also investigate the impact of various punishment schemes on the social welfare as well as how should the optimal social norms be designed if altruistic and malicious peers are active in the network. Our results show that optimal social norms are capable of deterring free-riding behaviors and providing significant improvements in the sharing efficiency of multimedia P2P networks. | 10.1016/j.image.2012.02.003 | Journal | Signal Processing: Image Communication | Multi-agent learning | Communications and Networks | ||
2012/01/01 00:00 | Pricing and Investment for Online TV Content Platforms | S. Ren, M. van der Schaar | 2012 |