A first-of-its-kind COVID-19 analysis authored jointly by the van der Schaar Lab, the University of Cambridge’s Department of Medicine, and a group of Brazilian researchers has been published in Nature Scientific Reports (editor-in-chief: Richard White).
The goal of this research was to provide the scientific community and, in particular, the Brazilian authorities with a ranking of the most important social, health, and economic risk factors related to COVID-19.
Building on previous statistical research (published in The Lancet Global Health) to identify the importance of ethnicity and socioeconomic status in determining outcome, this new study applies machine learning techniques to the Brazilian SIVEP-Gripe respiratory infection surveillance dataset to study demographic, patient, socioeconomic and organizational structure influences on COVID-19 outcome.
The authors concluded that socioeconomic and structural factors are as important as biological factors in determining COVID-19 outcomes in Brazil. Particularly important factors were: the state of residence and its development index; the distance to the hospital (especially for rural and less developed areas); the level of education; hospital funding model and strain. Ethnicity was also confirmed to be more important than comorbidities but less than the aforementioned factors.
This is the first known study of its kind to have been conducted in Brazil. Further details can be found below.
Comparing COVID-19 risk factors in Brazil using machine learning:
the importance of socioeconomic, demographic and structural factors
Pedro Baqui, Valerio Marra, Ahmed M. Alaa, Ioana Bica, Ari Ercole, Mihaela van der Schaar
Nature Scientific Reports
Abstract
The COVID-19 pandemic continues to have a devastating impact on Brazil. Brazil’s social, health and economic crises are aggravated by strong societal inequities and persisting political disarray. This complex scenario motivates careful study of the clinical, socioeconomic, demographic and structural factors contributing to increased risk of mortality from SARS-CoV-2 in Brazil specifically.
We consider the Brazilian SIVEP-Gripe catalog, a very rich respiratory infection dataset which allows us to estimate the importance of several non-laboratorial and socio-geographic factors on COVID-19 mortality. We analyze the catalog using machine learning algorithms to account for likely complex interdependence between metrics. The XGBoost algorithm achieved excellent performance, producing an AUC-ROC of 0.813 (95% CI 0.810–0.817), and outperforming logistic regression.
Using our model we found that, in Brazil, socioeconomic, geographical and structural factors are more important than individual comorbidities. Particularly important factors were: The state of residence and its development index; the distance to the hospital (especially for rural and less developed areas); the level of education; hospital funding model and strain. Ethnicity is also confirmed to be more important than comorbidities but less than the aforementioned factors.
In conclusion, socioeconomic and structural factors are as important as biological factors in determining the outcome of COVID-19. This has important consequences for policy making, especially on vaccination/non-pharmacological preventative measures, hospital management and healthcare network organization.
For a full list of the van der Schaar Lab’s publications, click here.
To find out more about the van der Schaar Lab’s work related to the COVID-19 pandemic, visit our dedicated page here.