During the first Coronavirus pandemic wave in Italy, mortality rates had very different trends between the various regions: mobility, positivity rates, the availability of primary care and the size of potential infection hubs in schools, places and in hospitals are among the strongest statistical predictors of such trends. These are the main results of a study published in Scientific Reports by a group of Italian researchers working at Scuola Superiore Sant’Anna (Pisa), Pennsylvania State University (USA) and Université Laval (Quebec City, Canada. ).
During the first pandemic wave, the group began monitoring epidemiological data released by the Italian authorities and tried to associate them with those on the mobility of people provided by Google and publicly accessible data on a range of socio-economic, infrastructural and environmental factors. Most of the female researchers are composed of young statisticians and data scientists (“data scientists”) committed to developing techniques and algorithms in an area of statistics called “Functional Data Analysis”, an area that studies data in the form of curves and surfaces. It was therefore natural for them to apply these methods to the data, characterizing the epidemic curves and exploring the differences between Italian regions.
“Unfortunately, the quality of the data available to the scientific community – says Francesca Chiaromonte, professor of statistics at the Scuola Superiore Sant’Anna and Penn State, in presenting the data of the study – is much lower than that which would be necessary to conduct analyzes capable of to clearly guide policies to combat the pandemic. This was true in the first half of 2020 and, even if some progress has been made, it remains true today too ”. The Covid-19 epidemic has brought to Italy an awareness of the limitations in the way authorities collect, process and make available the large amounts of data that could help researchers and policy makers (“policy makers”) to understand complex phenomena and design effective answers.
“Epidemiological data are imperfect and imperfectly distributed, the easiest to find data on mobility comes to us from Google, and variables that effectively capture potentially relevant demographic, health, infrastructural and environmental aspects are not made readily available by government authorities. central and local, or by the statistical offices ”, Francesca Chiaromonte emphasizes, who continues as follows:“ the problem is not that the data does not exist. The data exist, but there is a lack of mechanisms and platforms that make them available in a systematic, integrated and reliable way to researchers who would like to study them ”.
Despite these limitations, using data available for the resolution of the Italian regions and exploiting their sophisticated statistical techniques, the researchers of the group, who signed the study just published by Scientific Reports, have identified some important and significant trends. “We have characterized heterogeneous and staggered epidemics in different areas of Italy, summarizing and quantifying what policy makers, scientists and citizens have seen happen between February and April 2020”, comments Marzia Cremona, who after obtaining a doctorate in Mathematical Models and Methods for Engineering at the Politecnico di Milano and after conducting a post-doctoral research period at Penn State, she became an assistant professor in Data Science at the Université Laval. “We have identified – continues Marzia Cremona – an extreme, ‘exponential’ epidemic trajectory in Lombardy and in the most affected regions of Northern Italy, and a more moderate, ‘flattened’ trajectory in the rest of the country. In particular, Veneto falls into the second category, where the first positive cases appeared concurrently with those in Lombardy, but an aggressive testing strategy was immediately implemented “.
The study documented strong associations between Covid-19 mortality, mobility, and positivity rates. “These associations persist when models that control for other factors are used,” emphasizes Tobia Boschi, a master’s degree in Mathematical Engineering at the Politecnico di Milano and now a PhD student in Statistics at Penn State, “so our results, together with those of other studies in Italy and in the world, support the thesis according to which mobility plays a fundamental role in modulating epidemic curves, and the positivity rate can be used to monitor the progress of the pandemic ”.
The findings also suggest a significant role for factors such as distributed primary care, which appears to mitigate mortality, and the size of potential contagion hubs in hospitals, schools and workplaces, which may aggravate the epidemic. “Certainly these results need confirmation from higher resolution data, but over the last year evidence has accumulated from several studies, and this could inform policy choices, for example by suggesting short and medium-term investments to increase the ‘decentralized primary care, or strategies to reduce the number of students, patients and workers in the same environment “, recalls Lorenzo Testa, a student of the Sant’Anna School of Economics, now a master’s degree student in Data Science and Business Informatics at the University of Pisa.
“At this moment we are already extending our study over a longer period of time, comparing different pandemic waves and testing which predictive factors seem to have a similar role and which instead have a different role”, adds Jacopo Di Iorio who, after having obtained a PhD in Models and Mathematical Methods for Engineering at Politecnico di Milano, conducted a year of postdoctoral research at Scuola Superiore Sant’Anna and will soon move to Penn State where he will continue his postdoctoral research. “Our work demonstrates how Functional Data Analysis techniques can offer original and useful perspectives when applied to this type of data, both Italian and from other parts of the world”, concludes Jacopo Di Iorio.
As they continue their studies, these young researchers are committed to sharing the techniques, algorithms and procedures for data analysis they have developed with the scientific community. “I am proud of the quality of the statistical and computational training they have received in excellent Italian universities, and of their desire to expand their horizons with further training and international collaborations,” says Francesca Chiaromonte. “At Scuola Superiore Sant’Anna, through the EMbeDS (Economics and Management in the era of Data Science) Department of Excellence that I coordinate, we are trying to create a community and provide resources to this new generation of Italian data scientists and computation. We give them data and space – concludes Francesca Chiaromonte – because there is a concrete possibility that they will be able to improve things ”.