The social network has made available to French and foreign researchers important data sets on the movements of its users in order, in particular, to anticipate new epidemic peaks.
In early April, Facebook announced the availability of new data sets on population movements for research teams around the world. Aggregated and anonymized, this data, derived from information recorded by the social network’s mobile application, should make it possible to better predict the appearance of next epidemic peaks, or to adjust public health policies, said the social network.
Coronavirus: Facebook to share anonymized data with research teams A month later, researchers at Paris Sciences et Lettres University (PSL), which brings together eleven universities and grandes écoles and is the recipient in France of these data, displayed cautious optimism. “We are starting to have results, but they are not yet fully publishable,” explains Jamal Atif, professor at Paris-Dauphine-PSL University and coordinator of the initiative, who adds:
“From this data, combined with health data, we believe that we can build models to better understand the spread of the pandemic, and have algorithms and models that will have a higher predictive power than current models . “
Geolocation and overview
Facebook provides researchers, in France as in several countries, with data on population movements: the large number of social network users who have activated the geolocation of the application allows it to have a fairly precise overview of the way in which users move, whether they are daily journeys or exceptional journeys. Combined with other data, such as identified clusters or estimated risks of contamination, this information can help “predict” likely developments in the epidemic.
Predicting precise effects of containment is difficult due to lack of benchmarks in the past
The coronavirus pandemic, however, poses special challenges, which require deviating from the models used for other diseases. First, because the “most complex models, which will certainly allow the best predictions in the long term, require a lot of parameters [sur la durée d’incubation ou la proportion de porteurs asymptomatiques, par exemple], and the sources of information on these points are not so numerous at the moment, “warns Olivier Cappé, CNRS research director at the ENS-PSL. Not to mention that the general containment of the population implemented for almost two months is a historic first, the precise effects of which are difficult to predict due to the lack of comparison points.
Useful, the data of displacement of the French provided by Facebook are not a miracle tool, warn the researchers. And their interpretation and use require great caution. “Can we use it to try to predict the progression of the epidemic?” Yes, but the model will inevitably lag a little behind reality, because we do not really “see” people when they are infected with the disease, details Mr. Cappé. And as soon as we work on real data, we leave the realm of pure concepts: in real data, there are always things a little weird. You can never fully automate it, it is something that takes time. “
“Short term projections”
Researchers at PSL University are therefore currently focusing on models of “short-term projections”, at the scale of a few weeks rather than a few months. “We do a lot of testing,” says Cappé. We do not have a solution that we are sure is the right one, and we are trying several approaches. And if the data made available by Facebook is very useful, the aggregation of other sources of information can only improve the models, say the researchers.
“It’s interesting that there are multiple teams doing things from different data; this is one of the ways you can detect potential biases in the data, “said Cappé. Facebook is not used uniformly by all age groups, and data from telephone operators, used by other research projects, is not as precise from one department to another, depending on the degrees of coverage.
For the time being, Facebook data seems to confirm on the whole what Orange’s data showed on the movements of Ile-de-France residents, with a decrease in the population in Paris estimated between 20 and 30%. Researchers from PSL University hope to have useful information in the coming weeks “for Ile-de-France, where health data are a little more precise and were harmonized a little earlier than in other countries.” other regions, ”notes Atif. This information could be particularly interesting for the public authorities, in this very dense region, where the occupancy rates of the resuscitation services are still particularly high, and where the deconfinement will only be very partial from May 11.