There are more mobile phones than humans on earth. That presents a unique opportunity for big data and, more importantly, the insights from the data to be applied to greater social good. At this week’s PAPIs Connect—a predictive application programming interface (API) conference in Valencia, Spain—Nuria Oliver, the scientific director of Telefonica’s R&D program, spoke about how to adapt this data via machine learning.
Today, we touch on two of the situations she presented where big data and machine learning gave insight into how governments can better address crises, whether it’s a natural disaster or a disease outbreak.
Where Does All This Big Data Come From?
In this piece we aren’t talking about personalized data or even that which we’re offering via our social media accounts. Rather, this research focused on anonymized aggregated data that comes simply from us using our phones. A cell tower provides an approximate location which means it acts as a sort of encrypted sensor of activity that can range in breadth from a few hundred meters to a few kilometers in diameter.
From this, telecom companies can create social graphs that give fairly accurate estimates of mobility patterns in order to locate areas of need in case of emergency. There’s plenty of data gathered from calls and SMS text messages. Since smartphones and feature phones are equally capable of generating this data, this study can be replicated across cities around the world (for example, where feature phones are more prevalent).
When this mobility data is combined with public information like censuses, it can be used to leverage data for social good, including the following four situations highlighted by Oliver:
To help urban planning with flows of mobility. The data is more accurate than censuses that may only happen at 10-12 year intervals.
Develop more up-to-date official statistics
Public health tools — to understand how pandemics are propagating and to better organize response with limited resources
How Big Data is Changing Natural Disaster Response
In Autumn 2009, a state in Mexico was experiencing serious flooding and Telefonica partnered with the United Nations Global Pulse to analyze the impact of these floods, focusing on two questions.
First, could mobile data help identify the areas affected by floods? Data was aggregated from three sources:
Combined mobile data for geolocation
Census data — for representativeness of mobile data
Precipitation data from government for timeline of the weather and the government’s notifications
Then, to create a steady variable, Christmastime user behavior was compared to that of flood-time behavior (both events involved heightened behavior and mobile phone usage). Whereas heightened phone usage occurred everywhere over Christmas, heightened usage only jumped in affected regions during the flooding
The second question was to see if the governmental warning had been effective. The data proved that it wasn’t since the jump in phone usage didn’t occur at that time of the warning, but rather hours later when the flooding was rampant. From this, it became clear that the Mexican government and the regional government had to rethink its ways of reaching out to citizens.
How Machine Learning Could Slow Disease Outbreak
Spring 2009 found the first H1N1 or swine flu outbreak in Mexico. As it spread, the government’s response changed. First, there was a simple warning to avoid crowds. Then the Mexican government went ahead and closed schools, museums and other public institutions, including, while a pro soccer game still went on, not allowing anyone to attend it. Finally, after the World Health Organization raised the H1N1 outbreak to the highest level of risk—pandemic—the federal government shut down any all non-emergency services in order to restrain mobility with an eye towards containing the pandemic.
To establish a baseline, Telefonica compared five months of mobile phone records during the pandemic’s outbreak to the the same five days of the previous year to the baseline comparison. They realized that the first stage recommendation didn’t change any behavior. “When the government just issues a recommendation, we don’t think it affects us, just our neighbors,” Oliver said. At the second stage, it reduced mobility by 80 percent. The third warning didn’t reduce as much but the comparison was May 1 (Labour Day) to Cinco de Mayo.
The other part of the test research, which has been replicated in situations like Ebola, was to test if there was actually an impact on the spread of this influenza because of the governmental measures by bringing together data from the cell towers along with the disease model at its different stages of propagation. The study found that there was a ten percent decrease in spread and a 40-hour delay in the spread, which is a significant coup for medical service preparedness.
Even this non-specific generalized mobile data was enough to create significant clustering maps of mobility so lessons can be learned for the next disaster.
Where to Now?
These projects took years because they were the first of their kind and they came with the particular challenge of forging those first public-private partnerships. The next logical step is to replicate this learning in other countries for making disaster outreach more efficient. Big data and machine learning for social good is moving toward real-time data, like sending the Red Cross into the most affected areas, and mixing this data with social media in the same way that Facebook and Twitter have been leveraged to mark people as safe and to promote safe havens during recent emergencies.
While this article discusses just a handful of real world cases, the limits of data for social good can be limitless. Have you heard of another impressive use case where data is improving many lives when they most need it? Tell us in the comments below or tweet to @ProgrammableWeb and @JKRiggins so we can continue to highlight #tech4good!