Expert Commentary

Unique in the crowd: The privacy bounds of human mobility

2013 study in Scientific Reports demonstrating that small amounts of geographical data from cell phones can be used to identify individuals.

Internet commerce is increasingly based on the exploitation of “anonymized” consumer data collected by third parties and used by all manner of businesses and institutions. In the era of “Big Data” — the computer algorithmic analysis of huge amounts of information — the degree of anonymity anyone can expect on Internet-based telecommunications platforms has rapidly diminished, and substantial questions have arisen about the true state of Internet privacy.

The movements and locations of individuals are, of course, traditionally regarded as part of one of the most sensitive areas of privacy. Companies such as Google and Apple are increasingly collecting such data. In the wake of revelations that the National Security Agency (NSA) has accessed information from major Internet companies — including Google, Microsoft, Facebook, Skype, Apple and Yahoo — a debate has begun to unfold. How important might small bits of data or “metadata” be, from phone numbers and GPS tracking data to even just the location “pings” recorded by cellular telecommunications towers?

A 2013 study in Scientific Reports, published in the journal Nature, “Unique in the Crowd: The Privacy Bounds of Human Mobility,” is one of the latest research efforts to show how humans can be tracked and identified based on databases that, in principle, contain anonymous data. Researchers from MIT, Harvard and Université Catholique de Louvain in Belgium analyze what they call “mobility traces,” or data that can “approximate [the] whereabouts of individuals and can be used to reconstruct individuals’ movements across space and time.” They point out that “a simply anonymized dataset does not contain name, home address, phone number or other obvious identifier. Yet if an individual’s patterns are unique enough, outside information can be used to link the data back to an individual.” The study performs an analysis of 15 months of mobile phone data relating to about 1.5 million individuals in a small European country during 2006-2007.

The study’s findings include:

  • When information is provided hourly about an individual from mobile antennas, only four data points are typically necessary to identify a particular individual. This is true 95% of the time. This is because the movements of human beings are highly idiosyncratic – and thus present unique traces that can be analyzed with precision.
  • Overall, the researchers find that the “uniqueness of human mobility traces is high and that mobility datasets are likely to be re-identifiable using information only on a few outside locations.”

“These results should inform future thinking,” the authors write,” in the collection, use, and protection of mobility data. Going forward, the importance of location data will only increase and knowing the bounds of individual’s privacy will be crucial in the design of both future policies and information technologies.”

Other research also notes that mobility traces are highly identifiable, requiring increasingly careful thinking about privacy policy. These include a 2009 paper “Location-Sharing Technologies: Privacy Risks and Controls”;  “Anonymization of Location Data Does Not Work: A Large-Scale Measurement Study,” a 2011 study from industry researchers; and the 2009 paper “Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization.”

Tags: telecommunications, technology, privacy