Making Data Anonymous Not Enough to Protect Consumer Privacy

What data can be used to identify an individual’s identity? This question has long befuddled experts who aim to protect anonymity, as sometimes the techniques used to remove underlying identities fail. A graduate student at Carnegie Mellon, for example, was able to identify the medical history of then-Massachusetts governor William Weld from insurance records (1).

Data Anonymity

The identities of individual consumers are very likely to be re-identified.
(Source: Wikimedia Commons)

Recently, Yves-Alexandre de Montjoye, a computer security researcher at MIT, and his colleagues were able to identify individual people from a set of “anonymized” credit card data.

Their research analyzed credit card transaction information, also known as metadata, from over one million shoppers. Although names, addresses, and other information directly linked to card owners were not given as part of the data, de Montjoye and his team could identify 90 percent of individuals if they knew the date and location of just four of their transactions. In addition, the research showed that more affluent consumers are more likely to be re-identified than those of lower income, and women are more likely to be re-identified than men (1, 2).

De Montjoye and his colleagues identified three factors that contribute to re-identifying an individual: date, location, and price. They showed that making the data coarser, which was done by specifying date to within 15 days, giving the location to within a cluster of 350 stores, and increasing the range within which prices are categorized, will lower the chance of re-identification. However, the team demonstrated that, even with this lower data resolution, individuals could still be identified by using more transaction information.

These new results, along with previous research demonstrating that individuals can be identified using metadata, has led some scientists to conclude that for metadata to be useful, complete anonymity is impossible (1). If this is indeed the case, then strengthening security laws that protect the release of metadata will minimize the risk of identification of consumers for illegal purposes.


1. Deng B (2015). People identified through credit-card use alone. Nature. doi:10.1038/nature.2015.16817

2. de Montjoye YA, Radaelli L, Singh VK, Pentland A (2015). Unique in the shopping mall: On the reidentifiability of credit card metadata. Science, 347, 536-539. doi:10.1126/science.1256297

Bookmark the permalink.

Leave a Reply

Your email address will not be published.