Like everything else in security, anonymity systems shouldn’t be fielded before being subjected to adversarial attacks. We all know that it’s folly to implement a cryptographic system before it’s rigorously attacked; why should we expect anonymity systems to be any different? And, like everything else in security, anonymity is a trade-off. There are benefits, and there are corresponding risks. -/-
What the University of Texas researchers demonstrate is that this process (de-anonymization) isn’t hard, and doesn’t require a lot of data. <..>
With only eight movie ratings (of which two may be completely wrong), and dates that may be up to two weeks in error, they can uniquely identify 99 percent of the records in the dataset. After that, all they need is a little bit of identifiable data: from the IMDb (Internet Movie Database), from your blog, from anywhere. The moral is that it takes only a small named database for someone to pry the anonymity off a much larger anonymous database.
Other research reaches the same conclusion. Using public anonymous data from the 1990 census, Latanya Sweeneyfound that 87 percent of the population in the United States, 216 million of 248 million, could likely be uniquely identified by their five-digit ZIP code, combined with their gender and date of birth. About half of the U.S. population is likely identifiable by gender, date of birth and the city, town or municipality in which the person resides. Expanding the geographic scope to an entire county reduces that to a still-significant 18 percent. ”In general,” the researchers wrote, ”few characteristics are needed to uniquely identify a person.”