Do anonymous data exist?
Do anonymous data exist?

The possibility of identifying individuals in presumed anonymous data has received ample attention under the names of “re-identification” or “de-anonymization”. It has been widely successful and sophisticated techniques have been developed. Overviews of techniques and well-known cases are given for example by Mark Lennox[1], Natasha Lomas[2], Rocher et al.[3] and Dwork et al.[4].

Some kinds of data have been found to be very difficult to anonymize. Most prominently, this holds for location data[5]. Here, even a generalization to country level may not be sufficient[6]. Also, to reduce the identification potential of data, transformation that reduces the level of detail and truthfulness of the data must be applied. The question poses itself of whether successfully anonymized data are still fit for the purposes of processing.

Many scholars have concluded that likely, anonymous data that are still useful may not exist. This was most prominently voiced by Ohm who expresses doubt about the existence of anonymous data in a legal context. He states: “This mistake pervades nearly every information privacy law, regulation, and debate, yet regulators and legal scholars have paid it scant attention”[7]. From a more technical point of view, Cynthia Dwork, the co-inventor of differential privacy, has coined the phrase “de-identified data isn’t” (i.e., it isn’t de-identified or it isn’t useful data)[8].

