This section describes the GDPR’s concept of pseudonymization and how to implement it. Pseudonymization is a manner of processing of personal data. Pseudonymization is concerned with rendering identification in a controlled environment of a controller’s (or joint controllers’) processing activity (or activities) (or subset thereof) impossible. This requires (among others) that the data are rendered pseudonymous in a manner that they do no longer permit direct identification of data subjects.

Pseudonymization contrasts with anonymization which renders both direct and indirect identification impossible in all possible environments.

In order to understand the concept of pseudonymization, the following analysis attempts to provide a detailed conceptual framework with a precise technical interpretation. For this purpose, it defines precise (technical) meanings for the terms used in the GDPR and where necessary, introduces additional concepts and distinctions. The terminology is aimed to be compatible with the GDPR, however; re-definition of terms with a different meaning from that of the GDPR have been avoided.

Pseudonymization in a nutshell:

Considering that (identified) personal data can be seen as consisting of both, a “who” and a “what” part, pseudonymization is a manner of processing that strictly separates the “who” and the “what” part such that

  • the processing is limited to the “what” part and
  • the “who” part is separated and protected such that it cannot be used for the identification of data subjects.

The separation of “who” and “what” based on identified data is achieved by data pseudonymization. Data pseudonymization is a transformation[1] of data. It is distinct from pseudonymization (as defined in the GDPR) which is a manner of processing that acts on (already) pseudonymized data.

In the realm of pseudonymization, any identification is prohibited; while the possibility of re-identification is explicitly foreseen in the GDPR, rendering the data identified again in this way exits the realm of pseudonymization and enters that of processing identified data.

The risk inherent in identified data is usually higher than the sum of the risks inherent in pseudonymized data (“what”) and the additional information (“who”). This is evident when considering that that the separation either informs “who” is in the data set or that an unknown entity has certain properties (“what”).




1This transformation also constitutes processing according to Art. 4(2) GDPR but is not part of the processing that constitutes pseudonymization according to Art 4(5) GDPR. Note that the literature often does not make the distinction between data pseudonymization and pseudonymization; the present document makes this distinction explicit for conceptual clarity.

