Categories of data collected through social media

In principle, it is perfectly possible to gather different types of data through social media. Indeed, these can be personal and non-personal data. The understanding of data as non-personal is of great significance at a legal level and, obviously, for the preparation of these guidelines, insofar as the GDPR would not be applicable, but EU Regulation 2018/1807 would. In practice, this division between these two types of data is blurring due to the increasing use of data analysis technology, allowing for greater data processing capacity and extrapolation of results (group privacy). This situation blurs the line between personal and non-personal data to the extent that, for example, profiles are becoming increasingly accurate even if they are not linked to any specific individual and are, therefore, not personal data.

The limit for considering data as personal lies in its capacity to directly or indirectly identify a person, and, in particular, if the costs and time involved in such identification are not excessive^[1]. However, this sort of classification is not so easy to apply in practice. To begin with, some data that seem anonymous at first sight might be de-anonymized^[2] (see the subsection “Identification, pseudonymization and anonymization” in the section “Main Concepts” of the General part of these Guidelines). Furthermore, personal data as a legal concept enjoys a sort of expansive nature insofar as the hyper-production of data and the capacity to process and analyze them is constantly growing, thereby reducing the costs and time needed to identify a person from any given set of data (personal or non-personal)^[3].

Keeping all this in mind, one must conclude that, in the case of social networks, the processing of personal data is generally the rule. This is particularly true if we consider that in this context it is common for users to log in with a set of personal data. It is quite possible that (1) much of these data are not strictly necessary to log in and therefore does not comply with the data minimization principle (Art. 5.1.c GDPR) or that (2) the data are used for purposes that go beyond the mere login, in this case breaching the purpose limitation principle (Art. 5.1.b GDPR). Finally, personal profiling can reach a high level of accuracy irrespective of the type of data used for the production of such profiles.This requires the following cautions to be taken into account:

The controllers should assume by default that they are processing personal data and act accordingly.
- It is only advisable to avoid this assumption if the data to be used and the data inferred by the controller are entirely non-personal (e.g. weather data). In these cases, the controllers must document it in the records of processing.
- If the data to be processed relates to deceased persons or legal entities, precautions must be taken to prevent these data from being linked to natural persons (e.g. relatives of deceased persons or natural persons linked to legal entities).
- If the data to be processed relates to deceased persons, national data processing rules must also be taken into account, since data from deceased persons are not personal data according to the GDPR.
A level of granularity in profiling should be defined to sufficiently ensure the privacy of individuals who can potentially be linked to such profiling.
Protocols should be developed to prevent or reduce the possibility of re-identification of data users whose data have been processed for profiling. They shall include a legally binding compromise not to seek for such re-identification and the adoption of measures devoted to avoid involuntary re-identification.

In addition to the initial distinction between personal and non-personal data, it should be taken into account, within personal data, whetherspecial categories of personal data are concerned. This distinction is important insofar as the conditions for data processing vary depending on whether special categories of data (Article 9 GDPR) are concerned or not.

Finally, a note of consideration should be made on derived or inferred data. There has been some controversy as to whether or not derived data, and especially personal profiles, should be considered as intellectual property. Irrespective of this, it should be recalled that according to Article 4(1) GDPR such data are personal data to the extent that they relate to an identified or identifiable person. It should be added that it is possible to draw inferences related to what Article 9 GDPR considers special categories of data from ordinary personal data or even from non-personal data combined with other personal data (group privacy)^[4]. To the extent that these inferences relate to an identified or identifiable person, they should be treated as special categories of data, irrespective of their understanding (or not) as objects of intellectual property.
References

¹See Rec. 26 GDPR: “To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used”. ↑

²See Rec. 26 GDPR: “Personal data which have undergone pseudonymisation, which could be attributed to a natural person by the use of additional information should be considered to be information on an identifiable natural person” ↑

³See: in general G. Comandé (Editor) Encyclopedia of Data Science and Law Edwards Eldgar, 2021; forthcoming; G. Comandé – G. Malgieri, “Sensitive-by-distance: quasi-health data in the algorithmic era” (2017), in Information & Communications Technology Law, Vol. 26, Iss. 3, p. 229-249; G. Comandé – G. Schneider, “Regulatory Challenges of Data Mining Practices: The Case of the Never-ending Lifecycles of ‘Health Data’” (2018), in European Journal of Health Law, Volume 25, Issue 3, pages 284 – 307. ↑

⁴See in general Taylor, L., Floridi, L., van der Sloot, B. eds. (2017) Group Privacy: new
challenges of data technologies, Dordrecht, Springer. ↑