Data processing for purposes of archiving in the public interest, scientific or historical research purposes or statistical purposes
Pilar Nicolás Jiménez[1] (UPV/EHU) and Mikel Recuero Linares (UPV/EHU)

This part of the Guidelines was reviewed by Rossana Ducato.

This part of The Guidelines has been reviewed and validated by Marko Sijan, Senior Advisor Specialist (HR DPA).


As the European Data Protection Supervisor (EDPS) highlighted, “the European Commission has defined the objectives of the EU’s research and innovation policies to be ‘opening up the innovation process to people with experience in fields other than academia and science’, ‘spreading knowledge as soon as it is available using digital and collaborative technology’ and ‘promoting international cooperation in the research community’”.[2] These purposes are not in conflict with data protection. Indeed, data protection rules should not be an obstacle to freedom of science pursuant to Article 13 of the Charter of Fundamental Rights of the EU (CFREU). Rather, these rights and freedoms must be carefully assessed and balanced, resulting in an outcome which respects the essence of both.[3]

Indeed, the intention behind our current data protection legislation is to harmonize data processing with scientific research purposes.[4] This intention is clearly linked to Article 179(1) of the Treaty on the Functioning of the European Union (TFEU) for achieving a European Research Area. In line with this, the General Data Protection Regulation (GDPR) has introduced a new framework aimed at enabling data processing for archiving purposes in the public interest, historical and scientific research purposes or statistical purposes that goes beyond that provided by Directive 95/46/EC.[5] The core of this new regulation is Article 89 of the GDPR, which is accompanied by many other references throughout the whole text that complete it. These can be found both in the part of the GDPR that includes the decisive criteria for its interpretation (recitals), and in some specific provisions[6]. On the basis of those recitals, some preliminary ideas should be highlighted.

First, Recital 157 states that by coupling information from registries, including different types of data corresponding to a lot of individuals, researchers can obtain “new knowledge of great value with regard to widespread medical conditions such as cardiovascular disease, cancer and depression”. As a consequence, “research results can be enhanced, as they draw on a larger population”. These tools can contribute to improving research policies and, consequently, the population’s quality of life. These benefits mean that the processing of data for these purposes by researchers is reasonable, provided that the rights of the subjects are guaranteed. This establishes a conception of research as a process that pursues a social benefit, in the short, medium or long term, considered in a very broad way (improvement of the quality of life) but, at the same time, limiting that activity to this specific purpose. Furthermore, recital 159 specifies that “to meet the specificities of processing personal data for scientific research purposes, specific conditions should apply in particular as regards the publication or otherwise disclosure of personal data in the context of scientific research purposes”.

The second issue to be addressed is the specific nature of consent as a requirement for its validity, which has some particularities when the purpose of the processing is scientific research. Indeed, Article 4 of the GDPR states that consent “means any freely given, specific, informed and unambiguous indication of the data subject’s wishes by which he or she, by a statement or by a clear affirmative action, signifies agreement to the processing of personal data relating to him or her”. However, recital 33 states that “it is often not possible to fully identify the purpose of personal data processing for scientific research purposes at the time of data collection”.

However, it is common that during a project, approaches not initially foreseen may emerge, or that, upon completion of the project, the conclusions open doors to other related projects. Furthermore, researchers and teams are often specialized in an area or line of research developed from specific projects, and the data may remain useful or necessary for long periods of time[7]. As a response, institutional models – such as biobanks – have emerged, functioning as intermediaries between subjects and researchers. The purpose of collecting these data is to store them for when they may be required, without knowing, in principle, which research project, or projects, will process them. In view of this reality, Recital 33 states that “data subjects should be allowed to give their consent to certain areas of scientific research” even though “data subjects should have the opportunity to give their consent only to certain areas of research or parts of research projects to the extent allowed by the intended purpose”. Different options and consent are therefore allowed to varying extents provided that they are, as the recital recalls, “in keeping with recognized ethical standards for scientific research”.

A third point that deserves attention is that contained in Recital 50, which refers to the so-called compatibility of purposes[8], i.e., “processing of personal data for purposes other than those for which the personal data were initially collected”. This is a term used in cases where the personal data, intended to be used for research purposes, were initially collected or processed for a different purpose, but it can be legitimately processed for further new (compatible) purposes. In addition, further processing for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes are ex lege considered compatible lawful processing operations. This means that no consent of the data subject nor other legal basis are required for this further purpose, under the conditions to be described later. This option is of utmost importance for scientific research because it can facilitate access to a huge amount of data without the need to re-contact the data subjects.

Finally, it is necessary to mention Recital 53, which takes up the purpose of the GDPR concerning the establishment of harmonized conditions for the processing of special categories of personal data for health-related purposes (in particular, in the context of the management of health or social care services and systems). Furthermore, it states that “Union or Member State law should provide for specific and suitable measures so as to protect the fundamental rights and the personal data of natural persons”, while declaring that “Member States should be allowed to maintain or introduce further conditions, including limitations, with regard to the processing of genetic data, biometric data or data concerning health.” However, introduced measures “should not hamper the free flow of personal data within the Union when those conditions apply to cross-border processing of such data”.



