Lawfulness is an essential principle in terms of data protection. It implies that controllers shall ensure that they have a legal basis for processing personal data. If this is not the case, the processing must not be carried out. In general, and including special categories data, legal bases for processing are described in Article 6 and Article 9 of the GDPR. In the case of AI, the legal bases that are usually invoked to justifying processing are: consent; legitimate interest; contractual necessity; and legal obligation or vital interest. Processing for public interest can also be a legal ground, but we will not focus on it here since we address this topic widely in the section “Data protection and scientific research” in the “MainConcepts”section of Part II of these Guidelines. Therefore, we will focus on the four legal grounds listed.
Data processing is often based on consent provided by the data subjects. However, consent does not fit well with the essential nature of most AI developments, due to one simple fact: consent is, by nature, linked to a well-defined, concrete purpose. In the case of AI, the use of big data, and the aggregation, sharing or repurposing actions that are often performed creates a scenario that does not fit the underlying principles of the concept of consent and the purpose limitation principle (see “Purpose limitation principle” within Part II section “Principles” of these Guidelines).
Consent can be a useful legal basis for data processing for AI development, especially if controllers have a direct relationship with the subject who provides the data to be used for training, validating and deploying the model. For instance, if the AI tool aims to provide diagnoses of pneumonia, and physicians obtain data from patients at their healthcare institution, consent might serve well as a legal basis for processing. However, if processing involves the use of a complex AI tool that may have further uses of the data (e.g. profiling and automated decision-making might happen inadvertently, data are likely to be inferred during processing, such inferred data can be used for various purposes, etc.), it is difficult to see how a single consent could justify all such processing. To this end, controllers must take good care of the guidelines on consent provided by the Article 29 Working Party.
Within the framework of scientific research (see the “Data protection and scientific research” section in the “Main Concepts” in Part II of these Guidelines), the GDPR provides specific derogation from consent attributes, allowing controllers to make use of broad consent as a legal basis for processing. Broad consent must be understood in connection with Recital 33 of the GDPR, which states that it “is often not possible to fully identify the purpose of personal data processing for scientific research purposes at the time of data collection. Therefore, data subjects should be allowed to give their consent to certain areas of scientific research when in keeping with recognized ethical standards for scientific research.”
However, broad consent is not a kind of blanket or equivocal to open consent. It is an exceptional tool that can only be acceptable if several conditions apply. If broad consent is used for special categories of data, controllers should ensure that their national regulation allows for this. They should also be aware of the safeguards that should be implemented. Proportionality between the aim of research and the use of special categories of data must be guaranteed. Furthermore, controllers must ensure that their Member States regulation do not protect genetic, biometric and health data by introducing further conditions or limitations, since they are allowed to do so by the GDPR.
Furthermore, whenever broad consent is used to achieve the research purpose, there are some essential measures that should be considered to compensate for the abstract definition of the research purposes. Adherence to the recognized ethical standards of scientific research, according to Recital 33 of the GDPR, seems particularly relevant to this purpose.
|Box 3: Broad consent and additional safeguards
The German DPA recently listed some additional safeguards to be implemented in the case of broad consent. These are:
1. Safeguards to ensure transparency:
2. Safeguards to build trust:
3. Security safeguards:
In any case, research participants must be given the possibility to withdraw their consent, and opt in or out of certain research and parts of research, and be assured that their rights are safeguarded by adherence to the ethical standards of scientific research. Sometimes, this might cause harm the AI solution or oblige the controllers to perform complex actions. Therefore, controllers should consider if alternative legal grounds could serve them better to develop the tool while respecting the law.
In summary, controllers should be cautious when using consent as a legal ground to justify data processing, since consent does not invalidate their responsibilities regarding the fairness, necessity and proportionality of the processing. Furthermore, in the case of AI using Big Data, it is often hard to justify that consent fulfils all necessary requirements: freely given, specific, informed and unambiguous, and a clear affirmative act on the part of the data subject. In general, the more things that AI developers want to do with the data, the more difficult it is to ensure that consent is genuinely specific and informed. This should all be considered when selecting consent as a legal ground for data processing.
|Box 4. Consent as a legal basis: the OkCupid case
In 2016, a group of Danish researchers published a dataset of about 70,000 users. These data had been obtained from the online dating site OkCupid and included data categories such as usernames, age, gender, location, what kind of relationship (or sex) the data subjects were interested in, their personality traits, etc.
The researchers considered that the mere fact that these data were publicly available (on the users’ dating profiles) served as legal grounds for further processing. This is an excellent example of the terrible consequences of the argument that “the data is already public”. Data subjects had their personal data processed, and very sensitive information exposed to the public, without their consent.
Unfortunately, this association between public and open data is still too extensive. Researchers should be aware that consent provided for one concrete processing does not serve as legal grounds for further processing, and that ‘publicly available’ is not equivocal to ‘open data’; that is, data and content that can be freely used, modified and share by anyone for any purpose, as defined by the Open Data Institute.
☐ The controllers have checked that consent is the most appropriate legal basis for processing.
☐ The controllers request the consent of the interested parties in a free, specific, informed and unequivocal manner.
☐ Broad consent is used only when it is difficult or improbable to foresee how this data will be processed in the future.
☐ Broad consent used for processing of special categories of data is compatible with national regulations.
☐ Where broad consent is used, the data subjects are given the opportunity to withdraw their consent and to choose whether or not to participate in certain research and parts of it.
☐ Controllers have a direct relationship with the subject who provides the data to be used for training, validation and deployment of the IA model.
☐ There is no power imbalance between controllers and data subjects.
☐ The controllers ask people to positively opt in.
☐ The controllers do not use pre-ticked boxes or any other type of default consent.
☐ The controllers use clear, plain language that is easy to understand.
☐ The controllers specify why they want the data and what they are going to do with it.
☐ The controllers give separate distinct (‘granular’) options to consent separately to different purposes and types of processing.
☐ The controllers tell individuals they can withdraw their consent and how to do so.
☐ The controllers ensure that individuals can refuse to consent without detriment.
☐ The controllers avoid making consent a precondition of a service.
b) Legitimate interest
The use of legitimate interest as a legal ground for processing for AI development is applicable, provided that the result of the balancing test justifies it(see “Legitimate interest and balancing test” within Part II section “Main actions and tools” of these Guidelines). This may imply defining the objective of the AI’s processing at the outset, and ensuring that the original purpose of the processing is re-evaluated if the AI system provides an unexpected result, so that either the legitimate interests pursued can be identified, or that valid consent can be collected from individuals. The balancing test should be adequately documented in the records of processing. However, in some cases, legitimate interest might not serve well for AI processing purposes. For example, if controllers plan to gather a considerable amount of personal data ‘just in case’, they should not consider legitimate interest as a legal ground for the data processing, since the balancing between the need for processing and the possible impacts of the processing on people would hardly justify it.
|Checklist: legitimate interest as a legal basis
☐ The controllers have checked that legitimate interest is the most appropriate basis.
☐ The controllers understand their responsibility to protect individuals’ interests.
☐ The controllers keep a record of the decisions made and the reasoning behind them, to ensure that they can justify their decision.
☐ The controllers have identified the relevant legitimate interests.
☐ The controllers have checked that the processing is necessary and there is no less intrusive way to achieve the same result.
☐ The controllers have done a balancing test and are confident that the individual’s interests do not override those legitimate interests.
☐ The controllers only use individuals’ data in ways they would reasonably expect, unless the controllers have a very good reason.
☐ The controllers are not using people’s data in ways they would find intrusive, or which could cause them harm, unless the controllers have a very good reason.
☐ If the controllers process children’s data, they take extra care to make sure they protect the children’s interests.
☐ The controllers have considered safeguards to reduce the impact, where possible.
☐ The controllers have considered whether they can offer an opt out.
☐ The controllers have considered whether they also need to conduct a DPIA.
c) Performance of a contract
Performance of a contract to which the data subject is party, or in order to take steps at the request of the data subject prior to entering into a contract, might serve as a legal ground for processing, if using AI is objectively necessary to any of these purposes. This could be the case for developers who hire subjects to make use of their personal data in the training stage of the system. It could also be the case that the controller, who provides a service to interested third parties that includes the IA solution, uses the data of these subjects in the framework of the service contract.However, this legal ground should not be used for different purposes (such as system improvement or similar) according to the principle of purpose limitation (see “Purpose limitation principle” within Part II section “Principles” of these Guidelines), since data used to perform the contract are not necessary for those alternative aims. Thus, controllers can process the data that are intrinsically needed for the performance of a contract under the umbrella of this legal basis if they are objectively necessary to perform the contract, but not for other purposes. To sum up, it seems hard to see how the performance of a contract might serve as legal basis for AI research and innovation.
d) Legal obligation orvital interest
According to Article 6(1)(d) of the GDPR, data can be processed if it is “necessary in order to protect the vital interests of the data subject or of another natural person”. Equally, processing is lawful if it is “necessary for compliance with a legal obligation to which the controller is subject” (Article 6(1)(c). If we talk about special categories of data, then there are alternative legal grounds for processing, as expressed in Article 9.2. It is again difficult to imagine a single case where any of these bases could provide a legal ground for training an AI system at this moment, even though revisions of existing regulations at the national and European level may change this in the future. In any case, for the training of potentially life-saving AI systems, it would be better to rely on other legal bases, such as consent or public interest.
|Box 5. Examples of vital interest as a legal ground for data processing by an AI tool
Imagine that, during the COVID-19 pandemic, an organization develops an AI tool able to diagnose the disease using radiology. In such cases, data pertaining to patients could be processed on the basis of vital interest, as stated by Article 9(2)(c) of the GDPR. However, alternative legal grounds, such as substantial public interest (Article 9(2)(g) or (i) might be more appropriate.
AEPD (2020) Adecuación al RGPD de tratamientos que incorporan Inteligencia Artificial. Una introducción. Agencia Espanola Proteccion Datos, Madrid, p.20 Available at: www.aepd.es/sites/default/files/2020-02/adecuacion-rgpd-ia.pdf
Article 29 Working Party (2014) Opinion 6/2014 on the notion of legitimate interests of the controller under Article 7 of Directive 95/46. European Commission, Brussels. Available at: www.dataprotection.ro/servlet/ViewDocument?id=1086
CIPL (2020) Artificial intelligence and data protection. How the GDPR regulates AI. Centre for Information Policy Leadership, Washington, DC / Brussels / London. Available at: www.informationpolicycentre.com/uploads/5/7/1/0/57104281/cipl-hunton_andrews_kurth_legal_note_-_how_gdpr_regulates_ai__12_march_2020_.pdf
EDPB (2019) Guidelines 2/2019 on the processing of personal data under Article 6(1)(b) GDPR in the context of the provision of online services to data subjects. European Data Protection Board, Brussels. Available at: https://edpb.europa.eu/sites/edpb/files/consultation/edpb_draft_guidelines-art_6-1-b-final_public_consultation_version_en.pdf
EDPB (2020) Guidelines 05/2020 on consent under Regulation 2016/679 Version 1.1 Adopted on 4 May 2020. Available at: https://edpb.europa.eu/sites/edpb/files/files/file1/edpb_guidelines_202005_consent_en.pdf
EDPS (2017) Necessity toolkit. European Data Protection Supervisor, Brussels. Available at: https://edps.europa.eu/data-protection/our-work/publications/papers/necessity-toolkit_en
Further reading about legitimate interest, with practical cases and several references to the rulings by the Court of Justice of the European Union can be found in the following documents.
Future of Privacy Forum (no date) Processing personal data on the basis of legitimate interests under the GDPR. European Judicial Training Network, Brussels.Available at: www.ejtn.eu/PageFiles/17861/Deciphering_Legitimate_Interests_Under_the_GDPR%20(1).pdf
ICO (no date) How do we apply legitimate interests in practice? Information Commissioner’s Office, Wilmslow. Available at: https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/legitimate-interests/how-do-we-apply-legitimate-interests-in-practice/
ICO (no date) Lawful basis for processing. Information Commissioner’s Office, Wilmslow. Available at: https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/lawful-basis-for-processing/
Kuyumdzhieva, A. (2018) ‘Ethical challenges in the digital era: focus on medical research’, pp. 45-62 in: Koporc, Z. (ed.) Ethics and integrity in health and life sciences research. Emerald, Bingley.
Norwegian Data Protection Authority (2018) Artificial intelligence and privacy. Norwegian Data Protection Authority, Oslo. Available at: https://iapp.org/media/pdf/resource_center/ai-and-privacy.pdf
1AEPD (2020) Adecuación al RGPD de tratamientos que incorporan Inteligencia Artificial: Una introducción, p.20. Agencia Espanola Proteccion Datos, Madrid. Available at: www.aepd.es/sites/default/files/2020-02/adecuacion-rgpd-ia.pdf (accessed 15 May 2020). ↑
2International Bioethics Committee (2017) Report of the IBC on big data and health, p.20. UNESCO. Available at: http://unesdoc.unesco.org/images/0024/002487/248724e.pdf (accessed 13 March 2020). ↑
3ICO (no date) How do we apply legitimate interests in practice? Information Commissioner’s Office, Wilmslow. Available at: https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/legitimate-interests/how-do-we-apply-legitimate-interests-in-practice/(accessed 15 May 2020). Furthermore, the assessment of the nature of this relationship must include an investigation of the balance of power between the data subject and the data controller. ↑
4Article 29 Working Party (2018) Guidelines on consent under Regulation 2016/679. European Commission, Brussels, p.29. Available at: https://ec.europa.eu/newsroom/article29/item-detail.cfm?item_id=623051 (accessed 5 May 2020). ↑
5DSK, Beschluss der 97. Konferenz der unabhängigen Datenschutzaufsichtsbehörden des Bundes und der Länder zu Auslegung des Begriffs „bestimmte Bereiche wissenschaftlicher Forschung“ im Erwägungsgrund 33 der DS-GVO 3. April 2019, at: www.datenschutzkonferenz-online.de/media/dskb/20190405_auslegung_bestimmte_bereiche_wiss_forschung.pdf (accessed 20 May 2020). The English translation comes from a nice summary of the measures that can be consulted here: www.technologylawdispatch.com/2019/04/privacy-data-protection/german-dpas-publish-resolution-on-concept-of-broad-consent-and-the-interpretation-of-certain-areas-of-scientific-research/ ↑
6Kuyumdzhieva, A. (2018) ‘Ethical challenges in the digital era: focus on medical research’, pp.45-62 in: Koporc, Z. (ed.) Ethics and integrity in health and life sciences research. Emerald, Bingley. ↑
7Article 29 Working Party (2018) Guidelines on consent under Regulation 2016/679. WP259. European Commission, Brussels, p.3. Available at: https://ec.europa.eu/newsroom/article29/item-detail.cfm?item_id=623051 (accessed 15 May 2020). ↑
10CIPL (2020) Artificial intelligence and data protection. How the GDPR regulates AI. Centre for Information Policy Leadership, Washington, DC/Brussels/London, p.5. Available at: www.informationpolicycentre.com/uploads/5/7/1/0/57104281/cipl-hunton_andrews_kurth_legal_note_-_how_gdpr_regulates_ai__12_march_2020_.pdf(accessed 15 May 2020). ↑
11AEPD (2020) Adecuación al RGPD de tratamientos que incorporan Inteligencia Artificial. Una introducción. Agencia Espanola Proteccion Datos, Madrid, p.22. Available at: www.aepd.es/sites/default/files/2020-02/adecuacion-rgpd-ia.pdf (accessed 15 May 2020). ↑
12Ibid., p.20. ↑
13Article 29 Data Protection Working Party (2014) Opinion 06/2014 on the notion of legitimate interests of the data controller under Article 7 of Directive 95/46/EC. European Commission, Brussels, pp.16-17. Available at: https://ec.europa.eu/justice/article-29/documentation/opinion-recommendation/files/2014/wp217_en.pdf (accessed 16 May 2020). ↑
14EDPB (2019) Guidelines 2/2019 on the processing of personal data under Article 6(1)(b) GDPR in the context of the provision of online services to data subjects. European Data Protection Board, Brussels, p.14. Available at: https://edpb.europa.eu/sites/edpb/files/consultation/edpb_draft_guidelines-art_6-1-b-final_public_consultation_version_en.pdf (accessed 15 May 2020). ↑
15Article 29 Working Party (2014) Opinion 06/2014 on the notion of legitimate interests of the data controller under Article 7 of Directive 95/46/EC. European Commission, Brussels, p.20. Available at: https://ec.europa.eu/justice/article-29/documentation/opinion-recommendation/files/2014/wp217_en.pdf (accessed 15 May 2020). ↑