Resolving biases
Home » AI » General exposition » Fairness, diversity and non-discrimination » GDPR provisions » Resolving biases

There are different strategies that could help to avoid biases or to correct them. While creating the databases that will serve to build an AI model, controllers should make every effort to avoid unbalanced or mistaken data. Identifiable and discriminatory bias should be removed in the dataset-building phase where possible.[1]

If the origin of the bias is related to the training dataset, the controller should search for an adequate selection of data to be used in the training phase, to avoid the results of the subsequent model being incorrect or discriminatory.[2] An AI model must “be trained using relevant and correct data and it must learn which data to emphasize. The model must not emphasize information relating to racial or ethnic origin, political opinion, religion or belief, trade union membership, genetic status, health status or sexual orientation if this would lead to arbitrary discriminatory treatment (emphasis added).”[3]

Furthermore, people with disabilities should be included when sourcing data to build models, and in testing, to create a more inclusive and robust system. If this process is performed adequately, the bias will probably vanish. For example, in the racial bias in health algorithms case study (see Box 11),it was possible to reformulate the algorithm (in this case, so that it no longer used costs as a proxy for needs) and eliminate racial bias in predicting who needed extra care. Indeed, changing the indicator for health, from predicted costs to the number of chronic medical conditions, increased the percentage of black patients receiving better healthcare from 17% to 46%. This is an excellent example of increasing fairness by reformulating an algorithm.

However, controllers should always keep in mind that what makes fighting biases so particularly complex is that selecting a dataset involves making decisions and choices – which may, at times, almost be done unconsciously. By contrast, coding a traditional, deterministic algorithm is always a deliberate operation. Indeed, humans are always the intelligence behind a development – even when it is embedded in algorithms that we think are neutral. Whoever builds a dataset is, to some extent, building it in their own image, to reflect their own worldview, values or, at the very least, the values which are more or less inherent in the data gathered from the past.[4]

In light of this, it is important that the teams in charge of selecting the data to be integrated into a dataset should comprise people that reflect the diversity that the AI development is expected to show. At present, this is a major challenge. In terms of gender, for example, women comprise only 15% of AI research staff at Facebook and 10% at Google, and there is no public data on trans workers or other gender minorities. In terms of race, the gap is even starker: only 2.5% of Google’s workforce is black, while Facebook and Microsoft are each at 4%.[5] Controllers should make every effort to ensure that their teams better reflect diversity and implement accurate data that reflects this.

In summary, algorithms’ development processes should always include a careful monitoring of possible biases. Internal and external reviews should pay special attention to this issue. Datasets built for validation purposes should be carefully selected to ensure an adequate incorporation of data pertaining to subjects from different sectors of society, in terms of age, race, gender, disabilities, etc. Fortunately, there are a lot of technical tools devoted to eradicating biases in AI models.[6] The IEEE P7003TM Standard for Algorithmic Bias Considerations is particularly interesting at the moment.[7]

However, none of them offers a magical solution, or ‘silver bullet’, applicable to all types of algorithms. In most cases, the right solution will depend on the multiple variables involved in the algorithm. Controllersshould aim to eradicate biases as far as possible, and be honest about the final results of their efforts. If biases are uncovered, the AI solution should be trained again. If unfair biases cannot be erased from the model, its deployment should not proceed.

Checklist: bias

☐ The controller has established a strategy or a set of procedures to avoid creating or reinforcing unfair bias in the AI system, both regarding the use of input data and for the algorithm design.

☐ The controller assesses and acknowledges the possible limitations stemming from the composition of the used datasets.

☐ The controller has considered the diversity and representativeness of the data used.

☐ The controller has tested for specific populations or problematic use cases.

☐ The controllers used the available technical tools to improve their understanding of the data, model and performance.

☐ The controller has put in place processes to test and monitor for potential biases during the development, deployment and use phases of the AI system.

☐ The controller has implemented a mechanism that allows others to flag issues related to bias, discrimination or poor performance of the AI system.

☐ The controller has established clear steps and ways of communicating on how and to whom such issues can be raised.

☐ The controller has considered others, potentially indirectly affected by the AI system, in addition to the (end-)users.

☐ The controller has assessed whether there is any possible decision variability that can occur under the same conditions.

☐ In case of variability, the controller has established a measurement or assessment mechanism of the potential impact of such variability on fundamental rights.

☐ The controller has implemented a quantitative analysis or metrics to measure and test the applied definition of fairness.

☐ The controller has established mechanisms to ensure fairness in the AI systems, and has considered other potential mechanisms.

Additional information

CNIL (2017) How can humans keep the upper hand? The ethical matters raised by algorithms and artificial intelligence. Commission Nationale de l’Informatique et des Libertés, Paris. Available at: www.cnil.fr/sites/default/files/atoms/files/cnil_rapport_ai_gb_web.pdf

EDPB (2019) Guidelines 4/2019 on Article 25 Data Protection by Design and by Default (version for public consultation). European Data Protection Board, Brussels. Available at: https://edpb.europa.eu/our-work-tools/public-consultations-art-704/2019/guidelines-42019-article-25-data-protection-design_es

ICO (2020) AI auditing framework: draft guidance for consultation. Information Commissioner’s Office, Wilmslow. Available at: https://ico.org.uk/media/about-the-ico/consultations/2617219/guidance-on-the-ai-auditing-framework-draft-for-consultation.pdf

Mittelstadt, B. and Floridi, L. (2016) ‘The ethics of big data: current and foreseeable issues in biomedical context’, Science and Engineering Ethics 22(2): 303-341.

Norwegian Data Protection Authority (2018)Artificial intelligence and privacy. Norwegian Data Protection Authority, Oslo. Available at: https://iapp.org/media/pdf/resource_center/ai-and-privacy.pdf

West, S.M., Whittaker, M. and Crawford, K. (2019) Discriminating systems: gender, race and power in AI. AI Now Institute, New York, p.3. Available at: https://ainowinstitute.org/discriminatingsystems.html

 
 

References


1Recital 71 of the GDPR.

2For a definition of direct and indirect discrimination, see, for instance, Article 2 of Council Directive 2000/78/EC of 27 November 2000, which establishes a general framework for equal treatment in employment and occupation. See also Article 21 of the Charter of Fundamental Rights of the EU.

3Norwegian Data Protection Authority (2018)Artificial intelligence and privacy.Norwegian Data Protection Authority, Oslo, p.16. Available at: https://iapp.org/media/pdf/resource_center/ai-and-privacy.pdf (accessed 15 May 2020).

4CNIL (2017) How can humans keep the upper hand? The ethical matters raised by algorithms and artificial intelligence. Commission Nationale de l’Informatique et des Libertés, Paris,p.34. Available at: www.cnil.fr/sites/default/files/atoms/files/cnil_rapport_ai_gb_web.pdf (accessed 15 May 2020).

5West, S.M., Whittaker, M. and Crawford, K. (2019) Discriminating systems: gender, race and power in AI. AI Now Institute, New York, p.3. Available at: https://ainowinstitute.org/discriminatingsystems.html (accessed 15 May 2020).

6ICO (2020) AI auditing framework: draft guidance for consultation. Information Commissioner’s Office, Wilmslow, p.55-56. Available at: https://ico.org.uk/media/about-the-ico/consultations/2617219/guidance-on-the-ai-auditing-framework-draft-for-consultation.pdf(accessed 15 May 2020).

7See: https://ethicsinaction.ieee.org/ (accessed 17 May 2020).

Skip to content