Evaluation (validation) - Guidelines Panelfit

Description

“Before proceeding to final deployment of the model built by the data analyst, it is important to more thoroughly evaluate the model and review the model’s construction to be certain it properly achieves the business objectives. Here it is critical to determine if some important business issue has not been sufficiently considered. At the end of this phase, the project leader then should decide exactly how to use the data mining results. The key steps here are the evaluation of results, the process review, and the determination of next steps.”^[1]

This phase involves several tasks that raise important data protection issues. Overall, you must:

Evaluate the results of the model, for instance, whether it is accurate or not. To this purpose, the AI developer might test it in the real world.
Review the process. You shall review the data mining engagement to determine if there is any important factor or task that has somehow been overlooked. This includes quality assurance issues.

Main actions that need to be addressed

Processes of dynamic validation

The validation of the processing including an AI component must be done in conditions that reflect the actual environment in which the processing is intended to be deployed. Thus, if you know in advance where the AI tool will be used, you should adapt the validation process to that environment. For instance, if the tool will be deployed in Italy, you should validate it with data obtained from the Italian population, or, if not possible, a similar population. Otherwise, the results might be utterly incorrect. In any case, you should advise about the conditions of the validation to any possible user.

Moreover, the validation process requires periodic review if conditions change or if there is a suspicion that the solution itself may be altered. For instance, if the algorithm is being fed with data from elderly people, you should assess whether or not this changes its accuracy in a young population. You must make sure that validation reflects the conditions in which the algorithm has been validated accurately.

In order to reach this aim, validation should include all components of an AI tool, including data, pre-trained models, environments and the behavior of the system as a whole. Validation should also be performed as soon as possible. Overall, it must be ensured that the outputs or actions are consistent with the results of the preceding processes, comparing them to the previously defined policies to ensure that they are not violated.^[2] Validation sometimes needs gathering new personal data. In some other cases, controllers use data for purposes others than the original ones. In all these cases, controllers should ensure compliance with the GDPR (see “Purpose limitation” section in “Principles” chapter and “Data protection and scientific research” in “Concepts”).

Deleting unnecessary dataset

Quite often, the validation and training processes are somehow linked. If the validation recommends improvements in the model, training should be performed again. In principle, once the AI development has finally been achieved, the training stage of the AI tool is completed. At that moment, you should implement the removal of the dataset used for this purpose, unless there is a lawful need to maintain it for the purpose of refining or evaluating the system, or for other purposes compatible with those for which they were collected in accordance with the conditions of Article 6(4) of the GDPR (see “Define data storage adequate policies” section in this document). However, you should always consider that deleting the personal data can work against the need to update the accuracy of tools based on the real-time self-learning of algorithms: if a mistake is found you will probably need to recall the data previously used in the training stage. In the event that data subjects request its deletion, you shall have to adopt a case-by-case approach taking into account any limitations to this right provided by the Regulation (see Art. 17(3)).^[3]

Performing external audit of data processing

Since the risks of the system you are developing are high, an audit of the system by an independent third party must be considered. A variety of different audits can be used. These might be internal or external, they might cover the final product only, or be performed with less evolved prototypes. They could be considered a form of monitoring or a transparency tool.

In terms of legal accuracy, AI tools must be audited to see whether they process personal data in accordance with provisions of GDPR considering a wide range of issues that might be related with that processing. The High-Level Expert Group on AI stated that “testing processes should be designed and performed by as diverse group of people as possible. Multiple metrics should be developed to cover the categories that are being tested for different perspectives. Adversarial testing by trusted and diverse “red teams” deliberately attempting to “break” the system to find vulnerabilities, and “bug bounties” that incentivize outsiders to detect and responsibly report system errors and weaknesses, can be considered.”^[4] However, there are good reasons to be skeptical about the capability of an auditor to check the functioning of a machine learning system.

This is why it is sensible to focus on the items included by the AEPD in its recommended checklist: it would be more straightforward to focus on the measures implemented to avoid bias, obscurity, hidden profiling, etc., and the adequate use of tools such as the DPIA, which can be performed multiple times, than trying to have an in depth understanding of the functioning of a complex algorithm. Implementing adequate data protection policies form the first stages of the lifecycle of the tool is the best way to avoid data protection issues.

Ensuring compliance with legal framework for medical devices

Before deploying your device, you should ensure that you have adequately followed the regulation regarding the development of medical devices. Please, make sure that this is the case. A Clinical Evaluation and a Performance Evaluation should also be developed. The Guidance on Clinical Evaluation (MDR) / Performance Evaluation (IVDR) of Medical Device Software (https://ec.europa.eu/docsroom/documents/40323) could be an excellent tool to this purpose.)

Informing health care workers participating in development about possible issues

It is often the case that AI mechanisms are validated by comparing their performance with that of human elements, in this case, health care professionals. This can surreptitiously lead to their participation inducing an evaluation of their own professional ability. If we compare the success rate of some professionals with others, some of them may feel that they are being inadvertently tested. It is very important to try to avoid this effect. If it is going to occur, participants should be warned and accept this.

References

¹Colin Shearer, The CRISP-DM Model: The New Blueprint for Data Mining, p. 17 ↑

²High-Level Expert Group on AI, Ethics guidelines for trustworthy AI, 2019, p. 22. At: https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai v ↑

³AEPD, Adecuación al RGPD de tratamientos que incorporan Inteligencia Artificial. Una introducción, 2020, p.26. At: https://www.aepd.es/sites/default/files/2020-02/adecuacion-rgpd-ia.pdf ↑

⁴High-Level Expert Group on AI, Ethics guidelines for trustworthy AI, 2019, p. 22. At: https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-aiAccessed 15 May 2020 ↑