Annex II. Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness

Annex II^[1]

Inception

1. What is the health question relating to patient benefit?

2. What evidence is there that the development of the algorithm was informed by best practices in clinical research and epidemiological study design?

Study

1. When and how should patients be involved in data collection, analysis, deployment, and use?

2. Are the data suitable to answer the clinical question—that is, do they capture the relevant real world heterogeneity, and are they of sufficient detail and quality?

3. Does the validation methodology reflect the real-world constraints and operational procedures associated with data collection and storage?

4. What computational and software resources are required for the task, and are the available resources sufficient to tackle this problem?

Statistical methods

1. Are the reported performance metrics relevant for the clinical context in which the model will be used?

2. Is the ML/AI algorithm compared to the current best technology, and against other appropriate baselines?

3. Is the reported gain in statistical performance with the ML/AI algorithm justified in the context of any trade-offs?

Reproducibility

1. On what basis are data accessible to other researchers?

2. Are the code, software, and all other relevant parts of the prediction modeling pipeline available to others to facilitate replicability?

3. Is there organizational transparency about the flow of data and results?

Impact evaluation

1. Are the results generalizable to settings beyond where the system was developed (that is, results reproducibility/external validity)?

2. Does the model create or exacerbate inequities in healthcare by age, sex, ethnicity, or other protected characteristics?

3. What evidence is there that clinicians and patients find the model and its output (reasonably) interpretable?

4. How will evidence of real-world model effectiveness in the proposed clinical setting be generated, and how will unintended consequences be prevented?

Implementation

1. How is the model being regularly reassessed, and updated as data quality and clinical practice changes (that is, post-deployment monitoring)?

2. Is the ML/AI model cost effective to build, implement, and maintain?

3. How will the potential financial benefits be distributed if the ML/AI model is commercialized?

4. How have the regulatory requirements for accreditation/approval been addressed?

References

¹Vollmer, S. et al. (2020) ‘Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness’, BMJ 2020;368:l6927, http://dx.doi.org/10.1136/bmj.l6927 ↑

Annex II[1]

Annex II^[1]