Clinical decision support tools (DSTs), powered by Artificial Intelligence (AI), promise to improve clinicians' diagnostic and treatment decision-making. However, no AI model is always correct. DSTs must enable clinicians to validate each AI suggestion, convincing them to take the correct suggestions while rejecting its errors. While prior work often tried to do so by explaining AI's inner workings or performance, we chose a different approach: We investigated how clinicians validated each other's suggestions in practice (often by referencing scientific literature) and designed a new DST that embraces these naturalistic interactions. This design uses GPT-3 to draw literature evidence that shows the AI suggestions' robustness and applicability (or the lack thereof). A prototyping study with clinicians from three disease areas proved this approach promising. Clinicians' interactions with the prototype also revealed new design and research opportunities around (1) harnessing the complementary strengths of literature-based and predictive decision supports; (2) mitigating risks of de-skilling clinicians; and (3) offering low-data decision support with literature.
https://doi.org/10.1145/3544548.3581393
The ACM CHI Conference on Human Factors in Computing Systems (https://chi2023.acm.org/)