Health artificial intelligence (AI) is often developed in high-stakes, data-scarce contexts, where both clinical validity and patient comprehension are critical; however, rigorous, multi-level evaluation of explanations in real-world patient-facing settings remains challenging. To enhance patient understanding and trust, we propose a practical blueprint for designing and evaluating medically aligned, patient-centered explanation (MAP-X). We propose this blueprint through MAP-X, a system that employs a large language model (LLM) with retrieval-augmented generation (RAG) to translate clinical assessments into an understandable interface. We conducted a three-phase evaluation following a multi-level validation framework: a functional evaluation of faithfulness, a clinician evaluation of workflow suitability, and a patient evaluation of perceived understanding and trust. Our findings suggest that MAP-X may support clinical adoption. In the patient study, MAP-X showed higher reported trust and a positive trend in explanation satisfaction. Interviews suggested clearer understanding of assessment results. Overall, MAP-X produced clinically relevant explanations with reasonable faithfulness and usability. Clinician oversight remains necessary.
ACM CHI Conference on Human Factors in Computing Systems