Large language models (LLMs) are being increasingly deployed in healthcare, influencing diagnostic reasoning and clinical workflows. However, evidence of clinician engagement with these systems, how they prompt, constrain, and verify output, remains scarce, particularly in low- and middle-income countries (LMICs). We conducted a mixed-methods study with physicians in Pakistan: (1) logging their interactions while they solved expert-designed clinical vignettes with optional LLM assistance, and (2) interviewing 12 participants about generative-AI-supported diagnosis. Findings highlight diverse prompting strategies from role assignment to cautious scaffolding, with consistent insistence on human oversight. Interviews reveal pragmatic enthusiasm for LLMs as a “second brain” in resource-constrained settings, tempered by skepticism about reliability, privacy, and patient trust. This study contributes evidence of physician-LLM interaction patterns in an LMIC context, a taxonomy of prompting strategies and oversight mechanisms, and design implications for responsible AI integration in healthcare workflows.
ACM CHI Conference on Human Factors in Computing Systems