As misinformation proliferates online, large language models (LLMs) have been proposed as a promising tool to accelerate fact-checking workflows. While LLMs demonstrate strong performance in tasks such as text annotation, their capabilities in generating fact-checking reports remain uncertain. To investigate how media experts evaluate LLM-generated fact-checking reports, we conducted a 2 (Source: human vs. LLM) X 2 (Disclosure of Source: yes or no) between-subjects online experiment with media professionals (N=274). Our analyses reveal that experts perceive LLM-generated reports as significantly less useful than human-written reports; and such differences become larger when participants are not aware of the source. However, LLM-generated fact-checking reports were rated as accurate and logical as human-authored ones. Party affiliation plays a role in predicting perceived logicalness. Our findings advance the understanding of experts' evaluation of LLM-generated content within the context of misinformation, which provides important theoretical contributions to HCI and communication theories as well as practical implications for the field.
ACM CHI Conference on Human Factors in Computing Systems