Human-LLM Collaborative Annotation Through Effective Verification of LLM Labels

要旨

Large language models (LLMs) have shown remarkable performance across various natural language processing (NLP) tasks, indicating their significant potential as data annotators. Although LLM-generated annotations are more cost-effective and efficient to obtain, they are often erroneous for complex or domain-specific tasks and may introduce bias when compared to human annotations. Therefore, instead of completely replacing human annotators with LLMs, we need to leverage the strengths of both LLMs and humans to ensure the accuracy and reliability of annotations. This paper presents a multi-step human-LLM collaborative approach where (1) LLMs generate labels and provide explanations, (2) a verifier assesses the quality of LLM-generated labels, and (3) human annotators re-annotate a subset of labels with lower verification scores. To facilitate human-LLM collaboration, we make use of LLM's ability to rationalize its decisions. LLM-generated explanations can provide additional information to the verifier model as well as help humans better understand LLM labels. We demonstrate that our verifier is able to identify potentially incorrect LLM labels for human re-annotation. Furthermore, we investigate the impact of presenting LLM labels and explanations on human re-annotation through crowdsourced studies.

著者
Xinru Wang
Purdue University, West Lafayette, Indiana, United States
Hannah Kim
Megagon Labs, Mountain View, California, United States
Sajjadur Rahman
Megagon Labs, Mountain View, California, United States
Kushan Mitra
Megagon Labs, Mountain View, California, United States
Zhengjie Miao
Megagon Labs, Mountain View, California, United States
論文URL

doi.org/10.1145/3613904.3641960

動画

会議: CHI 2024

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2024.acm.org/)

セッション: Evaluating AI Technologies A

321
5 件の発表
2024-05-15 01:00:00
2024-05-15 02:20:00