Understanding Human-side Impact of Sampling Image Batches in Subjective Attribute Labeling

要旨

As image-based classifiers' application areas diversify, capturing human annotators' subjective responses in the data acquisition process is becoming crucial. In such scenarios, however, eliciting a response from human labelers in a reliable and cost-efficient manner has been a significant challenge. To bridge this gap, we seek to understand how applying different sequencing strategies in batch image labeling can impact human annotators' labeling performances. In particular, we developed three different sequencing strategies: (1) uncertainty-based labeling (UL), a sequencing strategy that prioritizes images that a classifier predicts with the highest uncertainty, and (2) certainty-based labeling (CL), a reverse strategy of UL that presents images with the highest prediction probability first, and (3) random, a baseline approach that randomly chooses a set of images in batch forming. Although UL and CL are the strategies that select images to-be-surfaced based on a classifier's point-of-view, we hypothesized that human annotators' perception and labeling performance may vary depending on the different sequencing strategies. In our study, we identified that participants were able to recognize a different level of perceived cognitive load across three conditions (CL the easiest and UL the hardest), while found a trade-off between labeling reliability (CL and UL more reliable than random) and task efficiency (UL the most efficient while CL the least efficient). Based on the results, we discuss the implications of design for data scientists who may consider applying different sequencing strategies in collecting image labels at scale for subjective tasks. Then we suggest possible future research areas.

著者
Sungsoo Ray Hong
George Mason University, Fairfax, Virginia, United States
Chaeyeon Chung
KAIST, Daejeon, Korea, Republic of
Jung Soo Lee
KAIST, Daejeon, Korea, Republic of
Kyungmin Park
Shinhan Bank, Seoul, Korea, Republic of
Junsoo Lee
KAIST, Daejeon, Korea, Republic of
Minjae Kim
NCSOFT, Seongnam-si, Korea, Republic of
Mookyung Song
NCSOFT, SeongNam, Korea, Republic of
Yeonwoo Kim
NCSOFT, SEOUL, Korea, Republic of
Jaegul Choo
KAIST, Daejeon, Korea, Republic of
論文URL

https://doi.org/10.1145/3476037

動画

会議: CSCW2021

The 24th ACM Conference on Computer-Supported Cooperative Work and Social Computing

セッション: Data Work and AI

Papers Room B
8 件の発表
2021-10-27 22:30:00
2021-10-28 00:00:00