As image-based classifiers' application areas diversify, capturing human annotators' subjective responses in the data acquisition process is becoming crucial. In such scenarios, however, eliciting a response from human labelers in a reliable and cost-efficient manner has been a significant challenge. To bridge this gap, we seek to understand how applying different sequencing strategies in batch image labeling can impact human annotators' labeling performances. In particular, we developed three different sequencing strategies: (1) uncertainty-based labeling (UL), a sequencing strategy that prioritizes images that a classifier predicts with the highest uncertainty, and (2) certainty-based labeling (CL), a reverse strategy of UL that presents images with the highest prediction probability first, and (3) random, a baseline approach that randomly chooses a set of images in batch forming. Although UL and CL are the strategies that select images to-be-surfaced based on a classifier's point-of-view, we hypothesized that human annotators' perception and labeling performance may vary depending on the different sequencing strategies. In our study, we identified that participants were able to recognize a different level of perceived cognitive load across three conditions (CL the easiest and UL the hardest), while found a trade-off between labeling reliability (CL and UL more reliable than random) and task efficiency (UL the most efficient while CL the least efficient). Based on the results, we discuss the implications of design for data scientists who may consider applying different sequencing strategies in collecting image labels at scale for subjective tasks. Then we suggest possible future research areas.
https://doi.org/10.1145/3476037
The 24th ACM Conference on Computer-Supported Cooperative Work and Social Computing