OneLabeler: A Flexible System for Building Data Labeling Tools

要旨

Labeled datasets are essential for supervised machine learning. Various data labeling tools have been built to collect labels in different usage scenarios. However, developing labeling tools is time-consuming, costly, and expertise-demanding on software development. In this paper, we propose a conceptual framework for data labeling and OneLabeler based on the conceptual framework to support easy building of labeling tools for diverse usage scenarios. The framework consists of common modules and states in labeling tools summarized through coding of existing tools. OneLabeler supports configuration and composition of common software modules through visual programming to build data labeling tools. A module can be a human, machine, or mixed computation procedure in data labeling. We demonstrate the expressiveness and utility of the system through ten example labeling tools built with OneLabeler. A user study with developers provides evidence that OneLabeler supports efficient building of diverse data labeling tools.

著者
Yu Zhang
University of Oxford, Oxford, United Kingdom
Yun Wang
Microsoft Research Asia, Beijing, China
Haidong Zhang
Microsoft Research Asia, Beijing, China
Bin Zhu
Microsoft Research Asia, Beijing, China
Siming Chen
Fudan University, Shanghai, China
Dongmei Zhang
Microsoft Research Asia, Beijing, China
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3517612

動画

会議: CHI 2022

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2022.acm.org/)

セッション: Interacting with Data

291
5 件の発表
2022-05-03 18:00:00
2022-05-03 19:15:00