Case Studies on the Motivation and Performance of Contributors Who Verify and Maintain In-Flux Tabular Datasets

要旨

The life cycle of a peer-produced dataset follows the phases of growth, maturity, and decline. Paying crowdworkers is a proven method to collect and organize information into structured tables. However, these tabular representations may contain inaccuracies due to errors or data changing over time. Thus, the maturation phase of a dataset can benefit from the additional human examination. One method to improve accuracy is to recruit additional paid crowdworkers to verify and correct errors. An alternative method relies on unpaid contributors, collectively editing the dataset during regular use. We describe two case studies to examine different strategies for human verification and maintenance of in-flux tabular datasets. The first case study examines traditional micro-task verification strategies with paid crowdworkers, while the second examines long-term maintenance strategies with unpaid contributions from non-crowdworkers. Two paid verification strategies that produced more accurate corrections at a lower cost per accurate correction were redundant data collection followed by final verification from a trusted crowdworker and allowing crowdworkers to review any data freely. In the unpaid maintenance strategies, contributors provided more accurate corrections when asked to review data matching their interests. This research identifies considerations and future approaches to collectively improving information accuracy and longevity of tabular information.

著者
Shaun Wallace
Brown University, Providence, Rhode Island, United States
Alexandra Papoutsaki
Pomona College, Claremont, California, United States
Neilly H.. Tan
University of Washington, Seattle, Washington, United States
Hua Guo
Twitter Inc., San Francisco, California, United States
Jeff Huang
Brown University, Providence, Rhode Island, United States
論文URL

https://doi.org/10.1145/3479592

動画

会議: CSCW2021

The 24th ACM Conference on Computer-Supported Cooperative Work and Social Computing

セッション: Crowds and Data Work

Papers Room D
8 件の発表
2021-10-25 23:00:00
2021-10-26 00:30:00