Data is fundamental to AI/ML models. This paper investigates the work practices concerning data annotation as performed in the industry, in India. Previous human-centred investigations have largely focused on annotators’ subjectivity, bias and efficiency. We present a wider perspective of the data annotation: following a grounded approach, we conducted three sets of interviews with 25 annotators, 10 industry experts and 12 ML/AI practitioners. Our results show that the work of annotators is dictated by the interests, priorities and values of others above their station. More than technical, we contend that data annotation is a systematic exercise of power through organizational structure and practice. We propose a set of implications for how we can cultivate and encourage better practice to balance the tension between the need for high quality data at low cost and the annotators’ aspiration for well-being, career perspective, and active participation in building the AI dream.
https://dl.acm.org/doi/abs/10.1145/3491102.3502121
The ACM CHI Conference on Human Factors in Computing Systems (https://chi2022.acm.org/)