Data-driven approaches that form the foundation of advancements in machine learning (ML) are powered in large part by human infrastructures that enable the collection of large datasets. We study the movement of data through multiple stages of data processing in the context of public health in India, examining the data work performed by frontline health workers, data stewards, and ML developers. We conducted interviews with these stakeholders to understand their varied perspectives on valuing data across stages, working with data to attain this value, and challenges arising throughout. We discuss the tensions in valuing and how they might be addressed, as we emphasize the need for improved transparency and accountability when data are transformed from one stage of processing to the next.
https://dl.acm.org/doi/abs/10.1145/3491102.3501868
The ACM CHI Conference on Human Factors in Computing Systems (https://chi2022.acm.org/)