Data Work Across Contexts and Disciplines

会議の名前
CSCW2021
Orienting, Framing, Bridging, Magic, and Counseling: How Data Scientists Navigate the Outer Loop of Client Collaborations in Industry and Academia
要旨

Data scientists often collaborate with clients to analyze data to meet a client’s needs. What does the end-to-end workflow of a data scientist’s collaboration with clients look like throughout the lifetime of a project? To investigate this question, we interviewed ten data scientists (5 female, 4 male, 1 non-binary) in diverse roles across industry and academia. We discovered that they work with clients in a six-stage outer-loop workflow, which involves 1) laying groundwork by building trust before a project begins, 2) orienting to the constraints of the client’s environment, 3) collaboratively framing the problem, 4) bridging the gap between data science and domain expertise, 5) the inner loop of technical data analysis work, 6) counseling to help clients emotionally cope with analysis results. This novel outer-loop workflow contributes to CSCW by expanding the notion of what collaboration means in data science beyond the widely-known inner-loop technical workflow stages of acquiring, cleaning, analyzing, modeling, and visualizing data. We conclude by discussing the implications of our findings for data science education, parallels to design work, and unmet needs for tool development.

著者
Sean Kross
The University of California San Diego, La Jolla, California, United States
Philip Guo
UC San Diego, La Jolla, California, United States
論文URL

https://doi.org/10.1145/3476052

Becoming Interdisciplinary: Fostering Critical Engagement With Disaster Data
要旨

Information systems such as mapping platforms, algorithms, and databases are a central component of how society responds to the threats posed by disasters. However, these systems has come under increasing criticism in recent years for prioritizing technical disciplines over insights from the humanities and social science and failing to adequately incorporate the perspectives of at-risk or affected communities. This paper describes a unique month-long workshop that convened interdisciplinary experts to collaborate on a projects related to flood data. In addition to findings about the practical accomplishment of interdisciplinary collaboration, we offer three interrelated contributions. First, we position interdisciplinarity as a critical practice and offer a detailed example of how we staged this process. We then discuss the benefits to interdisciplinarity of expanding the range of temporal logics normally deployed in design workshops. Finally, we reflect on approaches to evaluating the event’s contributions toward sustained critique and reform of expert practice.

著者
Robert Soden
David Lallemant
Nanyang Technological University, Singapore, Singapore
Perrine Hamel
Nanyang Technological University, Singapore, Singapore
Karen Barns
ARUP, San Francisco, California, United States
論文URL

https://doi.org/10.1145/3449242

Towards Supporting Data-Driven Practices in Stroke Telerehabilitation Technology
要旨

Telerehabilitation technology has the potential to support the work of patients and clinicians by collecting and displaying patients' data to inform, motivate, and support decision-making. However, few studies have investigated data-driven practices in telerehabilitation. In this qualitative study, we conducted interviews and a focus group with the use of data visualization probes to investigate the experience of stroke survivors and healthcare providers with game-based telerehabilitation involving physical and occupational therapy. We find that participants saw potential value in the data to support their work, however they experienced challenges when interpreting data to arrive at meaningful insights and actionable information. Further, patients' personal relationships with their goals and data stand in contrast with clinicians' more matter-of-fact perspectives. Informed by these results, we discuss implications for telerehabilitation technology design.

著者
Clara Caldeira
Mayara Costa Figueiredo
University of California, Irvine, Irvine, California, United States
Lucy Dodakian
UC Irvine, Irvine, California, United States
Cleidson de Souza
UFPA, Belem, Pennsylvania, Brazil
Steven C. Cramer
UC Los Angeles, Los Angeles, California, United States
Yunan Chen
論文URL

https://doi.org/10.1145/3449099

動画
Enabling collaborative data science development with the Ballet framework
要旨

While the open-source software development model has led to successful large-scale collaborations in building software systems, data science projects are frequently developed by individuals or small teams. We describe challenges to scaling data science collaborations and present a conceptual framework and ML programming model to address them. We instantiate these ideas in Ballet, a lightweight framework for collaborative, open-source data science through a focus on feature engineering, and an accompanying cloud-based development environment. Using our framework, collaborators incrementally propose feature definitions to a repository which are each subjected to an ML performance evaluation and can be automatically merged into an executable feature engineering pipeline. We leverage Ballet to conduct a case study analysis of an income prediction problem with 27 collaborators, and discuss implications for future designers of collaborative projects.

著者
Micah J.. Smith
MIT, Cambridge, Massachusetts, United States
Jürgen Cito
TU Wien, Vienna, Austria
Kelvin Lu
MIT, Cambridge, Massachusetts, United States
Kalyan Veeramachaneni
MIT, Cambridge, Massachusetts, United States
論文URL

https://doi.org/10.1145/3479575

動画
Dr.Aid: Supporting Data-governance Rule Compliance for Decentralized Collaboration in an Automated Way
要旨

Collaboration across institutional boundaries is widespread and increasing today. It depends on federations sharing data that often have governance rules or external regulations restricting their use. However, the handling of data governance rules (aka. data-use policies) remains manual, time-consuming and error-prone, limiting the rate at which collaborations can form and respond to challenges and opportunities, inhibiting citizen science and reducing data providers' trust in compliance. Using an automated system to facilitate compliance handling reduces substantially the time needed for such non-mission work, thereby accelerating collaboration and improving productivity. We present a framework, Dr.Aid, that helps individuals, organisations and federations comply with data rules, using automation to track which rules are applicable as data is passed between processes and as derived data is generated. It encodes data-governance rules using a formal language and performs reasoning on multi-input-multi-output data-flow graphs in decentralised contexts. We test its power and utility by working with users performing cyclone tracking and earthquake modelling to support mitigation and emergency response. We query standard provenance traces to detach Dr.Aid from details of the tools and systems they are using, as these inevitably vary across members of a federation and through time. We evaluate the model in three aspects by encoding real-life data-use policies from diverse fields, showing its capability for real-world usage and its advantage to traditional frameworks. We argue that this approach will lead to more agile, more productive and more trustworthy collaborations and show that the approach can be adopted incrementally. This, in-turn, will allow more appropriate data policies to emerge opening up new forms of collaboration.

著者
Rui Zhao
University of Edinburgh, Edinburgh, United Kingdom
Malcolm Atkinson
University of Edinburgh, Edinburgh, United Kingdom
Petros Papapanagiotou
University of Edinburgh, Edinburgh, United Kingdom
Federica Magnoni
INGV, Rome, Italy
Jacques Fleuriot
University of Edinburgh, Edinburgh, United Kingdom
論文URL

https://doi.org/10.1145/3479604

動画
Data Work in Education: Enacting and Negotiating Care and Control in Teachers' Use of Data-Driven Classroom Surveillance Technology
要旨

Today, teachers have been increasingly relying on data-driven technologies to track and monitor student behavior data for classroom management. Drawing insights from interviews with 20 K-8 teachers, this paper unpacks how teachers enacted both care and control through their data work in collecting, interpreting, and using student behavior data. In this process, teachers found themselves subject to surveilling gazes from parents, school administration, and students. As a result, teachers had to manipulate the student behavior data to navigate the balance between presenting a professional image to surveillants and enacting care/control that they deemed appropriate. This paper identifies and unpacks two nuanced forms of teachers' data work that have been understudied in CSCW: 1) data work as recontextualizing meanings and 2) data work as resisting surveillance. We discuss teachers' struggle over (in)visibility and their negotiation of autonomy and subjectivity in these two forms of data work. We highlight the importance of foregrounding and making space for data workers' (in our case, teachers') resistance and negotiation of autonomy in light of datafication.

著者
Alex Jiahong Lu
University of Michigan, Ann Arbor, Michigan, United States
Tawanna R. Dillahunt
University of Michigan, Ann Arbor, Michigan, United States
Gabriela Marcu
University of Michigan, Ann Arbor, Michigan, United States
Mark S.. Ackerman
University of Michigan, Ann Arbor, Ann Arbor, Michigan, United States
論文URL

https://doi.org/10.1145/3479596

Data Work and Decision Making in Emergency Medical Services: A Distributed Cognition Perspective
要旨

Emergency medical services (EMS) teams are first responders providing urgent medical care to severely ill or injured patients in the field. Despite their criticality, EMS work is one of the very few medical domains with limited technical support. This paper describes a study conducted to examine technology opportunities for supporting EMS data work and decision-making. We transcribed and analyzed 25 simulation videos. Using the distributed cognition framework, we examined EMS teams’ work practices that support information acquisition and sharing. Our results showed that EMS teams leveraged various mechanisms (e.g., verbal communication and external cognitive aids) to distribute cognitive labor in managing, collecting, and using patient data. However, we observed a set of prominent challenges in EMS data work, including lack of detailed documentation in real time, situation recall issues, situation awareness problems, and challenges in decision making and communication. Based on the results, we discuss implications for technology opportunities to support rapid information acquisition, integration, and sharing in time-critical, high-risk medical settings.

著者
Zhan Zhang
Pace University, New York, New York, United States
Karen Joy
Pace University, New York, New York, United States
Pradeepti Upadhyayula
Pace University, New York, New York, United States
Mustafa Ozkaynak
University of Colorado | Anschutz Medical Campus, Aurora, Colorado, United States
Richard Harris
Pace University, New York, New York, United States
Kathleen Adelgais
University of Colorado | Anschutz Medical Campus, Aurora, Colorado, United States
論文URL

https://doi.org/10.1145/3479500

動画
Tensions and Mitigations: Understanding Concerns and Values around Smartphone Data Collection for Public Health Emergencies
要旨

Smartphones increasingly serve as the source for, or to aggregate, a considerable amount of data that can be relevant in public health emergencies. Hence the sharing and utilisation of mobile health data, for example to help control the spread of communicable diseases, has become a relevant issue, with the COVID-19 pandemic adding a sudden urgency mirrored in debates around contact tracing apps. Building on exploratory work that indicated user perceptions and values around consent, and the notion that smartphones and mobile health data can be perceived as elements of self-embodiment, we present an online study comparing three scenarios of representative diseases undertaken during the first wave lockdown in the UK. Using a mixed-methods analysis of responses from 86 participants, we identify tensions and mitigations in user values and from those present the description of four characteristic user-groups that can inform considerations for design and development activities in this space.

著者
Colin Watson
Newcastle University, Newcastle upon Tyne, United Kingdom
Ridita Ali
Newcastle University, Newcastle, United Kingdom
Jan David. Smeddinck
Newcastle University, Newcastle upon Tyne, United Kingdom
論文URL

https://doi.org/10.1145/3476071

動画