Computational AI Development and Explanation

[B] Paper Room 02, 2021-05-14 01:00:00~2021-05-14 03:00:00 / [C] Paper Room 02, 2021-05-14 09:00:00~2021-05-14 11:00:00 / [A] Paper Room 02, 2021-05-13 17:00:00~2021-05-13 19:00:00

会議の名前
CHI 2021
Assessing the Impact of Automated Suggestions on Decision Making: Domain Experts Mediate Model Errors but Take Less Initiative
要旨

Automated decision support can accelerate tedious tasks as users can focus their attention where it is needed most. However, a key concern is whether users overly trust or cede agency to automation. In this paper, we investigate the effects of introducing automation to annotating clinical texts--a multi-step, error-prone task of identifying clinical concepts (e.g., procedures) in medical notes, and mapping them to labels in a large ontology. We consider two forms of decision aid: recommending which labels to map concepts to, and pre-populating annotation suggestions. Through laboratory studies, we find that 18 clinicians generally build intuition of when to rely on automation and when to exercise their own judgement. However, when presented with fully pre-populated suggestions, these expert users exhibit less agency: accepting improper mentions, and taking less initiative in creating additional annotations. Our findings inform how systems and algorithms should be designed to mitigate the observed issues.

著者
Ariel Levy
MIT, Cambridge, Massachusetts, United States
Monica Agrawal
MIT, Cambridge, Massachusetts, United States
Arvind Satyanarayan
MIT, Cambridge, Massachusetts, United States
David Sontag
MIT, Cambridge, Massachusetts, United States
DOI

10.1145/3411764.3445522

論文URL

https://doi.org/10.1145/3411764.3445522

動画
EventAnchor: Reducing Human Interactions in Event Annotation of Racket Sports Videos
要旨

The popularity of racket sports (e.g., tennis and table tennis) leads to high demands for data analysis, such as notational analysis, on player performance. While sports videos offer many benefits for such analysis, retrieving accurate information from sports videos could be challenging. In this paper, we propose EventAnchor, a data analysis framework to facilitate interactive annotation of racket sports video with the support of computer vision algorithms. Our approach uses machine learning models in computer vision to help users acquire essential events from videos (e.g., serve, the ball bouncing on the court) and offers users a set of interactive tools for data annotation. An evaluation study on a table tennis annotation system built on this framework shows significant improvement of user performances in simple annotation tasks on objects of interest and complex annotation tasks requiring domain knowledge.

著者
Dazhen Deng
Zhejiang University, Hangzhou, Zhejiang, China
Jiang Wu
Zhejiang University, Hangzhou, Zhejiang, China
Jiachen Wang
Zhejiang University, Hangzhou, Zhejiang, China
Yihong Wu
Zhejiang University, Hangzhou, China
Xiao Xie
Zhejiang University, Hangzhou, Zhejiang, China
Zheng Zhou
Department of Sport Science, College of Education, Hangzhou, CHN/Zhejiang, China
Hui Zhang
Zhejiang University, Hangzhou, Zhejiang Province, China
Xiaolong (Luke). Zhang
Penn State, University Park, Pennsylvania, United States
Yingcai Wu
Zhejiang University, Hangzhou, Zhejiang, China
DOI

10.1145/3411764.3445431

論文URL

https://doi.org/10.1145/3411764.3445431

動画
Data-Centric Explanations: Explaining Training Data of Machine Learning Systems to Promote Transparency
要旨

Training datasets fundamentally impact the performance of machine learning (ML) systems. Any biases introduced during training (implicit or explicit) are often reflected in the system’s behaviors leading to questions about fairness and loss of trust in the system. Yet, information on training data is rarely communicated to stakeholders. In this work, we explore the concept of data-centric explanations for ML systems that describe the training data to end-users. Through a formative study, we investigate the potential utility of such an approach, including the information about training data that participants find most compelling. In a second study, we investigate reactions to our explanations across four different system scenarios. Our results suggest that data-centric explanations have the potential to impact how users judge the trustworthiness of a system and to assist users in assessing fairness. We discuss the implications of our findings for designing explanations to support users’ perceptions of ML systems.

著者
Ariful Islam Anik
University of Manitoba, Winnipeg, Manitoba, Canada
Andrea Bunt
University of Manitoba, Winnipeg, Manitoba, Canada
DOI

10.1145/3411764.3445736

論文URL

https://doi.org/10.1145/3411764.3445736

動画
Method for Exploring Generative Adversarial Networks (GANs) via Automatically Generated Image Galleries
要旨

Generative Adversarial Networks (GANs) can automatically generate quality images from learned model parameters. However, it remains challenging to explore and objectively assess the quality of all possible images generated using a GAN. Currently, model creators evaluate their GANs via tedious visual examination of generated images sampled from narrow prior probability distributions on model parameters. Here, we introduce an interactive method to explore and sample quality images from GANs. Our first two user studies showed that participants can use the tool to explore a GAN and select quality images. Our third user study showed that images sampled from a posterior probability distribution using a Markov Chain Monte Carlo (MCMC) method on parameters of images collected in our first study resulted in on average higher quality and more diverse images than existing baselines. Our work enables principled qualitative GAN exploration and evaluation.

著者
Enhao Zhang
University of Michigan, Ann Arbor, Michigan, United States
Nikola Banovic
University of Michigan, Ann Arbor, Michigan, United States
DOI

10.1145/3411764.3445714

論文URL

https://doi.org/10.1145/3411764.3445714

動画
Player-AI Interaction: What Neural Network Games Reveal About AI as Play
要旨

The advent of artificial intelligence (AI) and machine learning (ML) bring human-AI interaction to the forefront of HCI research. This paper argues that games are an ideal domain for studying and experimenting with how humans interact with AI. Through a systematic survey of neural network games (n = 38), we identified the dominant interaction metaphors and AI interaction patterns in these games. In addition, we applied existing human-AI interaction guidelines to further shed light on player-AI interaction in the context of AI-infused systems. Our core finding is that AI as play can expand current notions of human-AI interaction, which are predominantly productivity-based. In particular, our work suggests that game and UX designers should consider flow to structure the learning curve of human-AI interaction, incorporate discovery-based learning to play around with the AI and observe the consequences, and offer users an invitation to play to explore new forms of human-AI interaction.

著者
Jichen Zhu
Drexel University, Philadelphia, Pennsylvania, United States
Jennifer Villareale
Drexel University Westphal , Philadelphia, Pennsylvania, United States
Nithesh Javvaji
Northeastern University, Boston, Massachusetts, United States
Sebastian Risi
IT University of Copenhagen, Copenhagen, Denmark
Mathias Löwe
IT University of Copenhagen, Copenhagen, Denmark
Rush Weigelt
Drexel University, Philadelphia, Pennsylvania, United States
Casper Harteveld
Northeastern University, Boston, Massachusetts, United States
DOI

10.1145/3411764.3445307

論文URL

https://doi.org/10.1145/3411764.3445307

動画
Human Reliance on Machine Learning Models When Performance Feedback is Limited: Heuristics and Risks
要旨

This paper addresses an under-explored problem of AI-assisted decision-making: when objective performance information of the machine learning model underlying a decision aid is absent or scarce, how do people decide their reliance on the model? Through three randomized experiments, we explore the heuristics people may use to adjust their reliance on machine learning models when performance feedback is limited. We find that the level of agreement between people and a model on decision-making tasks that people have high confidence in significantly affects reliance on the model if people receive no information about the model's performance, but this impact will change after aggregate-level model performance information becomes available. Furthermore, the influence of high confidence human-model agreement on people's reliance on a model is moderated by people's confidence in cases where they disagree with the model. We discuss potential risks of these heuristics, and provide design implications on promoting appropriate reliance on AI.

著者
Zhuoran Lu
Purdue University, West Lafayette, Indiana, United States
Ming Yin
Purdue University, West Lafayette, Indiana, United States
DOI

10.1145/3411764.3445562

論文URL

https://doi.org/10.1145/3411764.3445562

動画
AutoDS: Towards Human-Centered Automation of Data Science
要旨

Data science (DS) projects often follow a \textit{lifecycle} that consists of laborious \textit{tasks} for data scientists and domain experts (e.g., data exploration, model training, etc.). Only till recently, machine learning(ML) researchers have developed promising automation techniques to aid data workers in these tasks. This paper introduces \textbf{AutoDS}, an automated machine learning (AutoML) system that aims to leverage the latest ML automation techniques to support data science projects. Data workers only need to upload their dataset, then the system can automatically suggest ML configurations, preprocess data, select algorithm, and train the model. These suggestions are presented to the user via a web-based graphical user interface and a notebook-based programming user interface. We studied AutoDS with 30 professional data scientists, where one group used AutoDS, and the other did not, to complete a data science project. As expected, AutoDS improves productivity; Yet surprisingly, we find that the models produced by the AutoDS group have \textbf{higher quality} and \textbf{less errors}, but \textbf{lower human confidence scores}. We reflect on the findings by presenting design implications for incorporating automation techniques into human work in the data science lifecycle.

著者
Dakuo Wang
IBM Research, Cambridge, Massachusetts, United States
Josh Andres
IBM Research Australia, Melbourne, Victoria, Australia
Justin D.. Weisz
IBM Research AI, Yorktown Heights, New York, United States
Erick Oduor
IBM Research Africa, Nairobi, Nairobi, Kenya
Casey Dugan
IBM Research, Cambridge, Massachusetts, United States
DOI

10.1145/3411764.3445526

論文URL

https://doi.org/10.1145/3411764.3445526

動画
Evaluating the Interpretability of Generative Models by Interactive Reconstruction
要旨

For machine learning models to be most useful in numerous sociotechnical systems, many have argued that they must be human-interpretable. However, despite increasing interest in interpretability, there remains no firm consensus on how to measure it. This is especially true in representation learning, where interpretability research has focused on "disentanglement" measures only applicable to synthetic datasets and not grounded in human factors. We introduce a task to quantify the human-interpretability of generative model representations, where users interactively modify representations to reconstruct target instances. On synthetic datasets, we find performance on this task much more reliably differentiates entangled and disentangled models than baseline approaches. On a real dataset, we find it differentiates between representation learning methods widely believed but never shown to produce more or less interpretable models. In both cases, we ran small-scale think-aloud studies and large-scale experiments on Amazon Mechanical Turk to confirm that our qualitative and quantitative results agreed.

受賞
Honorable Mention
著者
Andrew Ross
Harvard University, Cambridge, Massachusetts, United States
Nina Chen
Harvard College, Cambridge, Massachusetts, United States
Elisa Zhao Hang
Harvard College, Cambridge, Massachusetts, United States
Elena L.. Glassman
Harvard University, Cambridge, Massachusetts, United States
Finale Doshi-Velez
Harvard University, Cambridge, Massachusetts, United States
DOI

10.1145/3411764.3445296

論文URL

https://doi.org/10.1145/3411764.3445296

動画
Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance
要旨

Many researchers motivate explainable AI with studies showing that human-AI team performance on decision-making tasks improves when the AI explains its recommendations. However, prior studies observed improvements from explanations only when the AI, alone, outperformed both the human and the best team. Can explanations help lead to complementary performance, where team accuracy is higher than either the human or the AI working solo? We conduct mixed-method user studies on three datasets, where an AI with accuracy comparable to humans helps participants solve a task (explaining itself in some conditions). While we observed complementary improvements from AI augmentation, they were not increased by explanations. Rather, explanations increased the chance that humans will accept the AI's recommendation, regardless of its correctness. Our result poses new challenges for human-centered AI: Can we develop explanatory approaches that encourage appropriate trust in AI, and therefore help generate (or improve) complementary performance?

著者
Gagan Bansal
University of Washington, Seattle, Washington, United States
Tongshuang Wu
University of Washington, Seattle, Washington, United States
Joyce Zhou
University of Washington, Seattle, Washington, United States
Raymond Fok
University of Washington, Seattle, Washington, United States
Besmira Nushi
Microsoft Research, REDMOND, Washington, United States
Ece Kamar
Microsoft Research, Redmond, Washington, United States
Marco Tulio Ribeiro
Microsoft Research, Redmond, Washington, United States
Daniel Weld
University of Washington, Seattle, Washington, United States
DOI

10.1145/3411764.3445717

論文URL

https://doi.org/10.1145/3411764.3445717

動画
Expanding Explainability: Towards Social Transparency in AI systems
要旨

As AI-powered systems increasingly mediate consequential decision-making, their explainability is critical for end-users to take informed and accountable actions. Explanations in human-human interactions are socially-situated. AI systems are often socio-organizationally embedded. However, Explainable AI (XAI) approaches have been predominantly algorithm-centered. We take a developmental step towards socially-situated XAI by introducing and exploring Social Transparency (ST), a sociotechnically informed perspective that incorporates the socio-organizational context into explaining AI-mediated decision-making. To explore ST conceptually, we conducted interviews with 29 AI users and practitioners grounded in a speculative design scenario. We suggested constitutive design elements of ST and developed a conceptual framework to unpack ST’s effect and implications at the technical, decision-making, and organizational level. The framework showcases how ST can potentially calibrate trust in AI, improve decision-making, facilitate organizational collective actions, and cultivate holistic explainability. Our work contributes to the discourse of Human-Centered XAI by expanding the design space of XAI.

受賞
Honorable Mention
著者
Upol Ehsan
Georgia Institute of Technology, Atlanta, Georgia, United States
Q. Vera Liao
IBM Research, Yorktown Heights, New York, United States
Michael Muller
IBM Research, Cambridge, Massachusetts, United States
Mark O. Riedl
Georgia Tech, Altanta, Georgia, United States
Justin D.. Weisz
IBM Research AI, Yorktown Heights, New York, United States
DOI

10.1145/3411764.3445188

論文URL

https://doi.org/10.1145/3411764.3445188

動画
Beyond Expertise and Roles: A Framework to Characterize the Stakeholders of Interpretable Machine Learning and their Needs
要旨

To ensure accountability and mitigate harm, it is critical that diverse stakeholders can interrogate black-box automated systems and find information that is understandable, relevant, and useful to them. In this paper, we eschew prior expertise- and role-based categorizations of interpretability stakeholders in favor of a more granular framework that decouples stakeholders' knowledge from their interpretability needs. We characterize stakeholders by their formal, instrumental, and personal knowledge and how it manifests in the contexts of machine learning, the data domain, and the general milieu. We additionally distill a hierarchical typology of stakeholder needs that distinguishes higher-level domain goals from lower-level interpretability tasks. In assessing the descriptive, evaluative, and generative powers of our framework, we find our more nuanced treatment of stakeholders reveals gaps and opportunities in the interpretability literature, adds precision to the design and comparison of user studies, and facilitates a more reflexive approach to conducting this research.

著者
Harini Suresh
MIT, Cambridge, Massachusetts, United States
Steven R. Gomez
MIT, Lexington, Massachusetts, United States
Kevin K. Nam
MIT, Lexington, Massachusetts, United States
Arvind Satyanarayan
MIT, Cambridge, Massachusetts, United States
DOI

10.1145/3411764.3445088

論文URL

https://doi.org/10.1145/3411764.3445088

動画
Whither AutoML? Understanding the Role of Automation in Machine Learning Workflows
要旨

Efforts to make machine learning more widely accessible have led to a rapid increase in Auto-ML tools that aim to automate the process of training and deploying machine learning. To understand how Auto-ML tools are used in practice today, we performed a qualitative study with participants ranging from novice hobbyists to industry researchers who use Auto-ML tools. We present insights into the benefits and deficiencies of existing tools, as well as the respective roles of the human and automation in ML workflows. Finally, we discuss design implications for the future of Auto-ML tool development. We argue that instead of full automation being the ultimate goal of Auto-ML, designers of these tools should focus on supporting a partnership between the user and the Auto-ML tool. This means that a range of Auto-ML tools will need to be developed to support varying user goals such as simplicity, reproducibility, and reliability.

著者
Doris Xin
University of California, Berkeley, Berkeley, California, United States
Eva Yiwei Wu
UC Berkeley, Berkeley, California, United States
Doris Jung-Lin Lee
University of California, Berkeley, Berkeley, California, United States
Niloufar Salehi
UC, Berkeley, Berkeley, California, United States
Aditya Parameswaran
UC Berkeley, Berkeley, California, United States
DOI

10.1145/3411764.3445306

論文URL

https://doi.org/10.1145/3411764.3445306

動画