Visualization for AI/ML

会議の名前
CHI 2023
ESCAPE: Countering Systematic Errors from Machine's Blind Spots via Interactive Visual Analysis
要旨

Classification models learn to generalize the associations between data samples and their target classes. However, researchers have increasingly observed that machine learning practice easily leads to systematic errors in AI applications, a phenomenon referred to as "AI blindspots.'' Such blindspots arise when a model is trained with training samples (e.g., cat/dog classification) where important patterns (e.g., black cats) are missing or periphery/undesirable patterns (e.g., dogs with grass background) are misleading towards a certain class. Even more sophisticated techniques cannot guarantee to capture, reason about, and prevent the spurious associations. In this work, we propose ESCAPE, a visual analytic system that promotes a human-in-the-loop workflow for countering systematic errors. By allowing human users to easily inspect spurious associations, the system facilitates users to spontaneously recognize concepts associated misclassifications and evaluate mitigation strategies that can reduce biased associations. We also propose two statistical approaches, relative concept association to better quantify the associations between a concept and instances, and debias method to mitigate spurious associations. We demonstrate the utility of our proposed ESCAPE system and statistical measures through extensive evaluation including quantitative experiments, usage scenarios, expert interviews, and controlled user experiments.

著者
Yongsu Ahn
University of Pittsburgh, Pittsburgh, Pennsylvania, United States
Yu-Ru Lin
University of Pittsburgh, Pittsburgh, Pennsylvania, United States
Panpan Xu
Amazon AWS, Santa Clara, California, United States
Zeng Dai
Bosch Research, Sunnyvale, California, United States
論文URL

https://doi.org/10.1145/3544548.3581373

動画
Angler: Helping Machine Translation Practitioners Prioritize Model Improvements
要旨

Machine learning (ML) models can fail in unexpected ways in the real world, but not all model failures are equal. With finite time and resources, ML practitioners are forced to prioritize their model debugging and improvement efforts. Through interviews with 13 ML practitioners at Apple, we found that practitioners construct small targeted test sets to estimate an error's nature, scope, and impact on users. We built on this insight in a case study with machine translation models, and developed Angler, an interactive visual analytics tool to help practitioners prioritize model improvements. In a user study with 7 machine translation experts, we used Angler to understand prioritization practices when the input space is infinite, and obtaining reliable signals of model quality is expensive. Our study revealed that participants could form more interesting and user-focused hypotheses for prioritization by analyzing quantitative summary statistics and qualitatively assessing data by reading sentences.

著者
Samantha Robertson
University of California, Berkeley, Berkeley, California, United States
Zijie J.. Wang
Georgia Tech, Atlanta, Georgia, United States
Dominik Moritz
Apple, Pittsburgh, Pennsylvania, United States
Mary Beth Kery
Apple Inc., Pittsburgh, Pennsylvania, United States
Fred Hohman
Apple, Seattle, Washington, United States
論文URL

https://doi.org/10.1145/3544548.3580790

動画
Subjective Probability Correction for Uncertainty Representations
要旨

We propose a new approach to uncertainty communication: we keep the uncertainty representation fixed, but adjust the distribution displayed to compensate for biases in people’s subjective probability in decision-making. To do so, we adopt a linear-in-probit model of subjective probability and derive two corrections to a Normal distribution based on the model’s intercept and slope: one correcting all right-tailed probabilities, and the other preserving the mode and one focal probability. We then conduct two experiments on U.S. demographically-representative samples. We show participants hypothetical U.S. Senate election forecasts as text or a histogram and elicit their subjective probabilities using a betting task. The first experiment estimates the linear-in-probit intercepts and slopes, and confirms the biases in participants’ subjective probabilities. The second, preregistered follow-up shows participants the bias-corrected forecast distributions. We find the corrections substantially improve participants’ decision quality by reducing the integrated absolute error of their subjective probabilities compared to the true probabilities. These corrections can be generalized to any univariate probability or confidence distribution, giving them broad applicability. Our preprint, code, data, and preregistration are available at https://doi.org/10.17605/osf.io/kcwxm.

受賞
Honorable Mention
著者
Fumeng Yang
Northwestern University, Evanston, Illinois, United States
Maryam Hedayati
Northwestern University, Evanston, Illinois, United States
Matthew Kay
Northwestern University, Chicago, Illinois, United States
論文URL

https://doi.org/10.1145/3544548.3580998

動画
Drava: Aligning Human Concepts with Machine Learning Latent Dimensions for the Visual Exploration of Small Multiples
要旨

Latent vectors extracted by machine learning (ML) are widely used in data exploration (e.g., t-SNE) but suffer from a lack of interpretability. While previous studies employed disentangled representation learning (DRL) to enable more interpretable exploration, they often overlooked the potential mismatches between the concepts of humans and the semantic dimensions learned by DRL. To address this issue, we propose Drava, a visual analytics system that supports users in 1) relating the concepts of humans with the semantic dimensions of DRL and identifying mismatches, 2) providing feedback to minimize the mismatches, and 3) obtaining data insights from concept-driven exploration. Drava provides a set of visualizations and interactions based on visual piles to help users understand and refine concepts and conduct concept-driven exploration. Meanwhile, Drava employs a concept adaptor model to fine-tune the semantic dimensions of DRL based on user refinement. The usefulness of Drava is demonstrated through application scenarios and experimental validation.

著者
Qianwen Wang
Harvard Medical School, Boston, Massachusetts, United States
Sehi L'Yi
Harvard Medical School, Boston, Massachusetts, United States
Nils Gehlenborg
Harvard Medical School, Boston, Massachusetts, United States
論文URL

https://doi.org/10.1145/3544548.3581127

動画
GAM Coach: Towards Interactive and User-centered Algorithmic Recourse
要旨

Machine learning (ML) recourse techniques are increasingly used in high-stakes domains, providing end users with actions to alter ML predictions, but they assume ML developers understand what input variables can be changed. However, a recourse plan's actionability is subjective and unlikely to match developers' expectations completely. We present GAM Coach, a novel open-source system that adapts integer linear programming to generate customizable counterfactual explanations for Generalized Additive Models (GAMs), and leverages interactive visualizations to enable end users to iteratively generate recourse plans meeting their needs. A quantitative user study with 41 participants shows our tool is usable and useful, and users prefer personalized recourse plans over generic plans. Through a log analysis, we explore how users discover satisfactory recourse plans, and provide empirical evidence that transparency can lead to more opportunities for everyday users to discover counterintuitive patterns in ML models. GAM Coach is available at: https://poloclub.github.io/gam-coach/.

著者
Zijie J.. Wang
Georgia Tech, Atlanta, Georgia, United States
Jennifer Wortman Vaughan
Microsoft Research, New York, New York, United States
Rich Caruana
Microsoft Research, Redmond, Washington, United States
Duen Horng Chau
Georgia Tech, Atlanta, Georgia, United States
論文URL

https://doi.org/10.1145/3544548.3580816

動画
Tracing and Visualizing Human-ML/AI Collaborative Processes through Artifacts of Data Work
要旨

Automated Machine Learning (AutoML) technology can lower barriers in data work yet still requires human intervention to be functional. However, the complex and collaborative process resulting from humans and machines trading off work makes it difficult to trace what was done, by whom (or what), and when. In this research, we construct a taxonomy of data work artifacts that captures AutoML and human processes. We present a rigorous methodology for its creation and discuss its transferability to the visual design process. We operationalize the taxonomy through the development of AutoML Trace, an interactive visual sketch showing both the context and temporality of human-ML/AI collaboration in data work. Finally, we demonstrate the utility of our approach via a usage scenario with an enterprise software development team. Collectively, our research process and findings explore challenges and fruitful avenues for developing data visualization tools that interrogate the sociotechnical relationships in automated data work.

受賞
Honorable Mention
著者
Jen Rogers
Scientific Computing and Imaging Institute, Salt Lake City, Utah, United States
Anamaria Crisan
Tableau Research, Seattle, Washington, United States
論文URL

https://doi.org/10.1145/3544548.3580819

動画