Angler: Helping Machine Translation Practitioners Prioritize Model Improvements

要旨

Machine learning (ML) models can fail in unexpected ways in the real world, but not all model failures are equal. With finite time and resources, ML practitioners are forced to prioritize their model debugging and improvement efforts. Through interviews with 13 ML practitioners at Apple, we found that practitioners construct small targeted test sets to estimate an error's nature, scope, and impact on users. We built on this insight in a case study with machine translation models, and developed Angler, an interactive visual analytics tool to help practitioners prioritize model improvements. In a user study with 7 machine translation experts, we used Angler to understand prioritization practices when the input space is infinite, and obtaining reliable signals of model quality is expensive. Our study revealed that participants could form more interesting and user-focused hypotheses for prioritization by analyzing quantitative summary statistics and qualitatively assessing data by reading sentences.

著者
Samantha Robertson
University of California, Berkeley, Berkeley, California, United States
Zijie J.. Wang
Georgia Tech, Atlanta, Georgia, United States
Dominik Moritz
Apple, Pittsburgh, Pennsylvania, United States
Mary Beth Kery
Apple Inc., Pittsburgh, Pennsylvania, United States
Fred Hohman
Apple, Seattle, Washington, United States
論文URL

https://doi.org/10.1145/3544548.3580790

動画

会議: CHI 2023

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2023.acm.org/)

セッション: Visualization for AI/ML

Room X11+X12
6 件の発表
2023-04-25 01:35:00
2023-04-25 03:00:00