Shared Interest: Measuring Human-AI Alignment to Identify Recurring Patterns in Model Behavior

Saliency methods --- techniques to identify the importance of input features on a model's output --- are a common step in understanding neural network behavior. However, interpreting saliency requires tedious manual inspection to identify and aggregate patterns in model behavior, resulting in ad hoc or cherry-picked analysis. To address these concerns, we present Shared Interest: metrics for comparing model reasoning (via saliency) to human reasoning (via ground truth annotations). By providing quantitative descriptors, Shared Interest enables ranking, sorting, and aggregating inputs, thereby facilitating large-scale systematic analysis of model behavior. We use Shared Interest to identify eight recurring patterns in model behavior, such as cases where contextual features or a subset of ground truth features are most important to the model. Working with representative real-world users, we show how Shared Interest can be used to decide if a model is trustworthy, uncover issues missed in manual analyses, and enable interactive probing.

Massachusetts Institute of Technology, Cambridge, Massachusetts, United States

IBM Research AI, Cambridge, Massachusetts, United States

MIT, Cambridge, Massachusetts, United States

IBM Research AI, Cambridge, Massachusetts, United States

https://dl.acm.org/doi/abs/10.1145/3491102.3501965

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2022.acm.org/)

383-385

5 件の発表

開始日時2022-05-04 01:15:00

終了日時2022-05-04 02:30:00

お気に入り

あとで読む

コレクション

Shared Interest: Measuring Human-AI Alignment to Identify Recurring Patterns in Model Behavior

要旨

受賞
Honorable Mention

著者

論文URL

動画

会議: CHI 2022

セッション: Intelligent Systems, Human-AI Collaboration

Shared Interest: Measuring Human-AI Alignment to Identify Recurring Patterns in Model Behavior

要旨

受賞Honorable Mention

著者

論文URL

動画

会議: CHI 2022

セッション: Intelligent Systems, Human-AI Collaboration

受賞
Honorable Mention