Divisi: Interactive Search and Visualization for Scalable Exploratory Subgroup Analysis

要旨

Analyzing data subgroups is a common data science task to build intuition about a dataset and identify areas to improve model performance. However, subgroup analysis is prohibitively difficult in datasets with many features, and existing tools limit unexpected discoveries by relying on user-defined or static subgroups. We propose exploratory subgroup analysis as a set of tasks in which practitioners discover, evaluate, and curate interesting subgroups to build understanding about datasets and models. To support these tasks we introduce Divisi, an interactive notebook-based tool underpinned by a fast approximate subgroup discovery algorithm. Divisi's interface allows data scientists to interactively re-rank and refine subgroups and to visualize their overlap and coverage in the novel Subgroup Map. Through a think-aloud study with 13 practitioners, we find that Divisi can help uncover surprising patterns in data features and their interactions, and that it encourages more thorough exploration of subtypes in complex data.

著者
Venkatesh Sivaraman
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Zexuan Li
University of Michigan, Ann Arbor, Michigan, United States
Adam Perer
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
DOI

10.1145/3706598.3713103

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713103

動画

会議: CHI 2025

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2025.acm.org/)

セッション: Playing with Data

Annex Hall F206
7 件の発表
2025-04-30 20:10:00
2025-04-30 21:40:00
日本語まとめ
読み込み中…