rTisane: Externalizing conceptual models for data analysis increases engagement with domain knowledge and improves statistical model quality

要旨

Statistical models should accurately reflect analysts’ domain knowledge about variables and their relationships. While recent tools let analysts express these assumptions and use them to produce a resulting statistical model, it remains unclear what analysts want to express and how externalization impacts statistical model quality. This paper addresses these gaps. We first conduct an exploratory study of analysts using a domain-specific language (DSL) to express conceptual models. We observe a preference for detailing how variables relate and a desire to allow, and then later resolve, ambiguity in their conceptual models. We leverage these findings to develop rTisane, a DSL for expressing conceptual models augmented with an interactive disambiguation process. In a controlled evaluation, we find that analysts reconsidered their assumptions, self-reported externalizing their assumptions accurately, and maintained analysis intent with rTisane. Additionally, rTisane enabled some analysts to author statistical models they were unable to specify manually. For others, rTisane resulted in models that better fit the data or enabled iterative improvement.

受賞
Best Paper
著者
Eunice Jun
University of California, Los Angeles, Los Angeles, California, United States
Edward Misback
University of Washington, Seattle, Washington, United States
Jeffrey Heer
University of Washington, Seattle, Washington, United States
Rene Just
University of Washington, Seattle, Washington, United States
論文URL

doi.org/10.1145/3613904.3642267

動画

会議: CHI 2024

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2024.acm.org/)

セッション: Working with Data A

318B
5 件の発表
2024-05-13 20:00:00
2024-05-13 21:20:00