Code Code Evolution: Understanding How People Change Data Science Notebooks Over Time

Sensemaking is the iterative process of identifying, extracting, and explaining insights from data, where each iteration is referred to as the "sensemaking loop." However, little is known about how sensemaking behavior evolves from exploration and explanation during this process. This gap limits our ability to understand the full scope of sensemaking, which in turn inhibits the design of tools that support the process. We contribute the first mixed-method to characterize how sensemaking evolves within computational notebooks. We study 2,574 Jupyter notebooks mined from GitHub by identifying data science notebooks that have undergone significant iterations, presenting a regression model that automatically characterizes sensemaking activity, and using this regression model to calculate and analyze shifts in activity across GitHub versions. Our results show that notebook authors participate in various sensemaking tasks over time, such as annotation, branching analysis, and documentation. We use our insights to recommend extensions to current notebook environments.

University of Maryland, College Park, Maryland, United States

University of Maryland, College Park, College Park, Maryland, United States

University of Washington, Seattle, Washington, United States

https://doi.org/10.1145/3544548.3580997

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2023.acm.org/)

Hall F

6 件の発表

開始日時2023-04-25 23:30:00

終了日時2023-04-26 00:55:00

お気に入り

あとで読む

コレクション

Code Code Evolution: Understanding How People Change Data Science Notebooks Over Time

要旨

受賞
Honorable Mention

著者

論文URL

動画

会議: CHI 2023

セッション: Working with Data

Code Code Evolution: Understanding How People Change Data Science Notebooks Over Time

要旨

受賞Honorable Mention

著者

論文URL

動画

会議: CHI 2023

セッション: Working with Data

受賞
Honorable Mention