AutoDS: Towards Human-Centered Automation of Data Science

要旨

Data science (DS) projects often follow a \textit{lifecycle} that consists of laborious \textit{tasks} for data scientists and domain experts (e.g., data exploration, model training, etc.). Only till recently, machine learning(ML) researchers have developed promising automation techniques to aid data workers in these tasks. This paper introduces \textbf{AutoDS}, an automated machine learning (AutoML) system that aims to leverage the latest ML automation techniques to support data science projects. Data workers only need to upload their dataset, then the system can automatically suggest ML configurations, preprocess data, select algorithm, and train the model. These suggestions are presented to the user via a web-based graphical user interface and a notebook-based programming user interface. We studied AutoDS with 30 professional data scientists, where one group used AutoDS, and the other did not, to complete a data science project. As expected, AutoDS improves productivity; Yet surprisingly, we find that the models produced by the AutoDS group have \textbf{higher quality} and \textbf{less errors}, but \textbf{lower human confidence scores}. We reflect on the findings by presenting design implications for incorporating automation techniques into human work in the data science lifecycle.

著者
Dakuo Wang
IBM Research, Cambridge, Massachusetts, United States
Josh Andres
IBM Research Australia, Melbourne, Victoria, Australia
Justin D.. Weisz
IBM Research AI, Yorktown Heights, New York, United States
Erick Oduor
IBM Research Africa, Nairobi, Nairobi, Kenya
Casey Dugan
IBM Research, Cambridge, Massachusetts, United States
DOI

10.1145/3411764.3445526

論文URL

https://doi.org/10.1145/3411764.3445526

動画

会議: CHI 2021

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2021.acm.org/)

セッション: Computational AI Development and Explanation

[B] Paper Room 02, 2021-05-14 01:00:00~2021-05-14 03:00:00 / [C] Paper Room 02, 2021-05-14 09:00:00~2021-05-14 11:00:00 / [A] Paper Room 02, 2021-05-13 17:00:00~2021-05-13 19:00:00
Paper Room 02
12 件の発表
2021-05-14 01:00:00
2021-05-14 03:00:00
日本語まとめ
読み込み中…