WebUI: A Dataset for Enhancing Visual UI Understanding with Web Semantics

Modeling user interfaces (UIs) from visual information allows systems to make inferences about the functionality and semantics needed to support use cases in accessibility, app automation, and testing. Current datasets for training machine learning models are limited in size due to the costly and time-consuming process of manually collecting and annotating UIs. We crawled the web to construct WebUI, a large dataset of 400,000 rendered web pages associated with automatically extracted metadata. We analyze the composition of WebUI and show that while automatically extracted data is noisy, most examples meet basic criteria for visual UI modeling. We applied several strategies for incorporating semantics found in web pages to increase the performance of visual UI understanding models in the mobile domain, where less labeled data is available: (i) element detection, (ii) screen classification and (iii) screen similarity.

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

Wellesley College, Wellesley, Massachusetts, United States

Grinnell College, Grinnell, Iowa, United States

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

Snooty Bird LLC, San Diego, California, United States

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

https://doi.org/10.1145/3544548.3581158

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2023.acm.org/)

Hall C

6 件の発表

開始日時2023-04-25 18:00:00

終了日時2023-04-25 19:30:00

お気に入り

あとで読む

コレクション

WebUI: A Dataset for Enhancing Visual UI Understanding with Web Semantics

要旨

受賞
Honorable Mention

著者

論文URL

動画

会議: CHI 2023

セッション: GUIs, Gaze, and Gesture-based Interaction

WebUI: A Dataset for Enhancing Visual UI Understanding with Web Semantics

要旨

受賞Honorable Mention

著者

論文URL

動画

会議: CHI 2023

セッション: GUIs, Gaze, and Gesture-based Interaction

受賞
Honorable Mention