Automatically Generating and Improving Voice Command Interface from Operation Sequences on Smartphones

Abstract

Using voice commands to automate smartphone tasks (e.g., making a video call) can effectively augment the interactivity of numerous mobile apps. However, creating voice command interfaces requires a tremendous amount of effort in labeling and compiling the graphical user interface (GUI) and the utterance data. In this paper, we propose AutoVCI, a novel approach to automatically generate voice command interface (VCI) from smartphone operation sequences. The generated voice command interface has two distinct features. First, it automatically maps a voice command to GUI operations and fills in parameters accordingly, leveraging the GUI data instead of corpus or hand-written rules. Second, it launches a complementary Q&A dialogue to confirm the intention in case of ambiguity. In addition, the generated voice command interface can learn and evolve from user interactions. It accumulates the history command understanding results to annotate the user’s input and improve its semantic understanding ability. We implemented this approach on Android devices and conducted a two-phase user study with 16 and 67 participants in each phase. Experimental results of the study demonstrated the practical feasibility of AutoVCI.

Authors
Lihang Pan
Tsinghua University, Beijing, China
Chun Yu
Tsinghua University, Beijing, China
JiaHui Li
Tsinghua University, Beijing, China
Tian Huang
Tsinghua University, Beijing, Beijing, China
Xiaojun Bi
Stony Brook University, Stony Brook, New York, United States
Yuanchun Shi
Tsinghua University, Beijing, China
Paper URL

https://dl.acm.org/doi/abs/10.1145/3491102.3517459

Video

Conference: CHI 2022

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2022.acm.org/)

Session: AI: Content Generation

288-289
4 items in this session
2022-05-04 09:00:00
2022-05-04 10:15:00