Voice interaction has long been envisioned as enabling users to transform physical interaction into hands-free, such as allowing fine-grained control of instructional videos without physically disengaging from the task at hand. While significant engineering advances have brought us closer to this ideal, we do not fully understand the user requirements for voice interactions that should be supported in such contexts. This paper presents an ecologically-valid wizard-of-oz elicitation study exploring realistic user requirements for an ideal instructional video playback control while cooking. Through the analysis of the issued commands and performed actions during this non-linear and complex task, we identify (1) patterns of command formulation, (2) challenges for design, and (3) how task and voice-based commands are interwoven in real-life. We discuss implications for the design and research of voice interactions for navigating instructional videos while performing complex tasks.
https://dl.acm.org/doi/abs/10.1145/3491102.3502036
The ACM CHI Conference on Human Factors in Computing Systems (https://chi2022.acm.org/)