Voice, Conversation and Design

会議の名前
CHI 2022
Unpacking Practitioners’ Attitudes Towards Codifications of Design Knowledge for Voice User Interfaces
要旨

Recent HCI research has sought to develop guidelines—‘heuristics’, ‘best practices’, ‘principles’ and so on—for voice user interfaces (VUI) to aid both practitioners and researchers in improving the quality of VUI-based design. However, limited research is available on how such design knowledge is conceptualised and used by industry practitioners. We present a small interview-based study conducted with 9 experienced VUI industry practitioners. Their concerns range from terminological challenges associated with VUI design knowledge, the role of codifcations of such knowledge like design guidelines alongside their practical design work, through to their views on the value of ‘harmonisation’ of VUI design knowledge. Given the complex—albeit preliminary—picture that emerges, we argue for HCI’s deeper consideration of how design knowledge meshes with the contingencies of practice, so that VUI design knowledge—such as design guidelines developed in HCI—delivers the most potential value for industry practice.

著者
Krishika Haresh Khemani
University of Nottingham, Nottingham, United Kingdom
Stuart Reeves
University of Nottingham, Nottingham, Nottinghamshire, United Kingdom
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3517623

動画
Designing for Speech Practice Systems: How Do User-Controlled Voice Manipulation and Model Speakers Impact Self-Perceptions of Voice?
要旨

Can you speak the way you desire without feeling the pressure to conform to standards of speaking? In this study, we investigated the impact of user-controlled voice manipulation and listening to recordings of model speakers on self-perceptions of voice and speech. Quantitative analysis showed that there was a significant improvement in the perceived confidence of tone by listening to model speakers, but there were no significant improvements due to voice manipulation. Qualitative analysis of interviews revealed that participants responded positively to the visual and auditory feedback provided by the voice manipulation software. The participants also evaluated the quality of model speakers to decide whether or not they wanted to refer to them for speech practice. Based on the results of these analyses, we summarized the design implications for a speech practice system that would allow further investigation of the impact of the system on self-perceptions of speech performance.

著者
Lisa Orii
University of Washington, Seattle, Washington, United States
Nami Ogawa
The University of Tokyo, Tokyo, Japan
Yuji Hatada
The University of Tokyo, Tokyo, Japan
Takuji Narumi
the University of Tokyo, Tokyo, Japan
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3502093

動画
Great Chain of Agents: The Role of Metaphorical Representation of Agents in Conversational Crowdsourcing
要旨

Conversational agents are being widely adopted across several domains to serve a variety of purposes ranging from providing intelligent assistance to companionship. Recent literature has shown that users develop intuitive folk theories and a metaphorical understanding of conversational agents (CAs) due to the lack of a mental model of the agents. However, investigation of metaphorical agent representation in the HCI community has mainly focused on the human level, despite non-human metaphors for agents being prevalent in the real world. We adopted Lakoff and Turner's `Great Chain of Being' framework to systematically investigate the impact of using non-human metaphors to represent conversational agents on worker engagement in crowdsourcing marketplaces. We designed a text-based conversational agent that assists crowd workers in task execution. Through a between-subjects experimental study (N=341), we explored how different human and non-human metaphors affect worker engagement, the perceived cognitive load of workers, intrinsic motivation, and their trust in the agents. Our findings bridge the gap of how users experience CAs with non-human metaphors in the context of conversational crowdsourcing.

著者
Ji-Youn Jung
Delft University of Technology, Delft, Netherlands
Sihang Qiu
Delft University of Technology, Delft, Netherlands
Alessandro Bozzon
Delft University of Technology, Delft, Netherlands
Ujwal Gadiraju
Delft University of Technology, Delft, Netherlands
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3517653

動画
"Rewind to the Jiggling Meat Part": Understanding Voice Control of Instructional Videos in Everyday Tasks
要旨

Voice interaction has long been envisioned as enabling users to transform physical interaction into hands-free, such as allowing fine-grained control of instructional videos without physically disengaging from the task at hand. While significant engineering advances have brought us closer to this ideal, we do not fully understand the user requirements for voice interactions that should be supported in such contexts. This paper presents an ecologically-valid wizard-of-oz elicitation study exploring realistic user requirements for an ideal instructional video playback control while cooking. Through the analysis of the issued commands and performed actions during this non-linear and complex task, we identify (1) patterns of command formulation, (2) challenges for design, and (3) how task and voice-based commands are interwoven in real-life. We discuss implications for the design and research of voice interactions for navigating instructional videos while performing complex tasks.

著者
Yaxi Zhao
University of Toronto, Toronto, Ontario, Canada
Razan Jaber
Stockholm University , Stockholm, Sweden
Donald McMillan
Stockholm University , Stockholm, Sweden
Cosmin Munteanu
University of Toronto Mississauga, Mississauga, Ontario, Canada
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3502036

動画
"A Voice that Suits the Situation": Understanding the Needs and Challenges for Supporting End-User Voice Customization
要旨

Although there is a potential demand for customizing voices, most customization is limited to the visual appearance of a figure (e.g., avatars). To better understand the users' needs, we first conducted an online survey with 104 participants. Then we conducted a semi-structured interview with a prototype with 14 participants to identify design considerations for supporting voice customization. The results show that there is a desire for voice customization especially for non-face-to-face conversations with someone unfamiliar. In addition, the findings revealed that different voices are favored for different contexts from a better version of one's own voice for improving delivery to a completely different voice for securing identity. As future work, we plan to extend this study by investigating voice synthesis techniques for end-users who wish to design their own voices for various contexts.

著者
Hyeon Jeong Byeon
Ewha Womans University, Seoul, Korea, Republic of
Chaerin Lee
Ewha Womans University, Seoul, Korea, Republic of
Jeemin Lee
Ewha Womans University, Seoul, Korea, Republic of
Uran Oh
Ewha womans university , Seoul, Korea, Republic of
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3501856

動画