Meetings, Chats, and Speech

https://doi.org/10.1145/3411764.3445504

There are many situations where using personal devices is not socially acceptable, or where nearby people present a privacy risk. For these situations, we explore the concept of hidden interaction techniques through two prototype applications. HiddenHaptics allows users to receive information through vibrotactile cues on a smartphone, and HideWrite allows users to write text messages by drawing on a dimmed smartwatch screen. We conducted three user studies to investigate whether, and how, these techniques can be used without being exposed. Our primary findings are (1) users can effectively hide their interactions while attending to a social situation, (2) users seek to interact when another person is speaking, and they also tend to hide the interaction using their body or furniture, and (3) users can sufficiently focus on the social situation despite their interaction, whereas non-users feel that observing the user hinders their ability to focus on the social activity.

LMU Munich, Munich, Germany

Wellesley College, Wellesley, Massachusetts, United States

Bundeswehr University Munich, Munich, Germany

LMU Munich, Munich, Germany

10.1145/3411764.3445504

https://doi.org/10.1145/3411764.3445640

Speech is now common in daily interactions with our devices, thanks to voice user interfaces (VUIs) like Alexa. Despite their seeming ubiquity, designs often do not match users’ expectations. Science fiction, which is known to influence design of new technologies, has included VUIs for decades. Star Trek: The Next Generation is a prime example of how people envisioned ideal VUIs. Understanding how current VUIs live up to Star Trek’s utopian technologies reveals mismatches between current designs and user expectations, as informed by popular fiction. Combining conversational analysis and VUI user analysis, we study voice interactions with the Enterprise’s computer and compare them to current interactions. Independent of futuristic computing power, we find key design-based differences: Star Trek interactions are brief and functional, not conversational, they are highly multimodal and context-driven, and there is often no spoken computer response. From this, we suggest paths to better align VUIs with user expectations.

University of Toronto, Toronto, Ontario, Canada

University of Toronto Mississauga, Mississauga, Ontario, Canada

10.1145/3411764.3445640

https://doi.org/10.1145/3411764.3445409

Voice assistants are fundamentally changing the way we access information. However, voice assistants still leverage little about the web beyond simple search results. We introduce Firefox Voice, a novel voice assistant built on the open web ecosystem with an aim to expand access to information available via voice. Firefox Voice is a browser extension that enables users to use their voice to perform actions such as setting timers, navigating the web, and reading a webpage’s content aloud. Through an iterative development process and use by over 12,000 active users, we find that users see voice as a way to accomplish certain browsing tasks efficiently, but struggle with discovering functionality and frequently discontinue use. We conclude by describing how Firefox Voice enables the development of novel, open web-powered voice-driven experiences.

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

University of Tennessee, Knoxville, Knoxville, Tennessee, United States

University of Central Florida, Orlando, Florida, United States

Mozilla, Mountain View, California, United States

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

Mozilla, Mountain View, California, United States

10.1145/3411764.3445409

https://doi.org/10.1145/3411764.3445430

Silent speech input converts non-acoustic features like tongue and lip movements into text. It has been demonstrated as a promising input method on mobile devices and has been explored for a variety of audiences and contexts where the acoustic signal is unavailable (e.g., people with speech disorders) or unreliable (e.g., noisy environment). Though the method shows promise, very little is known about peoples' perceptions regarding using it. In this work, first, we conduct two user studies to explore users' attitudes towards the method with a particular focus on social acceptance and error tolerance. Results show that people perceive silent speech as more socially acceptable than speech input and are willing to tolerate more errors with it to uphold privacy and security. We then conduct a third study to identify a suitable method for providing real-time feedback on silent speech input. Results show users find an abstract feedback method effective and significantly more private and secure than a commonly used video feedback method.

University of California, Merced, Merced, California, United States

University of British Columbia, Kelowna, British Columbia, Canada

University of California, Merced, Merced, California, United States

10.1145/3411764.3445430

https://doi.org/10.1145/3411764.3445615

Video-conferencing is essential for many companies, but its limitations in conveying social cues can lead to ineffective meetings. We present MeetingCoach, an intelligent post-meeting feedback dashboard that summarizes contextual and behavioral meeting information. Through an exploratory survey (N=120), we identified important signals (e.g., turn taking, sentiment) and used these insights to create a wireframe dashboard. The design was evaluated with in situ participants (N=16) who helped identify the components they would prefer in a post-meeting dashboard. After recording video-conferencing meetings of eight teams over four weeks, we developed an AI system to quantify the meeting features and created personalized dashboards for each participant. Through interviews and surveys (N=23), we found that reviewing the dashboard helped improve attendees' awareness of meeting dynamics, with implications for improved effectiveness and inclusivity. Based on our findings, we provide suggestions for future feedback system designs of video-conferencing meetings.

University of Rochester, Rochester, New York, United States

Microsoft, Seattle, Washington, United States

Microsoft, Redmond, Washington, United States

Microsoft Research, Redmond, Washington, United States

Microsoft Research, Cambridge, Massachusetts, United States

Microsoft Research, Cambridge, United Kingdom

Microsoft Research, Barcelona, Spain

Microsoft Research, Redmond, Washington, United States

10.1145/3411764.3445615

https://doi.org/10.1145/3411764.3445729

Virtual environments (VEs) can create collaborative and social spaces, which are increasingly important in the face of remote work and travel reduction. Recent advances, such as more open and widely available platforms, create new possibilities to observe and analyse interaction in VEs. Using a custom instrumented build of Mozilla Hubs to measure position and orientation, we conducted an academic workshop to facilitate a range of typical workshop activities. We analysed social interactions during a keynote, small group breakouts, and informal networking/hallway conversations. Our mixed-methods approach combined environment logging, observations, and semi-structured interviews. The results demonstrate how small and large spaces influenced group formation, shared attention, and personal space, where smaller rooms facilitated more cohesive groups while larger rooms made small group formation challenging but personal space more flexible. Beyond our findings, we show how the combination of data and insights can fuel collaborative spaces' design and deliver more effective virtual workshops.

University of Glasgow, Glasgow, United Kingdom

Centrum Wiskunde & Informatica, Amsterdam, Netherlands

BBC Research & Development, London, United Kingdom

CWI, Amsterdam, Netherlands

10.1145/3411764.3445729

https://doi.org/10.1145/3411764.3445536

We present a dialogue elicitation study to assess how users envision conversations with a perfect voice assistant (VA). In an online survey, N=205 participants were prompted with everyday scenarios, and wrote the lines of both user and VA in dialogues that they imagined as perfect. We analysed the dialogues with text analytics and qualitative analysis, including number of words and turns, social aspects of conversation, implied VA capabilities, and the influence of user personality. The majority envisioned dialogues with a VA that is interactive and not purely functional; it is smart, proactive, and has knowledge about the user. Attitudes diverged regarding the assistant's role as well as it expressing humour and opinions. An exploratory analysis suggested a relationship with personality for these aspects, but correlations were low overall. We discuss implications for research and design of future VAs, underlining the vision of enabling conversational UIs, rather than single command "Q&As".

LMU Munich, Munich, Germany

University of Bayreuth, Bayreuth, Germany

LMU Munich, Munich, Germany

University College Dublin, Dublin, Ireland

LMU Munich, Munich, Germany

10.1145/3411764.3445536

https://doi.org/10.1145/3411764.3445401

The use of virtual reality (VR) to simulate confrontational human behaviour has significant potential for use in training, where the recreation of uncomfortable feelings may help users to prepare for challenging real-life situations. In this paper we present a user study (n=68) in which participants experienced simulated confrontational behaviour performed by a virtual character either in immersive VR, or on a 2D display. Participants reported a higher elevation in anxiety in VR, which correlated positively with a perceived sense of physical space. Character believability was influenced negatively by visual elements of the simulation, and positively by behavioural elements, which complements findings from previous work. We recommend the use of VR for simulations of confrontational behaviour, where a realistic emotional response is part of the intended experience. We also discuss incorporation of domain knowledge of human behaviours, and carefully crafted motion-captured sequences, to increase users' sense of believability.

University of Lincoln, Lincoln, United Kingdom

University of Lincoln, Lincoln, Lincolnshire, United Kingdom

Lincoln University, Lincoln, Lincolnshire, United Kingdom

University of Lincoln, Lincoln, United Kingdom

KU Leuven, Leuven, Belgium

University of Lincoln, Lincoln, Lincolnshire, United Kingdom

University of Lincoln, Lincoln, United Kingdom

University of the West of Scotland, Glasgow, United Kingdom

10.1145/3411764.3445401

https://doi.org/10.1145/3411764.3445445

Bringing positive experiences to users is one of the key goals when designing conversational agents (CAs). Yet we still lack an understanding of users’ underlying needs to achieve positive experiences and how to support them in design. This research first applies Self-Determination Theory in an interview study to explore how users’ needs of competence, autonomy and relatedness could be supported or undermined in CA experiences. Ten guidelines are then derived from the interview findings. The key findings demonstrate that: competence is affected by users’ knowledge of the CA capabilities and effectiveness of the conversation; autonomy is influenced by flexibility of the conversation, personalisation of the experiences, and control over user data; regarding relatedness, users still have concerns over integrating social features into CAs. The guidelines recommend how to inform users about the system capabilities, design effective and socially appropriate conversations, and support increased system intelligence, customisation, and data transparency.

Imperial College London, London, United Kingdom

10.1145/3411764.3445445