The ability to monitor audience reactions is critical when delivering presentations. However, current videoconferencing platforms offer limited solutions to support this. This work leverages recent advances in affect sensing to capture and facilitate communication of relevant audience signals. Using an exploratory survey (N=175), we assessed the most relevant audience responses such as confusion, engagement, and head-nods. We then implemented AffectiveSpotlight, a Microsoft Teams bot that analyzes facial responses and head gestures of audience members and dynamically spotlights the most expressive ones. In a within-subjects study with 14~groups (N=117), we observed that the system made presenters significantly more aware of their audience, speak for a longer period of time, and self-assess the quality of their talk more similarly to the audience members, compared to two control conditions (randomly-selected spotlight and default platform UI). We provide design recommendations for future affective interfaces for online presentations based on feedback from the study.
https://doi.org/10.1145/3411764.3445235
There are many situations where using personal devices is not socially acceptable, or where nearby people present a privacy risk. For these situations, we explore the concept of hidden interaction techniques through two prototype applications. HiddenHaptics allows users to receive information through vibrotactile cues on a smartphone, and HideWrite allows users to write text messages by drawing on a dimmed smartwatch screen. We conducted three user studies to investigate whether, and how, these techniques can be used without being exposed. Our primary findings are (1) users can effectively hide their interactions while attending to a social situation, (2) users seek to interact when another person is speaking, and they also tend to hide the interaction using their body or furniture, and (3) users can sufficiently focus on the social situation despite their interaction, whereas non-users feel that observing the user hinders their ability to focus on the social activity.
https://doi.org/10.1145/3411764.3445504
Speech is now common in daily interactions with our devices, thanks to voice user interfaces (VUIs) like Alexa. Despite their seeming ubiquity, designs often do not match users’ expectations. Science fiction, which is known to influence design of new technologies, has included VUIs for decades. Star Trek: The Next Generation is a prime example of how people envisioned ideal VUIs. Understanding how current VUIs live up to Star Trek’s utopian technologies reveals mismatches between current designs and user expectations, as informed by popular fiction. Combining conversational analysis and VUI user analysis, we study voice interactions with the Enterprise’s computer and compare them to current interactions. Independent of futuristic computing power, we find key design-based differences: Star Trek interactions are brief and functional, not conversational, they are highly multimodal and context-driven, and there is often no spoken computer response. From this, we suggest paths to better align VUIs with user expectations.
https://doi.org/10.1145/3411764.3445640
Voice assistants are fundamentally changing the way we access information. However, voice assistants still leverage little about the web beyond simple search results. We introduce Firefox Voice, a novel voice assistant built on the open web ecosystem with an aim to expand access to information available via voice. Firefox Voice is a browser extension that enables users to use their voice to perform actions such as setting timers, navigating the web, and reading a webpage’s content aloud. Through an iterative development process and use by over 12,000 active users, we find that users see voice as a way to accomplish certain browsing tasks efficiently, but struggle with discovering functionality and frequently discontinue use. We conclude by describing how Firefox Voice enables the development of novel, open web-powered voice-driven experiences.
https://doi.org/10.1145/3411764.3445409
Silent speech input converts non-acoustic features like tongue and lip movements into text. It has been demonstrated as a promising input method on mobile devices and has been explored for a variety of audiences and contexts where the acoustic signal is unavailable (e.g., people with speech disorders) or unreliable (e.g., noisy environment). Though the method shows promise, very little is known about peoples' perceptions regarding using it. In this work, first, we conduct two user studies to explore users' attitudes towards the method with a particular focus on social acceptance and error tolerance. Results show that people perceive silent speech as more socially acceptable than speech input and are willing to tolerate more errors with it to uphold privacy and security. We then conduct a third study to identify a suitable method for providing real-time feedback on silent speech input. Results show users find an abstract feedback method effective and significantly more private and secure than a commonly used video feedback method.
https://doi.org/10.1145/3411764.3445430
Video-conferencing is essential for many companies, but its limitations in conveying social cues can lead to ineffective meetings. We present MeetingCoach, an intelligent post-meeting feedback dashboard that summarizes contextual and behavioral meeting information. Through an exploratory survey (N=120), we identified important signals (e.g., turn taking, sentiment) and used these insights to create a wireframe dashboard. The design was evaluated with in situ participants (N=16) who helped identify the components they would prefer in a post-meeting dashboard. After recording video-conferencing meetings of eight teams over four weeks, we developed an AI system to quantify the meeting features and created personalized dashboards for each participant. Through interviews and surveys (N=23), we found that reviewing the dashboard helped improve attendees' awareness of meeting dynamics, with implications for improved effectiveness and inclusivity. Based on our findings, we provide suggestions for future feedback system designs of video-conferencing meetings.
https://doi.org/10.1145/3411764.3445615
Virtual environments (VEs) can create collaborative and social spaces, which are increasingly important in the face of remote work and travel reduction. Recent advances, such as more open and widely available platforms, create new possibilities to observe and analyse interaction in VEs. Using a custom instrumented build of Mozilla Hubs to measure position and orientation, we conducted an academic workshop to facilitate a range of typical workshop activities. We analysed social interactions during a keynote, small group breakouts, and informal networking/hallway conversations. Our mixed-methods approach combined environment logging, observations, and semi-structured interviews. The results demonstrate how small and large spaces influenced group formation, shared attention, and personal space, where smaller rooms facilitated more cohesive groups while larger rooms made small group formation challenging but personal space more flexible. Beyond our findings, we show how the combination of data and insights can fuel collaborative spaces' design and deliver more effective virtual workshops.
https://doi.org/10.1145/3411764.3445729
We present a dialogue elicitation study to assess how users envision conversations with a perfect voice assistant (VA). In an online survey, N=205 participants were prompted with everyday scenarios, and wrote the lines of both user and VA in dialogues that they imagined as perfect. We analysed the dialogues with text analytics and qualitative analysis, including number of words and turns, social aspects of conversation, implied VA capabilities, and the influence of user personality. The majority envisioned dialogues with a VA that is interactive and not purely functional; it is smart, proactive, and has knowledge about the user. Attitudes diverged regarding the assistant's role as well as it expressing humour and opinions. An exploratory analysis suggested a relationship with personality for these aspects, but correlations were low overall. We discuss implications for research and design of future VAs, underlining the vision of enabling conversational UIs, rather than single command "Q&As".
https://doi.org/10.1145/3411764.3445536
The use of virtual reality (VR) to simulate confrontational human behaviour has significant potential for use in training, where the recreation of uncomfortable feelings may help users to prepare for challenging real-life situations. In this paper we present a user study (n=68) in which participants experienced simulated confrontational behaviour performed by a virtual character either in immersive VR, or on a 2D display. Participants reported a higher elevation in anxiety in VR, which correlated positively with a perceived sense of physical space. Character believability was influenced negatively by visual elements of the simulation, and positively by behavioural elements, which complements findings from previous work. We recommend the use of VR for simulations of confrontational behaviour, where a realistic emotional response is part of the intended experience. We also discuss incorporation of domain knowledge of human behaviours, and carefully crafted motion-captured sequences, to increase users' sense of believability.
https://doi.org/10.1145/3411764.3445401
Bringing positive experiences to users is one of the key goals when designing conversational agents (CAs). Yet we still lack an understanding of users’ underlying needs to achieve positive experiences and how to support them in design. This research first applies Self-Determination Theory in an interview study to explore how users’ needs of competence, autonomy and relatedness could be supported or undermined in CA experiences. Ten guidelines are then derived from the interview findings. The key findings demonstrate that: competence is affected by users’ knowledge of the CA capabilities and effectiveness of the conversation; autonomy is influenced by flexibility of the conversation, personalisation of the experiences, and control over user data; regarding relatedness, users still have concerns over integrating social features into CAs. The guidelines recommend how to inform users about the system capabilities, design effective and socially appropriate conversations, and support increased system intelligence, customisation, and data transparency.
https://doi.org/10.1145/3411764.3445445
Although social support is important for health and well-being, many young people are hesitant to reach out for support. The emerging uptake of chatbots for social and emotional purposes entails opportunities and concerns regarding non-human agents as sources of social support. To explore this, we invited 16 participants (16–21 years) to use and reflect on chatbots as sources of social support. Our participants first interacted with a chatbot for mental health (Woebot) for two weeks. Next, they participated in individual in-depth interviews. As part of the interview session, they were presented with a chatbot prototype providing information to young people. Two months later, the participants reported on their continued use of Woebot. Our findings provide in-depth knowledge about how young people may experience various types of social support—appraisal, informational, emotional, and instrumental support—from chatbots. We summarize implications for theory, practice, and future research.
https://doi.org/10.1145/3411764.3445318