注目の論文一覧

各カテゴリ上位30論文までを表示しています

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2024.acm.org/)

34
SplitBody: Reducing Mental Workload while Multitasking via Muscle Stimulation
Romain Nith (University of Chicago, Chicago, Illinois, United States)Yun Ho (University of Chicago, Chicago, Illinois, United States)Pedro Lopes (University of Chicago, Chicago, Illinois, United States)
Techniques like electrical muscle stimulation (EMS) offer promise in assisting physical tasks by automating movements, e.g., shaking a spray-can or tapping a button. However, existing actuation systems improve the performance of a task that users are already focusing on (e.g., users are already focused on using the spray-can). Instead, we investigate whether these interactive-actuation systems (e.g., EMS) offer any benefits if they automate a task that happens in the background of the user's focus. Thus, we explored whether automating a repetitive movement via EMS would reduce mental workload while users perform parallel tasks (e.g., focusing on writing an essay while EMS stirs a pot of soup). In our study, participants performed a cognitively-demanding multitask aided by EMS (SplitBody condition) or performed by themselves (baseline). We found that with SplitBody performance increased (35% on both tasks, 18% on the non-EMS-automated task), physical-demand decreased (31%), and mental-workload decreased (26%).
23
Tagnoo: Enabling Smart Room-Scale Environments with RFID-Augmented Plywood
Yuning Su (Simon Fraser University, Burnaby, British Columbia, Canada)Tingyu Zhang (Simon Fraser University, Burnaby, British Columbia, Canada)Jiuen Feng (University of Science and Technology of China, Hefei, Anhui, China)Yonghao Shi (Simon Fraser University, Burnaby, British Columbia, Canada)Xing-Dong Yang (Simon Fraser University, Burnaby, British Columbia, Canada)Te-Yen Wu (Florida State University, Tallahassee, Florida, United States)
Tagnoo is a computational plywood augmented with RFID tags, aimed at empowering woodworkers to effortlessly create room-scale smart environments. Unlike existing solutions, Tagnoo does not necessitate technical expertise or disrupt established woodworking routines. This battery-free and cost-effective solution seamlessly integrates computation capabilities into plywood, while preserving its original appearance and functionality. In this paper, we explore various parameters that can influence Tagnoo's sensing performance and woodworking compatibility through a series of experiments. Additionally, we demonstrate the construction of a small office environment, comprising a desk, chair, shelf, and floor, all crafted by an experienced woodworker using conventional tools such as a table saw and screws while adhering to established construction workflows. Our evaluation confirms that the smart environment can accurately recognize 18 daily objects and user activities, such as a user sitting on the floor or a glass lunchbox placed on the desk, with over 90% accuracy.
22
MOSion: Gaze Guidance with Motion-triggered Visual Cues by Mosaic Patterns
Arisa Kohtani (Tokyo Institute of Technology, Tokyo, Japan)Shio Miyafuji (Tokyo Institute of Technology, Tokyo, Japan)Keishiro Uragaki (Aoyama Gakuin University, Tokyo, Japan)Hidetaka Katsuyama (Tokyo Institute of Technology, Tokyo, Japan)Hideki Koike (Tokyo Institute of Technology, Tokyo, Japan)
We propose a gaze-guiding method called MOSion to adjust the guiding strength reacted to observers’ motion based on a high-speed projector and the afterimage effect in the human vision system. Our method decomposes the target area into mosaic patterns to embed visual cues in the perceived images. The patterns can only direct the attention of the moving observers to the target area. The stopping observer can see the original image with little distortion because of light integration in the visual perception. The pre computation of the patterns provides the adaptive guiding effect without tracking devices and computational costs depending on the movements. The evaluation and the user study show that the mosaic decomposition enhances the perceived saliency with a few visual artifacts, especially in moving conditions. Our method embedded in white lights works in various situations such as planar posters, advertisements, and curved objects.
20
DirectGPT: A Direct Manipulation Interface to Interact with Large Language Models
Damien Masson (University of Waterloo, Waterloo, Ontario, Canada)Sylvain Malacria (Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRIStAL, Lille, France)Géry Casiez (Univ. Lille, CNRS, Inria, Centrale Lille, UMR 9189 CRIStAL, Lille, France)Daniel Vogel (University of Waterloo, Waterloo, Ontario, Canada)
We characterize and demonstrate how the principles of direct manipulation can improve interaction with large language models. This includes: continuous representation of generated objects of interest; reuse of prompt syntax in a toolbar of commands; manipulable outputs to compose or control the effect of prompts; and undo mechanisms. This idea is exemplified in DirectGPT, a user interface layer on top of ChatGPT that works by transforming direct manipulation actions to engineered prompts. A study shows participants were 50% faster and relied on 50% fewer and 72% shorter prompts to edit text, code, and vector images compared to baseline ChatGPT. Our work contributes a validated approach to integrate LLMs into traditional software using direct manipulation. Data, code, and demo available at https://osf.io/3wt6s.
17
MAF: Exploring Mobile Acoustic Field for Hand-to-Face Gesture Interactions
Yongjie Yang (University of Pittsburgh, Pittsburgh, Pennsylvania, United States)Tao Chen (University of Pittsburgh, Pittsburgh, Pennsylvania, United States)Yujing Huang (University of Pittsburgh, Pittsburgh, Pennsylvania, United States)Xiuzhen Guo (Zhejiang University, Hangzhou, China)Longfei Shangguan (University of Pittsburgh, Pittsburgh, Pennsylvania, United States)
We present MAF, a novel acoustic sensing approach that leverages the commodity hardware in bone conduction earphones for hand-to-face gesture interactions. Briefly, by shining audio signals with bone conduction earphones, we observe that these signals not only propagate along the surface of the human face but also dissipate into the air, creating an acoustic field that envelops the individual’s head. We conduct benchmark studies to understand how various hand-to-face gestures and human factors influence this acoustic field. Building on the insights gained from these initial studies, we then propose a deep neural network combined with signal preprocessing techniques. This combination empowers MAF to effectively detect, segment, and subsequently recognize a variety of hand-to-face gestures, whether in close contact with the face or above it. Our comprehensive evaluation based on 22 participants demonstrates that MAF achieves an average gesture recognition accuracy of 92% across ten different gestures tailored to users' preferences.
17
Using Low-frequency Sound to Create Non-contact Sensations On and In the Body
Waseem Hassan (University of Copenhagen, Copenhagen, Denmark)Asier Marzo (Universidad Publica de Navarra, Pamplona, Navarre, Spain)Kasper Hornbæk (University of Copenhagen, Copenhagen, Denmark)
This paper proposes a method for generating non-contact sensations using low-frequency sound waves without requiring user instrumentation. This method leverages the fundamental acoustic response of a confined space to produce predictable pressure spatial distributions at low frequencies, called modes. These modes can be used to produce sensations either throughout the body, in localized areas of the body, or within the body. We first validate the location and strength of the modes simulated by acoustic modeling. Next, a perceptual study is conducted to show how different frequencies produce qualitatively different sensations across and within the participants' bodies. The low-frequency sound offers a new way of delivering non-contact sensations throughout the body. The results indicate a high accuracy for predicting sensations at specific body locations.
17
Using the Visual Language of Comics to Alter Sensations in Augmented Reality
Arpit Bhatia (University of Copenhagen, Copenhagen, Denmark)Henning Pohl (Aalborg University, Aalborg, Denmark)Teresa Hirzle (University of Copenhagen, Copenhagen, Denmark)Hasti Seifi (Arizona State University, Tempe, Arizona, United States)Kasper Hornbæk (University of Copenhagen, Copenhagen, Denmark)
Augmented Reality (AR) excels at altering what we see but non-visual sensations are difficult to augment. To augment non-visual sensations in AR, we draw on the visual language of comic books. Synthesizing comic studies, we create a design space describing how to use comic elements (e.g., onomatopoeia) to depict non-visual sensations (e.g., hearing). To demonstrate this design space, we built eight demos, such as speed lines to make a user think they are faster and smell lines to make a scent seem stronger. We evaluate these elements in a qualitative user study (N=20) where participants performed everyday tasks with comic elements added as augmentations. All participants stated feeling a change in perception for at least one sensation, with perceived changes detected by between four participants (touch) and 15 participants (hearing). The elements also had positive effects on emotion and user experience, even when participants did not feel changes in perception.
16
(Un)making AI Magic: A Design Taxonomy
Maria Luce Lupetti (Delft University of Technology, Delft, Netherlands)Dave Murray-Rust (TU Delft, Delft, Zuid Holland, Netherlands)
This paper examines the role that enchantment plays in the design of AI things by constructing a taxonomy of design approaches that increase or decrease the perception of magic and enchantment. We start from the design discourse surrounding recent developments in AI technologies, highlighting specific interaction qualities such as algorithmic uncertainties and errors and articulating relations to the rhetoric of magic and supernatural thinking. Through analyzing and reflecting upon 52 students' design projects from two editions of a Master course in design and AI, we identify seven design principles and unpack the effects of each in terms of enchantment and disenchantment. We conclude by articulating ways in which this taxonomy can be approached and appropriated by design/HCI practitioners, especially to support exploration and reflexivity.
16
CharacterMeet: Supporting Creative Writers' Entire Story Character Construction Processes Through Conversation with LLM-Powered Chatbot Avatars
Hua Xuan Qin (The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China)Shan Jin (The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China)Ze Gao (Hong Kong University of Science and Technology, Hong Kong, Hong Kong, China)Mingming Fan (The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China)Pan Hui (The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China)
Support for story character construction is as essential as characters are for stories. Building upon past research on early character construction stages, we explore how conversation with chatbot avatars embodying characters powered by more recent technologies could support the entire character construction process for creative writing. Through a user study (N=14) with creative writers, we examine thinking and usage patterns of CharacterMeet, a prototype system allowing writers to progressively manifest characters through conversation while customizing context, character appearance, voice, and background image. We discover that CharacterMeet facilitates iterative character construction. Specifically, participants, including those with more linear usual approaches, alternated between writing and personalized exploration through visualization of ideas on CharacterMeet while visuals and audio enhanced immersion. Our findings support research on iterative creative processes and the growing potential of personalizable generative AI creativity support tools. We present design implications for leveraging chatbot avatars in the creative writing process.
16
MARingBA: Music-Adaptive Ringtones for Blended Audio Notification Delivery
Alexander Wang (Carnegie Mellon University, Pittsburgh, Pennsylvania, United States)Yi Fei Cheng (Carnegie Mellon University, Pittsburgh, Pennsylvania, United States)David Lindlbauer (Carnegie Mellon University, Pittsburgh, Pennsylvania, United States)
Audio notifications provide users with an efficient way to access information beyond their current focus of attention. Current notification delivery methods, like phone ringtones, are primarily optimized for high noticeability, enhancing situational awareness in some scenarios but causing disruption and annoyance in others. In this work, we build on the observation that music listening is now a commonplace practice and present MARingBA, a novel approach that blends ringtones into background music to modulate their noticeability. We contribute a design space exploration of music-adaptive manipulation parameters, including beat matching, key matching, and timbre modifications, to tailor ringtones to different songs. Through two studies, we demonstrate that MARingBA supports content creators in authoring audio notifications that fit low, medium, and high levels of urgency and noticeability. Additionally, end users prefer music-adaptive audio notifications over conventional delivery methods, such as volume fading.
15
Towards an Eye-Brain-Computer Interface: Combining Gaze with the Stimulus-Preceding Negativity for Target Selections in XR
G S Rajshekar Reddy (University of Colorado Boulder, Boulder, Colorado, United States)Michael J. Proulx (Meta Reality Labs Research, Redmond, Washington, United States)Leanne Hirshfield (University of Colorado, Boulder, Colorado, United States)Anthony Ries (DEVCOM Army Research Laboratory, Aberdeen Proving Ground, Maryland, United States)
Gaze-assisted interaction techniques enable intuitive selections without requiring manual pointing but can result in unintended selections, known as Midas touch. A confirmation trigger eliminates this issue but requires additional physical and conscious user effort. Brain-computer interfaces (BCIs), particularly passive BCIs harnessing anticipatory potentials such as the Stimulus-Preceding Negativity (SPN) - evoked when users anticipate a forthcoming stimulus - present an effortless implicit solution for selection confirmation. Within a VR context, our research uniquely demonstrates that SPN has the potential to decode intent towards the visually focused target. We reinforce the scientific understanding of its mechanism by addressing a confounding factor - we demonstrate that the SPN is driven by the user's intent to select the target, not by the stimulus feedback itself. Furthermore, we examine the effect of familiarly placed targets, finding that SPN may be evoked quicker as users acclimatize to target locations; a key insight for everyday BCIs.
15
Unlocking Understanding: An Investigation of Multimodal Communication in Virtual Reality Collaboration
Ryan Ghamandi (University of Central Florida, Orlando, Florida, United States)Ravi Kiran Kattoju (University of Central Florida, Orlando, Florida, United States)Yahya Hmaiti (University of Central Florida, Orlando, Florida, United States)Mykola Maslych (University of Central Florida, Orlando, Florida, United States)Eugene Matthew. Taranta (University of Central Florida, Orlando, Florida, United States)Ryan P. McMahan (University of Central Florida, Orlando, Florida, United States)Joseph LaViola (University of Central Florida, Orlando, Florida, United States)
Communication in collaboration, especially synchronous, remote communication, is crucial to the success of task-specific goals. Insufficient or excessive forms of communication may lead to detrimental effects on task performance while increasing mental fatigue. However, identifying which combinations of communication modalities provide the most efficient transfer of information in collaborative settings will greatly improve collaboration. To investigate this, we developed a remote, synchronous, asymmetric VR collaborative assembly task application, where users play the role of either mentor or mentee, and were exposed to different combinations of three communication modalities: voice, gestures, and gaze. Through task-based experiments with 25 pairs of participants (50 individuals), we evaluated quantitative and qualitative data and found that gaze did not differ significantly from multiple combinations of communication modalities. Our qualitative results indicate that mentees experienced more difficulty and frustration in completing tasks than mentors, with both types of users preferring all three modalities to be present.
15
Cooking With Agents: Designing Context-aware Voice Interaction
Razan Jaber (Stockholm University , Stockholm, Sweden)Sabrina Zhong (University College London, London, United Kingdom)Sanna Kuoppamäki (KTH Royal Institute of Technology, Stockholm, Sweden)Aida Hosseini (KTH Royal Institute of Technology, Stockholm, Sweden)Iona Gessinger (University College Dublin, Dublin, Ireland)Duncan P. Brumby (University College London, London, United Kingdom)Benjamin R.. Cowan (University College Dublin, Dublin, Ireland)Donald McMillan (Stockholm University , Stockholm, Sweden)
Voice Agents (VAs) are touted as being able to help users in complex tasks such as cooking and interacting as a conversational partner to provide information and advice while the task is ongoing. Through conversation analysis of 7 cooking sessions with a commercial VA, we identify challenges caused by a lack of contextual awareness leading to irrelevant responses, misinterpretation of requests, and information overload. Informed by this, we evaluated 16 cooking sessions with a wizard-led context-aware VA. We observed more fluent interaction between humans and agents, including more complex requests, explicit grounding within utterances, and complex social responses. We discuss reasons for this, the potential for personalisation, and the division of labour in VA communication and proactivity. Then, we discuss the recent advances in generative models and the VAs interaction challenges. We propose limited context awareness in VAs as a step toward explainable, explorable conversational interfaces.
15
Gaze on the Go: Effect of Spatial Reference Frame on Visual Target Acquisition During Physical Locomotion in Extended Reality
Pavel Manakhov (Aarhus University, Aarhus, Denmark)Ludwig Sidenmark (University of Toronto, Toronto, Ontario, Canada)Ken Pfeuffer (Aarhus University, Aarhus, Denmark)Hans Gellersen (Lancaster University, Lancaster, United Kingdom)
Spatial interaction relies on fast and accurate visual acquisition. In this work, we analyse how visual acquisition and tracking of targets presented in a head-mounted display is affected by the user moving linearly at walking and jogging paces. We study four reference frames in which targets can be presented: Head and World where targets are affixed relative to the head and environment, respectively; HeadDelay where targets are presented in the head coordinate system but follow head movement with a delay, and novel Path where targets remain at fixed distance in front of the user, in the direction of their movement. Results of our study in virtual reality demonstrate that the more stable the target is relative to the environment, the faster and more precise it can be fixated. The results have practical significance as head-mounted displays enable interaction during mobility, and in particular when eye tracking is considered as input.
15
Outplay Your Weaker Self: A Mixed-Methods Study on Gamification to Overcome Procrastination in Academia
Jeanine Kirchner-Krath (Friedrich-Alexander-Universität Erlangen-Nuremberg, Nuremberg, Germany)Manuel Schmidt-Kraepelin (Institute of Applied Informatics and Formal Description Methods, Karlsruhe, Germany)Sofia Schöbel (Information Systems, Osnabrück, Germany)Mathias Ullrich (University of Koblenz, Koblenz, Germany)Ali Sunyaev (Karlsruhe Institute of Technology, Karlsruhe, Germany)Harald F. O.. von Korflesch (University of Koblenz, Koblenz, Germany)
Procrastination is the deliberate postponing of tasks knowing that it will have negative consequences in the future. Despite the potentially serious impact on mental and physical health, research has just started to explore the potential of information systems to help students combat procrastination. Specifically, while existing learning systems increasingly employ elements of game design to transform learning into an enjoyable and purposeful adventure, little is known about the effects of gameful approaches to overcome procrastination in academic settings. This study advances knowledge on gamification to counter procrastination by conducting a mixed-methods study among higher education students. Our results shed light on usage patterns and outcomes of gamification on self-efficacy, self-control, and procrastination behaviors. The findings contribute to theory by providing a better understanding of the potential of gamification to tackle procrastination. Practitioners are supported by implications on how to design gamified learning systems to support learners in self-organized work.
14
Volumetric Hybrid Workspaces: Interactions with Objects in Remote and Co-located Telepresence
Andrew Irlitti (University of Melbourne, Melbourne, Australia)Mesut Latifoglu (The University of Melbourne, Melbourne, Australia)Thuong Hoang (Deakin University, Geelong, Australia)Brandon Victor. Syiem (Queensland University of Technology, Brisbane, Queensland, Australia)Frank Vetere (The University of Melbourne, Melbourne, Australia)
Volumetric telepresence aims to create a shared space, allowing people in local and remote settings to collaborate seamlessly. Prior telepresence examples typically have asymmetrical designs, with volumetric capture in one location and objects in one format. In this paper, we present a volumetric telepresence mixed reality system that supports real-time, symmetrical, multi-user, partially distributed interactions, using objects in multiple formats, across multiple locations. We align two volumetric environments around a common spatial feature to create a shared workspace for remote and co-located people using objects in three formats: physical, virtual, and volumetric. We conducted a study with 18 participants over 6 sessions, evaluating how telepresence workspaces support spatial coordination and hybrid communication for co-located and remote users undertaking collaborative tasks. Our findings demonstrate the successful integration of remote spaces, effective use of proxemics and deixis to support negotiation, and strategies to manage interactivity in hybrid workspaces.
14
Spatial Gaze Markers: Supporting Effective Task Switching in Augmented Reality
Mathias N.. Lystbæk (Aarhus University, Aarhus, Denmark)Ken Pfeuffer (Aarhus University, Aarhus, Denmark)Tobias Langlotz (University of Otago, Dunedin, New Zealand)Jens Emil Sloth. Grønbæk (Aarhus University, Aarhus, Denmark)Hans Gellersen (Lancaster University, Lancaster, United Kingdom)
Task switching can occur frequently in daily routines with physical activity. In this paper, we introduce Spatial Gaze Markers, an augmented reality tool to support users in immediately returning to the last point of interest after an attention shift. The tool is task-agnostic, using only eye-tracking information to infer distinct points of visual attention and to mark the corresponding area in the physical environment. We present a user study that evaluates the effectiveness of Spatial Gaze Markers in simulated physical repair and inspection tasks against a no-marker baseline. The results give insights into how Spatial Gaze Markers affect user performance, task load, and experience of users with varying levels of task type and distractions. Our work is relevant to assist physical workers with simple AR techniques and render task switching faster with less effort.
14
TypeDance: Creating Semantic Typographic Logos from Image through Personalized Generation
Shishi Xiao (The Hong Kong University of Science and Technology(Guangzhou), Guangzhou, China)Liangwei Wang (The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China)Xiaojuan Ma (Hong Kong University of Science and Technology, Hong Kong, Hong Kong)Wei Zeng (The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, Guangdong, China)
Semantic typographic logos harmoniously blend typeface and imagery to represent semantic concepts while maintaining legibility. Conventional methods using spatial composition and shape substitution are hindered by the conflicting requirement for achieving seamless spatial fusion between geometrically dissimilar typefaces and semantics. While recent advances made AI generation of semantic typography possible, the end-to-end approaches exclude designer involvement and disregard personalized design. This paper presents TypeDance, an AI-assisted tool incorporating design rationales with the generative model for personalized semantic typographic logo design. It leverages combinable design priors extracted from uploaded image exemplars and supports type-imagery mapping at various structural granularity, achieving diverse aesthetic designs with flexible control. Additionally, we instantiate a comprehensive design workflow in TypeDance, including ideation, selection, generation, evaluation, and iteration. A two-task user evaluation, including imitation and creation, confirmed the usability of TypeDance in design across different usage scenarios.
14
Flicker Augmentations: Rapid Brightness Modulation for Real-World Visual Guidance using Augmented Reality
Jonathan Sutton (University of Copenhagen, Copenhagen, Denmark)Tobias Langlotz (University of Otago, Dunedin, New Zealand)Alexander Plopski (TU Graz, Graz, Austria)Kasper Hornbæk (University of Copenhagen, Copenhagen, Denmark)
Providing attention guidance, such as assisting in search tasks, is a prominent use for Augmented Reality. Typically, this is achieved by graphically overlaying geometrical shapes such as arrows. However, providing visual guidance can cause side effects such as attention tunnelling or scene occlusions, and introduce additional visual clutter. Alternatively, visual guidance can adjust saliency but this comes with different challenges such as hardware requirements and environment dependent parameters. In this work we advocate for using flicker as an alternative for real-world guidance using Augmented Reality. We provide evidence for the effectiveness of flicker from two user studies. The first compared flicker against alternative approaches in a highly controlled setting, demonstrating efficacy (N = 28). The second investigated flicker in a practical task, demonstrating feasibility with higher ecological validity (N = 20). Finally, our discussion highlights the opportunities and challenges when using flicker to provide real-world visual guidance using Augmented Reality.
14
Stick&Slip: Altering Fingerpad Friction via Liquid Coatings
Alex Mazursky (University of Chicago, Chicago, Illinois, United States)Jacob Serfaty (University of Chicago, Chicago, Illinois, United States)Pedro Lopes (University of Chicago, Chicago, Illinois, United States)
We present Stick&Slip, a novel approach that alters friction between the fingerpad & surfaces by depositing liquid droplets that coat the fingerpad. The liquid coating modifies the finger’s coefficient of friction, allowing users to feel surfaces up to ±60% more slippery or sticky. We selected our fluids to rapidly evaporate so that the surface returns to its original friction. Unlike traditional friction-feedback, such as electroadhesion or vibration, our approach: (1) alters friction on a wide range of surfaces and geometries, making it possible to modulate nearly any non-absorbent surface; (2) scales to many objects without requiring instrumenting the target surfaces (e.g., with conductive electrode coatings or vibromotors); and (3) both in/decreases friction via a single device. We identified nine liquids and characterized their practicality by measuring evaporation rates, etc. To illustrate the applicability of our approach, we demonstrate how it enables friction in virtual/mixed-reality or, even, while using everyday objects/tools.
14
GazePointAR: A Context-Aware Multimodal Voice Assistant for Pronoun Disambiguation in Wearable Augmented Reality
Jaewook Lee (University of Washington, Seattle, Washington, United States)Jun Wang (University of Washington, Seattle, Washington, United States)Elizabeth Brown (University of Washington, Seattle, Washington, United States)Liam Chu (University of Washington, Seattle, Washington, United States)Sebastian S.. Rodriguez (University of Illinois at Urbana-Champaign, Urbana, Illinois, United States)Jon E.. Froehlich (University of Washington, Seattle, Washington, United States)
Voice assistants (VAs) like Siri and Alexa are transforming human-computer interaction; however, they lack awareness of users' spatiotemporal context, resulting in limited performance and unnatural dialogue. We introduce GazePointAR, a fully-functional context-aware VA for wearable augmented reality that leverages eye gaze, pointing gestures, and conversation history to disambiguate speech queries. With GazePointAR, users can ask "what's over there?" or "how do I solve this math problem?" simply by looking and/or pointing. We evaluated GazePointAR in a three-part lab study (N=12): (1) comparing GazePointAR to two commercial systems, (2) examining GazePointAR's pronoun disambiguation across three tasks; (3) and an open-ended phase where participants could suggest and try their own context-sensitive queries. Participants appreciated the naturalness and human-like nature of pronoun-driven queries, although sometimes pronoun use was counter-intuitive. We then iterated on GazePointAR and conducted a first-person diary study examining how GazePointAR performs in-the-wild. We conclude by enumerating limitations and design considerations for future context-aware VAs.
14
Jigsaw: Authoring Immersive Storytelling Experiences with Augmented Reality and Internet of Things
Lei Zhang (University of Michigan, Ann Arbor, Michigan, United States)Daekun Kim (University of Waterloo, Waterloo, Ontario, Canada)Youjean Cho (University of Washington, Seattle, Washington, United States)Ava Robinson (Northwestern University, Evanston, Illinois, United States)Yu Jiang Tham (Snap Inc., Seattle, Washington, United States)Rajan Vaish (Snap Inc., Santa Monica, California, United States)Andrés Monroy-Hernández (Princeton University, Princeton, New Jersey, United States)
Augmented Reality (AR) presents new opportunities for immersive storytelling. However, this immersiveness faces two main hurdles. First, AR's immersive quality is often confined to visual elements, such as pixels on a screen. Second, crafting immersive narratives is complex and generally beyond the reach of amateurs due to the need for advanced technical skills. We introduce Jigsaw, a system that empowers beginners to both experience and craft immersive stories, blending virtual and physical elements. Jigsaw uniquely combines mobile AR with readily available Internet-of-things (IoT) devices. We conducted a qualitative study with 20 participants to assess Jigsaw's effectiveness in both consuming and creating immersive narratives. The results were promising: participants not only successfully created their own immersive stories but also found the playback of three such stories deeply engaging. However, sensory overload emerged as a significant challenge in these experiences. We discuss design trade-offs and considerations for future endeavors in immersive storytelling involving AR and IoT.
14
FocusFlow: 3D Gaze-Depth Interaction in Virtual Reality Leveraging Active Visual Depth Manipulation
Chenyang Zhang (University of Illinois at Urbana-Champaign, Champaign, Illinois, United States)Tiansu Chen (University of Illinois at Urbana-Champaign, Urbana, Illinois, United States)Eric Shaffer (University of Illinois at Urbana-Champaign, Urbana, Illinois, United States)Elahe Soltanaghai (University of Illinois urbana Champaign, Urbana, Illinois, United States)
Gaze interaction presents a promising avenue in Virtual Reality (VR) due to its intuitive and efficient user experience. Yet, the depth control inherent in our visual system remains underutilized in current methods. In this study, we introduce FocusFlow, a hands-free interaction method that capitalizes on human visual depth perception within the 3D scenes of Virtual Reality. We first develop a binocular visual depth detection algorithm to understand eye input characteristics. We then propose a layer-based user interface and introduce the concept of "Virtual Window" that offers an intuitive and robust gaze-depth VR interaction, despite the constraints of visual depth accuracy and precision spatially at further distances. Finally, to help novice users actively manipulate their visual depth, we propose two learning strategies that use different visual cues to help users master visual depth control. Our user studies on 24 participants demonstrate the usability of our proposed virtual window concept as a gaze-depth interaction method. In addition, our findings reveal that the user experience can be enhanced through an effective learning process with adaptive visual cues, helping users to develop muscle memory for this brand-new input mechanism. We conclude the paper by discussing potential future research topics of gaze-depth interaction.
14
Look Once to Hear: Target Speech Hearing with Noisy Examples
Bandhav Veluri (University of Washington, SEATTLE, Washington, United States)Malek Itani (University of Washington, Seattle, Washington, United States)Tuochao Chen (Computer Science and Engineering, Seattle, Washington, United States)Takuya Yoshioka (IEEE, Redmond, Washington, United States)Shyamnath Gollakota (university of Washington, Seattle, Washington, United States)
In crowded settings, the human brain can focus on speech from a target speaker, given prior knowledge of how they sound. We introduce a novel intelligent hearable system that achieves this capability, enabling target speech hearing to ignore all interfering speech and noise, but the target speaker. A naive approach is to require a clean speech example to enroll the target speaker. This is however not well aligned with the hearable application domain since obtaining a clean example is challenging in real world scenarios, creating a unique user interface problem. We present the first enrollment interface where the wearer looks at the target speaker for a few seconds to capture a single, short, highly noisy, binaural example of the target speaker. This noisy example is used for enrollment and subsequent speech extraction in the presence of interfering speakers and noise. Our system achieves a signal quality improvement of 7.01 dB using less than 5 seconds of noisy enrollment audio and can process 8 ms of audio chunks in 6.24 ms on an embedded CPU. Our user studies demonstrate generalization to real-world static and mobile speakers in previously unseen indoor and outdoor multipath environments. Finally, our enrollment interface for noisy examples does not cause performance degradation compared to clean examples, while being convenient and user-friendly. Taking a step back, this paper takes an important step towards enhancing the human auditory perception with artificial intelligence.
13
The Effects of Perceived AI Use On Content Perceptions
Irene Rae (Google, Madison, Wisconsin, United States)
There is a potential future where the content created by a human and an AI are indistinguishable. In this future, if you can't tell the difference, does it matter? We conducted a 3 (Assigned creator: human, human with AI assistance, AI) by 4 (Context: news, travel, health, and jokes) mixed-design experiment where participants evaluated human-written content that was presented as created by a human, a human with AI assistance, or an AI. We found that participants felt more negatively about the content creator and were less satisfied when they thought AI was used, but assigned creator had no effect on content judgments. We also identified five interpretations for how participants thought AI use affected the content creation process. Our work suggests that informing users about AI use may not have the intended effect of helping consumers make content judgments and may instead damage the relationship between creators and followers.
13
Sweating the Details: Emotion Recognition and the Influence of Physical Exertion in Virtual Reality Exergaming
Dominic Potts (University of Bath, Bath, United Kingdom)Zoe Broad (University of Bath, Bath, United Kingdom)Tarini Sehgal (University of Bath , Bath, United Kingdom)Joseph Hartley (University of Bath, Bath, United Kingdom)Eamonn O'Neill (University of Bath, Bath, United Kingdom)Crescent Jicol (University of Bath, Bath, United Kingdom)Christopher Clarke (University of Bath, Bath, United Kingdom)Christof Lutteroth (University of Bath, Bath, United Kingdom)
There is great potential for adapting Virtual Reality (VR) exergames based on a user's affective state. However, physical activity and VR interfere with physiological sensors, making affect recognition challenging. We conducted a study (n=72) in which users experienced four emotion inducing VR exergaming environments (happiness, sadness, stress and calmness) at three different levels of exertion (low, medium, high). We collected physiological measures through pupillometry, electrodermal activity, heart rate, and facial tracking, as well as subjective affect ratings. Our validated virtual environments, data, and analyses are openly available. We found that the level of exertion influences the way affect can be recognised, as well as affect itself. Furthermore, our results highlight the importance of data cleaning to account for environmental and interpersonal factors interfering with physiological measures. The results shed light on the relationships between physiological measures and affective states and inform design choices about sensors and data cleaning approaches for affective VR.
13
Blended Whiteboard: Physicality and Reconfigurability in Remote Mixed Reality Collaboration
Jens Emil Sloth. Grønbæk (Aarhus University, Aarhus, Denmark)Juan Sánchez Esquivel (Aarhus University, Aarhus, Denmark)Germán Leiva (Aarhus University, Aarhus, Denmark)Eduardo Velloso (University of Melbourne, Melbourne, Victoria, Australia)Hans Gellersen (Lancaster University, Lancaster, United Kingdom)Ken Pfeuffer (Aarhus University, Aarhus, Denmark)
The whiteboard is essential for collaborative work. To preserve its physicality in remote collaboration, Mixed Reality (MR) can blend real whiteboards across distributed spaces. Going beyond reality, MR can further enable interactions like panning and zooming in a virtually reconfigurable infinite whiteboard. However, this reconfigurability conflicts with the sense of physicality. To address this tension, we introduce Blended Whiteboard, a remote collaborative MR system enabling reconfigurable surface blending across distributed physical whiteboards. Blended Whiteboard supports a unique collaboration style, where users can sketch on their local whiteboards but also reconfigure the blended space to facilitate transitions between loosely and tightly coupled work. We describe design principles inspired by proxemics; supporting users in changing between facing each other and being side-by-side, and switching between navigating the whiteboard synchronously and independently. Our work shows exciting benefits and challenges of combining physicality and reconfigurability in the design of distributed MR whiteboards.
13
PANDALens: Towards AI-Assisted In-Context Writing on OHMD During Travels
Runze Cai (National University of Singapore, Singapore, Singapore)Nuwan Janaka (National University of Singapore, Singapore, Singapore)Yang Chen (National University of Singapore, Singapore, Singapore)Lucia Wang (Massachusetts Institute of Technology, Cambridge, Massachusetts, United States)Shengdong Zhao (National University of Singapore, Singapore, Singapore)Can Liu (City University of Hong Kong, Hong Kong, China)
While effective for recording and sharing experiences, traditional in-context writing tools are relatively passive and unintelligent, serving more like instruments rather than companions. This reduces primary task (e.g., travel) enjoyment and hinders high-quality writing. Through formative study and iterative development, we introduce PANDALens, a Proactive AI Narrative Documentation Assistant built on an Optical See-Through Head Mounted Display that supports personalized documentation in everyday activities. PANDALens observes multimodal contextual information from user behaviors and environment to confirm interests and elicit contemplation, and employs Large Language Models to transform such multimodal information into coherent narratives with significantly reduced user effort. A real-world travel scenario comparing PANDALens with a smartphone alternative confirmed its effectiveness in improving writing quality and travel enjoyment while minimizing user effort. Accordingly, we propose design guidelines for AI-assisted in-context writing, highlighting the potential of transforming them from tools to intelligent companions.
12
AirPush: A Pneumatic Wearable Haptic Device Providing Multi-Dimensional Force Feedback on a Fingertip
Yuxin Ma (Southern University of Science and Technology, Shenzhen, China)Tianze Xie (Southern University of Science and Technology, Shenzhen, China)Peng Zhang (Southern University of Science and Technology, Shenzhen, China)Hwan Kim (Sungshin Women's University, Seoul, Korea, Republic of)Seungwoo Je (SUSTech, Shenzhen, China)
Finger wearable haptic devices enrich virtual reality experiences by offering haptic feedback corresponding to the virtual environment. However, despite the effectiveness of current finger wearable haptic devices in delivering haptic feedback, many are often constrained in their ability to provide force feedback across a diverse range of directions or to sustain it. Therefore, we present AirPush, a finger wearable haptic device capable of generating continuously adjustable force feedback in multiple directions using compressed air. To evaluate its usability, we conducted a technical evaluation and four user studies: (1) we obtained the user's perceptual thresholds of angles under different directions on horizontal and vertical planes, (2) in perception studies, we found that users can identify five different magnitudes of force and eight different motion when using AirPush, and (3) using it in VR applications, we confirmed that users felt more realistic and immersed when using AirPush than the HTC VIVE Controller or AirPush with a fixed nozzle.
12
Apple’s Knowledge Navigator: Why Doesn’t that Conversational Agent Exist Yet?
Amanda K.. Newendorp (Iowa State University, Ames, Iowa, United States)Mohammadamin Sanaei (Iowa State University, Ames, Iowa, United States)Arthur J. Perron (Iowa State University, Ames, Iowa, United States)Hila Sabouni (Iowa State University, Ames, Iowa, United States)Nikoo Javadpour (Iowa State University , AMES, Iowa, United States)Maddie Sells (Iowa State University , Ames, Iowa, United States)Katherine Nelson (Iowa State University, Ames, Iowa, United States)Michael Dorneich (Iowa State University, Ames, IA, Iowa, United States)Stephen B.. Gilbert (Iowa State University, Ames, Iowa, United States)
Apple’s 1987 Knowledge Navigator video contains a vision of a sophisticated digital personal assistant, but the natural human-agent conversational dialog shown does not currently exist. To investigate why, the authors analyzed the video using three theoretical frameworks: the DiCoT framework, the HAT Game Analysis framework, and the Flows of Power framework. These were used to codify the human-agent interactions and classify the agent’s capabilities. While some barriers to creating such agents are technological, other barriers arise from privacy, social and situational factors, trust, and the financial business case. The social roles and asymmetric interactions of the human and agent are discussed in the broader context of HAT research, along with the need for a new term for these agents that does not rely on a human social relationship metaphor. This research offers designers of conversational agents a research roadmap to build more highly capable and trusted non-human teammates.
12
My Voice as a Daily Reminder: Self-Voice Alarm for Daily Goal Achievement
Jieun Kim (Cornell University, Ithaca, New York, United States)Hayeon Song (Sungkyunkwan University, Seoul, Korea, Republic of)
Sticking to daily plans is essential for achieving life goals but challenging in reality. This study presents a self-voice alarm as a novel daily goal reminder. Based on the strong literature on the psychological effects of self-voice, we developed a voice alarm system that reminds users of daily tasks to support their consistent task completion. Over the course of 14 days, participants (N = 63) were asked to complete daily vocabulary tasks when reminded by an alarm (i.e., self-voice vs. other-voice vs. beep sound alarm). The self-voice alarm elicited higher alertness and uncomfortable feelings while fostering more days of task completion and repetition compared to the beep sound alarm. Both self-voice and other-voice alarms increased users’ perceived usefulness of the alarm system. Leveraging both quantitative and qualitative approaches, we provide a practical guideline for designing voice alarm systems that will foster users’ behavioral changes to achieve daily goals.
12
MouseRing: Always-available Touchpad Interaction with IMU Rings
Xiyuan Shen (Tsinghua University, Beijing, China)Chun Yu (Tsinghua University, Beijing, China)Xutong Wang (Tsinghua University, Beijing, China)Chen Liang (Tsinghua University, Beijing, Beijing, China)Haozhan Chen (Tsinghua University, Beijing, China)Yuanchun Shi (Tsinghua University, Beijing, China)
Tracking fine-grained finger movements with IMUs for continuous 2D-cursor control poses significant challenges due to limited sensing capabilities. Our findings suggest that finger-motion patterns and the inherent structure of joints provide beneficial physical knowledge, which lead us to enhance motion perception accuracy by integrating physical priors into ML models. We propose MouseRing, a novel ring-shaped IMU device that enables continuous finger-sliding on unmodified physical surfaces like a touchpad. A motion dataset was created using infrared cameras, touchpads, and IMUs. We then identified several useful physical constraints, such as joint co-planarity, rigid constraints, and velocity consistency. These principles help refine the finger-tracking predictions from an RNN model. By incorporating touch state detection as a cursor movement switch, we achieved precise cursor control. In a Fitts’ Law study, MouseRing demonstrated input efficiency comparable to touchpads. In real-world applications, MouseRing ensured robust, efficient input and good usability across various surfaces and body postures.
12
Me, My Health, and My Watch: How Children with ADHD Understand Smartwatch Health Data
Elizabeth Ankrah (University of California, Irvine, Irvine, California, United States)Franceli L.. Cibrian (Chapman University, Orange, California, United States)Lucas M.. Silva (University of California, Irvine, Irvine, California, United States)Arya Tavakoulnia (University of California Irvine, Irvine, California, United States)Jesus Armando. Beltran (UCI, Irvine, California, United States)Sabrina Schuck (University of California Irvine, Irvine, California, United States)Kimberley D. Lakes (University of California Riverside, Riverside, California, United States)Gillian R. Hayes (University of California, Irvine, Irvine, California, United States)
Children with ADHD can experience a wide variety of challenges related to self-regulation, which can lead to poor educational, health, and wellness outcomes. Technological interventions, such as mobile and wearable health systems, can support data collection and reflection about health status. However, little is known about how ADHD children interpret such data. We conducted a deployment study with 10 children, aged 10 to 15, for six weeks, during which they used a smartwatch in their homes. Results from observations and interviews during this study indicate that children with ADHD can interpret their own health data, particularly at the moment. However, as ADHD children develop more autonomy, smartwatch systems may require alternatives for data reflection that are interpretable and actionable for them. This work contributes to the scholarly discourse around health data visualization, particularly in considering implications for the design of health technologies for children with ADHD.
12
"Waves Push Me to Slumberland": Reducing Pre-Sleep Stress through Spatio-Temporal Tactile Displaying of Music.
Hui Zhang (Hunan University, Changsha, China)Ruixiao Zheng (Hunan University, Changsha, China)Shirao Yang (Hunan University, Changsha, China)Wanyi Wei (Hunan University, Changsha, China)Huafeng Shan (Keeson, Jiaxing, China)Jianwei Zhang (Keeson, Jiaxing, China)
Despite the fact that spatio-temporal patterns of vibration, characterized as rhythmic compositions of tactile content, have exhibited an ability to elicit specific emotional responses and enhance the emotion conveyed by music, limited research has explored their underlying mechanism in regulating emotional states within the pre-sleep context. Aiming to investigate whether synergistic spatio-temporal tactile displaying of music can facilitate relaxation before sleep, we developed 16 vibration patterns and an audio-tactile prototype for presenting an ambient experience in a pre-sleep scenario. The stress-reducing effects were further evaluated and compared via a user experiment. The results showed that the spatio-temporal tactile display of music significantly reduced stress and positively influenced users' emotional states before sleep. Furthermore, our study highlights the therapeutic potential of incorporating quantitative and adjustable spatio-temporal parameters correlated with subjective psychophysical perceptions in the audio-tactile experience for stress management.
12
Waiting Time Perceptions for Faster Count-downs/ups Are More Sensitive Than Slower Ones: Experimental Investigation and Its Application
Takanori Komatsu (Meiji University, Tokyo, Japan)Chenxi Xie (Meiji University, Japan, Tokyo, Japan)Seiji Yamada (National Institute of Informatics, Tokyo, Japan)
Countdowns and count-ups are very useful displays that explicitly show how long users should wait and also show the current processing states of a given task. Most countdowns or count-ups decrease or increase their digit every one second exactly, and most users have an implicit assumption that the digit changes every one second exactly. However, there are no studies that investigate how users perceive wait times with these countdowns and count-ups and that consider changing users' perception of time passing as shorter than the actual passage of time by means of countdowns and count-ups while taking into account such user assumptions. To clarify these issues, we first investigated how users perceive countdowns "from 3/5/10 to 0" and count-ups "from 0 to 3/5/10" that have different lengths of intervals from 800 to 1200 msec (Experiment 1). Next, on the basis of the results of Experiment 1, we explored a novel method for presenting countdowns to make users perceive the wait time as being shorter than the actual wait time (Experiment 2) and investigated whether such countdowns can be used in realistic applications or not (Experiment 3). As a result, we found that countdowns and count-ups that were "from 250 msec shorter to 10% longer" than 3, 5, or 10 sec were perceived as 3, 5, or 10 sec, respectively, and those "from 5 to 0" (their lengths were 5 sec) that first displayed extremely shorter intervals were perceived as being shorter than their actual length (5 sec). Finally, we confirmed the applicability and effectiveness of such displays in a realistic application. Thus, we strongly argue that these findings could become indispensable knowledge for researchers in this research field to reduce users' cognitive load during wait times.
12
How Gaze Visualization Facilitates Initiation of Informal Communication in 3D Virtual Spaces
Junko Ichino (Tokyo City University, Yokohama, Japan)Masahiro Ide (Tokyo City University, Yokohama, Japan)Takehito Yoshiki (TIS Inc., Shinjuku, Tokyo, Japan)Hitomi Yokoyama (Okayama University of Science, Okayama, Japan)Hirotoshi Asano (Kogakuin University, Shinjyuku, Tokyo, Japan)Hideo Miyachi (Tokyo City University, Yokohama, Japan)daisuke okabe (Tokyo City University, Yokohama, Kanagawa, Japan)
This study explores how gaze visualization in virtual spaces facilitates the initiation of informal communication. Three styles of gaze cue visualization (arrow, bubbles, and miniature avatar) with two types of gaze behavior (one-sided gaze and joint gaze) were evaluated. 96 participants used either a non-visualized gaze cue or one of the three visualized gaze cues. The results showed that all visualized gaze cues facilitated the initiation of informal communication more effectively than the non-visualized gaze cue. For one-sided gaze, overall, bubbles had more positive effects on the gaze receiver’s behaviors and experiences than the other two visualized gaze cues, although the only statistically significant difference was in the verbal reaction rates. For joint gaze, all three visualized gaze cues had positive effects on the receiver’s behaviors and experiences. The design implications of the gaze visualization and the confederate-based evaluation method contribute to research on informal communication and social virtual reality.
12
Fragmented Moments, Balanced Choices: How Do People Make Use of Their Waiting Time?
Jian Zheng (University of Maryland, College Park, Maryland, United States)Ge Gao (University of Maryland, College Park, Maryland, United States)
Everyone spends some time waiting every day. HCI research has developed tools for boosting productivity while waiting. However, little is known about how people naturally spend their waiting time. We conducted an experience sampling study with 21 working adults who used a mobile app to report their daily waiting time activities over two weeks. The aim of this study is to understand the activities people do while waiting and the effect of situational factors. We found that participants spent about 60% of their waiting time on leisure activities, 20% on productive activities, and 20% on maintenance activities. These choices are sensitive to situational factors, including accessible device, location, and certain routines of the day. Our study complements previous ones by demonstrating that people purpose waiting time for various goals beyond productivity and to maintain work-life balance. Our findings shed light on future empirical research and system design for time management.
12
EyeEcho: Continuous and Low-power Facial Expression Tracking on Glasses
Ke Li (Cornell University, Ithaca, New York, United States)Ruidong Zhang (Cornell University, Ithaca, New York, United States)Siyuan Chen (Cornell University, Ithaca, New York, United States)Boao Chen (Cornell University, Ithaca, New York, United States)Mose Sakashita (Cornell University, Ithaca, New York, United States)Francois Guimbretiere (Cornell University, Ithaca, New York, United States)Cheng Zhang (Cornell , Ithaca, New York, United States)
In this paper, we introduce EyeEcho, a minimally-obtrusive acoustic sensing system designed to enable glasses to continuously monitor facial expressions. It utilizes two pairs of speakers and microphones mounted on glasses, to emit encoded inaudible acoustic signals directed towards the face, capturing subtle skin deformations associated with facial expressions. The reflected signals are processed through a customized machine-learning pipeline to estimate full facial movements. EyeEcho samples at 83.3 Hz with a relatively low power consumption of 167 mW. Our user study involving 12 participants demonstrates that, with just four minutes of training data, EyeEcho achieves highly accurate tracking performance across different real-world scenarios, including sitting, walking, and after remounting the devices. Additionally, a semi-in-the-wild study involving 10 participants further validates EyeEcho's performance in naturalistic scenarios while participants engage in various daily activities. Finally, we showcase EyeEcho's potential to be deployed on a commercial-off-the-shelf (COTS) smartphone, offering real-time facial expression tracking.
11
AI-Assisted Causal Pathway Diagram for Human-Centered Design
Ruican Zhong (Human Centered Design and Engineering, University of Washington, Seattle, Washington, United States)Donghoon Shin (University of Washington, Seattle, Washington, United States)Rosemary Meza (Kaiser Permanente Washington Health Research Institute, Seattle, Washington, United States)Predrag Klasnja (University of Michigan, Ann Arbor, Michigan, United States)Lucas Colusso (Microsoft, Seattle, Washington, United States)Gary Hsieh (University of Washington, Seattle, Washington, United States)
This paper explores the integration of causal pathway diagrams (CPD) into human-centered design (HCD), investigating how these diagrams can enhance the early stages of the design process. A dedicated CPD plugin for the online collaborative whiteboard platform Miro was developed to streamline diagram creation and offer real-time AI-driven guidance. Through a user study with designers ($N=20$), we found that CPD's branching and its emphasis on causal connections supported both divergent and convergent processes during design. CPD can also facilitate communication among stakeholders. Additionally, we found our plugin significantly reduces designers' cognitive workload and increases their creativity during brainstorming, highlighting the implications of AI-assisted tools in supporting creative work and evidence-based designs.
11
Embodied Tentacle: Mapping Design to Control of Non-Analogous Body Parts with the Human Body
Shuto Takashita (The University of Tokyo, Tokyo, Japan)Ken Arai (The University of Tokyo, Tokyo, Japan)Hiroto Saito (The University of Tokyo, Tokyo, Japan)Michiteru Kitazaki (Toyohashi University of Technology, Toyohashi, Japan)Masahiko Inami (The University of Tokyo, Tokyo, Japan)
Manipulating a non-humanoid body using a mapping approach that translates human body activity into different structural movements enables users to perform tasks that are difficult with their innate bodies. However, a key challenge is how to design an effective mapping to control non-analogous body parts with the human body. To address this challenge, we designed an articulated virtual arm and investigated the effect of mapping methods on a user's manipulation experience. Specifically, we developed an unbranched 12-joint virtual arm with an octopus-like appearance. Using this arm, we conducted a user study to compare the effects of several mapping methods with different arrangements on task performance and subjective evaluations of embodiment and user preference. As a result, we identified three important factors in mapping: "Visual and Configurational Similarity", "Kinematics Suitability for the User", and "Correspondence with Everyday Actions." Based on these findings, we discuss a mapping design for non-humanoid body manipulation.
11
MindfulDiary: Harnessing Large Language Model to Support Psychiatric Patients' Journaling
Taewan Kim (KAIST, Daejeon, Korea, Republic of)Seolyeong Bae (Gwangju Institute of Science and Technology, Gwangju, Korea, Republic of)Hyun AH Kim (NAVER Cloud, Gyeonggi-do, Korea, Republic of)Su-woo Lee (Wonkwang university hospital, iksan-si, Korea, Republic of)Hwajung Hong (KAIST, Deajeon, Korea, Republic of)Chanmo Yang (Wonkwang University Hospital, Wonkwang University, Iksan, Jeonbuk, Korea, Republic of)Young-Ho Kim (NAVER AI Lab, Seongnam, Gyeonggi, Korea, Republic of)
Large Language Models (LLMs) offer promising opportunities in mental health domains, although their inherent complexity and low controllability elicit concern regarding their applicability in clinical settings. We present MindfulDiary, an LLM-driven journaling app that helps psychiatric patients document daily experiences through conversation. Designed in collaboration with mental health professionals, MindfulDiary takes a state-based approach to safely comply with the experts' guidelines while carrying on free-form conversations. Through a four-week field study involving 28 patients with major depressive disorder and five psychiatrists, we examined how MindfulDiary facilitates patients' journaling practice and clinical care. The study revealed that MindfulDiary supported patients in consistently enriching their daily records and helped clinicians better empathize with their patients through an understanding of their thoughts and daily contexts. Drawing on these findings, we discuss the implications of leveraging LLMs in the mental health domain, bridging the technical feasibility and their integration into clinical settings.
11
VeeR: Exploring the Feasibility of Deliberately Designing VR Motion that Diverges from Mundane, Everyday Physical Motion to Create More Entertaining VR Experiences
Pin Chun Lu (National Taiwan University, Taipei, Taiwan)Che Wei Wang (National Taiwan University, Taipei, Taiwan)Yu Lun Hsu (National Taiwan University, Taipei, Taiwan)Alvaro Lopez (National Taiwan University, Taipei, Taiwan)Ching-Yi Tsai (National Taiwan University, Taipei, Taiwan)Chiao-Ju Chang (National Taiwan University, Taipei, Taiwan)Wei Tian Mireille Tan (University of Illinois Urbana-Champaign, Champaign, Illinois, United States)LI-CHUN LU (National Taiwan University , Taipei , Taiwan)Mike Y.. Chen (National Taiwan University, Taipei, Taiwan)
This paper explores the feasibility of deliberately designing VR motion that diverges from users’ physical movements to turn mundane, everyday transportation motion (e.g., metros, trains, and cars) into more entertaining VR motion experiences, in contrast to prior car-based VR approaches that synchronize VR motion to physical car movement exactly. To gain insight into users’ preferences for veering rate and veering direction for turning (left/right) and pitching (up/down) during the three phases of acceleration (accelerating, cruising, and decelerating), we conducted a formative, perceptual study (n=24) followed by a VR experience evaluation (n=18), all conducted on metro trains moving in a mundane, straight-line motion. Results showed that participants preferred relatively high veering rates, and preferred pitching upward during acceleration and downward during deceleration. Furthermore, while veering decreased comfort as expected, it significantly enhanced immersion (p<.01) and entertainment (p<.001) and the overall experience, with comfort being considered, was preferred by 89% of participants.
11
Listening to the Voices: Describing Ethical Caveats of Conversational User Interfaces According to Experts and Frequent Users
Thomas Mildner (University of Bremen, Bremen, Germany)Orla Cooney (University College Dublin, Dublin, Ireland)Anna-Maria Meck (BMW Group, Munich, Germany)Marion Bartl (University College Dublin, Dublin, Ireland)Gian-Luca Savino (University of St. Gallen, St. Gallen, Switzerland)Philip R. Doyle (HMD Research, Dublin, Ireland)Diego Garaialde (University College Dublin, Dublin, Ireland)Leigh Clark (Bold Insight, UK, London, United Kingdom)John Sloan (university College Dublin, Dublin, Dublin, Ireland)Nina Wenig (University of Bremen, Bremen, Germany)Rainer Malaka (University of Bremen, Bremen, Germany)Jasmin Niess (University of Oslo, Oslo, Norway)
Advances in natural language processing and understanding have led to a rapid growth in the popularity of conversational user interfaces (CUIs). While CUIs introduce novel benefits, they also yield risks that may exploit people's trust. Although research looking at unethical design deployed through graphical user interfaces (GUIs) established a thorough taxonomy of so-called dark patterns, there is a need for an equally in-depth understanding in the context of CUIs. Addressing this gap, we interviewed 27 participants from three cohorts: researchers, practitioners, and frequent users of CUIs. Applying thematic analysis, we develop five themes reflecting each cohort's insights about ethical design challenges and introduce the CUI Expectation Cycle, bridging system capabilities and user expectations while respecting each theme's ethical caveats. This research aims to inform future work to consider ethical constraints while adopting a human-centred approach.
11
Understanding Choice Independence and Error Types in Human-AI Collaboration
Alexander Erlei (University of Goettingen, Goettingen, Germany)Abhinav Sharma (Indian Institute of Information Technology Guwahati, Guwahati, Assam, India)Ujwal Gadiraju (Delft University of Technology, Delft, Netherlands)
The ability to make appropriate delegation decisions is an important prerequisite of effective human-AI collaboration. Recent work, however, has shown that people struggle to evaluate AI systems in the presence of forecasting errors, falling well short of relying on AI systems appropriately. We use a pre-registered crowdsourcing study ($N=611$) to extend this literature by two underexplored crucial features of human AI decision-making: \textit{choice independence} and \textit{error type}. Subjects in our study repeatedly complete two prediction tasks and choose which predictions they want to delegate to an AI system. For one task, subjects receive a decision heuristic that allows them to make informed and relatively accurate predictions. The second task is substantially harder to solve, and subjects must come up with their own decision rule. We systematically vary the AI system's performance such that it either provides the best possible prediction for both tasks or only for one of the two. Our results demonstrate that people systematically violate choice independence by taking the AI's performance in an unrelated second task into account. Humans who delegate predictions to a superior AI in their own expertise domain significantly reduce appropriate reliance when the model makes systematic errors in a complementary expertise domain. In contrast, humans who delegate predictions to a superior AI in a complementary expertise domain significantly increase appropriate reliance when the model systematically errs in the human expertise domain. Furthermore, we show that humans differentiate between error types and that this effect is conditional on the considered expertise domain. This is the first empirical exploration of choice independence and error types in the context of human-AI collaboration. Our results have broad and important implications for the future design, deployment, and appropriate application of AI systems.
11
Metaphors in Voice User Interfaces: A Slippery Fish
Smit Desai (University of Illinois, Urbana-Champaign, Champaign, Illinois, United States)Michael Bernard. Twidale (University of Illinois at Urbana-Champaign, Urbana, Illinois, United States)
We explore a range of different metaphors used for Voice User Interfaces (VUIs) by designers, end-users, manufacturers, and researchers using a novel framework derived from semi-structured interviews and a literature review. We focus less on the well-established idea of metaphors as a way for interface designers to help novice users learn how to interact with novel technology, and more on other ways metaphors can be used. We find that metaphors people use are contextually fluid, can change with the mode of conversation, and can reveal differences in how people perceive VUIs compared to other devices. Not all metaphors are helpful, and some may be offensive. Analyzing this broader class of metaphors can help understand, perhaps even predict problems. Metaphor analysis can be a low-cost tool to inspire design creativity and facilitate complex discussions about sociotechnical issues, enabling us to spot potential opportunities and problems in the situated use of technologies.
11
Elastica: Adaptive Live Augmented Presentations with Elastic Mappings Across Modalities
Yining Cao (University of California, San Diego, San Diego, California, United States)Rubaiat Habib Kazi (Adobe Research, Seattle, Washington, United States)Li-Yi Wei (Adobe Research, San Jose, California, United States)Deepali Aneja (Adobe Research, Seattle, Washington, United States)Haijun Xia (University of California, San Diego, San Diego, California, United States)
Augmented presentations offer compelling storytelling by combining speech content, gestural performance, and animated graphics in a congruent manner. The expressiveness of these presentations stems from the harmonious coordination of spoken words and graphic elements, complemented by smooth animations aligned with the presenter's gestures. However, achieving such desired congruence in a live presentation poses significant challenges due to the unpredictability and imprecision inherent in presenters' real-time actions. Existing methods either leveraged rigid mapping without predefined states or required the presenters to conform to predefined animations. We introduce adaptive presentations that dynamically adjust predefined graphic animations to real-time speech and gestures. Our approach leverages script following and motion warping to establish elastic mappings that generate runtime graphic parameters coordinating speech, gesture, and predefined animation state. Our evaluation demonstrated that the proposed adaptive presentation can effectively mitigate undesired visual artifacts caused by performance deviations and enhance the expressiveness of resulting presentations.
11
RELIC: Investigating Large Language Model Responses using Self-Consistency
Furui Cheng (ETH Zürich, Zürich, Switzerland)Vilém Zouhar (ETH Zurich, Zurich, Switzerland)Simran Arora (Stanford University, Stanford, California, United States)Mrinmaya Sachan (ETH Zurich, Zurich, Switzerland)Hendrik Strobelt (IBM Research AI, Cambridge, Massachusetts, United States)Mennatallah El-Assady (ETH Zürich, Zürich, Switzerland)
Large Language Models (LLMs) are notorious for blending fact with fiction and generating non-factual content, known as hallucinations. To address this challenge, we propose an interactive system that helps users gain insight into the reliability of the generated text. Our approach is based on the idea that the self-consistency of multiple samples generated by the same LLM relates to its confidence in individual claims in the generated texts. Using this idea, we design RELIC, an interactive system that enables users to investigate and verify semantic-level variations in multiple long-form responses. This allows users to recognize potentially inaccurate information in the generated text and make necessary corrections. From a user study with ten participants, we demonstrate that our approach helps users better verify the reliability of the generated text. We further summarize the design implications and lessons learned from this research for future studies of reliable human-LLM interactions.
11
Augmented Reality Cues Facilitate Task Resumption after Interruptions in Computer-Based and Physical Tasks
Kilian L. Bahnsen (Julius-Maximilians-Universität Würzburg, Würzburg, Germany)Lucas Tiemann (Julius-Maximilians-Universität Würzburg, Würzburg, Germany)Lucas Plabst (Julius-Maximilians-University Würzburg, Würzburg, Germany)Tobias Grundgeiger (Julius-Maximilians-Universität Würzburg, Würzburg, Germany)
Many work domains include numerous interruptions, which can contribute to errors. We investigated the potential of augmented reality (AR) cues to facilitate primary task resumption after interruptions of varying lengths. Experiment 1 (N = 83) involved a computer-based primary task with a red AR arrow at the to-be-resumed task step which was placed via a gesture by the participants or automatically. Compared to no cue, both cues significantly reduced the resumption lag (i.e., the time between the end of the interruption and the resumption of the primary task) following long but not short interruptions. Experiment 2 (N = 38) involved a tangible sorting task, utilizing only the automatic cue. The AR cue facilitated task resumption compared to not cue after both short and long interruptions. We demonstrated the potential of AR cues in mitigating the negative effects of interruptions and make suggestions for integrating AR technologies for task resumption.
11
Talaria: Interactively Optimizing Machine Learning Models for Efficient Inference
Fred Hohman (Apple, Seattle, Washington, United States)Chaoqun Wang (Apple, Beijing, China)Jinmook Lee (Apple, Cupertino, California, United States)Jochen Görtler (Independent Researcher, Walldorf, Germany)Dominik Moritz (Apple, Pittsburgh, Pennsylvania, United States)Jeffrey P. Bigham (Apple, Pittsburgh, Pennsylvania, United States)Zhile Ren (Apple, Seattle, Washington, United States)Cecile Foret (Apple, Cupertino, California, United States)Qi Shan (Apple Inc, Seattle, Washington, United States)Xiaoyi Zhang (Apple Inc, Seattle, Washington, United States)
On-device machine learning (ML) moves computation from the cloud to personal devices, protecting user privacy and enabling intelligent user experiences. However, fitting models on devices with limited resources presents a major technical challenge: practitioners need to optimize models and balance hardware metrics such as model size, latency, and power. To help practitioners create efficient ML models, we designed and developed Talaria: a model visualization and optimization system. Talaria enables practitioners to compile models to hardware, interactively visualize model statistics, and simulate optimizations to test the impact on inference metrics. Since its internal deployment two years ago, we have evaluated Talaria using three methodologies: (1) a log analysis highlighting its growth of 800+ practitioners submitting 3,600+ models; (2) a usability survey with 26 users assessing the utility of 20 Talaria features; and (3) a qualitative interview with the 7 most active users about their experience using Talaria.
11
A Living Framework for Understanding Cooperative Games
Pedro Pais (LASIGE, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal, Lisboa, Portugal)David Gonçalves (Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal)Daniel Reis (Faculdade de Ciências da Universidade de Lisboa, Lisboa, Portugal)João Cadete Nunes. Godinho (LASIGE, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal, Lisboa, Portugal)João Filipe. Morais (LASIGE, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal, Lisboa, Portugal)Manuel Piçarra (Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal)Pedro Trindade (LASIGE, Faculdade de Ciências, Universidade de Lisboa, Lisbon, Portugal, Lisbon, Portugal)Dmitry Alexandrovsky (KIT, Karlsruhe, Germany)Kathrin Gerling (KIT, Karlsruhe, Germany)João Guerreiro (Universidade de Lisboa, Lisbon, Portugal)André Rodrigues (Universidade de Lisboa, Lisboa, Portugal)
Playing cooperative games is recognised as a positive social activity. Yet, we have limited means to rigorously define or communicate the structures that govern these experiences, hindering attempts at consolidating knowledge and limiting the potential of design efforts. In this work, we introduce the Living Framework for Cooperative Games (LFCG), a framework derived from a multi-step systematic analysis of 129 cooperative games with contributions of eleven researchers. We describe how LFCG can be used as a tool for analyses and ideation, and as a shared language for describing a game’s design. LFCG is published as a web application to facilitate use and appropriation. It supports the creation, dissemination and aggregation of game reports and specifications; and enables stakeholders to extend and publish custom versions. Lastly, we discuss using a research-driven approach for formalising game structures and the advantages of community contributions for consolidation and reach.