Advances in emerging technologies, such as on-body mechanical actuators and electrical muscle stimulation, have allowed computers to take control over our bodies. This presents opportunities as well as challenges, raising fundamental questions about agency and the role of our body when interacting with technology. To advance this research field as a whole, we brought together expert perspectives in a week-long seminar to articulate the grand challenges that should be tackled when it comes to the design of computers’ control over our bodies. These grand challenges span technical, design, user, and ethical aspects. By articulating these grand challenges, we aim to begin initiating a research agenda that positions bodily control not only as a technical feature but as a central, experiential, and ethical concern for future human–computer interaction endeavors.
Unlike visual and auditory media, physical sensations are difficult to create and capture, limiting the availability of diverse haptic content.
Converting common media formats like video into haptics offers a promising solution, but existing video-to-haptics methods depend on specific characteristics, such as camera motion or predefined actions, and rely on spatial haptic hardware (e.g., motion chair, haptic vest).
We introduce HapticLens, an interactive method for creating haptics from video, supported by an open-source GUI and two vision algorithms.
Our method works with arbitrary video content, detects subtle motion, and requires only a single vibrotactile actuator.
We evaluate HapticLens through technical experiments and a study with 22 participants.
Results demonstrate it supports interactive vibration design with high designer satisfaction for its usability and haptic signals' overall quality and relevance. This work broadens the accessibility of video-driven haptics, offering a practical method to create and experience tactile content.
Haptic feedback contributes to immersive virtual reality (VR) experiences. However, designing such feedback at scale for all objects within a VR scene remains time-consuming. We present Scene2Hap, an LLM-centered system that automatically designs object-level vibrotactile feedback for entire VR scenes based on the objects' semantic attributes and physical context. Scene2Hap employs a multimodal large language model to estimate each object’s semantics and physical context, including its material properties and vibration behavior, from multimodal information in the VR scene. These estimated attributes are then used to generate or retrieve audio signals, subsequently converted into plausible vibrotactile signals. For more realistic spatial haptic rendering, Scene2Hap estimates vibration propagation and attenuation from vibration sources to neighboring objects, considering the estimated material properties and spatial relationships of virtual objects in the scene. Three user studies confirm that Scene2Hap successfully estimates the vibration-related semantics and physical context of VR scenes and produces realistic vibrotactile signals.
Force cues contribute significantly to the perception of material properties. Building on this principle, we propose a novel method that represents material properties utilizing pseudo-attraction force. Our method generates this perceived force in response to user motion, with a compact interface that produces asymmetric vibrations. This force sensation induces a perceived weight shift (Pseudo-weight shift), creating the perception of internal dynamics to convey the physical presence of a virtual object. System evaluations confirmed that the method produces a sensation equivalent to the inertial force of a 17.7 g mass. Furthermore, a large-scale user study and psychophysical experiment revealed that our method enables parametric control of perceived material properties, particularly viscosity, by modulating the vibration profile. This approach demonstrates that perceived force effectively substitutes for physical force, enabling vivid material representation through a compact interface and simple design. This expands the design space for expressive handheld haptics.
People who are blind or have low vision regularly use their hands to interact with the physical world to gain access to objects' shape, size, weight, and texture. However, many rich visual features remain inaccessible through touch alone, making it difficult to distinguish similar objects, interpret visual affordances, and form a complete understanding of objects. In this work, we present TouchScribe, a system that augments hand-object interactions with automated live visual descriptions. We trained a custom egocentric hand interaction model to recognize both common gestures (e.g., grab to inspect, hold side-by-side to compare) and unique ones by blind people (e.g., point to explore color, or swipe to read available texts). Furthermore, TouchScribe provides real-time and adaptive feedback based on hand movement, from hand interaction states, to object labels, and to visual details. Our user study and technical evaluations demonstrate that TouchScribe can provide rich and useful descriptions to support object understanding. Finally, we discuss the implications of making live visual descriptions responsive to users' physical reach.
Smart rings can serve as a wearable platform for off-device control of nearby devices. However, their thin bands lead to a small touch surface on the ring, limiting touch input expressivity and accuracy. To address this, we investigate the effectiveness of ring shapes that use beveled surfaces in addition to the flat outer surface as input surfaces. The distinct angles of the three surfaces limit touches to the intended surfaces, improving accuracy. A first study showed that with band widths of 6 mm or less, flat rings achieved low accuracy (63.8%) in distinguishing between left and right edge touches, whereas beveled and rounded rings achieved high accuracy (90.0%, 93.3%). In a second study of nine touch gestures on 6-mm rings, beveled rings outperformed the flat rings, achieving 92.9% (sighted) and 91.8% (eyes-free) with FPR ≤ 1%. Through these investigations, we identify beveled surfaces as a promising method toward expressive, precise touch input.
Environmental sounds like footsteps, keyboard typing, or dog barking carry rich information and emotional context, making them valuable for designing haptics in user applications. Existing audio-to-vibration methods, however, rely on signal-processing rules tuned for music or games and often fail to generalize across diverse sounds. To address this, we first investigated user perception of four existing audio-to-haptic algorithms, then created a data-driven model for environmental sounds. In Study 1, 34 participants rated vibrations generated by the four algorithms for 1,000 sounds, revealing no consistent algorithm preferences. Using this dataset, we trained Sound2Hap, a CNN-based autoencoder, to generate perceptually meaningful vibrations from diverse sounds with low latency. In Study 2, 15 participants rated its output higher than signal-processing baselines on both audio-vibration match and Haptic Experience Index (HXI), finding it more harmonious with diverse sounds. This work demonstrates a perceptually validated approach to audio-haptic translation, broadening the reach of sound-driven haptics.