98. Modeling Human Performance & Motion

前のセッションの直後

7

5分

KeySense: LLM-Powered Hands-Down, Ten-Finger Typing on Commodity Touchscreens

DancingBox: A Lightweight MoCap System for Character Animation from Physical Proxies

ActivitySeeker: Towards Collaborative Personalized Human Activity Discovery and Recognition on Smartphones

HiSync: Spatio-Temporally Aligning Hand Motion from Wearable IMU and On-Robot Camera for Command Source Identification in Long-Range HRI

Modelling Visuo-Haptic Perception Change in Size Estimation Tasks

Steering through a Dynamically Varying Path

Log2Motion: Biomechanical Motion Synthesis from Touch Logs

96. Human-Robot Interaction & Embodied Sensing

99. Physical and Tangible Data Visualizations

Existing touchscreen software keyboards prevent users from resting their hands, forcing slow and fatiguing index-finger tapping (“chicken typing”) instead of familiar hands-down ten-finger typing. We present KeySense, a purely software solution that preserves physical keyboard motor skills. KeySense isolates intentional taps from resting-finger noise with cognitive–motor timing patterns, and then uses a fine-tuned LLM decoder to turn the resulting noisy letter sequence into the intended word. In controlled component tests, this decoder substantially outperforms 2 statistical baselines (top-1 accuracy 84.8% vs 75.7% and 79.3%). A 12-participant study shows clear ergonomic and performance benefits: compared with the conventional hover-style keyboard, users rated KeySense as markedly less physically demanding (NASA-TLX median 1.5 vs 4.0), and after brief practice, typed significantly faster (WPM 28.3 vs 26.2, p <0.01). These results indicate that KeySense enables accurate, efficient and comfortable ten-finger text entry on commodity touchscreens, without any extra hardware.

読み込み中…

Creating compelling 3D character animations typically requires either expert use of professional software or expensive motion capture systems operated by skilled actors. We present DancingBox, a lightweight, vision-based system that makes motion capture accessible to novices by reimagining the process as digital puppetry. Instead of tracking precise human motions, DancingBox captures the approximate movements of everyday objects manipulated by users with a single webcam. These coarse proxy motions are then refined into realistic character animations by conditioning a generative motion model on bounding-box representations, enriched with human motion priors learned from large-scale datasets. To overcome the lack of paired proxy–animation data, we synthesize training pairs by converting existing motion capture sequences into proxy representations. A user study demonstrates that DancingBox enables intuitive and creative character animation using diverse proxies, from plush toys to bananas, lowering the barrier to entry for novice animators.

読み込み中…

Smartphones provide an attractive yet challenging platform for human activity recognition (HAR). They are ubiquitous, but also limit the input of HAR systems to a single IMU. These systems are also challenged by the inherent diversity of human activities and varying phone placement on the user's body. This results in traditional smartphone HAR systems having limited personalization potential or imposing a high user burden. We propose ActivitySeeker, a personalized smartphone HAR system that combines self-supervised activity discovery and low-burden user interaction to collaboratively label IMU data and adapt HAR models to individual users on-device through transfer learning. We evaluated ActivitySeeker through simulated online learning and in-the-wild user experiments, where it discovered 95.5% of personal activity types and achieved high recognition accuracy (93.3%) while maintaining a positive user experience. Leveraging the synergy between user and smartphone, ActivitySeeker opens up new possibilities for HAR-based applications like fitness, health and personalized recommendation.

読み込み中…

Long-range Human-Robot Interaction (HRI) remains underexplored. Within it, Command Source Identification (CSI) – determining who issued a command – is especially challenging due to multi-user and distance-induced sensor ambiguity. We introduce HiSync, an optical-inertial fusion framework that treats hand motion as binding cues by aligning robot-mounted camera optical flow with hand-worn IMU signals. We first elicit a user-defined (N=12) gesture set and collect a multimodal command gesture dataset (N=38) in long-range multi-user HRI scenarios. Next, HiSync extracts frequency-domain hand motion features from both camera and IMU data, and a learned CSINet denoises IMU readings, temporally aligns modalities, and performs distance-aware multi-window fusion to compute cross-modal similarity of subtle, natural gestures, enabling robust CSI. In three-person scenes up to 34m, HiSync achieves 92.32% CSI accuracy, outperforming the prior SOTA by 48.44%. HiSync is also validated on real-robot deployment. By making CSI reliable and natural, HiSync provides a practical primitive and design guidance for public-space HRI.

読み込み中…

Tangible interactions involve multiple sensory cues, enabling the accurate perception of object properties, such as size. Research has shown, however, that if we decouple these cues (for example, by altering the visual cue), then the resulting discrepancies present new opportunities for interactions. Perception over time though, not only relies on momentary sensory cues, but also on a priori beliefs about the object, implying a continuing update cycle. This cycle is poorly understood and its impact on interaction remains unknown. We study (N=80) visuo-haptic perception of size over time and (a) reveal how perception drifts, (b) examine the effects of visual priming and dead-reckoning, and (c) present a model of visuo-haptic perception as a cyclical, self-adjusting system. Our work has a direct impact on illusory perception in VR, but also sheds light on how our visual and haptic systems cooperate and diverge.

読み込み中…

Existing research in steering often focuses on static paths. However, in real-world systems, such as interfaces for lasso and video games, there are no guarantees that the paths will maintain their initial width and distance during steering. We thus explore steering movements on a dynamically varying path, widening or narrowing. To this end, we empirically studied the impact of several task parameters - including changed path width, path occupancy duration, and position - on steering through the dynamically varying path. As a result, we found different steering movements and performance to the previous steering task in the static path in terms of movement time and error rate. Additionally, we tested various previous extended models and found poor fitness results (mean $R^2_{adjusted}=0.53$) compared to the presented models (mean $R^2_{adjusted}=0.92$). We believe our model will assist GUI and game designers in evaluating their applications.

読み込み中…

Touch data from mobile devices are collected at scale but reveal little about the interactions that produce them. While biomechanical simulations can illuminate motor control processes, they have not yet been developed for touch interactions. To close this gap, we propose a novel computational problem: synthesizing plausible motion directly from logs. Our key insight is a reinforcement learning-driven musculoskeletal forward simulation that generates biomechanically plausible motion sequences consistent with events recorded in touch logs. We achieve this by integrating a software emulator into a physics simulator, allowing biomechanical models to manipulate real applications in real-time. Log2Motion produces rich syntheses of user movements from touch logs, including estimates of motion, speed, accuracy, and effort. We assess the plausibility of generated movements by comparing against human data from a motion capture study and prior findings, and demonstrate Log2Motion in a large-scale dataset. Biomechanical motion synthesis provides a new way to understand log data, illuminating the ergonomics and motor control underlying touch interactions.

読み込み中…

発表担当

目次

説明

日本語まとめ

説明

日本語まとめ

説明

日本語まとめ

説明

日本語まとめ

説明

日本語まとめ

説明

日本語まとめ

説明

日本語まとめ