Music to My Ears

Paper Guilds

/

会議の一覧

/

CHI 2026

/

Music to My Ears

CHI 2026

Movement-based Games, Sports, and Coaching

Novel Mobile and Tangible Interactions

Vocal training is difficult because the muscles that control pitch, resonance, and phonation are internal and invisible to learners. This paper investigates how Electromyography (EMG) and ultrasonic imaging (UI) can make these muscles observable for training purposes. We report three studies. First, we analyze the EMG and UI data from 16 singers (beginners, experienced \& professionals), revealing differences among three vocal groups of the muscle control proficiency. Second, we use the collected data to create a system that visualizes an expert's muscle activity as reference. This system is tested in a user study with 12 novices, showing that EMG highlighted muscle activation nuances, while UI provided insights into vocal cord length and dynamics. Third, to compare our approach to traditional methods (audio analysis and coach instructions), we conducted a focus group study with 15 experienced singers. Our results suggest that EMG is promising for improving vocal skill development and enhancing feedback systems. We conclude the paper with a detailed comparison of the analyzed modalities (EMG, UI and traditional methods), resulting in recommendations to improve vocal muscle training systems.

Graduate School of Media Design, Yokohama, Kanagawa, Japan

Ruhr University Bochum, Bochum, Germany

Tokyo Institute of Technology, Tokyo, Japan

Institute of Science Tokyo, Tokyo, Japan

Tokyo Institute of Technology, Tokyo, Japan

graduate school media design, TOKYO, Japan

Waseda University , Tokyo, Japan

Institute of Science Tokyo, Tokyo, Japan

Ruhr University Bochum, Bochum, Germany

Keio University, Yokohama, Kanagawa, Japan

Tokyo Institute of Technology, Tokyo, Japan

Keio University Graduate School of Media Design, Yokohama, Japan

お気に入り

あとで読む

コレクション

Traditional pianos are inherently non-portable, restricting everyday accessibility and on-demand creativity. Existing portable alternatives, largely vision-based with external cameras, suffer from limited range, occlusion, and unreliable contact detection. We present PianoBand, a wrist-worn system integrating an IMU, a miniature under-wrist RGB camera, and a printed keyboard sheet augmented with fiducial markers for reliable key mapping on any flat surface. Powered by a lightweight real-time IMU–vision pipeline, PianoBand enables high-fidelity piano interaction, supporting single notes, multi-finger chords, flexible fingering, dynamic velocity, and preliminary articulation techniques. Technical evaluation showed robust tap detection (over 99% accuracy) and accurate fingertip localization (8.90 pixels error), enabling precise note mapping. A comparative user study (N=15) further evaluated system performance, reporting high note accuracy, comparable to roll-up pianos and outperforming an XR piano, along with high ratings for portability, expressivity, and extensibility. Expert interviews highlighted broad application opportunities for piano-based experience and music creation, suggesting future design directions.

Tsinghua University, Beijing, China

Department of Automation, Tsinghua University, Beijing, China

Tsinghua University, Beijing, China

Department of Automation, BNRist, Tsinghua University, Beijing, China

お気に入り

あとで読む

コレクション

Learning to play music is an embodied process, but traditional tools like scores and recordings overlook the role of gesture. While video tutorials offer visual cues, they remain detached from the instrument. We present ReTouche, an interactive system that projects synchronized notes and hand gestures directly onto the actuated keys of a player piano. The system includes a pipeline for adapting publicly available overhead-view YouTube videos and supports interactions for video-based learning, such as sectional practice, layered guidance, and self-recording. We evaluate ReTouche through a comparative structured observation study with YouTube-based self-learning (n=18), a two-week autoethnography study (n=3), and a focus group with professional piano teachers (n=4). Our findings show that embodied representations can ground self-guided piano learning by anchoring gesture, sound, and action within the instrument. Learners appropriated these representations to develop strategies and sustain motivation, while teachers saw potential for integrating ReTouche as a complement to conventional pedagogy.

Institute for Future Technologies, De Vinci Higher Eduction, Courbevoie, Ile de France, France

University of Oxford, Oxford, United Kingdom

Ecole Supérieure d'Ingénieurs Léonard de Vinci, Courbevoie, France

De Vinci Higher Education, Research Center, Courbevoie, France

Institute for Future Technologies, De Vinci Higher Eduction, Courbevoie, France

INSERM, Paris, France

MIT Media Lab, Cambridge, Massachusetts, United States

De Vinci Higher Education, Courbevoie, France

お気に入り

あとで読む

コレクション

Adolescence is marked by strong creative impulses but limited strategies for structured expression, often leading to frustration or disengagement. While generative AI lowers technical barriers and delivers efficient outputs, its role in fostering adolescents’ expressive growth has been overlooked. We propose MusicScaffold, an adolescent-centered framework that transforms classical AI roles from broad conceptualizations into stage-specific, actionable developmental scaffolds designed to make expressive strategies transparent and learnable and to support adolescents in mastering creative expression. In a four-week study with middle school students (ages 12–14), MusicScaffold enhanced cognitive specificity, behavioral regulation, and affective autonomy in music creation. By reframing generative AI as a scaffold rather than a generator, this work bridges the machine efficiency of generative systems with human growth in adolescent creativity education.

The Hong Kong Polytechnic University, Hong Kong, Hong Kong

FireTorch Partners, Hong Kong, Hong Kong

Zhejiang University, Hangzhou, Zhejiang, China

Shenzhen International Foundation College, Shenzhen, China

The Hong Kong Polytechnic University, Hong Kong, China

お気に入り

あとで読む

コレクション

Music shapes the tone of videos, yet creators find it hard to find soundtracks that match their video's mood and narrative. Recent text-to-music models let creators generate music from text prompts, but our formative study (N=8) shows creators struggle to construct diverse prompts, quickly review and compare tracks, and understand their impact on the video. We present VidTune, a system that supports soundtrack creation by generating diverse music options from a creator’s prompt and producing contextual thumbnails for rapid review. VidTune extracts representative video subjects to ground thumbnails in context, maps each track’s valence and energy onto visual cues like color and brightness, and depicts prominent genres and instruments. Creators can refine tracks with natural language edits, which VidTune expands into new generations. In a controlled user study (N=12) and an exploratory case study (N=6), participants found VidTune helpful for efficiently reviewing and comparing music options and described the process as playful and enriching.

University of Texas, Austin, Austin, Texas, United States

Adobe Research, Seattle, Washington, United States

Adobe, Seattle, Washington, United States

お気に入り

あとで読む

コレクション

Live music provides a uniquely rich setting for studying creativity and interaction due to its spontaneous nature. The pursuit of live music agents---intelligent systems supporting real-time music performance and interaction---has captivated researchers across HCI, AI, and computer music for decades, and recent advancements in AI suggest unprecedented opportunities to evolve their design. However, the interdisciplinary nature of music has led to fragmented development across research communities, hindering effective communication and collaborative progress. In this work, we bring together perspectives from these diverse fields to map the current landscape of live music agents. Based on our analysis of 184 systems across both academic literature and video, we develop a comprehensive design space that categorizes dimensions spanning usage contexts, interactions, technologies, and ecosystems. By highlighting trends and gaps in live music agents, our design space offers researchers, designers, and musicians a structured lens to understand existing systems and shape future directions in real-time human-AI music co-creation. We release our annotated systems as a living artifact at https://live-music-agents.github.io.

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

Massachusetts Institute of Technology, Cambridge, Massachusetts, United States

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

University of Illinois Urbana-Champaign, Urbana, Illinois, United States

University of California San Diego, La Jolla, California, United States

KAIST, Daejeon, Korea, Republic of

Northwestern University, Evanston, Illinois, United States

Massachusetts Institute of Technology, Cambridge, Massachusetts, United States

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

お気に入り

あとで読む

コレクション

Rhythm and articulation are essential for expressive guitar performance. Existing tools provide basic beat cues, whereas beginners often struggle to align with these cues when playing complex techniques, such as strumming and muting. Informed by a formative study with five instructors and grounded in embodied learning theories, we present FretFlow, a haptic vest-based tool that simulates common instructional practices to guide learners through physical interactions like tapping. The key to FretFlow is its design space that maps rhythmic and articulation patterns in various playing techniques to distinct haptic patterns, enabling authoring of haptic scores. FretFlow further dynamically adapts haptic intensity based on learners' real-time performance accuracy, accompanied by multimodal guidance across haptic, visual, and audio channels. We iteratively refined haptic designs across two rounds with 46 participants, followed by a two-week user study with 20 beginners. Results show that FretFlow improves learners’ rhythmic accuracy and expressive performance.

Newcastle University, Newcastle Upon Tyne, United Kingdom

Newcastle University, Newcastle upon Tyne, United Kingdom

McGill University, Montreal, Quebec, Canada

Stuart Weitzman School of Design, Philadelphia, Pennsylvania, United States

Virginia Polytechnic Institute and State University, Blacksburg, Virginia, United States

UIUC, Champion, Illinois, United States

New York University Abu Dhabi, Abu Dhabi, --- Select One ---, United Arab Emirates

Newcastle University, Newcastle Upon Tyne, United Kingdom

お気に入り

あとで読む

コレクション

Movement-based Games, Sports, and Coaching

Novel Mobile and Tangible Interactions

要旨

受賞
Honorable Mention

著者

要旨

著者

要旨

著者

要旨

著者

要旨

著者

要旨

著者

要旨

著者

Music to My Ears

要旨

受賞Honorable Mention

著者

要旨

著者

要旨

著者

要旨

著者

要旨

著者

要旨

著者

要旨

著者

受賞
Honorable Mention