SoundStager: Interactive Design of Story-Driven GenAI Soundscapes for Video

要旨

Sound effects (SFX) are critical to video storytelling by immersing viewers, directing attention, and shaping emotion. However, crafting an effective soundscape is difficult: creators must decidehow to source, place, layer, and mix sounds to support the narrative. Generative text-to-SFX tools enable users to create custom sounds, but creators often struggle to describe sounds with words and lack control over individual stems in premixed outputs. We propose SoundStager, an AI-assisted tool for designing generative soundscapes for video. SoundStager analyzes the video narrativeto create layered audio scenes (of keynote, signal, soundmark, and archetypal sounds) and supports iterative refinement through a combination of conversational and analog controls. SoundStager’s design was informed by formative studies with six professional sound designers, six video creators, and insights from sound design literature. Our user evaluation with twelve video creators shows that SoundStager enables users to quickly create satisfactory soundscapes while retaining creative control.

著者
Suhyeon Yoo
University of Toronto, Toronto, Ontario, Canada
Adolfo Hernandez Santisteban
Adobe, Seattle, Washington, United States
Prem Seetharaman
Adobe Research, Sacramento, California, United States
Justin Salamon
Adobe Research, San Francisco, California, United States
Oriol Nieto
Adobe Research, San Francisco, California, United States
Anh Truong
Adobe Research, New York, New York, United States

会議: CHI 2026

ACM CHI Conference on Human Factors in Computing Systems

セッション: Designing Creative GenAI Tools

P1 - Room 115
7 件の発表
2026-04-15 18:00:00
2026-04-15 19:30:00