Vistoria: A Multimodal System to Support Fictional Story Writing through Instrumental Image-Text Co-Editing

要旨

Humans think visually—we remember in images, dream in pictures, and use visual metaphors to communicate. Yet, most creative writing tools remain text-centric, limiting how writers plan and translate ideas. We present Vistoria, a system for synchronized image-text co-editing in fictional story writing. A formative Wizard-of-Oz co-design study with 10 story writers revealed how sketches, images, and text serve as essential elements for ideation and organization. Drawing on theories of Instrumental Interaction, Vistoria introduces instrumental operations—Lasso, Collage, Perspective Shift, and Filter that enable seamless narrative exploration across modalities. A controlled study with 12 participants shows that co-editing enhances expressiveness, immersion, and collaboration, opening space for writers to follow divergent story directions and craft more vivid, detailed narratives. While multimodality increased cognitive demand, participants reported stronger senses of ownership and agency. These findings demonstrate how multimodal co-editing expands creative potential by balancing abstraction and concreteness in narrative development.

著者
Kexue Fu
City University of Hong Kong , Hong Kong, SAR, China
Jingfei Huang
Harvard University , Cambridge , Massachusetts, United States
Long Ling
Tongji University, Shanghai, China
Sumin Hong
University of Notre Dame, Notre Dame, Indiana, United States
Yihang ZUO
The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, Guangdong, China
RAY LC
City University of Hong Kong, Hong Kong, Hong Kong
Toby Jia-Jun. Li
University of Notre Dame, Notre Dame, Indiana, United States
動画

会議: CHI 2026

ACM CHI Conference on Human Factors in Computing Systems

セッション: Designing Creative GenAI Tools

P1 - Room 115
7 件の発表
2026-04-15 18:00:00
2026-04-15 19:30:00