3. Generating Visuals

会議の名前
UIST 2024
ShadowMagic: Designing Human-AI Collaborative Support for Comic Professionals’ Shadowing
要旨

Shadowing allows artists to convey realistic volume and emotion of characters in comic colorization. While AI technologies have the potential to improve professionals’ shadowing experience, current practice is manual and time-consuming. To understand how we can improve their shadowing experience, we conducted interviews with 5 professionals. We found that professionals’ level of engagement can vary depending on semantics, such as characters’ faces or hair. We also found they spent time on shadow “landscaping”—deciding where to place large shadow regions to create a realistic volumetric presentation while the final results can vary dramatically depending on their “staging” and “attention guiding” needs. We discovered they would accept AI suggestions for less engaging semantic parts or landscaping, while needing the capability to adjust details. Based on our observations, we developed ShadowMagic, which (1) generates AI-driven shadows based on commonly used light directions, (2) enables users to selectively choose results depending on semantics, and (3) allows users to complete shadow areas themselves for further perfection. Through a summative evaluation with 5 professionals, we found that they were significantly more satisfied with our AI-driven results compared to a baseline. We also found that ShadowMagic’s “step by step” workflow helps participants more easily adopt AI-driven results. We conclude by providing implications.

著者
Amrita Ganguly
George Mason University, Fairfax, Virginia, United States
Chuan Yan
George Mason University, CENTREVILLE, Virginia, United States
John Joon Young. Chung
Midjourney, San Francisco, California, United States
Tong Steven. Sun
George Mason University, Fairfax, Virginia, United States
YOON KIHEON
Pusan National University, Pusan, Korea, Republic of
Yotam Gingold
George Mason University, Fairfax, Virginia, United States
Sungsoo Ray Hong
George Mason University, Fairfax, Virginia, United States
論文URL

https://doi.org/10.1145/3654777.3676332

動画
What's the Game, then? Opportunities and Challenges for Runtime Behavior Generation
要旨

Procedural content generation (PCG), the process of algorithmically creating game components instead of manually, has been a common tool of game development for decades. Recent advances in large language models (LLMs) enable the generation of game behaviors based on player input at runtime. Such code generation brings with it the possibility of entirely new gameplay interactions that may be difficult to integrate with typical game development workflows. We explore these implications through GROMIT, a novel LLM-based runtime behavior generation system for Unity. When triggered by a player action, GROMIT generates a relevant behavior which is compiled without developer intervention and incorporated into the game. We create three demonstration scenarios with GROMIT to investigate how such a technology might be used in game development. In a system evaluation we find that our implementation is able to produce behaviors that result in significant downstream impacts to gameplay. We then conduct an interview study with n=13 game developers using GROMIT as a probe to elicit their current opinion on runtime behavior generation tools, and enumerate the specific themes curtailing the wider use of such tools. We find that the main themes of concern are quality considerations, community expectations, and fit with developer workflows, and that several of the subthemes are unique to runtime behavior generation specifically. We outline a future work agenda to address these concerns, including the need for additional guardrail systems for behavior generation.

受賞
Best Paper
著者
Nicholas Jennings
University of California, Berkeley, Berkeley, California, United States
Han Wang
University of California, Berkeley, Berkeley, California, United States
Isabel Li
University of California, Berkeley, Berkeley, California, United States
James Smith
UC Berkeley, Berkeley, California, United States
Bjoern Hartmann
UC Berkeley, Berkeley, California, United States
論文URL

https://doi.org/10.1145/3654777.3676358

動画
StyleFactory: Towards Better Style Alignment in Image Creation through Style-Strength-Based Control and Evaluation
要旨

Generative AI models have been widely used for image creation. However, generating images that are well-aligned with users' personal styles on aesthetic features (e.g., color and texture) can be challenging due to the poor style expression and interpretation between humans and models. Through a formative study, we observed that participants showed a clear subjective perception of the desired style and variations in its strength, which directly inspired us to develop style-strength-based control and evaluation. Building on this, we present StyleFactory, an interactive system that helps users achieve style alignment. Our interface enables users to rank images based on their strengths in the desired style and visualizes the strength distribution of other images in that style from the model's perspective. In this way, users can evaluate the understanding gap between themselves and the model, and define well-aligned personal styles for image creation through targeted iterations. Our technical evaluation and user study demonstrate that StyleFactory accurately generates images in specific styles, effectively facilitates style alignment in image creation workflow, stimulates creativity, and enhances the user experience in human-AI interactions.

著者
Mingxu Zhou
Zhejiang University, Hangzhou, Zhejiang, China
Dengming Zhang
Zhejiang University, Hangzhou, Zhejiang, China
Weitao You
Zhejiang University, Hangzhou, Zhejiang, China
Ziqi Yu
Zhejiang University, Hangzhou, Zhejiang, China
Yifei Wu
Zhejiang University, Hangzhou, Zhejiang, China
Chenghao Pan
Zhejiang University, Hangzhou, Zhejiang, China
Huiting Liu
Zhejiang University, Hangzhou, Zhejiang, China
Tianyu Lao
Zhejiang University, Hangzhou, Zhejiang, China
Pei Chen
Zhejiang University, Hangzhou, Zhejiang, China
論文URL

https://doi.org/10.1145/3654777.3676370

動画
AutoSpark: Supporting Automobile Appearance Design Ideation with Kansei Engineering and Generative AI
要旨

Rapid creation of novel product appearance designs that align with consumer emotional requirements poses a significant challenge. Text-to-image models, with their excellent image generation capabilities, have demonstrated potential in providing inspiration to designers. However, designers still encounter issues including aligning emotional needs, expressing design intentions, and comprehending generated outcomes in practical applications. To address these challenges, we introduce AutoSpark, an interactive system that integrates Kansei Engineering and generative AI to provide creativity support for designers in creating automobile appearance designs that meet emotional needs. AutoSpark employs a Kansei Engineering engine powered by generative AI and a semantic network to assist designers in emotional need alignment, design intention expression, and prompt crafting. It also facilitates designers' understanding and iteration of generated results through fine-grained image-image similarity comparisons and text-image relevance assessments. The design-thinking map within its interface aids in managing the design process. Our user study indicates that AutoSpark effectively aids designers in producing designs that are more aligned with emotional needs and of higher quality compared to a baseline system, while also enhancing the designers' experience in the human-AI co-creation process.

著者
Liuqing Chen
Zhejiang University, Hangzhou, China
Qianzhi Jing
Zhejiang University, HangZhou, China
Yixin Tsang
Zhejiang University, Hangzhou, China
Qianyi Wang
Zhejiang University, Ningbo, China
Ruocong Liu
Geely Holding Group, Shanghai, China
Duowei Xia
Zhejiang University, Hangzhou, China
Yunzhan Zhou
Zhejiang University, Ningbo, China
Lingyun Sun
Zhejiang University, Hangzhou, China
論文URL

https://doi.org/10.1145/3654777.3676337

動画