3. Generating Visuals

https://doi.org/10.1145/3654777.3676358

Procedural content generation (PCG), the process of algorithmically creating game components instead of manually, has been a common tool of game development for decades. Recent advances in large language models (LLMs) enable the generation of game behaviors based on player input at runtime. Such code generation brings with it the possibility of entirely new gameplay interactions that may be difficult to integrate with typical game development workflows. We explore these implications through GROMIT, a novel LLM-based runtime behavior generation system for Unity. When triggered by a player action, GROMIT generates a relevant behavior which is compiled without developer intervention and incorporated into the game. We create three demonstration scenarios with GROMIT to investigate how such a technology might be used in game development. In a system evaluation we find that our implementation is able to produce behaviors that result in significant downstream impacts to gameplay. We then conduct an interview study with n=13 game developers using GROMIT as a probe to elicit their current opinion on runtime behavior generation tools, and enumerate the specific themes curtailing the wider use of such tools. We find that the main themes of concern are quality considerations, community expectations, and fit with developer workflows, and that several of the subthemes are unique to runtime behavior generation specifically. We outline a future work agenda to address these concerns, including the need for additional guardrail systems for behavior generation.

University of California, Berkeley, Berkeley, California, United States

UC Berkeley, Berkeley, California, United States

https://doi.org/10.1145/3654777.3676370

Generative AI models have been widely used for image creation. However, generating images that are well-aligned with users' personal styles on aesthetic features (e.g., color and texture) can be challenging due to the poor style expression and interpretation between humans and models. Through a formative study, we observed that participants showed a clear subjective perception of the desired style and variations in its strength, which directly inspired us to develop style-strength-based control and evaluation. Building on this, we present StyleFactory, an interactive system that helps users achieve style alignment. Our interface enables users to rank images based on their strengths in the desired style and visualizes the strength distribution of other images in that style from the model's perspective. In this way, users can evaluate the understanding gap between themselves and the model, and define well-aligned personal styles for image creation through targeted iterations. Our technical evaluation and user study demonstrate that StyleFactory accurately generates images in specific styles, effectively facilitates style alignment in image creation workflow, stimulates creativity, and enhances the user experience in human-AI interactions.

Zhejiang University, Hangzhou, Zhejiang, China

https://doi.org/10.1145/3654777.3676337

Rapid creation of novel product appearance designs that align with consumer emotional requirements poses a significant challenge. Text-to-image models, with their excellent image generation capabilities, have demonstrated potential in providing inspiration to designers. However, designers still encounter issues including aligning emotional needs, expressing design intentions, and comprehending generated outcomes in practical applications. To address these challenges, we introduce AutoSpark, an interactive system that integrates Kansei Engineering and generative AI to provide creativity support for designers in creating automobile appearance designs that meet emotional needs. AutoSpark employs a Kansei Engineering engine powered by generative AI and a semantic network to assist designers in emotional need alignment, design intention expression, and prompt crafting. It also facilitates designers' understanding and iteration of generated results through fine-grained image-image similarity comparisons and text-image relevance assessments. The design-thinking map within its interface aids in managing the design process. Our user study indicates that AutoSpark effectively aids designers in producing designs that are more aligned with emotional needs and of higher quality compared to a baseline system, while also enhancing the designers' experience in the human-AI co-creation process.

Zhejiang University, Hangzhou, China

Zhejiang University, HangZhou, China

Zhejiang University, Hangzhou, China

Zhejiang University, Ningbo, China

Geely Holding Group, Shanghai, China

Zhejiang University, Hangzhou, China

Zhejiang University, Ningbo, China

Zhejiang University, Hangzhou, China