SketchFlex: Facilitating Spatial-Semantic Coherence in Text-to-Image Generation with Region-Based Sketches

Text-to-image models can generate visually appealing images from text descriptions. Efforts have been devoted to improving model controls with prompt tuning and spatial conditioning. However, our formative study highlights the challenges for non-expert users in crafting appropriate prompts and specifying fine-grained spatial conditions (e.g., depth or canny references) to generate semantically cohesive images, especially when multiple objects are involved. In response, we introduce SketchFlex, an interactive system designed to improve the flexibility of spatially conditioned image generation using rough region sketches. The system automatically infers user prompts with rational descriptions within a semantic space enriched by crowd-sourced object attributes and relationships. Additionally, SketchFlex refines users' rough sketches into canny-based shape anchors, ensuring the generation quality and alignment of user intentions. Experimental results demonstrate that SketchFlex achieves more cohesive image generations than end-to-end models, meanwhile significantly reducing cognitive load and better matching user intentions compared to region-based generation baseline.

The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China

The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, Guangdong, China

Central South University, Changsha, China

The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, Guangdong, China

10.1145/3706598.3713801

https://dl.acm.org/doi/10.1145/3706598.3713801

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2025.acm.org/)

G303

7 件の発表

開始日時2025-04-28 23:10:00

終了日時2025-04-29 00:40:00

読み込み中…

お気に入り

あとで読む

コレクション

SketchFlex: Facilitating Spatial-Semantic Coherence in Text-to-Image Generation with Region-Based Sketches

要旨

受賞
Honorable Mention

著者

DOI

論文URL

動画

会議: CHI 2025

セッション: Image and AI

日本語まとめ

SketchFlex: Facilitating Spatial-Semantic Coherence in Text-to-Image Generation with Region-Based Sketches

要旨

受賞Honorable Mention

著者

DOI

論文URL

動画

会議: CHI 2025

セッション: Image and AI

日本語まとめ

受賞
Honorable Mention