AdaptiveSliders: User-aligned Semantic Slider-based Editing of Text-to-Image Model Output

要旨

Precise editing of text-to-image model outputs remains challenging. Slider-based editing is a recent approach wherein the image’s semantic attributes are manipulated via sliders. However, it has significant user-centric issues. First, slider variations are often inconsistent across the sliding range. Second, the optimal slider range is unpredictable, with default values often being too large or small depending on the prompt and attribute. Third, manipulating one attribute can unintentionally alter others due to the complex entanglement of latent spaces. We introduce AdaptiveSliders, a tool that addresses these challenges by adapting to the specific attributes and prompts, generating consistent slider variations and optimal bounds while minimizing unintended changes. AdaptiveSliders also suggests initial attributes and generates initial images more aligned with prompt semantics. Through three validation studies and one end-to-end user study, we demonstrate that AdaptiveSliders significantly improves user control and experience, enabling semantic slider-based editing aligned with user needs and expectations.

著者
Rahul Jain
Purdue University, West Lafayette, Indiana, United States
Amit Goel
Fujitsu Consulting India, Noida, Uttar Pradesh, India
Koichiro Niinuma
Fujitsu Research of America, Pittsburgh, Pennsylvania, United States
Aakar Gupta
Fujitsu Research America, Redmond, Washington, United States
DOI

10.1145/3706598.3714292

論文URL

https://dl.acm.org/doi/10.1145/3706598.3714292

動画

会議: CHI 2025

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2025.acm.org/)

セッション: Image and AI

G303
7 件の発表
2025-04-28 23:10:00
2025-04-29 00:40:00
日本語まとめ
読み込み中…