IntentTuner: An Interactive Framework for Integrating Human Intentions in Fine-tuning Text-to-Image Generative Models

Fine-tuning facilitates the adaptation of text-to-image generative models to novel concepts (e.g., styles and portraits), empowering users to forge creatively customized content. Recent efforts on fine-tuning focus on reducing training data and lightening computation overload but neglect alignment with user intentions, particularly in manual curation of multi-modal training data and intent-oriented evaluation. Informed by a formative study with fine-tuning practitioners for comprehending user intentions, we propose IntentTuner, an interactive framework that intelligently incorporates human intentions throughout each phase of the fine-tuning workflow. IntentTuner enables users to articulate training intentions with imagery exemplars and textual descriptions, automatically converting them into effective data augmentation strategies. Furthermore, IntentTuner introduces novel metrics to measure user intent alignment, allowing intent-aware monitoring and evaluation of model training. Application exemplars and user studies demonstrate that IntentTuner streamlines fine-tuning, reducing cognitive effort and yielding superior models compared to the common baseline tool.

Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China

Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China/Guangdong, China

The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, Guangdong, China

https://doi.org/10.1145/3613904.3642165

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2024.acm.org/)

318B

5 件の発表

開始日時2024-05-13 23:00:00

終了日時2024-05-14 00:20:00

お気に入り

あとで読む

コレクション

IntentTuner: An Interactive Framework for Integrating Human Intentions in Fine-tuning Text-to-Image Generative Models

要旨

著者

論文URL

動画

会議: CHI 2024

セッション: Creativity: Visualizations and AI