Promptify: Text-to-Image Generation through Interactive Prompt Exploration with Large Language Models

要旨

Text-to-image generative models have demonstrated remarkable capabilities in generating high-quality images based on textual prompts. However, crafting prompts that accurately capture the user's creative intent remains challenging. It often involves laborious trial-and-error procedures to ensure that the model interprets the prompts in alignment with the user's intention. To address these challenges, we present Promptify, an interactive system that supports prompt exploration and refinement for text-to-image generative models. Promptify utilizes a suggestion engine powered by large language models to help users quickly explore and craft diverse prompts. Our interface allows users to organize the generated images flexibly, and based on their preferences, Promptify suggests potential changes to the original prompt. This feedback loop enables users to iteratively refine their prompts and enhance desired features while avoiding unwanted ones. Our user study shows that Promptify effectively facilitates the text-to-image workflow, allowing users to create visually appealing images on their first attempt while requiring significantly less cognitive load than a widely-used baseline tool.

著者
Stephen Brade
University of Toronto, Toronto, Ontario, Canada
Bryan Wang
University of Toronto, Toronto, Ontario, Canada
Mauricio Sousa
University of Toronto, Toronto, Ontario, Canada
Sageev Oore
Dalhousie University, Halifax, Nova Scotia, Canada
Tovi Grossman
University of Toronto, Toronto, Ontario, Canada
論文URL

https://doi.org/10.1145/3586183.3606725

動画

会議: UIST 2023

ACM Symposium on User Interface Software and Technology

セッション: Words and Visuals: Authoring Tools for Text and Images

Gold Room
6 件の発表
2023-11-01 19:50:00
2023-11-01 21:10:00