Precise editing of text-to-image model outputs remains challenging. Slider-based editing is a recent approach wherein the image’s semantic attributes are manipulated via sliders. However, it has significant user-centric issues. First, slider variations are often inconsistent across the sliding range. Second, the optimal slider range is unpredictable, with default values often being too large or small depending on the prompt and attribute. Third, manipulating one attribute can unintentionally alter others due to the complex entanglement of latent spaces. We introduce AdaptiveSliders, a tool that addresses these challenges by adapting to the specific attributes and prompts, generating consistent slider variations and optimal bounds while minimizing unintended changes. AdaptiveSliders also suggests initial attributes and generates initial images more aligned with prompt semantics. Through three validation studies and one end-to-end user study, we demonstrate that AdaptiveSliders significantly improves user control and experience, enabling semantic slider-based editing aligned with user needs and expectations.
https://dl.acm.org/doi/10.1145/3706598.3714292
The emergence of AI Image Generators (AIGs) has transformed image creation, making it more accessible to generate customized images from simple text prompts. While HCI research has explored the applications of text-to-image generation, the role of AIGs in visual content creation workflow remains relatively underexplored. To address this, we conducted in-depth interviews with 26 end users who had experience across 14 different AIGs and investigated users’ adoption and perceptions of AIGs and AI’s role throughout the entire workflow. Key factors examined include user goals, initial vision, tool integration, and decision-making. Our results indicated that functional goals often drive cross-tool integration to achieve desired outcomes, while in use cases motivated by recreational goals, the usage of AIGs influences the social implications of image sharing. We concluded with four distinct use cases, each highlighting how AIGs are integrated at different stages of the creative process based on varying user goals and visions.
https://dl.acm.org/doi/10.1145/3706598.3713227
Existing approaches for color-concept association typically rely on query-based image referencing, and color extraction from image references. However, these approaches are effective only for common concepts, and are vulnerable to unstable image referencing and varying image conditions. Our formative study with designers underscores the need for primary-accent color compositions and context-dependent colors (e.g., 'clear' vs. 'polluted' sky) in design. In response, we introduce a generative approach for mining semantically resonant colors leveraging images generated by text-to-image models. Our insight is that contemporary text-to-image models can resemble visual patterns from large-scale real-world data. The framework comprises three stages: concept instancing produces generative samples using diffusion models, text-guided image segmentation identifies concept-relevant regions within the image, and color association extracts primarily accompanied by accent colors. Quantitative comparisons with expert designs validate our approach's effectiveness, and we demonstrate the applicability through cases in various design scenarios and a gallery.
https://dl.acm.org/doi/10.1145/3706598.3713418
Text-to-image models can generate visually appealing images from text descriptions. Efforts have been devoted to improving model controls with prompt tuning and spatial conditioning. However, our formative study highlights the challenges for non-expert users in crafting appropriate prompts and specifying fine-grained spatial conditions (e.g., depth or canny references) to generate semantically cohesive images, especially when multiple objects are involved. In response, we introduce SketchFlex, an interactive system designed to improve the flexibility of spatially conditioned image generation using rough region sketches. The system automatically infers user prompts with rational descriptions within a semantic space enriched by crowd-sourced object attributes and relationships. Additionally, SketchFlex refines users' rough sketches into canny-based shape anchors, ensuring the generation quality and alignment of user intentions. Experimental results demonstrate that SketchFlex achieves more cohesive image generations than end-to-end models, meanwhile significantly reducing cognitive load and better matching user intentions compared to region-based generation baseline.
https://dl.acm.org/doi/10.1145/3706598.3713801
Cameras are increasingly augmented with computational processing, producing images that blur the line between documenting reality and creative expression. The rise of text-to-image models has redefined the concept of imagery, sparking ethical and philosophical debates. This paper presents the findings of a qualitative study that employed a provocative prototype `camera’ – the A(I)Cam – to engage creative practitioners directly in these discussions. Developed using a Research-through-Design (RtD) approach, the tangible prototype generates and instantly prints AI-created images. A(I)Cam facilitated reflection among creative practitioners (N=15) on their experiences with AI-driven tools and the broader implications for their future practices. We examine the shifts in perspective that emerged from engaging with this embodied form of generative AI (genAI), challenging traditional text-based interaction paradigms, and inviting new modes of creative exploration and reflection. In addition, we offer insights from the RtD project, highlighting the integration of genAI tools into the industrial design process.
https://dl.acm.org/doi/10.1145/3706598.3713529
Visual blends combine elements from two distinct visual concepts into a single, integrated image, with the goal of conveying ideas through imaginative and often thought-provoking visuals. Communicating abstract concepts through visual blends poses a series of conceptual and technical challenges. To address these challenges, we introduce Creative Blends, an AI-assisted design system that leverages metaphors to visually symbolize abstract concepts by blending disparate objects. Our method harnesses commonsense knowledge bases and large language models to align designers’ conceptual intent with expressive concrete objects. Additionally, we employ generative text-to-image techniques to blend visual elements through their overlapping attributes. A user study (N=24) demonstrated that our approach reduces participants’ cognitive load, fosters creativity, and enhances the metaphorical richness of visual blend ideation. We explore the potential of our method to expand visual blends to include multiple object blending and discuss the insights gained from designing with generative AI.
https://dl.acm.org/doi/10.1145/3706598.3713683
Image-generative AI provides new opportunities to transform personal data into alternative visual forms. In this paper, we illustrate the potential of AI-generated images in facilitating meaningful engagement with personal data. In a formative autobiographical design study, we explored the design and use of AI-generated images derived from personal data. Informed by this study, we designed a web-based application as a probe that represents personal data through generative images utilizing Open AI’s GPT-4 model and DALL-E 3. We then conducted a 21-day diary study and interviews using the probe with 16 participants to investigate users’ in-depth experiences with images generated by AI in everyday lives. Our findings reveal new qualities of experiences in users’ engagement with data, highlighting how participants constructed personal meaning from their data through imagination and speculation on AI-generated images. We conclude by discussing the potential and concerns of leveraging image-generative AI for personal data meaning-making.
https://dl.acm.org/doi/10.1145/3706598.3713722