Optimization with/for AI

会議の名前
CHI 2025
FontCraft: Multimodal Font Design Using Interactive Bayesian Optimization
要旨

Creating new fonts requires a lot of human effort and professional typographic knowledge. Despite the rapid advancements of automatic font generation models, existing methods require users to prepare pre-designed characters with target styles using font-editing software, which poses a problem for non-expert users. To address this limitation, we propose FontCraft, a system that enables font generation without relying on pre-designed characters. Our approach integrates the exploration of a font-style latent space with human-in-the-loop preferential Bayesian optimization and multimodal references, facilitating efficient exploration and enhancing user control. Moreover, FontCraft allows users to revisit previous designs, retracting their earlier choices in the preferential Bayesian optimization process. Once users finish editing the style of a selected character, they can propagate it to the remaining characters and further refine them as needed. The system then generates a complete outline font in OpenType format. We evaluated the effectiveness of FontCraft through a user study comparing it to a baseline interface. Results from both quantitative and qualitative evaluations demonstrate that FontCraft enables non-expert users to design fonts efficiently.

著者
Yuki Tatsukawa
The University of Tokyo, Tokyo, Japan
I-Chao Shen
Graduate School of Information Science and Technology, Tokyo, Japan
Mustafa Doga Dogan
Adobe Research, Basel, Switzerland
Anran QI
Inria, Sophia Antipolis, France
Yuki Koyama
National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Japan
Ariel Shamir
Reichman University, Herzliya, Israel
Takeo Igarashi
The University of Tokyo, Tokyo, Japan
DOI

10.1145/3706598.3713863

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713863

動画
OptiSub: Optimizing Video Subtitle Presentation for Varied Display and Font Sizes via Speech Pause-Driven Chunking
要旨

Viewers desire to watch video content with subtitles in various font sizes according to their viewing environment and personal preferences. Unfortunately, because a chunk of the subtitle—a segment of the text corpus displayed on the screen at once—is typically constructed based on one specific font size, text truncation or awkward line breaks can occur when different font sizes are utilized. While existing methods address this problem by reconstructing subtitle chunks based on maximum character counts, they overlook synchronization of the display time with the content, often causing misaligned text. We introduce OptiSub, a fully automated method that optimizes subtitle segmentation to fit any user-specified font size while ensuring synchronization with the content. Our method leverages the timing of speech pauses within the video for synchronization. Experimental results, including a user study comparing OptiSub with previous methods, demonstrate its effectiveness and practicality across diverse font sizes and input videos.

著者
Dawon Lee
KAIST, Daejeon, Korea, Republic of
Jongwoo Choi
KAIST, Daejeon, Korea, Republic of
Junyong Noh
KAIST, Daejeon, Korea, Republic of
DOI

10.1145/3706598.3714199

論文URL

https://dl.acm.org/doi/10.1145/3706598.3714199

動画
Artificial Intimacy: Exploring Normativity and Personalization through Fine-tuning LLM Chatbots
要旨

Fine-tuning Large Language Models (LLMs) is one response to the critique of LLMs being biased, erasing diversity, and raising ethical concerns. The Artificial Intimacy project employs artistic methods, taking personalization of chatbots to an extreme by fine-tuning LLMs on individual social media data. We find that regular GPT-3 chatbots attempt to circumvent value-laden content through flagging prompts and producing generic non-answers with variable success. While the transactional nature of such output allowed participants to make sense of responses with less personification, fine-tuned models presented value-laden, normative, and familiar personalities, resulting in strong personification as a way of making sense of the interactions. This mimicry of emotional connection resulted in a sense of artificial intimacy creating expectations for reciprocity and consideration that the models cannot express by design. As the commercialization of interactions with chatbots continues, we discuss the ethics of such emotional manipulation and its implications for personalization of LLMs.

著者
Mirabelle Jones
University of Copenhagen, Copenhagen, Denmark
Nastasia Griffioen
University of Twente, Enschede, Netherlands
Christina Neumayer
Department of Communication, University of Copenhagen, Denmark
Irina Shklovski
University of Copenhagen, Copenhagen, Denmark
DOI

10.1145/3706598.3713728

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713728

動画
Characterizing Photorealism and Artifacts in Diffusion Model-Generated Images
要旨

Diffusion model-generated images can appear indistinguishable from authentic photographs, but these images often contain artifacts and implausibilities that reveal their AI-generated provenance. Given the challenge to public trust in media posed by photorealistic AI-generated images, we conducted a large-scale experiment measuring human detection accuracy on 450 diffusion-model generated images and 149 real images. Based on collecting 749,828 observations and 34,675 comments from 50,444 participants, we find that scene complexity of an image, artifact types within an image, display time of an image, and human curation of AI-generated images all play significant roles in how accurately people distinguish real from AI-generated images. Additionally, we propose a taxonomy characterizing artifacts often appearing in images generated by diffusion models. Our empirical observations and taxonomy offer nuanced insights into the capabilities and limitations of diffusion models to generate photorealistic images in 2024.

著者
Negar Kamali
Northwestern University, Evanston, Illinois, United States
Karyn Nakamura
Northwestern University, Evanston, Illinois, United States
Aakriti Kumar
Northwestern University, Evanston, Illinois, United States
Angelos Chatzimparmpas
Utrecht University, Utrecht, Netherlands
Jessica Hullman
Northwestern University, Evanston, Illinois, United States
Matthew Groh
Northwestern, Evanston, Illinois, United States
DOI

10.1145/3706598.3713962

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713962

動画
Continual Human-in-the-Loop Optimization
要旨

Optimal input settings vary across users due to differences in motor abilities and personal preferences, which are typically addressed by manual tuning or calibration. Although human-in-the-loop optimization has the potential to identify optimal settings during use, it is rarely applied due to its long optimization process. A more efficient approach would continually leverage data from previous users to accelerate optimization, exploiting shared traits while adapting to individual characteristics. We introduce the concept of Continual Human-in-the-Loop Optimization and a Bayesian optimization-based method that leverages a Bayesian-neural-network surrogate model to capture population-level characteristics while adapting to new users. We propose a generative replay strategy to mitigate catastrophic forgetting. We demonstrate our method by optimizing virtual reality keyboard parameters for text entry using direct touch, showing reduced adaptation times with a growing user base. Our method opens the door for next-generation personalized input systems that improve with accumulated experience.

受賞
Honorable Mention
著者
Yi-Chi Liao
ETH Zürich, Zürich, Switzerland
Paul Streli
ETH Zürich, Zürich, Switzerland
Zhipeng Li
ETH Zürich, Zurich, Switzerland
Christoph Gebhardt
ETH Zürich, Zürich, Switzerland
Christian Holz
ETH Zürich, Zurich, Switzerland
DOI

10.1145/3706598.3713603

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713603

動画
Coordination Mechanisms in AI Development: Practitioner Experiences on Integrating UX Activities
要旨

Software development relies on collaboration and alignment between a variety of roles, including software developers and user experience designers. The increasing focus on artificial intelligence in today's development projects has given rise to new challenges in this collaboration. We extend previous work on the process of designing human-AI systems by analysing collaborative practices between UX designers and AI developers through Mintzberg's theory on coordination mechanisms. We conducted 15 in-depth interviews with UX designers and AI developers currently working on AI projects. We contribute by identifying how coordination mechanisms impact the UX design process when developing AI systems, inter-team (a)symmetries in power relations, and a growing need for tools and cross-disciplinary knowledge to support these collaborative efforts. In particular, we outline the risks of coordinating AI development work through the standardisation of output and skills in separately organised UX and AI development teams.

受賞
Honorable Mention
著者
Anders Bruun
Computer Science, Aalborg University, Aalborg Oest, Denmark
Niels van Berkel
Aalborg University, Aalborg, Denmark
Dimitrios Raptis
Aalborg University, Aalborg, Denmark
Effie L-C. Law
Durham University, Durham, United Kingdom
DOI

10.1145/3706598.3713200

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713200

動画
Exploring Reduced Feature Sets for American Sign Language Dictionaries
要旨

There is currently no easy way to look up signs in sign language. Feature-based dictionaries help overcome this challenge by enabling users to look up a sign by inputting descriptive visual features, such as handshape and movement. However, feature-based dictionaries are typically cumbersome, including large numbers of complex features that the user must sort through. In this work, we explore simplifying the set of features used in feature-based American Sign Language (ASL) dictionaries. We present two studies: 1) a simulation study focused on lookup accuracy for various reduced feature sets, and 2) a user study focused on understanding human preferences between feature sets. Our results suggest that it is possible to dramatically reduce the number of features needed to search for signs without significantly impacting the accuracy of search results, and that smaller feature sets can improve the user experience in some cases.

受賞
Honorable Mention
著者
Ben Kosa
University of Wisconsin--Madison, Madison, Wisconsin, United States
Aashaka Desai
University of Washington, Seattle, Washington, United States
Alex X. Lu
Microsoft Research, Cambridge, Massachusetts, United States
Richard E.. Ladner
University of Washington, Seattle, Washington, United States
Danielle Bragg
Microsoft Research, Cambridge, Massachusetts, United States
DOI

10.1145/3706598.3714118

論文URL

https://dl.acm.org/doi/10.1145/3706598.3714118

動画