Adapting an interface requires taking into account both the positive and negative effects that changes may have on the user. A carelessly picked adaptation may impose high costs to the user – for example, due to surprise or relearning effort – or "trap" the process to a suboptimal design immaturely. However, effects on users are hard to predict as they depend on factors that are latent and evolve over the course of interaction. We propose a novel approach for adaptive user interfaces that yields a conservative adaptation policy: It finds beneficial changes when there are such and avoids changes when there are none. Our model-based reinforcement learning method plans sequences of adaptations and consults predictive HCI models to estimate their effects. We present empirical and simulation results from the case of adaptive menus, showing that the method outperforms both a non-adaptive and a frequency-based policy.
Visual blends are an advanced graphic design technique to seamlessly integrate two objects into one. Existing tools help novices create prototypes of blends, but it is unclear how they would improve them to be higher fidelity. To help novices, we aim to add structure to the iterative improvement process. We introduce a method for improving prototypes that uses secondary design dimensions to explore a structured design space. This method is grounded in the cognitive principles of human visual object recognition. We present VisiFit – a computational design system that uses this method to enable novice graphic designers to improve blends with computationally generated options they can select, adjust, and chain together. Our evaluation shows novices can substantially improve 76% of blends in under 4 minutes. We discuss how the method can be generalized to other blending problems, and how computational tools can support novices by enabling them to explore a structured design space quickly and efficiently.
Visual design provides the backdrop to most of our interactions over the Internet, but has not received as much analytical attention as textual content. Combining computational with qualitative approaches, we investigate the growing concern that visual design of the World Wide Web has homogenized over the past decade. By applying computer vision techniques to a large data-set of representative websites images from 2003--2019, we show that designs have become significantly more similar since 2007, especially for page layouts where the average distance between sites decreased by over 30%. Synthesizing interviews from 11 experienced web design professionals with our computational analyses, we discuss causes of this homogenization including overlap in source code and libraries, color scheme standardization, and support for mobile devices. Our results seek to motivate future discussion of the factors that influence designers and their implications on the future trajectory of web design.
Recent research on creativity support tools (CST) adopts artificial intelligence (AI) that leverages big data and computational capabilities to facilitate creative work. Our work aims to articulate the role of AI in supporting creativity with a case study of an AI-based CST tool in fashion design based on theoretical groundings. We developed AI models by externalizing three cognitive operations (extending, constraining, and blending) that are associated with divergent and convergent thinking. We present FashionQ, an AI-based CST that has three interactive visualization tools (StyleQ, TrendQ, and MergeQ). Through interviews and a user study with 20 fashion design professionals (10 participants for the interviews and 10 for the user study), we demonstrate the effectiveness of FashionQ on facilitating divergent and convergent thinking and identify opportunities and challenges of incorporating AI in the ideation process. Our findings highlight the role and use of AI in each cognitive operation based on professionals’ expertise and suggest future implications of AI-based CST development.
Advertising posters are a commonly used form of information presentation to promote a product. Producing advertising posters often takes much time and effort of designers when confronted with abundant choices of design elements and layouts. This paper presents Vinci, an intelligent system that supports the automatic generation of advertising posters. Given the user-specified product image and taglines, Vinci uses a deep generative model to match the product image with a set of design elements and layouts for generating an aesthetic poster. The system also integrates online editing-feedback that supports users in editing the posters and updating the generated results with their design preference. Through a series of user studies and a Turing test, we found that Vinci can generate posters as good as human designers and that the online editing-feedback improves the efficiency in poster modification.
Representing the semantics of GUI screens and components is crucial to data-driven computational methods for modeling user-GUI interactions and mining GUI designs. Existing GUI semantic representations are limited to encoding either the textual content, the visual design and layout patterns, or the app contexts. Many representation techniques also require significant manual data annotation efforts. This paper presents Screen2Vec, a new self-supervised technique for generating representations in embedding vectors of GUI screens and components that encode all of the above GUI features without requiring manual annotation using the context of user interaction traces. Screen2Vec is inspired by the word embedding method Word2Vec, but uses a new two-layer pipeline informed by the structure of GUIs and interaction traces and incorporates screen- and app-specific metadata. Through several sample downstream tasks, we demonstrate Screen2Vec's key useful properties: representing between-screen similarity through nearest neighbors, composability, and capability to represent user tasks.
What breathes life into an embodied agent or avatar? While body motions such as facial expressions, speech and gestures have been well studied, relatively little attention has been applied to subtle changes due to underlying physiology. We argue that subtle pulse signals are important for creating more lifelike and less disconcerting avatars. We propose a method for animating blood flow patterns, based on a data-driven physiological model that can be used to directly augment the appearance of synthetic avatars and photo-realistic faces. While the changes are difficult for participants to "see", they significantly more frequently select faces with blood flow as more anthropomorphic and animated than faces without blood flow. Furthermore, by manipulating the frequency of the heart rate in the underlying signal we can change the perceived arousal of the character.
We propose a method that generates a virtual camera layout of a 3D animation scene by following the cinematic intention of a reference video. From a reference video, cinematic features such as the start frame, end frame, framing, camera movement, and the visual features of the subjects are extracted automatically. The extracted information is used to generate the virtual camera layout, which resembles the camera layout of the reference video. Our method handles stylized as well as human characters with body proportions different from those of humans. We demonstrate the effectiveness of our approach with various reference videos and 3D animation scenes. The user evaluation results show that the generated layouts are comparable to layouts created by the artist, allowing us to assert that our method can provide effective assistance to both novice and professional users when positioning a virtual camera.
Showing ads delivers revenue for online content distributors, but ad exposure can compromise user experience and cause user fatigue and frustration. Correctly balancing ads with other content is imperative. Currently, ad allocation relies primarily on demographics and inferred user interests, which are treated as static features and can be privacy-intrusive. This paper uses person-centric and momentary context features to understand optimal ad-timing. In a quasi-experimental study on a three-month longitudinal dataset of 100K Snapchat users, we find ad timing influences ad effectiveness. We draw insights on the relationship between ad effectiveness and momentary behaviors such as duration, interactivity, and interaction diversity. We simulate ad reallocation, finding that our study-driven insights lead to greater value for the platform. This work advances our understanding of ad consumption and bears implications for designing responsible ad allocation systems, improving both user and platform outcomes. We discuss privacy-preserving components and ethical implications of our work.
Recent advances in deep generative neural networks have made it possible for artificial intelligence to actively collaborate with human beings in co-creating novel content (e.g. music, art). While substantial research focuses on (individual) human-AI collaborations, comparatively less research examines how AI can play a role in human-human collaborations during co-creation. In a qualitative lab study, we observed 30 participants (15 pairs) compose a musical phrase in pairs, both with and without AI. Our findings reveal that AI may play important roles in influencing human social dynamics during creativity, including: 1) implicitly seeding a common ground at the start of collaboration, 2) acting as a psychological safety net in creative risk-taking, 3) providing a force for group progress, 4) mitigating interpersonal stalling and friction, and 5) altering users' collaborative and creative roles. This work contributes to the future of generative AI in social creativity by providing implications for how AI could enrich, impede, or alter creative social dynamics in the years to come.
People increasingly rely on Artificial Intelligence (AI) based systems to aid decision-making in various domains and often face a choice between alternative systems. We explored the effects of users' perception of AI systems' warmth (perceived intent) and competence (perceived ability) on their choices. In a series of studies, we manipulated AI systems' warmth and competence levels. We show that, similar to the judgments of other people, there is often primacy for warmth over competence. Specifically, when faced with a choice between a high-competence system and a high-warmth system, more participants preferred the high-warmth system. Moreover, the precedence of warmth persisted even when the high-warmth system was overtly deficient in its competence compared to an alternative high competence-low warmth system. The current research proposes that it may be vital for AI systems designers to consider and communicate the system's warmth characteristics to its potential users.
Designing intelligent interactive text entry systems often relies on factors that are difficult to estimate or assess using traditional HCI design and evaluation methods. We introduce a complementary approach by adapting function structure models from engineering design. We extend their use by extracting controllable and uncontrollable parameters from function structure models and visualizing their impact using envelope analysis. Function structure models allow designers to understand a system in terms of its functions and flows between functions and decouple functions from function carriers. Envelope analysis allows the designer to further study how parameters affect variables of interest, for example, accuracy, keystroke savings and other dependent variables. We provide examples of function structure models and illustrate a complete envelope analysis by investigating a parameterized function structure model of predictive text entry. We discuss the implications of this design approach for both text entry system design and for critique of system contributions.
Digital text has become one of the primary ways of exchanging knowledge, but text needs to be rendered to a screen to be read. We present AdaptiFont, a human-in-the-loop system that is aimed at interactively increasing readability of text displayed on a monitor. To this end, we first learn a generative font space with non-negative matrix factorization from a set of classic fonts. In this space we generate new true-type-fonts through active learning, render texts with the new font, and measure individual users’ reading speed. Bayesian optimization sequentially generates new fonts on the fly to progressively increase individuals’ reading speed. The results of a user study show that this adaptive font generation system finds regions in the font space corresponding to high reading speeds, that these fonts significantly increase participants’ reading speed, and that the found fonts are significantly different across individual readers.
To create aesthetically pleasing aerial footage, the correct framing of camera targets is crucial. However, current quadrotor camera tools do not consider the 3D extent of actual camera targets in their optimization schemes and simply interpolate between keyframes when generating a trajectory. This can yield videos with aesthetically unpleasing target framing. In this paper, we propose an optimization formulation that optimizes the quadrotor camera pose such that targets are positioned at desirable screen locations according to videographic compositional rules and entirely visible throughout a shot. Camera targets are identified using a semi-automatic pipeline which leverages a deep-learning-based visual saliency model. A large-scale perceptual study (N≈500) shows that our method enables users to produce shots with a target framing that is closer to what they intended to create and more or as aesthetically pleasing than with the previous state of the art.
According to psychology research, emotional induction has positive implications in many domains such as therapy and education. Our aim in this paper was to manipulate the Regulatory Focus Theory to assess its impact on the induction of regulatory focus related emotions in children in a pretend play scenario with a social robot. The Regulatory Focus Theory suggests that people follow one of two paradigms while attempting to achieve a goal; by seeking gains (promotion focus - associated with feelings of happiness) or by avoiding losses (prevention focus - associated with feelings of fear). We conducted a study with 69 school children in two different conditions (promotion vs. prevention). We succeeded in inducing happiness emotions in the promotion condition and found a resulting positive effect of the induction on children's social engagement with the robot. We also discuss the important implications of these results in both educational and child robot interaction fields.