注目の論文一覧 | UIST 2024 | Paper Guilds (ペーパーギルド)

各カテゴリ上位30論文までを表示しています

ACM Symposium on User Interface Software and Technology

Andreas Rene. Fender (University of Stuttgart, Stuttgart, Germany)Mohamed Kari (University of Duisburg-Essen, Essen, Germany)

Digital pen input devices based on absolute pen position sensing, such as Wacom Pens, support high-fidelity pen input. However, they require specialized sensing surfaces like drawing tablets, which can have a large desk footprint, constrain the possible input area, and limit mobility. In contrast, digital pens with integrated relative sensing enable mobile use on passive surfaces, but suffer from motion artifacts or require surface contact at all times, deviating from natural pen affordances. We present OptiBasePen, a device for mobile pen input on ordinary surfaces. Our prototype consists of two parts: the "base" on which the hand rests and the pen for fine-grained input. The base features a high-precision mouse sensor to sense its own relative motion, and two infrared image sensors to track the absolute pen tip position within the base's frame of reference. This enables pen input on ordinary surfaces without external cameras while also avoiding drift from pen micro-movements. In this work, we present our prototype as well as the general base+pen concept, which combines relative and absolute sensing.

Karl Toby. Rosenberg (New York University, New York, New York, United States)Rubaiat Habib Kazi (Adobe Research, Seattle, Washington, United States)Li-Yi Wei (Adobe Research, San Jose, California, United States)Haijun Xia (University of California, San Diego, San Diego, California, United States)Ken Perlin (New York University, New York, New York, United States)

We introduce DrawTalking, an approach to building and controlling interactive worlds by sketching and speaking while telling stories. It emphasizes user control and flexibility, and gives programming-like capability without requiring code. An early open-ended study with our prototype shows that the mechanics resonate and are applicable to many creative-exploratory use cases, with the potential to inspire and inform research in future natural interfaces for creative exploration and authoring.

Tongyu Zhou (Brown University, Providence, Rhode Island, United States)Joshua Kong. Yang (Brown University, Providence, Rhode Island, United States)Vivian Hsinyueh. Chan (UC Berkeley, Berkeley, California, United States)Ji Won Chung (Brown University, Providence, Rhode Island, United States)Jeff Huang (Brown University, Providence, Rhode Island, United States)

Efforts to expand the authoring of visual stories beyond the 2D canvas have commonly mapped flat imagery to 3D scenes or objects. This translation requires spatial reasoning, as artists must think in two spaces. We propose PortalInk, a tool for artists to craft and export 2.5D graphical stories while remaining in 2D space by using SVG transitions. This is achieved via a parallax effect that generates a sense of depth that can be further explored using pan and zoom interactions. Any canvas position can be saved and linked to in a closed drawn stroke, or "portal," allowing the artist to create spatially discontinuous, or even infinitely looping visual trajectories. We provide three case studies and a gallery to demonstrate how artists can naturally incorporate these interactions to craft immersive comics, as well as re-purpose them to support use cases beyond drawing such as animation, slide-based presentations, web design, and digital journalism.

Jan Henry Belz (Porsche AG, Stuttgart, Baden-Württemberg, Germany)Lina Madlin. Weilke (Porsche AG, Stuttgart, Germany)Anton Winter (Porsche AG, Stuttgart, Germany)Philipp Hallgarten (Porsche AG, Stuttgart, Baden-Württemberg, Germany)Enrico Rukzio (University of Ulm, Ulm, Germany)Tobias Grosse-Puppendahl (Porsche AG, Stuttgart, Germany)

Stories have long captivated the human imagination with narratives that enrich our lives. Traditional storytelling methods are often static and not designed to adapt to the listener’s environment, which is full of dynamic changes. For instance, people often listen to stories in the form of podcasts or audiobooks while traveling in a car. Yet, conventional in-car storytelling systems do not embrace the adaptive potential of this space. The advent of generative AI is the key to creating content that is not just personalized but also responsive to the changing parameters of the environment. We introduce a novel system for interactive, real-time story narration that leverages environment and user context in correspondence with estimated arrival times to adjust the generated story continuously. Through two comprehensive real-world studies with a total of 30 participants in a vehicle, we assess the user experience, level of immersion, and perception of the environment provided by the prototype. Participants' feedback shows a significant improvement over traditional storytelling and highlights the importance of context information for generative storytelling systems.

Xinshuang Liu (University of California, San Diego, San Diego, California, United States)Yizhong Zhang (Microsoft Research Asia, Beijing, China)Xin Tong (Microsoft Research Asia, Beijing, China)

In whiteboard-based remote communication, the seamless integration of drawn content and hand-screen interactions is essential for an immersive user experience. Previous methods either require bulky device setups for capturing hand gestures or fail to accurately track the hand poses from capacitive images. In this paper, we present a real-time method for precise tracking 3D poses of both hands from capacitive video frames. To this end, we develop a deep neural network to identify hands and infer hand joint positions from capacitive frames, and then recover 3D hand poses from the hand-joint positions via a constrained inverse kinematic solver. Additionally, we design a device setup for capturing high-quality hand-screen interaction data and obtained a more accurate synchronized capacitive video and hand pose dataset. Our method improves the accuracy and stability of 3D hand tracking for capacitive frames while maintaining a compact device setup for remote communication. We validate our scheme design and its superior performance on 3D hand pose tracking and demonstrate the effectiveness of our method in whiteboard-based remote communication.

Zhihao Yao (Tsinghua University, Beijing, Beijing, China)Yao Lu (Tsinghua University, Beijing, China)Qirui Sun (Tsinghua University, Beijing, China)Shiqing Lyu (Tsinghua University, Beijing, China)Hanxuan Li (The Future Laboratory, Beijing, China)Xing-Dong Yang (Simon Fraser University, Burnaby, British Columbia, Canada)Xuezhu Wang (Tsinghua University, Beijing, China)Guanhong Liu (Tongji University, Shanghai, China)Haipeng Mi (Tsinghua University, Beijing, China)

Shadow puppetry, a culturally rich storytelling art, faces challenges transitioning to the digital realm. Creators in the early design phase struggle with crafting intricate patterns, textures, and basic animations while adhering to stylistic conventions - hindering creativity, especially for novices. This paper presents Lumina, a tool to facilitate the early Chinese shadow puppet design stage. Lumina provides contour templates, animations, scene editing tools, and machine-generated traditional puppet patterns. These features liberate creators from tedious tasks, allowing focus on the creative process. Developed based on a formative study with puppet creators, the web-based Lumina enables wide dissemination. An evaluation with 18 participants demonstrated Lumina's effectiveness and ease of use, with participants successfully creating designs spanning traditional themes to contemporary and science-fiction concepts.

Anyi Rao (Stanford University, Stanford, California, United States)Jean-Peïc Chou (Stanford University, Stanford, California, United States)Maneesh Agrawala (Stanford University, Stanford, California, United States)

Scriptwriters usually rely on their mental visualization to create a vivid story by using their imagination to see, feel, and experience the scenes they are writing. Besides mental visualization, they often refer to existing images or scenes in movies and analyze the visual elements to create a certain mood or atmosphere. In this paper, we develop a new tool, ScriptViz, to provide external visualization based on a large movie database for the screenwriting process. It retrieves reference visuals on the fly based on scripts’ text and dialogue from a large movie database. The tool provides two types of control on visual elements that enable writers to 1) see exactly what they want with fixed visual elements and 2) see variances in uncertain elements. User evaluation among 15 scriptwriters shows that ScriptViz is able to present scriptwriters with consistent yet diverse visual possibilities, aligning closely with their scripts and helping their creation.

Jisu Yim (KAIST, Daejeon, Korea, Republic of)Seoyeon Bae (KAIST, Daejeon, Korea, Republic of)Taejun Kim (School of Computing, KAIST, Daejeon, Korea, Republic of)Sunbum Kim (School of Computing, KAIST, Daejeon, Korea, Republic of)Geehyuk Lee (School of Computing, KAIST, Daejeon, Korea, Republic of)

The palmrest area of laptops has the potential as an additional input space, considering its consistent palm contact during keyboard interaction. We propose Palmrest+, leveraging shear force exerted on the palmrest area. We suggest two input techniques: Palmrest Shortcut, for instant shortcut execution, and Palmrest Joystick, for continuous value input. These allow seamless and subtle input amidst keyboard typing. Evaluation of Palmrest Shortcut against conventional keyboard shortcuts revealed faster performance for applying shear force in unimanual and bimanual-manner with a significant reduction in gaze shifting. Additionally, the assessment of Palmrest Joystick against the laptop touchpad demonstrated comparable performance in selecting one- and two- dimensional targets with low-precision pointing, i.e., for short distances and large target sizes. The maximal hand displacement significantly decreased for both Palmrest Shortcut and Palmrest Joystick compared to conventional methods. These findings verify the feasibility and effectiveness of leveraging the palmrest area as an additional input space on laptops, offering promising enhanced typing-related user interaction experiences.

Yu Jiang (Saarland University, Saarbrucken, Germany)Alice C. Haynes (Saarland Informatics Campus, Saarbrücken, Germany)Narjes Pourjafarian (Cornell University, Ithaca, New York, United States)Jan Borchers (RWTH Aachen University, Aachen, Germany)Jürgen Steimle (Saarland University, Saarland Informatics Campus, Saarbrücken, Germany)

Machine embroidery is a versatile technique for creating custom and entirely fabric-based patterns on thin and conformable textile surfaces. However, existing machine-embroidered surfaces remain static, limiting the interactions they can support. We introduce Embrogami, an approach for fabricating textile structures with versatile shape-changing behaviors. Inspired by origami, we leverage machine embroidery to form finger-tip-scale mountain-and-valley structures on textiles with customized shapes, bistable or elastic behaviors, and modular composition. The structures can be actuated by the user or the system to modify the local textile surface topology, creating interactive elements like toggles and sliders or textile shape displays with an ultra-thin, flexible, and integrated form factor. We provide a dedicated software tool and report results of technical experiments to allow users to flexibly design, fabricate, and deploy customized Embrogami structures. With four application cases, we showcase Embrogami’s potential to create functional and flexible shape-changing textiles with diverse visuo-tactile feedback.

Ruyu Yan (Princeton University, Princeton, New Jersey, United States)Jiatian Sun (Cornell University, Ithaca, New York, United States)Abe Davis (Cornell University, Ithaca, New York, United States)

We present a novel perceptually-motivated interactive tool for using color contrast to enhance details represented in the lightness channel of images and video. Our method lets users adjust the perceived contrast of different details by manipulating local chromaticity while preserving the original lightness of individual pixels. Inspired by the use of similar chromaticity mappings in painting, our tool effectively offers contrast along a user-selected gradient of chromaticities as additional bandwidth for representing and enhancing different details in an image. We provide an interface for our tool that closely resembles the familiar design of tonal contrast curve controls that are available in most professional image editing software. We show that our tool is effective for enhancing the perceived contrast of details without altering lightness in an image and present many examples of effects that can be achieved with our method on both images and video.

Boyu Li (The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China)Linping Yuan (The Hong Kong University of Science and Technology, Hong Kong, Hong Kong)Zhe Yan (The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China)Qianxi Liu (The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China)Yulin Shen (The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China)Zeyu Wang (The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China)

We introduce AniCraft, a mixed reality system for prototyping 3D character animation using physical proxies crafted from everyday objects. Unlike existing methods that require specialized equipment to support the use of physical proxies, AniCraft only requires affordable markers, webcams, and daily accessible objects and materials. AniCraft allows creators to prototype character animations through three key stages: selection of virtual characters, fabrication of physical proxies, and manipulation of these proxies to animate the characters. This authoring workflow is underpinned by diverse physical proxies, manipulation types, and mapping strategies, which ease the process of posing virtual characters and mapping user interactions with physical proxies to animated movements of virtual characters. We provide a range of cases and potential applications to demonstrate how diverse physical proxies can inspire user creativity. User experiments show that our system can outperform traditional animation methods for rapid prototyping. Furthermore, we provide insights into the benefits and usage patterns of different materials, which lead to design implications for future research.

Rui He (The Hong Kong Polytechnic University, Hong Kong, Hong Kong)Huaxin Wei (The Hong Kong Polytechnic University, Kowloon, Hong Kong)Ying Cao (ShanghaiTech University, Shanghai, Shanghai, China)

Designing cinematic compositions, which involves moving cameras through a scene, is essential yet challenging in filmmaking. Machinima filmmaking provides real-time virtual environments for exploring different compositions flexibly and efficiently. However, producing high-quality cinematic compositions in such environments still requires significant cinematography skills and creativity. This paper presents Cinemassist, a tool designed to support and enhance this creative process by generating a variety of cinematic composition proposals at both keyframe and scene levels, which users can incorporate into their workflows and achieve more creative results. At the crux of our system is a deep generative model trained on real movie data, which can generate plausible, diverse camera poses conditioned on 3D animations and additional input semantics. Our model enables an interactive cinematic composition design workflow where users can co-design with the model by being inspired by model-generated suggestions while having control over the generation process. Our user study and expert rating find Cinemassist can facilitate the design process for users of different backgrounds and enhance the design quality especially for users with animation expertise, demonstrating its potential as an invaluable tool in the context of digital filmmaking.

Aditya Gunturu (University of Calgary, Calgary, Alberta, Canada)Yi Wen (City University of Hong Kong, Hong Kong, State/Territory, Hong Kong)Nandi Zhang (University of Calgary, Calgary, Alberta, Canada)Jarin Thundathil (University of Calgary, Calgary, Alberta, Canada)Rubaiat Habib Kazi (Adobe Research, Seattle, Washington, United States)Ryo Suzuki (University of Calgary, Calgary, Alberta, Canada)

We introduce Augmented Physics, a machine learning-integrated authoring tool designed for creating embedded interactive physics simulations from static textbook diagrams. Leveraging recent advancements in computer vision, such as Segment Anything and Multi-modal LLMs, our web-based system enables users to semi-automatically extract diagrams from physics textbooks and generate interactive simulations based on the extracted content. These interactive diagrams are seamlessly integrated into scanned textbook pages, facilitating interactive and personalized learning experiences across various physics concepts, such as optics, circuits, and kinematics. Drawing from an elicitation study with seven physics instructors, we explore four key augmentation strategies: 1) augmented experiments, 2) animated diagrams, 3) bi-directional binding, and 4) parameter visualization. We evaluate our system through technical evaluation, a usability study (N=12), and expert interviews (N=12). Study findings suggest that our system can facilitate more engaging and personalized learning experiences in physics education.

Seams are areas of overlapping fabric formed by stitching two or more pieces of fabric together in the cut-and-sew apparel manufacturing process. In SeamPose, we repurposed seams as capacitive sensors in a shirt for continuous upper-body pose estimation. Compared to previous all-textile motion-capturing garments that place the electrodes on the clothing surface, our solution leverages existing seams inside of a shirt by machine-sewing insulated conductive threads over the seams. The unique invisibilities and placements of the seams afford the sensing shirt to look and wear similarly as a conventional shirt while providing exciting pose-tracking capabilities. To validate this approach, we implemented a proof-of-concept untethered shirt with 8 capacitive sensing seams. With a 12-participant user study, our customized deep-learning pipeline accurately estimates the relative (to the pelvis) upper-body 3D joint positions with a mean per joint position error (MPJPE) of 6.0 cm. SeamPose represents a step towards unobtrusive integration of smart clothing for everyday pose estimation.

Josh M.. Pollock (Massachusetts Institute of Technology, Cambridge, Massachusetts, United States)Catherine Mei (Massachusetts Institute of Technology, Cambridge, Massachusetts, United States)Grace Huang (Massachusetts Institute of Technology, Cambridge, Massachusetts, United States)Elliot Evans (N/A, Ottawa, Ontario, Canada)Daniel Jackson (MIT, Cambridge, Massachusetts, United States)Arvind Satyanarayan (MIT, Cambridge, Massachusetts, United States)

Diagrams are essential tools for problem-solving and communication as they externalize conceptual structures using spatial relationships. But when picking a diagramming framework, users are faced with a dilemma. They can either use a highly expressive but low-level toolkit, whose API does not match their domain-specific concepts, or select a high-level typology, which offers a recognizable vocabulary but supports a limited range of diagrams. To address this gap, we introduce Bluefish: a diagramming framework inspired by component-based user interface (UI) libraries. Bluefish lets users create diagrams using relations: declarative, composable, and extensible diagram fragments that relax the concept of a UI component. Unlike a component, a relation does not have sole ownership over its children nor does it need to fully specify their layout. To render diagrams, Bluefish extends a traditional tree-based scenegraph to a compound graph that captures both hierarchical and adjacent relationships between nodes. To evaluate our system, we construct a diverse example gallery covering many domains including mathematics, physics, computer science, and even cooking. We show that Bluefish's relations are effective declarative primitives for diagrams. Bluefish is open source, and we aim to shape it into both a usable tool and a research platform.

Vishnu Sarukkai (Stanford University, Stanford, California, United States)Lu Yuan (Stanford University, Stanford, California, United States)Mia Tang (Stanford University, Stanford, California, United States)Maneesh Agrawala (Stanford University, Stanford, California, United States)Kayvon Fatahalian (Stanford University, Stanford, California, United States)

We introduce a novel sketch-to-image tool that aligns with the iterative refinement process of artists. Our tool lets users sketch blocking strokes to coarsely represent the placement and form of objects and detail strokes to refine their shape and silhouettes. We develop a two-pass algorithm for generating high-fidelity images from such sketches at any point in the iterative process. In the first pass we use a ControlNet to generate an image that strictly follows all the strokes (blocking and detail) and in the second pass we add variation by renoising regions surrounding blocking strokes. We also present a dataset generation scheme that, when used to train a ControlNet architecture, allows regions that do not contain strokes to be interpreted as not-yet-specified regions rather than empty space. We show that this partial-sketch-aware ControlNet can generate coherent elements from partial sketches that only contain a small number of strokes. The high-fidelity images produced by our approach serve as scaffolds that can help the user adjust the shape and proportions of objects or add additional elements to the composition. We demonstrate the effectiveness of our approach with a variety of examples and evaluative comparisons. Quantitatively, novice viewers prefer the quality of images from our algorithm over a baseline Scribble ControlNet for 82% of the pairs and found our images had less distortion in 80% of the pairs.

John Joon Young. Chung (Midjourney, San Francisco, California, United States)Max Kreminski (Midjourney, San Francisco, California, United States)

Large language models (LLMs) can help writers build story worlds by generating world elements, such as factions, characters, and locations. However, making sense of many generated elements can be overwhelming. Moreover, if the user wants to precisely control aspects of generated elements that are difficult to specify verbally, prompting alone may be insufficient. We introduce Patchview, a customizable LLM-powered system that visually aids worldbuilding by allowing users to interact with story concepts and elements through the physical metaphor of magnets and dust. Elements in Patchview are visually dragged closer to concepts with high relevance, facilitating sensemaking. The user can also steer the generation with verbally elusive concepts by indicating the desired position of the element between concepts. When the user disagrees with the LLM's visualization and generation, they can correct those by repositioning the element. These corrections can be used to align the LLM's future behaviors to the user's perception. With a user study, we show that Patchview supports the sensemaking of world elements and steering of element generation, facilitating exploration during the worldbuilding process. Patchview provides insights on how customizable visual representation can help sensemake, steer, and align generative AI model behaviors with the user's intentions.