Large Language Models

会議の名前
CHI 2024
Model Compression in Practice: Lessons Learned from Practitioners Creating On-device Machine Learning Experiences
要旨

On-device machine learning (ML) promises to improve the privacy, responsiveness, and proliferation of new, intelligent user experiences by moving ML computation onto everyday personal devices. However, today's large ML models must be drastically compressed to run efficiently on-device, a hurtle that requires deep, yet currently niche expertise. To engage the broader human-centered ML community in on-device ML experiences, we present the results from an interview study with 30 experts at Apple that specialize in producing efficient models. We compile tacit knowledge that experts have developed through practical experience with model compression across different hardware platforms. Our findings offer pragmatic considerations missing from prior work, covering the design process, trade-offs, and technical strategies that go into creating efficient models. Finally, we distill design recommendations for tooling to help ease the difficulty of this work and bring on-device ML into to more widespread practice.

著者
Fred Hohman
Apple, Seattle, Washington, United States
Mary Beth Kery
Apple Inc., Pittsburgh, Pennsylvania, United States
Donghao Ren
Apple, Seattle, Washington, United States
Dominik Moritz
Apple, Pittsburgh, Pennsylvania, United States
論文URL

doi.org/10.1145/3613904.3642109

動画
Talaria: Interactively Optimizing Machine Learning Models for Efficient Inference
要旨

On-device machine learning (ML) moves computation from the cloud to personal devices, protecting user privacy and enabling intelligent user experiences. However, fitting models on devices with limited resources presents a major technical challenge: practitioners need to optimize models and balance hardware metrics such as model size, latency, and power. To help practitioners create efficient ML models, we designed and developed Talaria: a model visualization and optimization system. Talaria enables practitioners to compile models to hardware, interactively visualize model statistics, and simulate optimizations to test the impact on inference metrics. Since its internal deployment two years ago, we have evaluated Talaria using three methodologies: (1) a log analysis highlighting its growth of 800+ practitioners submitting 3,600+ models; (2) a usability survey with 26 users assessing the utility of 20 Talaria features; and (3) a qualitative interview with the 7 most active users about their experience using Talaria.

受賞
Honorable Mention
著者
Fred Hohman
Apple, Seattle, Washington, United States
Chaoqun Wang
Apple, Beijing, China
Jinmook Lee
Apple, Cupertino, California, United States
Jochen Görtler
Independent Researcher, Walldorf, Germany
Dominik Moritz
Apple, Pittsburgh, Pennsylvania, United States
Jeffrey P. Bigham
Apple, Pittsburgh, Pennsylvania, United States
Zhile Ren
Apple, Seattle, Washington, United States
Cecile Foret
Apple, Cupertino, California, United States
Qi Shan
Apple Inc, Seattle, Washington, United States
Xiaoyi Zhang
Apple Inc, Seattle, Washington, United States
論文URL

doi.org/10.1145/3613904.3642628

動画
Luminate: Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation
要旨

Thanks to their generative capabilities, large language models (LLMs) have become an invaluable tool for creative processes. These models have the capacity to produce hundreds and thousands of visual and textual outputs, offering abundant inspiration for creative endeavors. But are we harnessing their full potential? We argue that current interaction paradigms fall short, guiding users towards rapid convergence on a limited set of ideas, rather than empowering them to explore the vast latent design space in generative models. To address this limitation, we propose a framework that facilitates the structured generation of design space in which users can seamlessly explore, evaluate, and synthesize a multitude of responses. We demonstrate the feasibility and usefulness of this framework through the design and development of an interactive system, Luminate, and a user study with 14 professional writers. Our work advances how we interact with LLMs for creative tasks, introducing a way to harness the creative potential of LLMs.

著者
Sangho Suh
University of California, San Diego, San Diego, California, United States
Meng Chen
University of Notre Dame, Notre Dame, Indiana, United States
Bryan Min
University of California San Diego, La Jolla, California, United States
Toby Jia-Jun. Li
University of Notre Dame, Notre Dame, Indiana, United States
Haijun Xia
University of California, San Diego, San Diego, California, United States
論文URL

doi.org/10.1145/3613904.3642400

動画
Narrating Fitness: Leveraging Large Language Models for Reflective Fitness Tracker Data Interpretation
要旨

While fitness trackers generate and present quantitative data, past research suggests that users often conceptualise their wellbeing in qualitative terms. This discrepancy between numeric data and personal wellbeing perception may limit the effectiveness of personal informatics tools in encouraging meaningful engagement with one’s wellbeing. In this work, we aim to bridge the gap between raw numeric metrics and users’ qualitative perceptions of wellbeing. In an online survey with $n=273$ participants, we used step data from fitness trackers and compared three presentation formats: standard charts, qualitative descriptions generated by an LLM (Large Language Model), and a combination of both. Our findings reveal that users experienced more reflection, focused attention and reward when presented with the generated qualitative data compared to the standard charts alone. Our work demonstrates how automatically generated data descriptions can effectively complement numeric fitness data, fostering a richer, more reflective engagement with personal wellbeing information.

受賞
Honorable Mention
著者
Konstantin R.. Strömel
Osnabrück University, Osnabrück, Germany
Stanislas Henry
ENSEIRB-MATMECA Bordeaux, Bordeaux, France
Tim Johansson
Chalmers University of Technology, Gothenburg, Sweden
Jasmin Niess
University of Oslo, Oslo, Norway
Paweł W. Woźniak
Chalmers University of Technology, Gothenburg, Sweden
論文URL

doi.org/10.1145/3613904.3642032

動画
RELIC: Investigating Large Language Model Responses using Self-Consistency
要旨

Large Language Models (LLMs) are notorious for blending fact with fiction and generating non-factual content, known as hallucinations. To address this challenge, we propose an interactive system that helps users gain insight into the reliability of the generated text. Our approach is based on the idea that the self-consistency of multiple samples generated by the same LLM relates to its confidence in individual claims in the generated texts. Using this idea, we design RELIC, an interactive system that enables users to investigate and verify semantic-level variations in multiple long-form responses. This allows users to recognize potentially inaccurate information in the generated text and make necessary corrections. From a user study with ten participants, we demonstrate that our approach helps users better verify the reliability of the generated text. We further summarize the design implications and lessons learned from this research for future studies of reliable human-LLM interactions.

著者
Furui Cheng
ETH Zürich, Zürich, Switzerland
Vilém Zouhar
ETH Zurich, Zurich, Switzerland
Simran Arora
Stanford University, Stanford, California, United States
Mrinmaya Sachan
ETH Zurich, Zurich, Switzerland
Hendrik Strobelt
IBM Research AI, Cambridge, Massachusetts, United States
Mennatallah El-Assady
ETH Zürich, Zürich, Switzerland
論文URL

doi.org/10.1145/3613904.3641904

動画