Look at me

Paper session

会議の名前
CHI 2020
Enhancing Mobile Voice Assistants with WorldGaze
要旨

Contemporary voice assistants require that objects of inter-est be specified in spoken commands. Of course, users are often looking directly at the object or place of interest – fine-grained, contextual information that is currently unused. We present WorldGaze, a software-only method for smartphones that provides the real-world gaze location of a user that voice agents can utilize for rapid, natural, and precise interactions. We achieve this by simultaneously opening the front and rear cameras of a smartphone. The front-facing camera is used to track the head in 3D, including estimating its direction vector. As the geometry of the front and back cameras are fixed and known, we can raycast the head vector into the 3D world scene as captured by the rear-facing camera. This allows the user to intuitively define an object or region of interest using their head gaze. We started our investigations with a qualitative exploration of competing methods, before developing a functional, real-time implementation. We conclude with an evaluation that shows WorldGaze can be quick and accurate, opening new multimodal gaze+voice interactions for mobile voice agents.

キーワード
WorldGaze
interaction techniques
mobile interaction
著者
Sven Mayer
Carnegie Mellon University, Pittsburgh, PA, USA
Gierad Laput
Apple Inc. & Carnegie Mellon University, Cupertino, CA, USA
Chris Harrison
Carnegie Mellon University, Pittsburgh, PA, USA
DOI

10.1145/3313831.3376479

論文URL

https://doi.org/10.1145/3313831.3376479

動画
GazeConduits: Calibration-Free Cross-Device Collaboration through Gaze and Touch
要旨

We present GazeConduits, a calibration-free ad-hoc mobile interaction concept that enables users to collaboratively interact with tablets, other users, and content in a cross-device setting using gaze and touch input. GazeConduits leverages recently introduced smartphone capabilities to detect facial features and estimate users' gaze directions. To join a collaborative setting, users place one or more tablets onto a shared table and position their phone in the center, which then tracks users present as well as their gaze direction to determine the tablets they look at. We present a series of techniques using GazeConduits for collaborative interaction across mobile devices for content selection and manipulation. Our evaluation with 20 simultaneous tablets on a table shows that GazeConduits can reliably identify which tablet or collaborator a user is looking at.

キーワード
Cross-device interaction
gaze input
touch input
著者
Simon Voelker
RWTH Aachen University, Aachen, Germany
Sebastian Hueber
RWTH Aachen University, Aachen, Germany
Christian Holz
ETH Zürich, Zürich, Switzerland
Christian Remy
Aarhus University, Aarhus, Denmark
Nicolai Marquardt
University College London, London, United Kingdom
DOI

10.1145/3313831.3376578

論文URL

https://doi.org/10.1145/3313831.3376578

動画
TurkEyes: A Web-Based Toolbox for Crowdsourcing Attention Data
要旨

Eye movements provide insight into what parts of an image a viewer finds most salient, interesting, or relevant to the task at hand. Unfortunately, eye tracking data, a commonly-used proxy for attention, is cumbersome to collect. Here we explore an alternative: a comprehensive web-based toolbox for crowdsourcing visual attention. We draw from four main classes of attention-capturing methodologies in the literature. ZoomMaps is a novel zoom-based interface that captures viewing on a mobile phone. CodeCharts is a self-reporting methodology that records points of interest at precise viewing durations. ImportAnnots is an "annotation" tool for selecting important image regions, and cursor-based BubbleView lets viewers click to deblur a small area. We compare these methodologies using a common analysis framework in order to develop appropriate use cases for each interface. This toolbox and our analyses provide a blueprint for how to gather attention data at scale without an eye tracker.

キーワード
Eye tracking
attention
crowdsourcing
interaction techniques
著者
Anelise Newman
Massachusetts Institute of Technology, Cambridge, MA, USA
Barry McNamara
Massachusetts Institute of Technology, Cambridge, MA, USA
Camilo Fosco
Massachusetts Institute of Technology, Cambridge, MA, USA
Yun Bin Zhang
Harvard University, Cambridge, MA, USA
Pat Sukhum
Harvard University, Cambridge, MA, USA
Matthew Tancik
University of California, Berkeley, Berkeley, CA, USA
Nam Wook Kim
Boston College, Chestnut Hill, MA, USA
Zoya Bylinskii
Adobe Research, Cambridge, MA, USA
DOI

10.1145/3313831.3376799

論文URL

https://doi.org/10.1145/3313831.3376799

Designing an Eyes-Reduced Document Skimming App for Situational Impairments
要旨

Listening to text using read-aloud applications is a popular way for people to consume content when their visual attention is situationally impaired (e.g., commuting, walking, tired eyes). However, due to the linear nature of audio, such apps do not support skimming---a non-linear, rapid form of reading---essential for quickly grasping the gist and organization of difficult texts, like academic or professional documents. To support auditory skimming for situational impairments, we (1) identified the user needs and challenges in auditory skimming through a formative study (N=20), (2) derived the concept of "eyes-reduced" skimming that blends auditory and visual modes of reading, inspired by how participants mixed visual and non-visual interactions, (3) generated a set of design guidelines for eyes-reduced skimming, and (4) designed and evaluated a novel audio skimming app that embodies the guidelines. Our in-situ preliminary observation study (N=6) suggested that participants were positive about our design and were able to auditorily skim documents. We discuss design implications for eyes-reduced reading, read-aloud apps, and text-to-speech engines.

キーワード
Situational impairments
skim reading
eyes-free
eyes-reduced
text-to-speech
mobile device
accessibility
design guidelines
interactive prototype
著者
Taslim Arefin Khan
University of British Columbia, Vancouver, BC, Canada
Dongwook Yoon
University of British Columbia, Vancouver, BC, Canada
Joanna McGrenere
University of British Columbia, Vancouver, BC, Canada
DOI

10.1145/3313831.3376641

論文URL

https://doi.org/10.1145/3313831.3376641

A View on the Viewer: Gaze-Adaptive Captions for Videos
要旨

Subtitles play a crucial role in cross-lingual distribution of multimedia content and help communicate information where auditory content is not feasible (loud environments, hearing impairments, unknown languages). Established methods utilize text at the bottom of the screen, which may distract from the video. Alternative techniques place captions closer to related content (e.g., faces) but are not applicable to arbitrary videos such as documentations. Hence, we propose to leverage live gaze as indirect input method to adapt captions to individual viewing behavior. We implemented two gaze-adaptive methods and compared them in a user study (n=54) to traditional captions and audio-only videos. The results show that viewers with less experience with captions prefer our gaze-adaptive methods as they assist them in reading. Furthermore, gaze distributions resulting from our methods are closer to natural viewing behavior compared to the traditional approach. Based on these results, we provide design implications for gaze-adaptive captions.

受賞
Honorable Mention
キーワード
Eye Tracking
Gaze Input
Gaze-Responsive Display
Multimedia
Video Captions
Subtitles
著者
Kuno Kurzhals
ETH Zürich, Zürich, Switzerland
Fabian Göbel
ETH Zürich, Zürich, Switzerland
Katrin Angerbauer
University of Stuttgart, Stuttgart, Germany
Michael Sedlmair
University of Stuttgart, Stuttgart, Germany
Martin Raubal
ETH Zürich, Zürich, Switzerland
DOI

10.1145/3313831.3376266

論文URL

https://doi.org/10.1145/3313831.3376266

動画