Smartphones conveniently place large information spaces in the palms of our hands. While research has shown that larger screens positively affect spatial memory, workload, and user experience, smartphones remain fairly compact for the sake of device ergonomics and portability. Thus, we investigate the use of hybrid user interfaces to virtually increase the available display size by complementing the smartphone with an augmented reality head-worn display. We thereby combine the benefits of familiar touch interaction with the near-infinite visual display space afforded by augmented reality. To better understand the potential of virtually-extended displays and the possible issues of splitting the user's visual attention between two screens (real and virtual), we conducted a within-subjects experiment with 24 participants completing navigation tasks using different virtually-augmented display sizes. Our findings reveal that a desktop monitor size represents a "sweet spot" for extending smartphones with augmented reality, informing the design of hybrid user interfaces.
https://doi.org/10.1145/3544548.3581438
In this paper, we proposed \emph{Squeez'In}, a technique on smartphones that enabled private authentication by holding and squeezing the phone with a unique pattern. We first explored the design space of practical squeezing gestures for authentication by analyzing the participants' self-designed gestures and squeezing behavior. Results showed that varying-length gestures with two levels of touch pressure and duration were the most natural and unambiguous. We then implemented \emph{Squeez'In} on an off-the-shelf capacitive sensing smartphone, and employed an SVM-GBDT model for recognizing gestures and user-specific behavioral patterns, achieving 99.3\% accuracy and 0.93 F1-score when tested on 21 users. A following 14-day study validated the memorability and long-term stability of \proj. During usability evaluation, compared with gesture and pin code, \emph{Squeez'In} achieved significantly faster authentication speed and higher user preference in terms of privacy and security.
https://doi.org/10.1145/3544548.3581419
Despite the potential of Virtual Reality as the next computing platform for general purposes, current systems are tailored to stationary settings to support expansive interaction in mid-air. However, in mobile scenarios, the physical constraints of the space surrounding the user may be prohibitively small for spatial interaction in VR with classical controllers. In this paper, we present HandyCast, a smartphone-based input technique that enables full-range 3D input with two virtual hands in VR while requiring little physical space, allowing users to operate large virtual environments in mobile settings. HandyCast defines a pose-and-touch transfer function that fuses the phone's position and orientation with touch input to derive two individual 3D hand positions. Holding their phone like a gamepad, users can thus move and turn it to independently control their virtual hands. Touch input using the thumbs fine-tunes the respective virtual hand position and controls object selection. We evaluated HandyCast in three studies, comparing its performance with that of Go-Go, a classic bimanual controller technique. In our open-space study, participants required significantly less physical motion using HandyCast with no decrease in completion time or body ownership. In our space-constrained study, participants achieved significantly faster completion times, smaller interaction volumes, and shorter path lengths with HandyCast compared to Go-Go. In our technical evaluation, HandyCast's fully standalone inside-out 6D tracking performance again incurred no decrease in completion time compared to an outside-in tracking baseline.
https://doi.org/10.1145/3544548.3580677
An omni-directional (360°) camera captures the entire viewing sphere surrounding its optical center. Such cameras are growing in use to create highly immersive content and viewing experiences. When such a camera is held by a user, the view includes the user's hand grip, finger, body pose, face, and the surrounding environment, providing a complete understanding of the visual world and context around it. This capability opens up numerous possibilities for rich mobile input sensing. In OmniSense, we explore the broad input design space for mobile devices with a built-in omni-directional camera and broadly categorize them into three sensing pillars: i) near device ii) around device and iii) surrounding device. In addition we explore potential use cases and applications that leverage these sensing capabilities to solve user needs. Following this, we develop a working system to put these concepts into action, by leveraging these sensing capabilities to enable potential use cases and applications. We studied the system in a technical evaluation and a preliminary user study to gain initial feedback and insights. Collectively these techniques illustrate how a single, omni-purpose sensor on a mobile device affords many compelling ways to enable expressive input, while also affording a broad range of novel applications that improve user experience during mobile interaction.
https://doi.org/10.1145/3544548.3580747
Tracking body pose on-the-go could have powerful uses in fitness, mobile gaming, context-aware virtual assistants, and rehabilitation. However, users are unlikely to buy and wear special suits or sensor arrays to achieve this end. Instead, in this work, we explore the feasibility of estimating body pose using IMUs already in devices that many users own --- namely smartphones, smartwatches, and earbuds. This approach has several challenges, including noisy data from low-cost commodity IMUs, and the fact that the number of instrumentation points on a user's body is both sparse and in flux. Our pipeline receives whatever subset of IMU data is available, potentially from just a single device, and produces a best-guess pose. To evaluate our model, we created the IMUPoser Dataset, collected from 10 participants wearing or holding off-the-shelf consumer devices and across a variety of activity contexts. We provide a comprehensive evaluation of our system, benchmarking it on both our own and existing IMU datasets.
https://doi.org/10.1145/3544548.3581392
Perceiving the region of interest (ROI) and target object by smartphones from the user's first-person perspective can enable diverse spatial interactions. In this paper, we propose a novel ROI input method and a target selecting method for smartphones by utilizing the user-perspective phone occlusion. This concept of turning the phone into real-world physical cursor benefits from the proprioception, gets rid of the constraint of camera preview, and allows users to rapidly and accurately select the target object. Meanwhile, our method can provide a resizable and rotatable rectangular ROI to disambiguate dense targets. We implemented the prototype system by positioning the user's iris with the front camera and estimating the rectangular area blocked by the phone with the rear camera simultaneously, followed by a target prediction algorithm with the distance-weighted Jaccard index. We analyzed the behavioral models of using our method and evaluated our prototype system's pointing accuracy and usability. Results showed that our method is well-accepted by the users for its convenience, accuracy, and efficiency.
https://doi.org/10.1145/3544548.3580696