Sensing

https://dl.acm.org/doi/abs/10.1145/3491102.3502015

Speech is inappropriate in many situations, limiting when voice control can be used. Most unvoiced speech text entry systems can not be used while on-the-go due to movement artifacts. Using a dental retainer with capacitive touch sensors, SilentSpeller tracks tongue movement, enabling users to type by spelling words without voicing. SilentSpeller achieves an average 97% character accuracy in offline isolated word testing on a 1164-word dictionary. Walking has little effect on accuracy; average offline character accuracy was roughly equivalent on 107 phrases entered while walking (97.5%) or seated (96.5%). To demonstrate extensibility, the system was tested on 100 unseen words, leading to an average 94% accuracy. Live text entry speeds for seven participants averaged 37 words per minute at 87% accuracy. Comparing silent spelling to current practice suggests that SilentSpeller may be a viable alternative for silent mobile text entry.

The University of Tokyo, Bunkyo, Tokyo, Japan

Georgia Institute of Technology, Atlanta, Georgia, United States

University of Washington, Seattle, Washington, United States

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

The University of Tokyo, Tokyo, Japan

Google Inc., Mountain View, California, United States

The University of Tokyo, Tokyo, Japan

Georgia Institute of Technology, Atlanta, Georgia, United States

https://dl.acm.org/doi/abs/10.1145/3491102.3517440

By sensing how a user is holding a smartphone, adaptive user interfaces are possible such as those that automatically switch the displayed content and position of graphical user interface (GUI) components following how the phone is being held. We propose ReflecTouch, a novel method for detecting how a smartphone is being held by capturing images of the smartphone screen reflected on the cornea with a built-in front camera. In these images, the areas where the user places their fingers on the screen appear as shadows, which makes it possible to estimate the grasp posture. Since most smartphones have a front camera, this method can be used regardless of the device model; in addition, no additional sensor or hardware is required. We conducted data collection experiments to verify the classification accuracy of the proposed method for six different grasp postures, and the accuracy was 85%.

Keio University, Yokohama City, Japan

Yahoo Japan Corporation, Tokyo, Japan

Tokyo University of Technology, Tokyo, Japan

Keio University, Yokohama City, Japan

https://dl.acm.org/doi/abs/10.1145/3491102.3517698

Face orientation can often indicate users’ intended interaction target. In this paper, we propose FaceOri, a novel face tracking technique based on acoustic ranging using earphones. FaceOri can leverage the speaker on a commodity device to emit an ultrasonic chirp, which is picked up by the set of microphones on the user’s earphone, and then processed to calculate the distance from each microphone to the device. These measurements are used to derive the user’s face orientation and distance with respect to the device. We conduct a ground truth comparison and user study to evaluate FaceOri’s performance. The results show that the system can determine whether the user orients to the device at a 93.5% accuracy within a 1.5 meters range. Furthermore, FaceOri can continuously track the user’s head orientation with a median absolute error of 10.9 mm in the distance, 3.7◦ in yaw, and 5.8◦ in pitch. FaceOri can allow for convenient hands-free control of devices and produce more intelligent context-aware interaction.

Tsinghua University, Beijing, China

University of Washington, Seattle, Washington, United States

Tsinghua University, Beijing, China

University of Washington, Seattle, Washington, United States

Tsinghua University, Beijing, China

https://dl.acm.org/doi/abs/10.1145/3491102.3502079

Many interactive systems are susceptible to misinterpreting the user's input actions or gestures. Interpretation errors are common when systems gather a series of signals from the user and then attempt to interpret the user's intention based on those signals -- e.g., gesture identification from a touchscreen, camera, or body-worn electrodes -- and previous work has shown that interpretation error can cause significant problems for learning new input commands. Error-reduction strategies from telecommunications, such as repeating a command or increasing the length of the input while reducing its expressiveness, could improve these input mechanisms -- but little is known about whether longer command sequences will cause problems for users (e.g., increased effort or reduced learning). We tested performance, learning, and perceived effort in a crowd-sourced study where participants learned and used input mechanisms with different error-reduction techniques. We found that error reduction techniques are feasible, can outperform error-prone ordinary input, and do not negatively affect learning or perceived effort.

University of Saskatchewan, Saskatoon, Saskatchewan, Canada

University of Canterbury, Christchurch, New Zealand