Speech recognition is unreliable in noisy places, compromises privacy and security when around strangers, and inaccessible to people with speech disorders. Lip reading can mitigate many of these challenges but the existing silent speech recognizers for lip reading are error prone. Developing new recognizers and acquiring new datasets is impractical for many since it requires enormous amount of time, effort, and other resources. To address these, first, we develop LipType, an optimized version of LipNet for improved speed and accuracy. We then develop an independent repair model that processes video input for poor lighting conditions, when applicable, and corrects potential errors in output for increased accuracy. We tested this model with both LipType and other speech and silent speech recognizers to demonstrate its effectiveness.
https://doi.org/10.1145/3411764.3445565
In this paper, we propose Auth+Track, a novel authentication model that aims to reduce redundant authentication in everyday smartphone usage. By sparse authentication and continuous tracking of user's status, Auth+Track eliminates the "gap" authentication between fragmented sessions and enables "Authentication Free when User is Around". To instantiate Auth+Track model, we present PanoTrack, a valid implementation that integrates body and near field hand information for user tracking. We install a fisheye camera on the top of the phone to achieve panoramic vision that can capture both user's body and on-screen hand. Based on the captured video stream, we develops an algorithm pipeline to extract all the key features for user tracking, including body keypoints and their temporal and spacial association, near field hand status and features for user identity assignment. By analyzing system performance and user experience in real-life scenarios, we demonstrate that our system outperforms existing solutions.
https://doi.org/10.1145/3411764.3445624
We present ElectroRing, a wearable ring-based input device that reliably detects both onset and release of a subtle finger pinch, and more generally, contact of the fingertip with the user's skin. ElectroRing addresses a common problem in ubiquitous touch interfaces, where subtle touch gestures with little movement or force are not detected by a wearable camera or IMU. ElectroRing's active electrical sensing approach provides a step-function-like change in the raw signal, for both touch and release events, which can be easily detected using only basic signal processing techniques. Notably, ElectroRing requires no second point of instrumentation, but only the ring itself, which sets it apart from existing electrical touch detection methods. We built three demo applications to highlight the effectiveness of our approach when combined with a simple IMU-based 2D tracking system.
https://doi.org/10.1145/3411764.3445094
We present Project Tasca, a pocket-based textile sensor that detects user input and recognizes everyday objects that a user carries in the pockets of a pair of pants (e.g., keys, coins, electronic devices, or plastic items). By creating a new fabric-based sensor capable of detecting in-pocket touch and pressure, and recognizing metallic, non-metallic, and tagged objects inside the pocket, we enable a rich variety of subtle, eyes-free, and always-available input, as well as context-driven interactions in wearable scenarios. We developed our prototype by integrating four distinct types of sensing methods, namely: inductive sensing, capacitive sensing, resistive sensing, and NFC in a multi-layer fabric structure into the form factor of a jeans pocket. Through a ten-participant study, we evaluated the performance of our prototype across 11 common objects including hands, 8 force gestures, and 30 NFC tag placements. We yielded 92.3% personal cross-validation accuracy for object recognition, 96.4% accuracy for gesture recognition, and 100% accuracy for detecting NFC tags at close distance . We concluded by demonstrating the interactions enabled by our pocket-based sensor in several applications.
https://doi.org/10.1145/3411764.3445712
Gestures are a promising candidate of a input modality for ambient computing where conventional input modalities such as touchscreen are not available. Existing works have focused on gesture recognition using image sensors. However, its cost, high battery consumption, and privacy concerns made it challenging as an always-on solution. This paper introduces an efficient gesture recognition technique using a miniaturized 60GHz radar sensor. The technique recognizes four directional swipes and omni swipe using a radar chip (6.5×5.0[mm]) integrated into a mobile phone. We developed a convolutional neural network model efficient enough for battery-powered and computationally constrained processors. Its model size and interference time is less than 1/5000 compared to an existing gesture recognition technique using radar. Our evaluations with large scale datasets consisting of 558,000 gesture samples and 3,920,000 negative samples demonstrated our algorithm’s efficiency, robustness, and readiness to be deployed outside of research laboratories.
https://doi.org/10.1145/3411764.3445367
We propose a novel modality for active biometric authentication: electrical muscle stimulation (EMS). To explore this, we engineered an interactive system, which we call ElectricAuth, that stimulates the user’s forearm muscles with a sequence of electrical impulses (i.e., EMS challenge) and measures the user’s involuntary finger movements (i.e., response to the challenge). ElectricAuth leverages EMS’s intersubject variability, where the same electrical stimulation results in different movements in different users because everybody’s physiology is unique (e.g., differences in bone and muscular structure, skin resistance and composition, etc.). As such, ElectricAuth allows users to login without memorizing passwords or PINs. ElectricAuth’s challenge-response structure makes it secure against data breaches and replay attacks, a major vulnerability facing today’s biometrics such as facial recognition and fingerprints. Furthermore, ElectricAuth never reuses the same challenge twice in authentications – in just one second of stimulation it encodes one of 68M possible challenges. In our user studies, we found that ElectricAuth resists: (1) impersonation attacks (false acceptance rate: 0.17% at 5% false rejection rate); (2) replay attacks (false acceptance rate: 0.00% at 5% false rejection rate); and, (3) synthesis attacks (false acceptance rates: 0.2-2.5%). Our longitudinal study also shows that ElectricAuth produces consistent results over time and across different humidity and muscle conditions.
https://doi.org/10.1145/3411764.3445441
We present BackTrack, a trackpad placed on the back of a smartphone to track fine-grained finger motions. Our system has a small form factor, with all the circuits encapsulated in a thin layer attached to a phone case. It can be used with any off-the-shelf smartphone, requiring no power supply or modification of the operating systems. BackTrack simply extends the finger tracking area of the front screen, without interrupting the use of the front screen. It also provides a switch to prevent unintentional touch on the trackpad. All these features are enabled by a battery-free capacitive circuit, part of which is a transparent, thin-film conductor coated on a thin glass and attached to the front screen. To ensure accurate and robust tracking, the capacitive circuits are carefully designed. Our design is based on a circuit model of capacitive touchscreens, justified through both physics-based finite-element simulation and controlled laboratory experiments. We conduct user studies to evaluate the performance of using BackTrack. We also demonstrate its use in a number of smartphone applications.
https://doi.org/10.1145/3411764.3445374
Wake-up-free techniques (e.g., Raise-to-Speak) are important for improving the voice input experience. We present ProxiMic, a close-to-mic (within 5 cm) speech sensing technique using only one microphone. With ProxiMic, a user keeps a microphone-embedded device close to the mouth and speaks directly to the device without wake-up phrases or button presses. To detect close-to-mic speech, we use the feature from pop noise observed when a user speaks and blows air onto the microphone. Sound input is first passed through a low-pass adaptive threshold filter, then analyzed by a CNN which detects subtle close-to-mic features (mainly pop noise). Our two-stage algorithm can achieve 94.1% activation recall, 12.3 False Accepts per Week per User (FAWU) with 68 KB memory size, which can run at 352 fps on the smartphone. The user study shows that ProxiMic is efficient, user-friendly, and practical.
https://doi.org/10.1145/3411764.3445687
We present Pose-on-the-Go, a full-body pose estimation system that uses sensors already found in today’s smartphones. This stands in contrast to prior systems, which require worn or external sensors. We achieve this result via extensive sensor fusion, leveraging a phone’s front and rear cameras, the user-facing depth camera, touchscreen, and IMU. Even still, we are missing data about a user’s body (e.g., angle of the elbow joint), and so we use inverse kinematics to estimate and animate probable body poses. We provide a detailed evaluation of our system, benchmarking it against a professional-grade Vicon tracking system. We conclude with a series of demonstration applications that underscore the unique potential of our approach, which could be enabled on many modern smartphones with a simple software update.
https://doi.org/10.1145/3411764.3445582
We present FaceSight, a computer vision-based hand-to-face gesture sensing technique for AR glasses. FaceSight fixes an infrared camera onto the bridge of AR glasses to provide extra sensing capability of the lower face and hand behaviors. We obtained 21 hand-to-face gestures and demonstrated the potential interaction benefits through five AR applications. We designed and implemented an algorithm pipeline that segments facial regions, detects hand-face contact (f1 score: 98.36%), and trains convolutional neural network (CNN) models to classify the hand-to-face gestures. The input features include gesture recognition, nose deformation estimation, and continuous fingertip movement. Our algorithm achieves classification accuracy of all gestures at 83.06%, proved by the data of 10 users. Due to the compact form factor and rich gestures, we recognize FaceSight as a practical solution to augment input capability of AR glasses in the future.
https://doi.org/10.1145/3411764.3445484
Handheld controllers are an essential part of VR systems. Modern sensing techniques enable them to track users' finger movements to support natural interaction using hands. The sensing techniques, however, often fail to precisely determine whether two fingertips touch each other, which is important for the robust detection of a pinch gesture. To address this problem, we propose AtaTouch, which is a novel, robust sensing technique for detecting the closure of a finger pinch. It utilizes a change in the coupled impedance of an antenna and human fingers when the thumb and finger form a loop. We implemented a prototype controller in which AtaTouch detects the finger pinch of the grabbing hand. A user test with the prototype showed a finger-touch detection accuracy of 96.4%. Another user test with the scenarios of moving virtual blocks demonstrated low object-drop rate (2.75%) and false-pinch rate (4.40%). The results and feedback from the participants support the robustness and sensitivity of AtaTouch.
https://doi.org/10.1145/3411764.3445442
Capacitive touchscreens are near-ubiquitous in today's touch-driven devices, such as smartphones and tablets. By using rows and columns of electrodes, specialized touch controllers are able to capture a 2D image of capacitance at the surface of a screen. For over a decade, capacitive "pixels" have been around 4 millimeters in size – a surprisingly low resolution that precludes a wide range of interesting applications. In this paper, we show how super-resolution techniques, long used in fields such as biology and astronomy, can be applied to capacitive touchscreen data. By integrating data from many frames, our software-only process is able to resolve geometric details finer than the original sensor resolution. This opens the door to passive tangibles with higher-density fiducials and also recognition of every-day metal objects, such as keys and coins. We built several applications to illustrate the potential of our approach and report the findings of a multipart evaluation.
https://doi.org/10.1145/3411764.3445703
In this paper, we propose a reconfigurable electrode, RElectrode, using a microfluidic technique that can change the geometry and material properties of the electrode to satisfy the needs for sensing a variety of different types of user input through touch/touchless gestures, pressure, temperature, and distinguish between different types of objects or liquids. Unlike the existing approaches, which depend on specific-shaped electrode for particular sensing (e.g., coil for inductive sensing), RElectrode enables capacity, inductance, resistance/pressure, temperature, pH sensings all in a single package. We demonstrate the design and fabrication of the microfluidic structure of our RElectrode, evaluate its sensing performance through several studies, and provide some unique applications. RElectrode demonstrates technical feasibility and application values of integrating physical and biochemical properties of microfluidics into novel sensing interfaces.
https://doi.org/10.1145/3411764.3445652