Bayesian hierarchical models are probabilistic models that have hierarchical structures and use Bayesian methods for inferences. In this paper, we extend Fitts' law to be a Bayesian hierarchical pointing model and compare it with the typical pooled pointing models (i.e., treating all observations as the same pool), and the individual pointing models (i.e., building an individual model for each user separately). The Bayesian hierarchical pointing models outperform pooled and individual pointing models in predicting the distribution \hl{and the mean of pointing movement time, especially when the training data are sparse.} Our investigation also shows that \hl{both noninformative and weakly informative priors are adequate for modeling pointing actions,} although the weakly informative prior performs slightly better than the noninformative prior when the training data size is small. Overall, we conclude that the expected advantages of Bayesian hierarchical models hold for the pointing tasks. Bayesian hierarchical modeling should be adopted a more principled and effective approach of building pointing models than the current common practices in HCI which use pooled or individual models.
The accurate and personalized estimation of task difficulty provides many opportunities for optimizing user experience. However, user diversity makes such difficulty estimation hard, in that empirical measurements from some user sample do not necessarily generalize to others. In this paper, we contribute a new approach for personalized difficulty estimation of game levels, borrowing methods from content recommendation. Using factorization machines (FM) on a large dataset from a commercial puzzle game, we are able to predict difficulty as the number of attempts a player requires to pass future game levels, based on observed attempt counts from earlier levels and levels played by others. In addition to performance and scalability, FMs offer the benefit that the learned latent variable model can be used to study the characteristics of both players and game levels that contribute to difficulty. We compare the approach to a simple non-personalized baseline and a personalized prediction using Random Forests. Our results suggest that FMs are a promising tool enabling game designers to both optimize player experience and learn more about their players and the game.
There is a growing interest in adopting Deep Learning (DL) given its superior performance in many domains. However, modern DL frameworks such as TensorFlow often come with a steep learning curve. In this work, we propose INTENT, an interactive system that infers user intent and generates corresponding TensorFlow code on behalf of users. INTENT helps users understand and validate the semantics of generated code by rendering individual tensor transformation steps with intermediate results and element-wise data provenance. Users can further guide INTENT by marking certain TensorFlow operators as desired or undesired, or directly manipulating the generated code. A within-subjects user study with 18 participants shows that users can finish programming tasks in TensorFlow more successfully with only half the time, compared with a variant of INTENT that has no interaction or visualization support.
Forward biomechanical simulation in HCI holds great promise as a tool for evaluation, design, and engineering of user interfaces. Although reinforcement learning (RL) has been used to simulate biomechanics in interaction, prior work has relied on unrealistic assumptions about the control problem involved, which limits the plausibility of emerging policies. These assumptions include direct torque actuation as opposed to muscle-based control; direct, privileged access to the external environment, instead of imperfect sensory observations; and lack of interaction with physical input devices. In this paper, we present a new approach for learning muscle-actuated control policies based on perceptual feedback in interaction tasks with physical input devices. This allows modelling of more realistic interaction tasks with cognitively plausible visuomotor control. We show that our simulated user model successfully learns a variety of tasks representing different interaction methods, and that the model exhibits characteristic movement regularities observed in studies of pointing. We provide an open-source implementation which can be extended with further biomechanical models, perception models, and interactive environments.
Interactions based on automatic speech recognition (ASR) have become widely used, with speech input being increasingly utilized to create documents. However, as there is no easy way to distinguish between commands being issued and text required to be input in speech, misrecognitions are difficult to identify and correct, meaning that documents need to be manually edited and corrected. The input of symbols and commands is also challenging because these may be misrecognized as text letters. To address these problems, this study proposes a speech interaction method called DualVoice, by which commands can be input in a whispered voice and letters in a normal voice. The proposed method does not require any specialized hardware other than a regular microphone, enabling a complete hands-free interaction. The method can be used in a wide range of situations where speech recognition is already available, ranging from text input to mobile/wearable computing. Two neural networks were designed in this study, one for discriminating normal speech from whispered speech, and the second for recognizing whisper speech. A prototype of a text input system was then developed to show how normal and whispered voice can be used in speech text input. Other potential applications using DualVoice are also discussed.
It is important for photographers to have the best possible lighting configuration at the time of shooting; otherwise, they need post-processing on images, which may cause artifacts and deterioration. Thus, photographers often struggle to find the best possible lighting configuration by manipulating lighting devices, including light sources and modifiers, in a trial-and-error manner. In this paper, we propose a novel computational framework to support photographers. This framework assumes that every lighting device is programmable; that is, its adjustable parameters (e.g., orientation, intensity, and color temperature) can be set using a program. Using our framework, photographers do not need to learn how the parameter values affect the resulting lighting, and even do not need to determine the strategy of the trial-and-error process; instead, photographers need only concentrate on evaluating which lighting configuration is more desirable among options suggested by the system. The framework is enabled by our novel photographer-in-the-loop Bayesian optimization, which is sample-efficient (i.e., the number of required evaluation steps is small) and which can also be guided by providing a rough painting of the desired lighting configuration if any. We demonstrate how the framework works in both simulated virtual environments and a physical environment, suggesting that it could find pleasing lighting configurations quickly in around 10 iterations. Our user study suggests that the framework enables the photographer to concentrate on the look of captured images rather than the parameters, compared with the traditional manual lighting workflow.