Locating a target based on auditory and visual cues—such as finding a car in a crowded parking lot or identifying a speaker in a virtual meeting—requires balancing effort, time, and accuracy under uncertainty. Existing models of audiovisual search often treat perception and action in isolation, overlooking how people adaptively coordinate movement and sensory strategies. We present Sensonaut, a computational model of embodied audiovisual search. The core assumption is that people deploy their body and sensory systems in ways they believe will most efficiently improve their chances of locating a target, trading off time and effort under perceptual constraints. Our model formulates this as a resource-rational decision-making problem under partial observability. We validate the model against newly collected human data, showing that it reproduces both adaptive scaling of search time and effort under task complexity, occlusion, and distraction, and characteristic human errors. Our simulation of human-like resource-rational search informs the design of audiovisual interfaces that minimize search cost and cognitive load.
Throughput is a widely used performance metric, combining speed and accuracy into a single measure, while reducing the effect of subjective speed–accuracy trade-offs. Despite its wide application in 2D steering tasks, its direct extension to 3D presents unique challenges since 3D trajectories exhibit higher variability, and perceptual–motor factors undermine existing formulations. Consequently, throughput has not been systematically adopted for evaluating steering in 3D virtual environments. In this paper, using a controlled virtual reality user study with a ring-and-wire task, we introduce and validate a novel throughput formulation for 3D steering based on the bivariate standard deviation of the trajectory for the effective width calculation. Our results show that this formulation provides smoother throughput values across subjective speed–accuracy differences and improves model fit compared to traditional approaches. This work advances our theoretical understanding of the Steering law in 3D contexts, provides researchers and practitioners with a robust evaluation method, and establishes a foundation for future studies of complex 3D trajectory interactions.
This paper presents an N-ary Gaussian Model for predicting endpoint distributions in pointing tasks across task scenarios. Built on the foundational principles of the Ternary Gaussian model series, our model framework allows researchers to define parameter constraints and automatically refine model combinations, eliminating the need for predefined equations based on data analysis. We utilize the Bayesian Information Criterion (BIC) for model selection, ensuring simplicity while maintaining predictive accuracy. We conducted a comparative analysis against published baselines across 7 diverse datasets, covering 1D, 2D, and 3D tasks, different input modalities, different display devices, and time-constrained scenarios, demonstrating the robustness and generalization of the N-ary Gaussian Model. The N-ary Gaussion model offers an automated solution for modeling pointing uncertainty, and also incorporates cross output device, input modality, and temporal constraint factors into spatial pointing uncertainty modeling for the first time.
Assistive interfaces, such as recommendation engines, adaptive systems, and intelligent assistants, span diverse methods and disciplines but lack a shared conceptual foundation.
This paper models assistance as sequential decision-making under uncertainty between two agents: the user and the assistant.
The formalism allows casting assistance as an optimization problem and offers a rich but principled vocabulary to understand the dynamics of assistance.
Drawing on Partially Observable Stochastic Games (POSGs) and related models, we: (1) motivate multi-agent over single-agent formulations; (2) adapt POSGs to HCI and clarify their tractability through reductions; (3) propose a two-agent sequential model that unambiguously defines concepts such as adaptation, augmentation, and delegation; (4) illustrate applicability through domain problems and examples; and (5) offer a supporting implementation via a library. These results warrant more attention on decision-theory as a principled yet actionable approach to assistive interfaces.
General-purpose LLMs pose misinformation risks for development and policy experts, lacking epistemic humility for verifiable outputs. We present AVA (AI + Verified Analysis), a GenAI platform built on a curated library of over 4,000 World Bank Reports with multilingual capabilities. AVA’s multi-agent pipeline enables users to query and receive evidence-based syntheses. It operationalizes epistemic humility through two mechanisms: citation verifiability (tracing claims to sources) and reasoned abstention (declining unsupported queries with justification and redirection). We conducted an in-the-wild evaluation with over 2,200 individuals from heterogeneous organisations and roles in 116 countries, via log analysis, surveys, and 20 interviews. Difference-in-Differences estimates associate sustained engagement with 2.4–3.9 hours saved weekly. Qualitatively, participants used AVA as a specialized “evidence engine”; reasoned abstention clarified scope boundaries, and trust was calibrated through institutional provenance and page-anchored citations. We contribute design guidelines for specialized AI and articulate a vision for `ecosystem-aware' Humble AI.
We present AutoOptimization, a novel multi-objective optimization framework for adapting user interfaces. From a user’s verbal preferences for changing a UI, our framework guides a prioritization-based Pareto frontier search over candidate layouts. It selects suitable objective functions for UI placement while simultaneously parameterizing them according to the user's instructions to define the optimization problem. A solver then generates a series of optimal UI layouts, which our framework validates against the user's instructions to adapt the UI with the final solution. Our approach thus overcomes the previous need for manual inspection of layouts and the use of population averages for objective parameters. We integrate a Vision-Language Model into our framework whose reasoning capabilities allow us to focus on the Pareto optimization, prioritize results, and validate outcomes. We evaluate each step of our framework inside a Mixed Reality use case and demonstrate that AutoOptimization effectively increases the usability of UI adaptation schemes.
Deciding which idea is worth prototyping is a central concern in iterative design. A prototype should be produced when the expected improvement is high and the cost is low. However, this is hard to decide, because costs can vary drastically: a simple parameter tweak may take seconds, while fabricating hardware consumes material and energy. Such asymmetries can discourage a designer from exploring the design space. In this paper, we present an extension of cost-aware Bayesian optimization to account for diverse prototyping costs. The method builds on the power of Bayesian optimization and requires only a minimal modification to the acquisition function. The key idea is to use designer-estimated costs to guide sampling toward more cost-effective prototypes. In technical evaluations, the method achieved comparable utility to a cost-agnostic baseline while requiring only approximately 70 percent of the cost; under strict budgets, it outperformed the baseline threefold. A within-subjects study with 12 participants in a realistic joystick design task demonstrated similar benefits. These results show that accounting for prototyping costs can make Bayesian optimization more compatible with real-world design projects.