Disambiguating distal target selection in dense and occluded virtual environments has been a challenge for Virtual Reality (VR) interaction design. While raycasting is a widely used interaction technique for selecting distant objects, it defaults to the first intersected target, forcing users into disambiguation phases, which can disrupt presence, increase cognitive load, and slow interaction. We introduce VoiceRay, a voice-based target selection that allows users to specify the ordinal position of the intended target along the ray (e.g., “second object”) without altering the scene or requiring additional inputs from the user. In a study with 24 participants, VoiceRay was compared against five existing techniques: AlphaCursor, LassoGrid, RayCursor, BubbleRay, and Raycasting. Results showed that VoiceRay significantly decreased selection time, maintained presence, increased usability, and reduced cognitive load. These findings demonstrate that voice-based interaction offers an effective, easy-to-use alternative for resolving 3D selection ambiguity in dense and occluded VR environments.
ACM CHI Conference on Human Factors in Computing Systems