Towards Building Condition-Based Cross-Modality Intention-Aware Human-AI Cooperation under VR Environment

To address critical challenges in effectively identifying user intent and forming relevant information presentations and recommendations in VR environments, we propose an innovative condition-based multi-modal human-AI cooperation framework. It highlights the intent tuples (intent, condition, intent prompt, action prompt) and 2-Large-Language-Models (2-LLMs) architecture. This design, utilizes ``condition'' as the core to describe tasks, dynamically match user interactions with intentions, and empower generations of various tailored multi-modal AI responses. The architecture of 2-LLMs separates the roles of intent detection and action generation, decreasing the prompt length and helping with generating appropriate responses. We implemented a VR-based intelligent furniture purchasing system based on the proposed framework and conducted a three-phase comparative user study. The results conclusively demonstrate the system's superiority in time efficiency and accuracy, intention conveyance improvements, effective product acquisitions, and user satisfaction and cooperation preference. Our framework provides a promising approach towards personalized and efficient user experiences in VR.

Xi'an Jiaotong University, Xi'an, China

Xi’an Jiaotong University , Xi’an , China

https://doi.org/10.1145/3613904.3642360

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2024.acm.org/)

Remote Sessions

14 件の発表

開始日時2024-05-13 18:00:00

終了日時2024-05-14 02:20:00

お気に入り

あとで読む

コレクション