Towards Building Condition-Based Cross-Modality Intention-Aware Human-AI Cooperation under VR Environment

要旨

To address critical challenges in effectively identifying user intent and forming relevant information presentations and recommendations in VR environments, we propose an innovative condition-based multi-modal human-AI cooperation framework. It highlights the intent tuples (intent, condition, intent prompt, action prompt) and 2-Large-Language-Models (2-LLMs) architecture. This design, utilizes ``condition'' as the core to describe tasks, dynamically match user interactions with intentions, and empower generations of various tailored multi-modal AI responses. The architecture of 2-LLMs separates the roles of intent detection and action generation, decreasing the prompt length and helping with generating appropriate responses. We implemented a VR-based intelligent furniture purchasing system based on the proposed framework and conducted a three-phase comparative user study. The results conclusively demonstrate the system's superiority in time efficiency and accuracy, intention conveyance improvements, effective product acquisitions, and user satisfaction and cooperation preference. Our framework provides a promising approach towards personalized and efficient user experiences in VR.

著者
Ziyao He
Xi'an Jiaotong University, Xi'an, China
Shiyuan Li
Xi'an Jiaotong University, Xi'an, China
Yunpeng Song
Xi'an Jiaotong University, Xi'an, China
Zhongmin Cai
Xi’an Jiaotong University , Xi’an , China
論文URL

doi.org/10.1145/3613904.3642360

動画

会議: CHI 2024

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2024.acm.org/)

セッション: Remote Presentations: Highlight on AI

Remote Sessions
14 件の発表
2024-05-13 18:00:00
2024-05-14 02:20:00