Embodiment can enhance conversational agents, such as increasing their perceived presence. This is typically achieved through visual representations of a virtual body; however, visual modalities are not always available, such as when users interact with agents using headphones or display-less glasses. In this work, we explore auditory embodiment. By introducing auditory cues of bodily presence — through spatially localized voice and situated Foley audio from environmental interactions — we investigate how audio alone can convey embodiment and influence perceptions of a conversational agent. We conducted a 2 (spatialization: monaural vs. spatialized) × 2 (Foley: none vs. Foley) within-subjects study, where participants (n=24) engaged in conversations with agents. Our results show that spatialization and Foley increase co-presence, but reduce users’ perceptions of the agent’s attention and other social attributes.
ACM CHI Conference on Human Factors in Computing Systems