Voice and Audio in Learning

https://dl.acm.org/doi/abs/10.1145/3491102.3501974

Though recent technological advances have enabled note-taking through different modalities (e.g., keyboard, digital ink, voice), there is still a lack of understanding of the effect of the modality choice on learning. In this paper, we compared two note-taking input modalities—keyboard and voice—to study their effects on participants’ understanding of learning content. We conducted a study with 60 participants in which they were asked to take notes using voice or keyboard on two independent digital text passages while also making a judgment about their performance on an upcoming test. We built mixed-effects models to examine the effect of the note-taking modality on learners’ text comprehension, the content of notes and their meta-comprehension judgement. Our findings suggest that taking notes using voice leads to a higher conceptual understanding of the text when compared to typing the notes. We also found that using voice triggers generative processes that result in learners taking more elaborate and comprehensive notes. The findings of the study imply that note-taking tools designed for digital learning environments could incorporate voice as an input modality to promote effective note-taking and a higher conceptual understanding of the text.

The University of Melbourne, Melbourne, Victoria, Australia

The University of Melbourne, Melbourne, Australia

The University of Melbourne, Melbourne, VIC, Australia

University of Melbourne, Melbourne, VIC, Australia

The University of Queensland, Brisbane, Australia

The University of Melbourne, Melbourne, Victoria, Australia

University of Melbourne, Melbourne, Victoria, Australia

https://dl.acm.org/doi/abs/10.1145/3491102.3501848

Avatar customization is known to positively affect crucial outcomes in numerous domains. However, it is unknown whether audial customization can confer the same benefits as visual customization. We conducted a preregistered 2 x 2 (visual choice vs. visual assignment x audial choice vs. audial assignment) study in a Java programming game. Participants with visual choice experienced higher avatar identification and autonomy. Participants with audial choice experienced higher avatar identification and autonomy, but only within the group of participants who had visual choice available. Visual choice led to an increase in time spent, and indirectly led to increases in intrinsic motivation, immersion, time spent, future play motivation, and likelihood of game recommendation. Audial choice moderated the majority of these effects. Our results suggest that audial customization plays an important enhancing role vis-à-vis visual customization. However, audial customization appears to have a weaker effect compared to visual customization. We discuss the implications for avatar customization more generally across digital applications.

Purdue University, West Lafayette, Indiana, United States

Michigan State University, East Lansing, Michigan, United States

Purdue University, West Lafayette, Indiana, United States

University of California, Santa Cruz, Santa Cruz, California, United States

https://dl.acm.org/doi/abs/10.1145/3491102.3502050

Video programs are important, accessible educational resources for young children, especially those from an under-resourced backgrounds. These programs' potential can be amplified if children are allowed to socially interact with media characters during their video watching. This paper presents the design and empirical investigation of interactive science-focused videos in which the main character, powered by a conversational agent, engaged in contingent conversation with children by asking children questions and providing responsive feedback. We found that children actively interacted with the media character in the conversational videos and their parents spontaneously provided support in the process. We also found that the children who watched the conversational video performed better in the immediate, episode-specific science assessment compared to their peers who watched the broadcast, non-interactive version of the same episode. Several design implications are discussed for using conversational technologies to better support child active learning and parent involvement in video watching.

University of California, Irvine, Irvine, California, United States

University of California Irvine, Irvine, California, United States

University of California, Irvine, Irvine, California, United States

https://dl.acm.org/doi/abs/10.1145/3491102.3517680

Smart speakers are increasingly being adopted by families in the U.S. In many cases, a smart speaker is shared by family members, which might make the sense of ownership uncertain. Through a diary and interview-based study with 20 Asian Indian parents and teenagers living in the U.S., this study through thematic analysis highlights various aspects of smart speakers that support or hinder the fulfilment of the psychological needs – self-efficacy, self-identity, territoriality, autonomy, and accountability and responsibility – that foster a sense of ownership. The paper also discusses the experiences of a cultural group that is not well represented in prior HCI work, and thus, it will add useful nuances and knowledge about inclusivity to the study of smart-speaker technology in HCI. Finally, it contributes six actionable design guidelines, including guidelines that relate to maintaining conversational context across different interactions and fostering a sense of ownership among users who share smart speakers, through the use of territorial markers.

Syracuse University, Syracuse, New York, United States