Early counting forms a critical foundation for numeracy, involving coordination of visual representations, verbal number words, and physical actions such as pointing. Designing effective technologies for young children, therefore requires careful calibration of multimodal features. This study investigated how different levels of demonstrations paired with a voice assistant—static (baseline: image+voice), animated (animation+voice), and interactive (touch+animation+voice)—influence counting-related understanding and engagement in 2–4-year-olds. We developed a tablet-based counting game and conducted a within-subjects study with 32 children. Results showed that animated demonstration improved cardinal number word understanding over both baseline and the interactive demonstration. Analyses of verbal counting engagement showed that concurrent touch demands increased cognitive load, limiting children’s counting aloud. These findings suggest that more interactivity does not always yield better outcomes for young learners. We contribute empirical evidence and design guidance: voice+animation supports early counting, while touch interactivity should be lightweight and age-appropriate, informing the design of multimodal voice-assisted applications.
ACM CHI Conference on Human Factors in Computing Systems