Digital images remain largely inaccessible to blind or visually impaired (BVI) people because alt-text rarely conveys how %objects For-TAPS - materials materials feel or sound. We augment material images with multimodal vibrotactile patterns and evaluate four generation pipelines. AP1: prompt with one-shot example, AP2: prompt to audio, then pattern, AP3: real finger–material recording to pattern, and AP4: patterns from a public haptic database. A custom multilocal vibrotactile tablet played patterns on 10 material images (e.g., wood, stone, glass). Eight BVI participants explored each image with four patterns and ranked the best match. Think-aloud feedback highlighted: Theme 1 (realism — rough/grainy for wood and stone; smooth/steady for glass), Theme 2 (distinctiveness — separable cues; uniform buzzes were criticized), Theme 3 (personal associations), Theme 4 (effort/calibration for faint/noisy patterns; intensity tuning), and Theme 5 (preferences/suggestions). AP3 felt most authentic; AI patterns aided clarity but seemed stylized. Exploratory ranks (n=8) echoed hybrid, user-tunable pipelines for accessible material perception (AP3 Median 3/4, AI Medians 2/4).
ACM CHI Conference on Human Factors in Computing Systems