Modeling visual search not only offers an opportunity to predict the usability of an interface before actually testing it on real users but also advances scientific understanding about human behavior. In this work, we first conduct a set of analyses on a large-scale dataset of visual search tasks on realistic webpages. We then present a deep neural network that learns to predict the scannability of webpage content, i.e., how easy it is for a user to find a specific target. Our model leverages both heuristic-based features such as target size and unstructured features such as raw image pixels. This approach allows us to model complex interactions that might be involved in a realistic visual search task, which can not be achieved by traditional analytical models. We analyze the model behavior to offer our insights into how the salience map learned by the model aligns with human intuition.
https://doi.org/10.1145/3313831.3376870
The ACM CHI Conference on Human Factors in Computing Systems (https://chi2020.acm.org/)