Games have become a popular way of collecting human subject data, based on the premise that they are more engaging than surveys or experiments, but generate equally valid data. However, this premise has not been empirically tested. In response, we designed a game for eliciting linguistic data following Intrinsic Elicitation – a design approach aiming to minimise validity threats in data collection games – and compared it to an equivalent linguistics experiment as control. In a preregistered study and replication (n=96 and n=136), using two different ways of operationalising accuracy, the game generated substantially more enjoyment (d=.70, .73) and substantially less accurate data (d=-.68, -.40) – though still more accurate than random responding. We conclude that for certain data types data collection games may present a serious trade-off between participant enjoyment and data quality, identify possible causes of lower data quality for future research, reflect on our design approach, and urge games HCI researchers to use careful controls where appropriate.
https://dl.acm.org/doi/abs/10.1145/3491102.3502025
The ACM CHI Conference on Human Factors in Computing Systems (https://chi2022.acm.org/)