When modeling passive data to infer individual mental wellbeing, a common source of ground truth is self-reports. But these tend to represent the psychological facet of mental states, which might not align with the physiological facet of that state. Our paper demonstrates that when what people ``feel'' differs from what people ``say they feel'', we witness a semantic gap that limits predictions. We show that predicting mental wellbeing with passive data (offline sensors or online social media) is related to how the ground-truth is measured (objective arousal or self-report). Features with psycho-social signals (e.g., language) were better at predicting self-reported anxiety and stress. Conversely, features with behavioral signals (e.g., sleep), were better at predicting stressful arousal. Regardless of the source of ground truth, integrating both signals boosted prediction. To reduce the semantic gap, we provide recommendations to evaluate ground truth measures and adopt parsimonious sensing.
https://dl.acm.org/doi/abs/10.1145/3491102.3502037
The ACM CHI Conference on Human Factors in Computing Systems (https://chi2022.acm.org/)