Mental workload critically affects well-being and performance in safety-critical systems. While machine learning models for mental workload prediction often leverage physiological indicators, interpretability and error analysis are frequently overlooked. This study develops robust models for workload prediction that emphasize interpretability and analyzes common misclassifications to elucidate key mechanisms. Respiratory and cardiac signals from 30 participants, as well as oculomotor signals from 17 participants, captured under varying task demands were utilized. Five models of varying interpretability were validated with optimized hyperparameters and preprocessing. A logistic regression and a decision tree were selected to distinguish between two and three workload levels, respectively. On unseen test data, they achieved f1-scores of 90.5% (accuracy: 92.2%) and 72.0% (accuracy: 72.3%). Performance varied across scenarios and individuals. Findings show that transparent, efficient models combined with appropriate preprocessing can compete with black-box approaches, with implications for safety-critical applications where interpretability, trust, and computational efficiency are essential.
ACM CHI Conference on Human Factors in Computing Systems