Adaptive instruction for online education can increase learning gains and decrease the work required of learners, instructors, and course designers. Reinforcement Learning (RL) is a promising tool for developing instructional policies, as RL models can learn complex relationships between course activities, learner actions, and educational outcomes. This paper demonstrates the first RL model to schedule educational activities in real time for a large online course through active learning. Our model learns to assign a sequence of course activities while maximizing learning gains and minimizing the number of items assigned. Using a controlled experiment with over 1,000 learners, we investigate how this scheduling policy affects learning gains, dropout rates, and qualitative learner feedback. We show that our model produces better learning gains using fewer educational activities than a linear assignment condition, and produces similar learning gains to a self-directed condition using fewer educational activities and with lower dropout rates.
https://doi.org/10.1145/3313831.3376518
The ACM CHI Conference on Human Factors in Computing Systems (https://chi2020.acm.org/)