Advances in artificial intelligence (AI) have made it possible to generate courseware and formative practice questions from textbooks. Courseware applies a learn by doing approach by integrating formative practice with text, a method proven to increase learning gains for students. By using AI for automatic question generation, the learn by doing method of courseware can be made available for nearly any textbook subject. As the generated questions are a primary learning feature in this environment, it is necessary to ensure they function as well for students as those written by humans. In this paper, we will use student data from an AI-generated Psychology courseware used in an online course at the University of Central Florida. The courseware has both generated questions and human-authored questions, allowing for a unique comparison of question engagement, difficulty, and persistence using student data from a natural learning context. The evaluation of quality metrics is critical in automatic question generation research, yet on its own is not comprehensive of students’ experience. Student perception is a meaningful qualitative metric, as student perceptions can inform behavior and decisions. Therefore, student perceptions of the courseware and questions were also solicited via survey. Combining question data analysis with student perception feedback gives a more comprehensive evaluation of the quality of AI-generated questions used in a natural learning context.