Exploring large language models for evaluating automatically generated questions

Abstract

Automatic question generation has emerged as an effective and efficient method for incorporating formative practice into electronic textbooks on a large scale. This advancement, however, introduces new challenges in ensuring the quality of the generated questions. Traditionally, analyzing student responses has been effective in identifying low-quality questions. However, preemptively filtering out substandard questions before they reach students would be more desirable. In this study, we present preliminary findings on a promising technique that leverages a large language model (LLM) to identify potentially low-quality questions. Our hypothesis is that questions an LLM fails to answer correctly may contain quality issues, particularly since LLMs generally outperform students in answering automatically generated questions. Using a data set of questions from an open-source textbook, our method successfully identified nearly 30% of the questions that were rejected through analysis of student answer data. These results suggest that LLMs can be a valuable tool in improving the quality control process of automatically generated questions.

Exploring large language models for evaluating automatically generated questions

Abstract

Date

Authors

Research Areas

Conference

Citation

Related Publications

Search our catalog of recent publications authored by our team.

AI principles in practice with a learning engineering framework

Intrinsic and contextual factors impacting student ratings of automatically generated questions: A large-scale data analysis

AI-Generated questions in an OER textbook: Evaluating the performance of formative practice

Automatically generated practice in the classroom: Exploring performance and impact across courses

Investigating student ratings with features of automatically generated questions: A large-scale analysis using data from natural learning contexts

Exploring large language models for evaluating automatically generated questions

Stay current on the latest research and news

Exploring large language models for evaluating automatically generated questions

Abstract

Date

Authors

Research Areas

Conference

Citation

Share

Related Publications

Search our catalog of recent publications authored by our team.

AI principles in practice with a learning engineering framework

Intrinsic and contextual factors impacting student ratings of automatically generated questions: A large-scale data analysis

AI-Generated questions in an OER textbook: Evaluating the performance of formative practice

Automatically generated practice in the classroom: Exploring performance and impact across courses

Investigating student ratings with features of automatically generated questions: A large-scale analysis using data from natural learning contexts

Exploring large language models for evaluating automatically generated questions

Stay current on the latest research and news