| Title: |
Enhancing LLM-Based Short Answer Grading with Retrieval-Augmented Generation |
| Language: |
English |
| Authors: |
Yucheng Chu; Peng He; Hang Li; Haoyu Han; Kaiqi Yang; Yu Xue; Tingting Li; Yasemin Copur-Gencturk; Joseph Krajcik; Jiliang Tang |
| Source: |
International Educational Data Mining Society. 2025. |
| Availability: |
International Educational Data Mining Society. e-mail: admin@educationaldatamining.org; Web site: https://educationaldatamining.org/conferences/ |
| Peer Reviewed: |
Y |
| Page Count: |
7 |
| Publication Date: |
2025 |
| Sponsoring Agency: |
National Science Foundation (NSF), Division of Research on Learning in Formal and Informal Settings (DRL) |
| Contract Number: |
2446701; 1813760 |
| Document Type: |
Speeches/Meeting Papers; Reports - Research |
| Education Level: |
Junior High Schools; Middle Schools; Secondary Education |
| Descriptors: |
Artificial Intelligence; Science Education; Technology Uses in Education; Natural Language Processing; Grading; Evaluation Methods; Automation; Accuracy; Knowledge Level; Middle School Students |
| Abstract: |
Short answer assessment is a vital component of science education, allowing evaluation of students' complex three-dimensional understanding. Large language models (LLMs) that possess human-like ability in linguistic tasks are increasingly popular in assisting human graders to reduce their workload. However, LLMs' limitations in domain knowledge restrict their understanding in task-specific requirements and hinder their ability to achieve satisfactory performance. Retrieval-augmented generation (RAG) emerges as a promising solution by enabling LLMs to access relevant domain-specific knowledge during assessment. In this work, we propose an adaptive RAG framework for automated grading that dynamically retrieves and incorporates domain-specific knowledge based on the question and student answer context. Our approach combines semantic search and curated educational sources to retrieve valuable reference materials. Experimental results in a science education dataset demonstrate that our system achieves an improvement in grading accuracy compared to baseline LLM approaches. The findings suggest that RAG-enhanced grading systems can serve as reliable support with efficient performance gains. [For the complete proceedings, see ED675583.] |
| Abstractor: |
As Provided |
| Entry Date: |
2025 |
| Accession Number: |
ED675678 |
| Database: |
ERIC |