Improving Automated Long Answer Grading:
In the context of education, grading long answers accurately remains a challenging task. The RiceChem dataset, introduced in this study, is derived from real student responses to long-answer questions in a chemistry course, showcasing significantly higher word counts compared to traditional ASAG datasets. The application of a rubric-based grading model using MNLI for transfer learning yields more effective and nuanced assessments.
This study not only advances grading methodologies but also opens new avenues for research in educational assessment, especially for long-answer formats that demand detailed analysis.