Interpreting LLMs for Educational Scoring

Scott SI Digest

Large Language Models

Educational Tools

Visual Analytics

Interpreting LLMs in Adaptive Educational Tools

The development of iScore, an interactive visual analytics tool, marks a step forward in utilizing Large Language Models (LLMs) for educational purposes, particularly in automatic scoring of summary writing. iScore’s design enables learning engineers to upload, score, and compare multiple summaries, facilitating iterative revision, tracking changes in LLM scores, and visualizing model weights at different abstraction levels.

iScore emerged from a user-centered design process with learning engineers who deploy summary scoring LLMs.
The tool addresses LLM interpretability challenges such as aggregating large text inputs and tracking score provenance.
A month-long deployment with learning engineers showed a 3% improvement in score accuracy.
It builds trust in LLMs by allowing engineers to understand and evaluate their models during deployment.

In my view, iScore is significant because it directly tackles the transparency and trust issues in LLMs within educational settings. The focus on LLM interpretability could pave the way for other specialized tools in assessing and refining AI applications in education and beyond. Check out more about this work at arXiv:2403.04760v1.

Personalized AI news from scientific papers.