2025: Testing Feedback Quality of Athena for Learning Management Systems

Bachelor's theses

Student
Aleks Petrov

Supervisor(s)Advisor(s)

Abstract

Athena is a module-based system that supports (semi-)automated assessment of exercises in Artemis. It provides tutors with immediate feedback on student submissions for programming, modeling and text exercises. The modules are powered by large language models (LLMs). However, there are currently no systematic test cases to verify whether these modules work as expected or whether the quality of feedback remains consistent over time—especially when dependencies like LLM versions change.

The primary objective of this thesis is to design and implement comprehensive testing strategies to ensure the correctness and reliability of Athena’s feedback modules, with a primary focus on LLM-based modules. This includes designing mock tests to ensure raw system functionality and implementing integration tests for similarity analysis using semantic similarity metrics. The goal is to support long-term maintainability and confidence in feedback quality through systematic mock unit tests and regression monitoring.