Title: Modelling Computer-Based Formative Assessment Data With Graph Neural
Networks

Authors: Benjamin Garzón Jimenez De Cisneros, Lisi Qarkaxhija, Vincenzo Perri,
Ingo Scholtes, Martin J. Tomasik

Abstract:
Computer-based formative feedback (CBFA) systems are software tools designed
to enable tasks of data collection and performance evaluation in the classroom
with the aim of providing feedback and supporting instructional decisions.
CBFA systems enable acquiring large-scale datasets that can be used to study
academic abilities and their development in everyday settings, as opposed to
the less ecological conditions found in standardised assessments. 

In the present work, we model a dataset obtained from the MINDSTEPS CBFA
system, which serves a population of tens of thousands of students in
Northwestern Switzerland. The system includes an item bank covering topics and
competences over several years of mandatory schooling, from grade 3 to grade
9. The items span four school subjects: mathematics, German, English and
French, with a further sub-categorisation in competence domains (e.g., German
grammar or German reading). The dataset analysed contains over 20 million
responses from ~ 89000 students in ~ 18000 different items, forming a large
and highly sparse student-item response matrix. 

A natural representation in this scenario is a graph in which nodes stand for
students or items and an edge between a student and an item represents a
particular response of the student to that particular item (edge label:
correct/incorrect). The task that concerns us is to learn to predict
unobserved edge labels from observed ones. For this purpose, we resort to
graph neural networks (GNNs), a class of machine learning methods recently
developed to model graph-structured data for node-level or edge-level
prediction tasks. GNNs constitute a more expressive alternative to other
techniques that are typically used for test scoring (e.g., item response
theory), which may not be flexible enough to capture the complexity of
large-scale datasets. The specific model we use consists of an encoder module
with two graph convolutional layers followed by a decoder module which outputs
the probability of a correct response. Nodes and edges are represented as
embeddings in a multidimensional space, and the model also can incorporate
features at the student (e.g., gender, mother tongue), item (e.g., competence
domain) and response (e.g., age when responding) levels. After fitting the
GNN, the learned item embeddings recover properties of the school curriculum:
item embeddings of domain competences that belong to the same subject tend to
cluster together, as do the embeddings of sub-competences within a competence
domain. The main source of variation in these item embeddings corresponds to
item difficulty. Besides, the dimension accounting for most of the variation
in edge embeddings shows a common age progression across subjects, revealing
the increase in ability over time. The model parameters, which capture both
the structure of the academic curriculum and the evolution of abilities, can
thus be used to inform curriculum development in a data-driven manner and
examine learning trajectories. We conclude by discussing advantages and
disadvantages of the proposed approach with respect to more established
alternatives.