Research

Evidence behind the platform

Everything we build is grounded in research on how oral assessment reveals understanding — and how it doesn't. We publish our findings openly and hold ourselves accountable to the evidence.

What we ask

Does oral assessment surface understanding the essay alone hides?

How we test

Are AI scores within the same range as a panel of experienced teachers?

What we watch

Do conversation scores differ across language background, disability, or socioeconomic indicators?

What we share

Every rubric, every prompt, every model decision — open to partner schools.

Working Papers

What we are studying

Oral Assessment as an Integrity Signal: Evidence from AI-Mediated Viva Sessions

Viva Research · Internal working paper

How conversation response patterns differ between students who submit AI-generated work and students who wrote their own — across secondary and tertiary institutions. We document the signals our scoring picks up on, the rates we observe, and where the method breaks down. Submitted to partner schools for review.

Authenticity Scoring: Designing AI Metrics That Teachers Trust

Viva Research · Internal working paper

A grounded study of how teachers evaluate authenticity in student speech. We map the behavioural signals teachers use — vocabulary register shifts, hedging, over-specificity — into a trainable rubric validated against expert human raters.

Does the Conversation Change the Student? Longitudinal Effects of Oral Assessment

Viva Research · Pilot data

A follow-up on students who completed Viva conversations across multiple assignment cycles. Early evidence suggests that knowing an oral conversation is coming changes how students engage with source material before submission.

How We Work

Our research principles

Transparent methods

We document every scoring rubric, every prompt, every model decision in our methodology appendix — available to partner schools on request.

Teacher calibration

AI scores are calibrated against a panel of experienced teachers and updated when teacher-AI disagreement exceeds thresholds.

Adverse impact monitoring

We actively test for score disparities across language background, disability status, and socioeconomic indicators. If we find them, we investigate them.

No closed-loop grading

Viva scores are advisory signals. Final grade decisions remain with the teacher. We are explicit about this in every interface.

Partner with our research team

We collaborate with universities, school districts, and independent researchers working on assessment, integrity, and AI in education.

Reach out →