Students using CodeGrade to receive instant automated feedback on R programming assignments at Wageningen University

May 6, 2025

Automatic Submission Testing for R Programming: A Project from Wageningen University

In 30 seconds...

What if grading R assignments could be faster, more consistent, and actually help students learn better? At Wageningen University, Dr. Dainius Masiliūnas put this to the test—literally—by building an automated submission system with CodeGrade.

Want to learn more? Get in touch now!

If you’ve ever felt overwhelmed trying to give students meaningful feedback in a programming course, especially as class sizes grow, you’re not alone. Timely, consistent grading is tough to maintain when you’re juggling dozens or even hundreds of submissions. That’s exactly the challenge Dr. Dainius Masiliūnas from Wageningen University set out to address in a recent research project on automatic student submission testing using CodeGrade.

This wasn’t just an experiment in convenience. It was a deep dive into how automated testing could change the way we teach programming, from how fast students get feedback to how reliably we catch bugs, and even how we design courses for long-term sustainability. If you’ve been considering integrating more automation into your course, the findings here might give you a helpful roadmap.

Why the Research Happened

Programming education depends heavily on feedback loops. But as Dr. Masiliūnas saw in his own Geoscripting course, manually grading every submission simply doesn’t scale. There were long delays between student work and instructor feedback, inconsistent evaluations between graders, and major time sinks just to set up grading environments—particularly for assignments in R.

The goal of the research was to find out whether automatic testing, set up through CodeGrade, could solve these bottlenecks while still keeping students engaged and learning effectively.

Setting Up Smart Autotesting for R

The research team started by extending CodeGrade’s autotest features—which work well for Python—to support R. That turned out to be more complex than expected. A typical R environment setup involves installing about fifteen packages, which originally took around twenty minutes per virtual machine.

To speed things up, they made use of a custom package repository called r2u, hosting all the necessary packages and dependencies in one place. Once this was in place, environment setup dropped to less than thirty seconds. Every time a student pushed their code, a fresh VM loaded instantly, ran the test suite, and delivered feedback through CodeGrade—no manual setup required.

Give students feedback in an instant.

Book a demo

How Students Reacted

With automated feedback now arriving seconds after each submission, students started resubmitting more often—over 70% pushed code three times or more per assignment. That means students were using the feedback actively, not just submitting once and moving on. The autotests became part of the learning process, helping them catch mistakes early and experiment with improvements.

Importantly, these tests weren’t just about checking for correct output. They could also catch edge cases, like functions that relied on global variables or failed with unexpected inputs—things that human graders sometimes miss, especially at scale.

What It Means for You

For instructors, the biggest gains were consistency, speed, and long-term sustainability. Once the test environment was built, grading routine submissions became nearly automatic. That gave instructors more time to focus on edge cases, provide deeper feedback where it mattered, or even redesign assignments without worrying about rebuilding grading scripts from scratch.

The fact that students saw exactly what tests they passed or failed also added a level of transparency that improved the learning experience. Everyone was held to the same standards, and there were fewer surprises when grades came back.

For now, the ideal model seems to be a hybrid: use automatic submission testing for quick feedback and routine checks, and layer on human grading for more complex or creative elements.

What’s Next?

Dr. Masiliūnas is now looking to expand the system to support other languages like Python and JavaScript in addition to R. The team is also exploring ways to build in more flexible grading logic.

This research highlights not just a time-saving technique but a shift in how we think about teaching programming at scale. With the right setup, tools like CodeGrade can help you create a more interactive, consistent, and scalable learning environment, one where students don’t have to wait days to grow.

A huge thank you to Dr. Masiliūnas for sharing his research! You can read more about this project here.