Grading JUnit 5 Java assignments using JaCoCo for computer science education

April 1, 2021

Testing the Tests: Autograding student unit tests in Java assignments (using JUnit code coverage)

In 30 seconds...

Writing effective unit tests is a step all computer scientists will face;
But how to grade JUnit unit tests? This blog shows the best way to autograde student JUnit unit tests for Java assignments and gives a step-by-step guide to automatically assessing JUnit tests in CodeGrade, using code coverage;
What is Code coverage? It measures the percentage of source code covered by the written unit tests.

All computer scientists will have to learn how to write effective unit tests at some point in their (academic) career. Almost all computer science degrees that we have come across so far teach their students how to do this. Sometimes this is already done in an introduction to programming in Java course, to lay a good foundation, but often we see this taking place in more advanced Java Software Development courses.

For this guide, I have researched the best way to autograde student JUnit unit tests for Java assignments and will explain step by step how you can use code coverage to automatically assess JUnit tests in CodeGrade. This guide will explain testing JUnit tests, but the theory and principles will also be very useful for achieving this for different programming languages and unit testing frameworks. We have covered how to do this in Python in another guide, click here to read that.

Unit Test assessment metrics

There are multiple ways in which we can effectively assess unit tests, of which code coverage and mutation testing are the two most common ones. Both serve different purposes and differ in setup complexity. For this guide, I will stick to Code Coverage testing which should be sufficient for most educational purposes, but I do want to briefly mention how effective (albeit harder to set up) mutation testing can be.

Code Coverage is the most common metric to assess unit test quality. It very simply measures what percentage of source code is covered by the unit tests that you have written. Different metrics can be used to calculate this, for instance the percentage of subroutines or the percentage of instructions of the code that the tests cover.
Mutation Testing is a more advanced metric to assess unit test quality, that goes beyond simple code coverage. During mutation testing, a tool makes minor changes to your code (known as mutations), that should break your code. It then checks whether a mutation makes your unit tests fail (this is what we want if our unit test is correct and complete) or not (meaning that our unit test was not correct or incomplete).

For most educational purposes and introductory Java assignments, using code coverage is sufficient for testing the students’ JUnit tests. As the main learning goal is to teach students the good practice of writing unit tests that cover each line or function of their code. Using CodeGrade’s continuous feedback we can very effectively motivate our students with instant feedback to go for a 100% code coverage score. However, for more advanced software engineering courses, you may want to consider using Mutation Testing, which not only measures the number of lines we cover, but also how well these lines are actually covered by our tests. This metric can be somewhat off putting for beginning students, but very useful in more advanced courses.

Start autograding all facets of your Java assignments now with CodeGrade!

Book a demo

Code Coverage Assessment for Java using JaCoCo

Setting up Code Coverage in CodeGrade’s autograder is very straightforward. The resulting percentage can be simply converted to the points we give our students.

JUnit, one of the most common unit testing frameworks for Java and supported out of the box by CodeGrade, works together very well with JaCoCo (Java Code Coverage). JaCoCo is a lightweight and flexible code coverage library that works well with tools like Ant and Maven, but also simply from the command line in CodeGrade AutoTest. Ant and Maven are supported in CodeGrade, but for this example we will stick to the CLI.

We can download the required JaCoCo .jar files (jacococli.jar and jacocoagent.jar) from https://www.eclemma.org/jacoco/ and upload them as fixtures to CodeGrade’s autograder.

As we need JUnit5 installed for JaCoCo to work, we need to install cg-junit5 in the Global Setup script using `cg-junit5 install`. In the per-student setup script to run we can now compile the student Java files and tests using `cg-junit5 compile *.java`.

Next up, we need to run our unit tests and inject JaCoCo in it, this is simply done using the Unit Test step and the command:

-!- CODE language-console -!-cg-junit5 run --java-args="-javaagent:$FIXTURES/jacocoagent.jar" -- -c TestReadSudoku

This runs JUnit5 tests with the JaCoCo agent. The test classes are selected using the `-c` flag, in this case it is only `TestReadSudoku`.

Now, all that is left to do is use JaCoCo command line interface to generate our report. The easiest way to do this is to set up a folder structure that is straightforward to read by JaCoCO. In our example, we created three folders: `src`, `tests` and `compiled`. And moved all the student’s files to the correct folders (i.e. all `.java` files to `src`, all regular `.class` files to `compiled` and the compiled tests to `tests`). I created a small bash script which I run in AutoTest to do this:

-!- CODE language-bash -!-#!/bin/bash
mkdir src tests compiled
mv Board.java Cell.java SudokuSolver.java src
mv Board.class Cell.class SudokuSolver.class compiled
mv TestReadSudoku.class tests
rm TestReadSudoku.java

We can now run JaCoCO to generate our report in a Run Program step using:

-!- CODE language-console -!-java -jar $FIXTURES/jacococli.jar report jacoco.exec --classfiles compiled --sourcefiles src --xml cov.xml

This runs the JaCoCo CLI, calls the report function and uses the `jacoco.exec` file (generated in the Unit Test step) to generate this report. It then finds all files it needs in the folders we created so neatly and finally generates the output XML in `cov.xml`.

In a subsequent Capture Points test, we can now very easily capture the code coverage using this simple python script, which we uploaded as a fixture:

-!- CODE language-python -!-import xml.etree.ElementTree as ET
tree = ET.parse('out.xml')
root = tree.getroot()
instructions = root[2].attrib
covered = int(instructions.get('covered'))
total = covered + int(instructions.get('missed'))
print(covered / total)

Full JaCoCo AutoTest category (including 2 optional steps)

Manually grading the code coverage report

Tools like JaCoCo often provide the option to generate very useful HTML code coverage reports. These reports are not only very useful for manual grading by a teacher, but also interesting to open and investigate for students too. Next to automatically parsing the code coverage percentages, we can also save this code coverage report to the $AT_OUTPUT directory so that it is visible in the CodeGrade Code Viewer for grading.

In JaCoCo, we can use the following command (very similar to the one discussed above) to generate an HTML report and save it to the `$AT_OUTPUT` directory in CodeGrade:

-!- CODE language-console -!-java -jar $FIXTURES/jacococli.jar report jacoco.exec --classfiles compiled --sourcefiles src --html $AT_OUTPUT

‍

JaCoCo HTML code coverage report rendered in the CodeGrade interface

Teaching unit testing for Java effectively

The tools explained in this guide will help you set up effective automatic grading for your Java assignments in which students have to create unit tests themselves. With CodeGrade’s continuous feedback, students can submit their code and unit tests many times and improve them after getting their instant feedback.

You can make this process even more effective by adding custom descriptions to your tests. Many of the steps we covered use relatively large and hard to understand commands, which can be rather confusing for our students. Recently, we have added the possibility to customize test descriptions in CodeGrade, which allows you to rename your tests to something that will help your students understand them better.

‍
I hope this guide has helped you set up your assignments in which you grade unit tests from students. As I mentioned, these metrics are applicable to many other programming languages, not just Java. Would you like to learn more about code coverage or mutation testing in CodeGrade? Or would you like to apply this to another language? Feel free to email me at support@codegrade.com and I’d be more than happy to help you out!