Autograder for R scripts and data science assignments
February 25, 2021

Autograding R data science assignments in CodeGrade

Next to all regular Computer Science courses, CodeGrade is also very well suited for and often used in Data Science degrees. One of the most common technologies for data analysis is the programming language R, which is used in many courses taught using CodeGrade too. For instance, in the advanced Geo-information Sciences course at Wageningen University. Read their story here

All requirements to get R to run in CodeGrade’s AutoTest are therefore automatically pre-installed, making it easy to set up autograding for your R assignments in CodeGrade. In this guide, I will go over setting this up step by step and explain how you can further customize your R automatic tests.

Setting up a basic R assignment

Setting up autograding for any R programming assignment is easy in CodeGrade. As mentioned before, R is one of the programming languages that is installed on CodeGrade by default (see full list here), this means that you do not need to do any further configuration!

We simply use Rscript to run student R scripts and then use CodeGrade’s Input / Output tests to check if the student returned the correct output. I will go over calling and grading specific functions in R and grading entire R scripts in CodeGrade. For both examples we will create automatic tests for an assignment in which a student has to print the first x numbers of the fibonacci sequence.

Autograding specific functions in R

One of the easiest ways to check R functions is using the I/O Step in CodeGrade. Using Rscript, we can call a specific function with different input arguments and check if the output matches with our expected output. If it does, we award the student with a point.

For our fibonacci assignment, our students are tasked to create a function called fibonacci that takes the length of the sequence it has to print as its only argument. In our I/O test, we can now run Rscript with some arguments: 

Rscript -e "source(fibonacci.R')" -e

With the -e flag we can provide Rscript expressions to execute, which we now use to source the fibonacci.R script the student has handed in. The final -e is not a typo, this flag is used to now append the input argument of our input and output pair to, which is: “fibonacci(5)”. Which will print the output, which we then compare with our expected output which we set to [1] 1 1 2 3 5.

The above method is the easiest to set up, as it does not require creating a grading script and uploading that as a fixture. If you wish, you can of course write these two expressions in a simple R script, upload that as a fixture, and run it using Rscript testFibonacci.R. Which allows you to add more tests or upload current tests. In both cases, it is recommended to add a custom description to your test to explain to your students exactly what it is that you are grading.

CodeGrade I/O automatic R script function test

Supercharge your data science assignments using CodeGrade now!

Autograding R scripts

It is also possible to autograde entire R scripts using CodeGrade, either using standard input or by specifying exactly what the scripts should produce (without input). In our example, we tell the students to create a script fibonacci.R that prints the first 10 numbers of the fibonacci sequence. 

Grading the correctness of this script can be done with a very simple I/O Test in CodeGrade again. This time, we just have to run Rscript fibonacci.R. This runs the R script the student has uploaded. Since we do not need to give any input, we can leave the input arguments and input fields empty and just define an expected output, which is [1] 1 1 2 3 5 8 13 21 34 55 in our case.

Installing R packages in CodeGrade

In the examples above we discussed a very basic R assignment, but most advanced data analysis assignments will require students to make use of the various useful R libraries available. It is possible to install these packages in CodeGrade very easily. Packages can be installed directly by the student in their uploaded R script or beforehand by you in the configuration of AutoTest.

For both these methods, it is required to set up a directory for the user library (in R that is the R_LIBS_USER environment variable). After creating this directory correctly, you can install packages to this user in the regular way (and you do not need to run code with sudo). You can upload the following script called setup.R as a fixture and run it in the Global Setup Script using Rscript $FIXTURES/setup.R. Optionally, you can also already install required packages in this script:


dir.create(UserLib, recursive=TRUE)

install.packages(c("tidyr"), lib=UserLib)

After running at least the first two lines of the above script, students can also install R libraries directly in their scripts uploaded to CodeGrade using the regular install.packages("PACKAGE") line. Please note that if you want to allow students to install packages themselves using this method, you need to enable internet access for them in the categories you test their code in. Optionally, you may also increase the timeout of the steps, as installing R packages can take quite some time.

Both of these methods have their pros and cons. Letting your students install their own libraries allows for more freedom, but results in a longer wait before students get their automatic feedback (as we have to wait for the libraries to install). Installing these libraries up front moves that waiting time up front and will remove this for most instant feedback runs, however, you will need to know exactly which libraries students will use.

Checking R code plagiarism in CodeGrade

With the increasing popularity of data science courses, it is also more and more important to check for plagiarism in R programming assignments. Checking plagiarism in R scripts is easy and straightforward using CodeGrade. After your students have handed in their files, simply navigate to the plagiarism checker tab and select R as your language. You now even get the option to include previous assignments or upload base code that you have provided to your students.

Continue reading

Start using CodeGrade now to supercharge your feedback on code.