Use semgrep and CodeGrade to autograde structure of code assignments of IT education

January 27, 2021

Autograding code structure using CodeGrade and Semgrep

In 30 seconds...

I often get requests from teachers who want to automatically assess the structure of their students’ code using CodeGrade, for assignments in courses ranging from Introduction to Programming to Data Structures or other more advanced courses. Especially for larger, more free-form, assignments, automatically grading structure becomes difficult due to the large number of different possible solutions. However, almost like how you would use a linter (a tool to assess code style based on rules in a style guide), we can automatically detect code structure and add or deduct points for desired structures or bad practices in student code effectively with CodeGrade.

In this blog post, I will explain how you can very easily set up automatic structure testing for your CodeGrade assignments using a very handy tool called semgrep and give you some concrete examples you can use in your own CodeGrade assignments!

Real world scenarios

To better understand when you may want to automatically grade code structure and which problems we are solving in this blog, I will go over two real requests I got from computer science instructors:

For an Introduction to Programming in Python course, an instructor wants to effectively teach students the different types of loops. She has multiple assignments in the course, focusing on different loops. To force students to use a while-loop for one assignment and to use a for-loop for another one, she wants to automatically detect these structures in the code and deduct points if a loop is missing and if the wrong type of loop was used.
For numerous programming courses, instructors want to enforce good coding practices that are not caught by traditional linters. For this, they want to deduct points for common bad “spaghetti code” practices. In our example, we will automatically deduct points for Java code with too many if-statements.

Of course, there are endless possibilities with the tools I explain in this blog and you are encouraged to translate the examples to fit your own assignments or programming languages.

Semgrep

Traditional linters, like pylint for Python or eslint for JavaScript, are easily used in CodeGrade and great for general, broad language standards, but not for specific code structure checks. Semgrep is a tool that can do static code analysis on the structure of code, based on very simple patterns you provide it. Originally designed to find security vulnerabilities in code, Semgrep is an open-source tool by the software security company r2c (originally developed at Facebook) that supports many programming languages like Go, Java, JavaScript, Python and Ruby, with TypeScript, PHP and C currently being beta-tested. Semgrep can also be used for Jupyter Notebooks, after converting the notebook to python code. Learn how to do that in our blog on grading Jupyter Notebooks.

Semgrep makes it surprisingly easy to perform more complex code analysis by allowing you to write rules in a human readable format. You can provide generic or language specific patterns, which are then found in the code. With its pattern syntax, you can find:

Equivalences: Matching code that means the same thing even though it looks different.
Wildcards / ellipsis (...): Matching any statement, expression or variable.
Metavariables ($X): Matching unknown expressions that you do not yet know what they will exactly look like, but want to be the same variable in multiple parts of your pattern.

To make semgrep effective for educational purposes, we have created a wrapper script around semgrep in CodeGrade. This wrapper script makes it work beautifully in our Unit Test step, so that each individual rule you define will show up as one specific Unit Test that can either pass or fail. What’s more, we have added the necessary feature that allows for “positive matches”: when we define a pattern and do expect to find (e.g. if we enforce users to use a for-loop). By default, all patterns and found matches in semgrep are considered errors. This wrapper script called `cg-semgrep` is automatically installed on AutoTest.

Let CodeGrade help you fully autograde all parts of your code assignments!

Book a demo

Semgrep patterns for education

After installing semgrep in the “Global setup script to run” in our AutoTest (using `cg-semgrep install`) and uploading YAML tests file as a fixture, we can start to write the patterns we need for our assignments. Semgrep has a very nifty tool that lets us try and create patterns online: https://semgrep.dev/editor. Let’s do that by going over the requests above.

Detecting loops in Python

I have created the first ruleset for the assignment for which we want to detect a for-loop in the student code, and make sure they do not use a while-loop.

-!- CODE language-yaml -!-rules:
- id: for-loop
match-expected: true
pattern: |
for $EL in $LST:
...
message: A for-loop was used
severity: INFO
languages:
- python

- id: no-while-loop
match-expected: false
pattern: |
while $COND:
...
message: No while-loop was used
severity: INFO
languages:
- python

The file rules.yml features two rules: for-loop and no-while-loop. I use the ellipsis (`...`) to capture anything and metavariables `$EL` (element) and `$LST` (list) to capture the two parts of the for-loop, the naming of these metavariables is irrelevant and could have been anything else. We defined and tested our patterns for both, to simply match with respectively a for-loop and a while-loop (semgrep has an awesome online editor to test your patterns: https://semgrep.dev/editor/). Also, I have given understandable messages for both rules, as these messages will be printed to the student as the name of the test. Finally, but most importantly, we have added the match-expected field (this is added in the CodeGrade wrapper script and cannot be tested in regular semgrep), with putting this field to `True` for the for-loop rule, we specify that we are expecting a match in order to pass that test.

Semgrep results in a CodeGrade automatic test

Detecting spaghetti code

For our final request, we would like to automatically deduct points for common bad practices. More concretely for one of the most common bad practices: too many nested if-statements. Luckily, structures like this are very easy to catch with semgrep in CodeGrade:

-!- CODE language-yaml -!-rules:
- id: nested-if
match-expected: false
pattern: |
if (...) {
...
if (...) {
...
if (...) {
...
}
}
}
message: No more than 2 nested if-statements were used.
severity: INFO
languages:
- java

We use the ellipses (the `…` in the pattern) here, so that any code can be in the condition and inside the blocks of the if-statements. As an if-statement itself can also be part of this, we automatically catch code that has more than three nested if-statements with this pattern too.

Adding semgrep to your CodeGrade assignment

CodeGrade AutoTest offers you endless possibilities and full flexibility for your autograding workflow. You can use the tools and tips explained in this blog in combination with I/O Testing or Unit Testing to check the functionality of student code or together with Code Quality tests to assess the quality and style of student code. The rubrics in CodeGrade will make sure all these tests work together well and are clear to your students.

‍
CodeGrade’s structure tests using semgrep can help you autograde desired or unwanted structures and snippets in your students’ code. Would you like to learn more about doing this in your assignment? Or would you like to use this for a different purpose than explained in this blog, feel free to email me at support@codegrade.com and I’d be more than happy to help out!

‍