Autograding the structure of code assignments for CS Education and code bootcamps using CodeGrade and Semgrep without an AST.
Guides
April 4, 2022

Webinar: Autograding Code Structure using Semgrep

In 30 seconds...

In this webinar we discussed:

  • Why you should start autograding code structure;
  • What the tool Semgrep is;
  • How Semgrep works together with CodeGrade;
  • The basics of Semgrep patterns and rules;
  • Three step by step examples of autograding code structure in CodeGrade.

Learn all about it in this article or watch the webinar here!

In our latest webinar, we tell you everything you need to know about autograding code structure using Semgrep in CodeGrade, including many practical step by step examples! This webinar was part of our monthly CodeGrade Webinars series and was recorded live on April 1st 2022 - available on-demand now.

Semgrep and CodeGrade

Traditional linters, like pylint for Python or eslint for JavaScript, are easily used in CodeGrade and great for general, broad language standards, but not for specific code structure checks. Semgrep is a tool that can do static code analysis on the structure of code, based on very simple patterns you provide it. Originally designed to find security vulnerabilities in code, Semgrep is an open-source tool by the software security company r2c (originally developed at Facebook) that supports many programming languages like Go, Java, JavaScript, Python and Ruby, with languages like PHP and C currently being beta-tested.

With Semgrep, you can use simple YAML configuration files that include patterns to look for specific structures in code. In the webinar, Devin will go over the basics of these patterns and rule files. You can also find more information in Semgrep's official documentation here: https://semgrep.dev/docs/. Using these configuration files is way easier and portable than creating your own script and parsing the AST (Abstract Syntax Tree) yourself each time you want to assess code structure.

Finally, as mentioned in the webinar, a great place to try out your patterns is using Semgrep's Playground, which can be found here: https://semgrep.dev/playground.

CodeGrade has built in support for Semgrep in it's Unit Test step and has made Semgrep into an education-ready tool. Specifically for education, we have added the `match-expected` field in the rule YAML, which you can use to look for both wanted and unwanted structures.

With CodeGrade, you can autograde every part of even the most complex code assignments. Learn more now!

Step by Step Examples

Below, you can find the example YAML configuration files that we used for the three examples in the webinar:

Example 1, checking for imports:

-!- CODE language-yaml -!-rules:
- id: pandas-import
 match-expected: false
 pattern: import pandas
 message: You are not allowed to use pandas in this simple assignment!
 severity: INFO
 languages:
   - python

The above file called `import.yml` can be uploaded as a fixture and used for your Python assignments right away.

Example 2, checking for for-loops:

In the webinar, something went wrong during the live example 2. Later we found out that this was not due to a typo, but due to a bug in Semgrep. We are currently upgrading our Semgrep installation in hopes that this will be resolved soon. This bug was however specifically for Java, and the Python YAML configuration below will work for the same purpose.

-!- CODE language-yaml -!-rules:
- id: for-loop
match-expected: true
pattern: |
  for $EL in $LST:
      ...
message: A for-loop was used
severity: INFO
languages:
  - python

- id: no-while-loop
match-expected: false
pattern: |
  while $COND:
      ...
message: No while-loop was used
severity: INFO
languages:
  - python

Example 3, checking for function and variable names:

In my opinion, this is one of the easiest yet most effective use cases of Semgrep in CodeGrade. In all your assignments for which you require your students to use specific naming in order for your (unit) tests to work, you can add a Semgrep check before those tests to check if the naming is correct. As a result, students will get a clear and helpful message from Semgrep when they made a naming mistake instead of a complicated error message from your (unit) test, preventing confusion past the deadline.

-!- CODE language-yaml -!-rules:
- id: function-name
 match-expected: true
 pattern: |
   def calculate_weight(...):
     ...
 message: You are using the function called calculate_weight().
 severity: INFO
 languages:
   - python

- id: variable-name
 match-expected: true
 pattern: bestsellers = $X
 message: You used the right variable name.
 severity: INFO
 languages:
   - python

Want to read more about Semgrep? You can also take a look at our Help Center article here!

Devin Hillenius

Devin Hillenius

Co-founder

Continue reading

New in CodeGrade: Community Library

With the CodeGrade Community Library you can easily import fully automatically graded coding assignments into your own coding course.

Watch our ChatGPT and Coding Education webinar!

Watch CodeGrade's webinar on using ChatGPT in coding courses to help students work with this new tool and even use it in your grading worklfow yourself!

Join our webinar on ChatGPT in Coding Education!

Join CodeGrade’s CEO, Youri Voet, for a webinar on the impact of ChatGPT on computer science education. Learn how to make ChatGPT-proof coding assignments, teach AI literacy and how to use ChatGPT to set up automatic testing for your coding assignments.

New features: Assignment Schedules and Asynchronous Assignments

On May 8th and May 22nd, respectively, CodeGrade will launch two exciting new features. These features are Assignment Schedules and Asynchronous Assignments.

Sign up to our newsletter

Schedule a personalized tour of CodeGrade today