Autograding the structure of code assignments for CS Education and code bootcamps using CodeGrade and Semgrep without an AST.
Guides
April 4, 2022

Webinar: Autograding Code Structure using Semgrep

In 30 seconds...

In this webinar we discussed:

  • Why you should start autograding code structure;
  • What the tool Semgrep is;
  • How Semgrep works together with CodeGrade;
  • The basics of Semgrep patterns and rules;
  • Three step by step examples of autograding code structure in CodeGrade.

Learn all about it in this article or watch the webinar here!

In our latest webinar, we tell you everything you need to know about autograding code structure using Semgrep in CodeGrade, including many practical step by step examples! This webinar was part of our monthly CodeGrade Webinars series and was recorded live on April 1st 2022 - available on-demand now.

Semgrep and CodeGrade

Traditional linters, like pylint for Python or eslint for JavaScript, are easily used in CodeGrade and great for general, broad language standards, but not for specific code structure checks. Semgrep is a tool that can do static code analysis on the structure of code, based on very simple patterns you provide it. Originally designed to find security vulnerabilities in code, Semgrep is an open-source tool by the software security company r2c (originally developed at Facebook) that supports many programming languages like Go, Java, JavaScript, Python and Ruby, with languages like PHP and C currently being beta-tested.

With Semgrep, you can use simple YAML configuration files that include patterns to look for specific structures in code. In the webinar, Devin will go over the basics of these patterns and rule files. You can also find more information in Semgrep's official documentation here: https://semgrep.dev/docs/. Using these configuration files is way easier and portable than creating your own script and parsing the AST (Abstract Syntax Tree) yourself each time you want to assess code structure.

Finally, as mentioned in the webinar, a great place to try out your patterns is using Semgrep's Playground, which can be found here: https://semgrep.dev/playground.

CodeGrade has built in support for Semgrep in it's Unit Test step and has made Semgrep into an education-ready tool. Specifically for education, we have added the `match-expected` field in the rule YAML, which you can use to look for both wanted and unwanted structures.

With CodeGrade, you can autograde every part of even the most complex code assignments. Learn more now!

Step by Step Examples

Below, you can find the example YAML configuration files that we used for the three examples in the webinar:

Example 1, checking for imports:

-!- CODE language-yaml -!-rules:
- id: pandas-import
 match-expected: false
 pattern: import pandas
 message: You are not allowed to use pandas in this simple assignment!
 severity: INFO
 languages:
   - python

The above file called `import.yml` can be uploaded as a fixture and used for your Python assignments right away.

Example 2, checking for for-loops:

In the webinar, something went wrong during the live example 2. Later we found out that this was not due to a typo, but due to a bug in Semgrep. We are currently upgrading our Semgrep installation in hopes that this will be resolved soon. This bug was however specifically for Java, and the Python YAML configuration below will work for the same purpose.

-!- CODE language-yaml -!-rules:
- id: for-loop
match-expected: true
pattern: |
  for $EL in $LST:
      ...
message: A for-loop was used
severity: INFO
languages:
  - python

- id: no-while-loop
match-expected: false
pattern: |
  while $COND:
      ...
message: No while-loop was used
severity: INFO
languages:
  - python

Example 3, checking for function and variable names:

In my opinion, this is one of the easiest yet most effective use cases of Semgrep in CodeGrade. In all your assignments for which you require your students to use specific naming in order for your (unit) tests to work, you can add a Semgrep check before those tests to check if the naming is correct. As a result, students will get a clear and helpful message from Semgrep when they made a naming mistake instead of a complicated error message from your (unit) test, preventing confusion past the deadline.

-!- CODE language-yaml -!-rules:
- id: function-name
 match-expected: true
 pattern: |
   def calculate_weight(...):
     ...
 message: You are using the function called calculate_weight().
 severity: INFO
 languages:
   - python

- id: variable-name
 match-expected: true
 pattern: bestsellers = $X
 message: You used the right variable name.
 severity: INFO
 languages:
   - python

Want to read more about Semgrep? You can also take a look at our Help Center article here!

Devin Hillenius

Devin Hillenius

Co-founder

Continue reading

CodeGrade Announces Partnership with Pearson to Transform Coding Education

Today, CodeGrade announced a partnership with Pearson to deliver an enhanced technology for educators to better serve students.

Why Data Security Matters in Academia: Safeguarding Your Digital Assets

We discuss protecting sensitive data in higher education: safeguarding student confidentiality, research integrity, and fostering trust.

The Importance of Engagement in Your Introductory Programming Course

Unlock the significance of crafting engaging introductory programming classes to captivate learners' attention and inspire active participation. Explore how active learning techniques elevate comprehension and knowledge retention.

ISO27001 Certification Announcement!

Our ISO27001 certification allows us to remain committed in our desire to protect your sensitive information.

Sign up to our newsletter

Schedule a personalized tour of CodeGrade today