Autograding the structure of code assignments for CS Education and code bootcamps using CodeGrade and Semgrep without an AST.
Guides
April 4, 2022

Webinar: Autograding Code Structure using Semgrep

In 30 seconds...

In this webinar we discussed:

  • Why you should start autograding code structure;
  • What the tool Semgrep is;
  • How Semgrep works together with CodeGrade;
  • The basics of Semgrep patterns and rules;
  • Three step by step examples of autograding code structure in CodeGrade.

Learn all about it in this article or watch the webinar here!

In our latest webinar, we tell you everything you need to know about autograding code structure using Semgrep in CodeGrade, including many practical step by step examples! This webinar was part of our monthly CodeGrade Webinars series and was recorded live on April 1st 2022 - available on-demand now.

Semgrep and CodeGrade

Traditional linters, like pylint for Python or eslint for JavaScript, are easily used in CodeGrade and great for general, broad language standards, but not for specific code structure checks. Semgrep is a tool that can do static code analysis on the structure of code, based on very simple patterns you provide it. Originally designed to find security vulnerabilities in code, Semgrep is an open-source tool by the software security company r2c (originally developed at Facebook) that supports many programming languages like Go, Java, JavaScript, Python and Ruby, with languages like PHP and C currently being beta-tested.

With Semgrep, you can use simple YAML configuration files that include patterns to look for specific structures in code. In the webinar, Devin will go over the basics of these patterns and rule files. You can also find more information in Semgrep's official documentation here: https://semgrep.dev/docs/. Using these configuration files is way easier and portable than creating your own script and parsing the AST (Abstract Syntax Tree) yourself each time you want to assess code structure.

Finally, as mentioned in the webinar, a great place to try out your patterns is using Semgrep's Playground, which can be found here: https://semgrep.dev/playground.

CodeGrade has built in support for Semgrep in it's Unit Test step and has made Semgrep into an education-ready tool. Specifically for education, we have added the `match-expected` field in the rule YAML, which you can use to look for both wanted and unwanted structures.

With CodeGrade, you can autograde every part of even the most complex code assignments. Learn more now!

Step by Step Examples

Below, you can find the example YAML configuration files that we used for the three examples in the webinar:

Example 1, checking for imports:

-!- CODE language-yaml -!-rules:
- id: pandas-import
 match-expected: false
 pattern: import pandas
 message: You are not allowed to use pandas in this simple assignment!
 severity: INFO
 languages:
   - python

The above file called `import.yml` can be uploaded as a fixture and used for your Python assignments right away.

Example 2, checking for for-loops:

In the webinar, something went wrong during the live example 2. Later we found out that this was not due to a typo, but due to a bug in Semgrep. We are currently upgrading our Semgrep installation in hopes that this will be resolved soon. This bug was however specifically for Java, and the Python YAML configuration below will work for the same purpose.

-!- CODE language-yaml -!-rules:
- id: for-loop
match-expected: true
pattern: |
  for $EL in $LST:
      ...
message: A for-loop was used
severity: INFO
languages:
  - python

- id: no-while-loop
match-expected: false
pattern: |
  while $COND:
      ...
message: No while-loop was used
severity: INFO
languages:
  - python

Example 3, checking for function and variable names:

In my opinion, this is one of the easiest yet most effective use cases of Semgrep in CodeGrade. In all your assignments for which you require your students to use specific naming in order for your (unit) tests to work, you can add a Semgrep check before those tests to check if the naming is correct. As a result, students will get a clear and helpful message from Semgrep when they made a naming mistake instead of a complicated error message from your (unit) test, preventing confusion past the deadline.

-!- CODE language-yaml -!-rules:
- id: function-name
 match-expected: true
 pattern: |
   def calculate_weight(...):
     ...
 message: You are using the function called calculate_weight().
 severity: INFO
 languages:
   - python

- id: variable-name
 match-expected: true
 pattern: bestsellers = $X
 message: You used the right variable name.
 severity: INFO
 languages:
   - python

Want to read more about Semgrep? You can also take a look at our Help Center article here!

Devin Hillenius

Devin Hillenius

Co-founder, Product Expert
Devin is co-founder and Product Expert at CodeGrade. During his studies Computer Science and work as a TA at the University of Amsterdam, he developed CodeGrade together with his co-founders to make their life easier. Devin supports instructors with their programming courses, focusing on both their pedagogical needs and innovative technical possibilities. He also hosts CodeGrade's monthly webinar.

Continue reading

Teaching Intro to Python with CodeGrade

The CodeGrade Introduction to Python course is an 8-week basic Python course. Students are not required to have any prior knowledge on programming or Python. This course will cover the basic concepts of programming up to Python specific modules and OOP design. It is available for all instructors now.

New release CodeGrade QuietStorm.1!

Find out about all the features and updates coming with our latest release, QuietStorm.1!

New release CodeGrade QuietStorm.1!

Automatically grading Haskell code assignments

Learn about autograding Haskell coding assignments for Computer Science education courses. CodeGrade can help you use tools like input and output checking, Quickcheck, Tasty and HUnit unit test autograding, HLint code quality checking and code structure autograding using semgrep.

Top tips for teaching programming

The most efficient ways to teach students how to code, by defining your target audience, designing your assignments in a meaningful way and eliminating distractions in the classroom.

Top tips for teaching programming

Learn more about CodeGrade!

Grow your coding classroom
without compromise.