Automatically grading Jupyter (IPython) Notebooks for university courses
April 2, 2021

Grading Jupyter Notebooks, manually and automatically

In 30 seconds...

  • Jupyter Notebook is a commonly used tool in computer science and programming teaching. Students can create and share documents that are a combination of Python code and text;
  • But it is still awkward to grade! Most teachers download the notebook, open it in Jupyter, manually review it then submit it back to the LMS;
  • Manually grading does make sense for Notebooks, as many include written text and graphics. In CodeGrade we render and run the code portion of the notebook directly within our interface! Just click on a line and leave feedback;
  • It is also very easy to setup automatic testing: unit testing or easy I/O tests are a great way to assess student’s IPython Notebooks automatically.

Jupyter Notebook, formerly known as IPython Notebook, is a fantastic and very commonly used tool in computer science and programming education. Jupyter Notebooks allow a student to create and share documents that are a combination of Python code and text, which can include equations, visualizations and narrative text. This combination makes it very powerful for the more applied computer science courses like data science, machine learning and computer graphics. 

Teaching with Jupyter Notebooks has become a common practice because of all these advantages. However, grading Jupyter Notebooks (files ending with the `.ipynb` extension) is still very cumbersome. Most teachers simply download the submitted notebook, open it using the Jupyter application, manually review it, and then submit a grade back to the learning management system. A far from practical and scalable solution. In this guide, we will explain how you can more effectively manually grade Jupyter Notebooks using CodeGrade and how you can even set up automatic tests for them too.

Prefer to watch a webinar? All of the techniques mentioned in this blog can also be found in our webinar on Python and Jupyter Notebook autograding.

Manual grading

CodeGrade makes it very easy to automatically grade and run a Jupyter Notebook in our interface, but we will start with a short section discussing grading them manually. The main reason for that is that Jupyter Notebooks are pre-eminently suited for some way of manual grading. 

Jupyter Notebooks are most often chosen by instructors because they very intuitively combine written reports and code, which can interact and add to each other in a notebook. Moreover, the types of courses Jupyter Notebooks are most commonly used for result in predominantly graphs, visualizations and other graphics. The code part of notebooks is very effectively graded manually, but written text and graphics are often chosen to be graded manually.

CodeGrade makes this intuitive and highly efficient for you, as it renders and runs the Jupyter Notebook directly within our interface in your browser. This makes grading it manually extremely easy: just as you have come to expect from CodeGrade, you can simply click on any line in the notebook to leave feedback to the students. All of CodeGrade’s efficiency- and feedback-enhancing tools, think feedback snippets, rubrics and grading management, are also available for Jupyter Notebooks. This is all available as a plugin to your learning management system (LMS) too, like Canvas, Blackboard and Open edX. You can read an article here where we discuss Jupyter Notebooks in Brightspace.

Automatically running a Jupyter Notebook

When grading a Jupyter Notebook manually, it is a good practice to first run the notebook. CodeGrade can do this automatically. 

To understand why this is a good practice, it is important to understand the inner workings of a Jupyter Notebook first. In essence, notebooks are simple JSON files that store the different cells in it. Next to storing these cells, the results of these cells are also saved in this JSON, meaning that students hand in a notebook in a certain state. The state of the Jupyter Notebook that students hand in is not necessarily the latest state and output can possibly be manually altered. With automatic grading, tests will check the correctness of the actual code, but with manual grading the visual results are often leading. To make sure you are grading the latest you can automatically run the notebook using CodeGrade AutoTest.

For this, we will use CodeGrade’s AutoTest Output functionality. This allows us to generate output using AutoTest, that can be displayed in our Code Viewer! CodeGrade has custom scripts that you can use to do this very easily, but using the pre-installed `jupyter` package you can do this very easily yourself too using:

-!- CODE language-shell -!-jupyter nbconvert --execute --to notebook
   --output $AT_OUTPUT/jupyter.ipynb $STUDENT/jupyter.ipynb
   --allow-errors

By adding this line to a Run Program step in your AutoTest, you can generate the run Jupyter Notebook in the AutoTest output folder to manually review!


The automatically run Jupyter Notebook in the Code Viewer


Start autograding your Jupyter Notebook assignments now with CodeGrade!

Continue reading

What's happening with GitHub Classroom?

GitHub Classroom updates have slowed and GitHub now points instructors to Codio. Here's what has actually changed in 2026, what professors are reporting, and what it means for your fall planning.

Best Paid Autograders for University Programming Courses (2026)

A side-by-side comparison of the best paid autograders for university programming courses in 2026 — CodeGrade, Gradescope, Codio, and Vocareum — covering pricing, features, and LMS integration.

Best Autograders for University Programming Courses You Can Start Using for Free (2026)

A practical comparison of six free autograders for university programming courses in 2026 — including CodeGrade, GitHub Classroom, Gradescope, Autograder.io, Otter Grader, and nbgrader.

Sign up to our newsletter