Grading Jupyter Notebooks, manually and automatically
In 30 seconds...
Jupyter Notebook is a commonly used tool in computer science and programming teaching. Students can create and share documents that are a combination of Python code and text;
But it is still awkward to grade! Most teachers download the notebook, open it in Jupyter, manually review it then submit it back to the LMS;
Manually grading does make sense for Notebooks, as many include written text and graphics. In CodeGrade we render and run the code portion of the notebook directly within our interface! Just click on a line and leave feedback;
It is also very easy to setup automatic testing: unit testing or easy I/O tests are a great way to assess student’s IPython Notebooks automatically.
Jupyter Notebook, formerly known as IPython Notebook, is a fantastic and very commonly used tool in computer science and programming education. Jupyter Notebooks allow a student to create and share documents that are a combination of Python code and text, which can include equations, visualizations and narrative text. This combination makes it very powerful for the more applied computer science courses like data science, machine learning and computer graphics.
Teaching with Jupyter Notebooks has become a common practice because of all these advantages. However, grading Jupyter Notebooks (files ending with the `.ipynb` extension) is still very cumbersome. Most teachers simply download the submitted notebook, open it using the Jupyter application, manually review it, and then submit a grade back to the learning management system. A far from practical and scalable solution. In this guide, we will explain how you can more effectively manually grade Jupyter Notebooks using CodeGrade and how you can even set up automatic tests for them too.
CodeGrade makes it very easy to automatically grade and run a Jupyter Notebook in our interface, but we will start with a short section discussing grading them manually. The main reason for that is that Jupyter Notebooks are pre-eminently suited for some way of manual grading.
Jupyter Notebooks are most often chosen by instructors because they very intuitively combine written reports and code, which can interact and add to each other in a notebook. Moreover, the types of courses Jupyter Notebooks are most commonly used for result in predominantly graphs, visualizations and other graphics. The code part of notebooks is very effectively graded manually, but written text and graphics are often chosen to be graded manually.
CodeGrade makes this intuitive and highly efficient for you, as it renders and runs the Jupyter Notebook directly within our interface in your browser. This makes grading it manually extremely easy: just as you have come to expect from CodeGrade, you can simply click on any line in the notebook to leave feedback to the students. All of CodeGrade’s efficiency- and feedback-enhancing tools, think feedback snippets, rubrics and grading management, are also available for Jupyter Notebooks. This is all available as a plugin to your learning management system (LMS) too, like Canvas, Blackboard and Open edX. You can read an article here where we discuss Jupyter Notebooks in Brightspace.
Automatically running a Jupyter Notebook
When grading a Jupyter Notebook manually, it is a good practice to first run the notebook. CodeGrade can do this automatically.
To understand why this is a good practice, it is important to understand the inner workings of a Jupyter Notebook first. In essence, notebooks are simple JSON files that store the different cells in it. Next to storing these cells, the results of these cells are also saved in this JSON, meaning that students hand in a notebook in a certain state. The state of the Jupyter Notebook that students hand in is not necessarily the latest state and output can possibly be manually altered. With automatic grading, tests will check the correctness of the actual code, but with manual grading the visual results are often leading. To make sure you are grading the latest you can automatically run the notebook using CodeGrade AutoTest.
For this, we will use CodeGrade’s AutoTest Output functionality. This allows us to generate output using AutoTest, that can be displayed in our Code Viewer! CodeGrade has custom scripts that you can use to do this very easily, but using the pre-installed `jupyter` package you can do this very easily yourself too using:
Jupyter Notebooks are essentially a wrapper around Python code, and therefore very suitable for automatic grading using CodeGrade! By converting the notebook to a regular Python script, we can use all the easy autograding tools and options we normally use for Python (see this webinar on autograding Python code) on our Jupyter Notebooks too!
Luckily, the `jupyter` package has a function that allows us to convert a Jupyter Notebook to a valid Python script. This Python script is simply constructed by appending all code cells in your Jupyter Notebook. You can use the following line of code in a Run Program step to achieve this:
In the same AutoTest category as this line, you can now interact with the generated script like you would normally do. One way to do this easily is using the Input/Output Tests (I/O Tests). Before interacting with the script, you will have to import it. We will do this by opening the Python interpreter with the following command: `python3 -ic "import your_script"`. By making this the “Program to test” in your I/O Test, you will be able to interact with the script using the stdin and stdout. For example, by printing the result of a function as input and writing the expected output as output: `print(your_script.function(1, 5))`.
As the input is regular Python code you are inputting to the Python interpreter, you can call functions, do arithmetic operations and print variables. As can be seen in the examples above.
Importing Python code without printing
One thing to be aware of is that we are checking the stdout of the scripts, which are run completely when importing. As a result, students can clutter the output with additional print statements outside of functions. There’s two ways to prevent this:
Importing the script with the stdout redirected. This can be done using this little code snippet, which you run via `python3 -i import_without_print.py`:
-!- CODE language-python -!-from contextlib import redirect_stdout from os import devnull
with redirect_stdout(open(devnull, 'w')): import jupyter #<-- The name of your script
If you are providing Jupyter Notebook skeleton code to your students, you can make sure to add a `if __name__ == "__main__":` statement every time you print solutions. This way, the students can see their solutions while interacting with the notebook, but these solutions will not be printed when importing the code (using the script above or with a regular import).
Automatically grading using unit testing
Just like with any programming language, unit testing is a great way to assess student’s Jupyter Notebooks easily and automatically. The I/O tests already allow you to check individual functions and variables, but have their limitations. Unit testing Jupyter Notebooks is especially useful if you want to assess and interpret more advanced data types (Numpy arrays and pandas dataframes work well using I/O tests too!), if you need unit testing functionalities like testing for exceptions or complex tests or if you want to use random input / output testing.
A very common way to unit test Jupyter Notebooks is using a tool called nbgrader, a great yet complex tool that is often used in education. CodeGrade helps you bring the functionality of nbgrader to the cloud with our own unit testing framework `cg-jupyter-unit`, which is in open beta right now. Would you like to try out `cg-jupyter-unit` yourself? Please send an email to firstname.lastname@example.org and we’d be happy to get you started with it!
Jupyter Notebooks in CodeGrade
With more and more instructors using Jupyter Notebooks in education and academics, it is becoming increasingly important to streamline the grading process of them. With their unique format and characteristics, it is more challenging to manually grade them in an efficient way or autograde them at all. This guide provides tools to do this in CodeGrade. This list is by no means exhaustive, but provides all information required to start autograding Jupyter Notebooks for almost all programming assignments. Would you like to learn more about grading Jupyter Notebooks or do you have any questions regarding this guide? I’d be more than happy to help you out via email@example.com.
Co-founder, Product Expert
Devin is co-founder and Product Expert at CodeGrade. During his studies Computer Science and work as a TA at the University of Amsterdam, he developed CodeGrade together with his co-founders to make their life easier. Devin supports instructors with their programming courses, focusing on both their pedagogical needs and innovative technical possibilities. He also hosts CodeGrade's monthly webinar.
We can all agree, coding is hard. But, imagine if one tool could simplify the process? In this blog, Sam discusses OpenAI Codex, a platform that translates natural language into code, using AI. Could this be the start of a new era of coding?
Learn how to identify the three types of coders: students coding to understand, students coding as a skill and those learning coding as a career. We also explain the needs of these groups and how you can best tailor your code classroom to them.