Zulip Chat Archive

Stream: Lean for teaching

Topic: auto-grading


Matthew Ballard (Dec 03 2021 at 16:25):

I was scheduled to teach an undergraduate class based on the Hitchhiker's guide for the spring semester. Unfortunately, 8:30 am is attractive time slot for students. Who knew. It's been pushed back to fall '22.

One of things I wanted to do was set up GitHub actions for auto-grading of Lean assignments. Something reusable would be nice. It looks straightforward to write some simple checks on the file.

  1. Do people have any existing setups?
  2. Is there any value in hardcore locking things down?

Heather Macbeth (Dec 03 2021 at 16:33):

I would love to have this, too.

Heather Macbeth (Dec 03 2021 at 16:35):

What I have done before is have students email me links to their code in the web editor (like this, cc @Andrew Souther ), so that I just click on the link and can confirm it works.

Matthew Ballard (Dec 03 2021 at 16:37):

For my previous Lean course, I used copy-pasta from MS Teams. This semester I got familiar with GitHub Classroom. I only use it to auto-generate pdfs from the tex source in my grad course the moment.

Heather Macbeth (Dec 03 2021 at 16:39):

I've been thinking of using Gradescope for this (I really like their features for handwritten assignments), but it's for-pay.
https://help.gradescope.com/article/h7ztxl9164-instructor-assignment-types#programming_assignments

Matthew Ballard (Dec 03 2021 at 16:40):

It looks functionally equivalent to GH Classroom's offering. Docs

Matthew Ballard (Dec 03 2021 at 16:44):

Which is really just some shortcuts for GH actions

Heather Macbeth (Dec 03 2021 at 16:44):

I'm worried about expecting my math majors to understand git, remote authentication, etc ... do you know what the user experience is like? (Can they just point-and-click upload a file?)

Matthew Ballard (Dec 03 2021 at 16:46):

I took the position that knowing git is a job skill even for math PhDs and then forced them to use it. GH has made it pretty painless especially if they use the GUIs for MacOS/Win.

Matthew Ballard (Dec 03 2021 at 16:47):

I also had a homework 0 which was git/GH training. Here is the organization page for the class.

Matthew Ballard (Dec 03 2021 at 16:48):

The main features I most enjoy about GH Classroom are

  1. the actions integration
  2. autocreation of PRs for each submitted assignment. It is so much easier to comment directly on a line of a file.

But overall it is pretty bare bones.

Matthew Ballard (Dec 03 2021 at 16:51):

@Rob Lewis mentioned he had some existing tools.

Rob Lewis (Dec 03 2021 at 17:21):

Brown has a Gradescope subscription, so this semester I used that. My TAs tried to set up a simple autograder, but it was clunky, and they hadn't learned to metaprogram so it would have fallen on me to do it proplerly. I decided to save myself the effort and wait until next time I teach the class, when my TAs will know some metaprogramming and can do something more robust.

Rob Lewis (Dec 03 2021 at 17:23):

The problem is that (depending on how you structure the assignments) a check for "does this file compile" is both too strict and too weak. You may want to allow students to sorry certain lemmas, or if every problem is in the same file, you'd need extra code to distinguish in which problem the comilation error happens.

Rob Lewis (Dec 03 2021 at 17:24):

Conversely, you'd want to check that nobody defines an unsound axiom and uses that as a sorry that doesn't raise any warnings.

Rob Lewis (Dec 03 2021 at 17:25):

Really you want a metaprogram grading script that checks for a declaration with a certain name and type, and computes which axioms it uses. But even this isn't enough, if you have problems that ask students to state a theorem on their own -- there's probably a range of acceptable answers.

Rob Lewis (Dec 03 2021 at 17:28):

I have some scripts for a course based on Logic and Proofs. Here the assignment structure was much simpler: there were two Lean assignments, each with 20 .lean files, each containing one sorryed theorem. So the autograder would import all 20 files, then run a metaprogram to check for sorry-free proofs with the right type.

Rob Lewis (Dec 03 2021 at 17:29):

Of course students still found ways to break that, out of ~150 submissions I'd have 15 that removed namespaces, or proved auxiliary lemmas with the same name in multiple files outside of the problem namespace, or whatever.

Rob Lewis (Dec 03 2021 at 17:30):

And this wasn't set up on GitHub or Gradescope. We used CoCalc so I'd just download a folder with all the submissions and the script would output a spreadsheet I could upload to Canvas.

Matthew Ballard (Dec 03 2021 at 17:37):

Rob Lewis said:

Of course students still found ways to break that, out of ~150 submissions I'd have 15 that removed namespaces, or proved auxiliary lemmas with the same name in multiple files outside of the problem namespace, or whatever.

Presumably, students would self-remedy if given immediate feedback on the assignment?

Rob Lewis (Dec 03 2021 at 17:41):

For sure. This wasn't doable with the CoCalc setup, I didn't know how to distribute an autograder without exposing the solutions. Certainly possible on Gradescope, and I assume on GitHub.

Rob Lewis (Dec 03 2021 at 17:42):

Er, actually, I guess the autograder for the L&P course didn't need to know the solutions. I take it back

Johan Commelin (Dec 03 2021 at 17:47):

It should be possible to write an autograder for Fermat's Last Theorem without knowing the solution :wink:

Matthew Ballard (Dec 03 2021 at 17:47):

Rob Lewis said:

I have some scripts for a course based on Logic and Proofs. Here the assignment structure was much simpler: there were two Lean assignments, each with 20 .lean files, each containing one sorryed theorem. So the autograder would import all 20 files, then run a metaprogram to check for sorry-free proofs with the right type.

Other than perhaps checking for some dubious axioms and some more descriptive feedback, I didn't imagine more than this.

I have no idea how to handle assignments where students roll their own statements.

Matthew Ballard (Dec 03 2021 at 17:49):

Another answer to what computers can do for mathematicians: save them from reading non-sense proofs. If a student can both go to a blackboard and explain a statement and prove it in lean, I have decent confidence they can write a solid natural language proof.

Patrick Massot (Dec 03 2021 at 18:42):

Matthew Ballard said:

Another answer to what computers can do for mathematicians: save them from reading non-sense proofs. If a student can both go to a blackboard and explain a statement and prove it in lean, I have decent confidence they can write a solid natural language proof.

That is extremely optimistic in my experience. That's why I developed my controlled natural language tactics, in order to help students transfer their Lean proofs to paper.

Matthew Ballard (Dec 03 2021 at 18:44):

But they could stand at a board and explain things well?

Apurva Nakade (Dec 03 2021 at 20:39):

Patrick Massot said:

Matthew Ballard said:

Another answer to what computers can do for mathematicians: save them from reading non-sense proofs. If a student can both go to a blackboard and explain a statement and prove it in lean, I have decent confidence they can write a solid natural language proof.

That is extremely optimistic in my experience. That's why I developed my controlled natural language tactics, in order to help students transfer their Lean proofs to paper.

@Patrick Massot Are these controlled NL tactics a part of mathlib? I would love to have these for teaching students as well.

Johan Commelin (Dec 03 2021 at 20:41):

They are not part of mathlib atm

Johan Commelin (Dec 03 2021 at 20:41):

https://github.com/PatrickMassot/lean-bavard

Rob Lewis (Dec 03 2021 at 20:47):

I don't think they're even in English yet

Rob Lewis (Dec 03 2021 at 20:47):

But a repo with these tactics would be a great addition to https://leanprover-community.github.io/lean_projects.html

Rob Lewis (Dec 03 2021 at 20:47):

And a course repo could use that project as a dependency as well as mathlib

Apurva Nakade (Dec 03 2021 at 20:49):

Thanks a lot! Oh wow, these are a lot of tactics!!

I'll try and translate these to English over the winter break.

Apurva Nakade (Dec 03 2021 at 20:50):

Is there any way to use these in the lean web editor or something web-based that does not require a lean installation?

Johan Commelin (Dec 03 2021 at 20:51):

Rob Lewis said:

I don't think they're even in English yet

There is a partial English translation. I don't know if it is on github. But there is a talk on Youtube by Patrick from a month ago, where he does an english demo

Rob Lewis (Dec 03 2021 at 20:54):

Apurva Nakade said:

Is there any way to use these in the lean web editor or something web-based that does not require a lean installation?

I guess you'd probably have to set up your own web editor and change the configuration slightly -- maybe here? @Bryan Gin-ge Chen would know better

Bryan Gin-ge Chen (Dec 03 2021 at 21:21):

Yeah, that's right. The leanprover-community/lean-web-editor repo README hopefully has enough information to get you started, otherwise feel free to ping me here.

Apurva Nakade (Dec 03 2021 at 22:36):

Great! I'll start hacking and see how it goes...

Patrick Massot (Dec 04 2021 at 13:19):

I know I need to find time to work on this. But time is extremely scarce. I need to clean up things to make them easier to translate. And there are additional challenges with English because Lean already uses some English words that can't be reused.

Kevin Buzzard (Dec 04 2021 at 14:11):

One issue with the English clash is that mathematicians seem to use "apply" to mean so many different things. I think it's a really hard tactic to teach! Not because what it does is difficult, but because even if you know what it does, it is still a word which pops into your head every time you want to apply a theorem (which might be an iff statement) or apply a theorem to a hypothesis, or basically use any technique at all on any part of the local context by applying the technique.

Mac (Dec 04 2021 at 15:30):

Kevin Buzzard said:

you want to apply a theorem (which might be an iff statement) or apply a theorem to a hypothesis, or basically use any technique at all on any part of the local context by applying the technique.

I feel like it may be possible to generalize apply to these cases as well. For example, one could add an apply f at h to apply functions to local hypotheses. Also, an iff statement could potentially have a coe_fun instance that coerces it to mp for apply.

Johan Commelin (Dec 04 2021 at 15:40):

@Mac I've seen examples where new users wanted to apply and iff where they meant rw. I completely understand why they might want to use apply there, because that's the word you would use informally. I think that "rewrite" is almost never used in informal maths.

Mac (Dec 04 2021 at 15:44):

@Johan Commelin fair enough. In that case there is an unfortunate clash between functional terminology and proof terminology that does not have an easy answer. If one was writing more natural tactics it may be a good idea to rename apply to fapply, apply_fun, or apply function and use plain apply for rewrite/rw (or potentially simp).

Apurva Nakade (Dec 04 2021 at 22:46):

I agree with these points. I myself find it hard to use apply and I only ever use it retrospectively to compactify. Perhaps suffices is a replacement? rw and ext are words that don't translate easily into math. And I don't quite get the differences betweenlet set and have lol.

I still think it would be worthwhile to try and get NL tactics, even if they end up being clumsy and inefficient.

Apurva Nakade (Dec 04 2021 at 22:50):

I'll try to translate at least a few of Patrick's tactics. I really want to see some basic proof in Lean being more or less indistinguishable from a math proof on paper.

Patrick Massot (Dec 04 2021 at 23:06):

For teaching purposes, my experience is that the vagueness of the "apply" word is indeed extremely problematic (we have the same issue in French). And I also agree that rw doesn't translate well on paper. I'm not happy with my NL version of rw which doesn't sound natural (because nothing would).

Patrick Massot (Dec 04 2021 at 23:08):

About let and have, I sometimes wish Lean would distinguish between those. It's a trap that almost all beginners fall into. But for teaching purposes I'm very happy that Lean distinguishes them. I don't want my students to mix defining data and asserting a statement.


Last updated: Dec 20 2023 at 11:08 UTC