Zulip Chat Archive

Stream: general

Topic: Generalising typeclasses


Alex J. Best (Dec 15 2020 at 10:40):

I've been working on a tactic/linter to find declarations with typeclass arguments that are overly specific, in the sense that in the proof they are only ever used to produce other instances which are more general.
E.g. if I state the lemma

  lemma bad (G : Type*) [comm_group G] (n : ) (g : G) (h : g^(-n) = 1) : g^n = 1 :=
  begin
    rw [gpow_neg, inv_eq_one] at h,
    exact h,
  end

the tactic notices that the proof didn't actually use the commutative structure, only the [group G] instance found via comm_group.to_group. I hope that by making these sorts of changes we can end up with a more complete library (with less obvious small lemmas missing) with a lot less human effort than finding these generalisations by hand when they're needed, the nature of the process should be that proofs will never need fixing and so simply changing the typeclass assumptions will give us more general lemmas for free.
The potential downside is that in some cases we might end up making the typeclass system work harder than it needs to: If all the lemmas are stated in their most general form, rather than the form they are actually used in, then applying these lemmas could cause a long search every time they are applied, rather than just once at the time they are proved (e.g. if I am writing a theory of groups that nobody will ever use any of the lemmas in for semigroups/monoids then having [group G] as the assumption throughout would in theory typecheck faster than if I change half the lemmas to require semigroup and then only ever apply them with groups). I haven't really tested the impact of this yet and probably this is the sort of thing that would need to be decided on a case-by-case basis anyway.

This tactic still needs some refining to work when the implications between different typeclasses are more subtle (diamonds), to flag multiple generalisation steps at once, and to rule out more false positives. But the version I have already produces a large number of fairly interesting results, I've prepared an (unpolished) branch with a pretty random sample of some the declarations flagged by the tactic changed over at https://github.com/leanprover-community/mathlib/compare/generalising_typeclasses to give a sense of the sort of thing this is finding. Many of these are things like removing commutativity assumptions or weakening assumptions on being partial orders to preorders, fields to being rings, topological groups to just has_continuous_mul's.
For this branch I've only generalised lemmas rather than definitions as how general we should make definitions is an even trickier design question in my mind than for lemmas.

My question to the maintainers is, would you welcome a sequence of PR's making changes of the sort in the branch (say split up by folder / area to make each one easy to review)? It is quite possible that some of the changes will not be appropriate or desired, due to various design decisions in different areas of mathlib that I'm not aware of.
I'd like to keep making PR's as I improve the tactic, so as to spread the work of merging any changes out over time, and would of course like to PR the tactic itself when its complete, as I hope that like the other linters it will be useful to people as they are working on developing a new theory.

This sort of generalisation is an iterative process, as typeclasses are generalised in some lemmas, any dependent lemmas may also become generalisable in the same way so I'd rather start making PRs than end up with a branch massively diverged from mathlib as I'm tweaking things. I'd also rather get some changes made before the linter is done and PR'd to avoid ending up with a huge nolints file associated :smile:.

Gabriel Ebner (Dec 15 2020 at 10:47):

More interestingly, where is the tactic/linter code?

Alex J. Best (Dec 15 2020 at 10:49):

Its currently a complete mess, sorry! I'll try and clean it up and make it public as soon as it is actually readable.

Gabriel Ebner (Dec 15 2020 at 10:49):

Alex J. Best said:

The potential downside is that in some cases we might end up making the typeclass system work harder than it needs to: If all the lemmas are stated in their most general form, rather than the form they are actually used in, then applying these lemmas could cause a long search every time they are applied, rather than just once at the time they are proved.

This is mathlib, stating lemmas in maximal generality using type classes is an intentional design choice. (AKA, we already have this "issue" all over the place.)

Gabriel Ebner (Dec 15 2020 at 10:51):

Alex J. Best said:

For this branch I've only generalised lemmas rather than definitions as how general we should make definitions is an even trickier design question in my mind than for lemmas.

I think the answer is the same as for the unused arguments linter: put a nolint there if it's really, really what you want. But in general I'm suspicious of unnecessary type-class arguments (see also the discussion about whether nontrivial should have a has_zero argument).

Gabriel Ebner (Dec 15 2020 at 10:57):

Alex J. Best said:

This sort of generalisation is an iterative process, as typeclasses are generalised in some lemmas, any dependent lemmas may also become generalisable in the same way so I'd rather start making PRs than end up with a branch massively diverged from mathlib as I'm tweaking things. I'd also rather get some changes made before the linter is done and PR'd to avoid ending up with a huge nolints file associated :).

From a strategic point of view, what we've been doing with the other linters is exactly the opposite: PR a first weak version of the linter to mathlib and fix the linter warnings, and then improve the linter in follow-up PRs.
If you'd like to get the linter in first before fixing the issues, then changes to the nolints.txt file are also accepted (that's what it's for).

Kevin Buzzard (Dec 15 2020 at 10:58):

I am currently adding random weird typeclasses like canonical_linear_ordered_monoid or whatever; will the linter notice them when they appear and then start complaining about various other lemmas now having slightly the wrong assumptions?

Johan Commelin (Dec 15 2020 at 11:07):

Kevin, if I understand it correctly, then yes. That's what will happen.

Johan Commelin (Dec 15 2020 at 11:08):

@Alex J. Best This sounds great! Nice effort!

Eric Wieser (Dec 15 2020 at 11:09):

If that's a problem Kevin, we could always have a lint rule that automatically ignores new typeclasses by adding all violations it finds when the typeclass is first added to a nolints file; which would at least stop any new lemmas using the overly strict typeclass

Kevin Buzzard (Dec 15 2020 at 11:18):

I don't think it's a problem at all. In fact one of the problems with making new typeclasses is not knowing which lemmas have been proved for specialisations of those typeclasses which should really be proved in greater generality

Alex J. Best (Dec 15 2020 at 11:32):

Gabriel Ebner said:

This is mathlib, stating lemmas in maximal generality using type classes is an intentional design choice. (AKA, we already have this "issue" all over the place.)

Thanks, ok I will worry less about this then :smile:
Gabriel Ebner said:

From a strategic point of view, what we've been doing with the other linters is exactly the opposite: PR a first weak version of the linter to mathlib and fix the linter warnings, and then improve the linter in follow-up PRs.
If you'd like to get the linter in first before fixing the issues, then changes to the nolints.txt file are also accepted (that's what it's for).

I'm hoping to have the better version working fairly soon, and I'm hoping the output of that version will be much more useful. But if it's harder to get working than I'm hoping I'll try and PR a cleaned up basic version instead, at present there are a few too many false positives for it to be mathlib ready I think unfortunately.

Alex J. Best (Dec 15 2020 at 11:35):

Kevin Buzzard said:

I am currently adding random weird typeclasses like canonical_linear_ordered_monoid or whatever; will the linter notice them when they appear and then start complaining about various other lemmas now having slightly the wrong assumptions?

Ideally (in my mind) yes, as soon as a new class is added with the right instances to and from existing ones it should show up. It does depend a bit on the implementation though, how much searching is done and as always there is some trade off between how long it takes the linter to run and how effective it can be.

Kevin Buzzard (Dec 15 2020 at 11:42):

With something like this I would imagine there's an argument for running it once, being very effective and very slow, to see what kind of a mess we're in. This is a really cool linter! I'm pretty sure Johan would have loved it when he added groups with zero!

Floris van Doorn (Dec 15 2020 at 18:41):

This is a very cool. Writing this linter has been somewhere on my to-do list for a while, so it's really cool you actually wrote it! PRs with these changes are definitely welcome to mathlib.


Last updated: Dec 20 2023 at 11:08 UTC