Zulip Chat Archive

Stream: general

Topic: I love the linter


Kevin Buzzard (Mar 26 2020 at 12:15):

I've seen people going on about the linters, but I didn't ever realise it was as simple as just going to the bottom of your file and typing #lint. This should be better known! I've already learnt something new about simp by looking at my first linter output.

: -/
#print polynomial_derivation.map_C_mul /- Left-hand side simplifies from
  ⇑d (C k * f)
to
  C k * ⇑d f + f * ⇑d (C k)
using
  [polynomial_derivation.map_mul]
Try to change the left-hand side to the simplified term!
 -/

I particularly love the emoji in the top line. Here's the set-up:

import tactic
import data.polynomial

structure polynomial_derivation (R : Type) [comm_ring R] :=
(to_fun : polynomial R  polynomial R) -- Delta
(map_add' :  (f g : polynomial R), to_fun (f + g) = to_fun f + to_fun g) --Delta2
(map_C_mul' :  (k : R) (f : polynomial R), to_fun (polynomial.C k * f) = polynomial.C k * to_fun f) --Delta3
(map_mul' :  (f g : polynomial R), to_fun (f * g) = f * to_fun g + g * to_fun f)

open polynomial

namespace polynomial_derivation

variables {R : Type} [comm_ring R]

/-- think of a polynomial derivation as a function from R[X] to R[X] -/
instance : has_coe_to_fun (polynomial_derivation R) :=
{ F := λ _, polynomial R  polynomial R,
  coe := to_fun -- should say `to_fun` once the derivation is defined
}

@[simp]
theorem map_add (d : polynomial_derivation R) :  {f g : polynomial R}, d (f + g) = d f + d g := d.map_add'

@[simp]
def map_C_mul (d : polynomial_derivation R):  (k : R) (f : polynomial R), d (C k * f) = C k * d f := d.map_C_mul'

@[simp]
def map_mul (d : polynomial_derivation R):  (f g : polynomial R), d (f * g) = f * d g + g * d f := d.map_mul'

Kevin Buzzard (Mar 26 2020 at 12:16):

What did I do wrong?

Mario Carneiro (Mar 26 2020 at 12:17):

map_C_mul is a special case of map_mul

Mario Carneiro (Mar 26 2020 at 12:18):

You could set the priority higher, or you could prove that d (C k) = 0 and then you will get the same result without it

Johan Commelin (Mar 26 2020 at 12:19):

Prove map_C, tag that one as simp and remove simp from map_C_mul.

Mario Carneiro (Mar 26 2020 at 12:19):

also, shouldn't you be getting a lint for marking those theorems as def?

Kevin Buzzard (Mar 26 2020 at 12:21):

This is something Shing and I are working on. It's still a WIP but nearly done.

Kevin Buzzard (Mar 26 2020 at 12:21):

Mario Carneiro said:

map_C_mul is a special case of map_mul

Only in the presence of map_C

Kevin Buzzard (Mar 26 2020 at 12:21):

So have we got the definition of the structure wrong?

Mario Carneiro (Mar 26 2020 at 12:23):

no, the lhs of map_C_mul is an instantiation of the lhs of map_mul

Mario Carneiro (Mar 26 2020 at 12:24):

that means that given input d (C k * f) they will race to see which will rewrite first

Kevin Buzzard (Mar 26 2020 at 12:24):

Yes I understand that, but the right hand side isn't, right?

Mario Carneiro (Mar 26 2020 at 12:24):

in the absence of map_C, that means it's not confluent

Kevin Buzzard (Mar 26 2020 at 12:24):

I know who wants to win that race though

Mario Carneiro (Mar 26 2020 at 12:25):

then set the priority of the winner higher

Kevin Buzzard (Mar 26 2020 at 12:25):

I have no real idea what the word confluent means.

Kevin Buzzard (Mar 26 2020 at 12:25):

Should I be changing priorities of winners, or should be aiming for full confluence compliance?

Mario Carneiro (Mar 26 2020 at 12:25):

confluent means that if A rewrites to B and A also rewrites to C, then B and C will eventually reach some common point D

Kevin Buzzard (Mar 26 2020 at 12:26):

right -- so that was exactly what didn't happen when we made all the structure fields simp lemmas

Mario Carneiro (Mar 26 2020 at 12:26):

You can make it confluent by adding map_C as a simp lemma

Kevin Buzzard (Mar 26 2020 at 12:26):

and removing simp from map_C_mul I guess

Mario Carneiro (Mar 26 2020 at 12:27):

not even

Kevin Buzzard (Mar 26 2020 at 12:27):

Do I or not?

Mario Carneiro (Mar 26 2020 at 12:27):

confluence means that they can diverge but they come back together

Mario Carneiro (Mar 26 2020 at 12:27):

if you have "shortcut" simp lemmas this can cause a confluent rewrite system even though there is not a unique path

Kevin Buzzard (Mar 26 2020 at 12:28):

Do they come back together? This will be a 50-50 chance depending on whether it's add_zero or zero_add which is definitions, right?

Kevin Buzzard (Mar 26 2020 at 12:28):

Oh no, who cares about defeq here

Mario Carneiro (Mar 26 2020 at 12:28):

d (C k * f) = C k * d f + f * d (C k) = C k * d f + f * 0 = C k * d f + 0 = C k * d f

Kevin Buzzard (Mar 26 2020 at 12:29):

There is some other relation -- simp_rel, which is I guess related to defeq

Kevin Buzzard (Mar 26 2020 at 12:29):

Right -- I don't care whether it's add_zero or zero_add which is the definitional one, I just need to make sure they are both tagged with @[simp]

Mario Carneiro (Mar 26 2020 at 12:30):

right, this is not a defeq thing

Mario Carneiro (Mar 26 2020 at 12:30):

note also that simp_rel is not symmetric like defeq

Kevin Buzzard (Mar 26 2020 at 12:30):

But I want to encourage simp to go the short way and not do the calculation you posted.

Mario Carneiro (Mar 26 2020 at 12:30):

that's what priority is for

Mario Carneiro (Mar 26 2020 at 12:31):

or you can just throw them in a bag and let them duke it out

Kevin Buzzard (Mar 26 2020 at 12:31):

So where is the AI that analyses my confluent rewriting system and says "I suggest you put the following priorities on these simp lemmas: ..."?

Mario Carneiro (Mar 26 2020 at 12:31):

hey, at least we detected the split

Kevin Buzzard (Mar 26 2020 at 12:32):

so is this hard? I thought that would be exactly the sort of thing you were good at

Mario Carneiro (Mar 26 2020 at 12:32):

figuring out that it's confluent is hard enough (undecidable?)

Kevin Buzzard (Mar 26 2020 at 12:32):

I know but the system here is tiny

Mario Carneiro (Mar 26 2020 at 12:32):

the simp set is massive

Kevin Buzzard (Mar 26 2020 at 12:32):

I know I will not be rewriting lists here, polynomials are atoms

Kevin Buzzard (Mar 26 2020 at 12:33):

I know to ignore all \le simp lemmas

Mario Carneiro (Mar 26 2020 at 12:33):

if you analyze the simplified problem, in some cases, you can prove confluence

Kevin Buzzard (Mar 26 2020 at 12:33):

I just want them off. I want to say "look, this is about polynomial rings, so you can have all polynomial ring simp lemmas and that's it, no junk about filters"

Mario Carneiro (Mar 26 2020 at 12:33):

You can also nolint it and move on with your life

Kevin Buzzard (Mar 26 2020 at 12:33):

"I don't care if someone accidentally imported topological spaces"

Johan Commelin (Mar 26 2020 at 12:34):

This is the moment where Kenny posts his remark about simpsets.

Kevin Buzzard (Mar 26 2020 at 12:34):

I'm all ears. I really felt like I learnt something new about simp today.

Mario Carneiro (Mar 26 2020 at 12:34):

the topological spaces lemmas don't apply, but it can be hard to determine that in advance

Kevin Buzzard (Mar 26 2020 at 12:35):

Thanks @Shing Tak Lam . He wrote the proofs but I did the simp labelling.

Mario Carneiro (Mar 26 2020 at 12:35):

What if your polynomial ring also happens to be a topological space?

Mario Carneiro (Mar 26 2020 at 12:36):

Simp is pretty smart about looking only at the relevant lemmas to apply, but it has the simple job, it has a term already and is rewriting it. Analyzing the simp set as a whole is much harder

Mario Carneiro (Mar 26 2020 at 12:36):

and yes, this is an active area of CS research

Mario Carneiro (Mar 26 2020 at 12:38):

if you want to read a wikipedia article, try https://en.wikipedia.org/wiki/Knuth%E2%80%93Bendix_completion_algorithm

Kevin Buzzard (Mar 26 2020 at 12:44):

Where are the docs for #lint? I want to switch all but one of them off.

Rob Lewis (Mar 26 2020 at 12:45):

https://leanprover-community.github.io/mathlib_docs/commands.html#linting%20commands

Rob Lewis (Mar 26 2020 at 12:45):

#lint only doc_blame

Kevin Buzzard (Mar 26 2020 at 12:46):

I :heart: the docs

Mario Carneiro (Mar 26 2020 at 12:49):

where's doc_thank?

Kevin Buzzard (Mar 26 2020 at 12:49):

Yeah I could just link to that

Kevin Buzzard (Mar 26 2020 at 12:50):

rofl I couldn't find it in the docs because I didn't know #lint was a command :-)

Mario Carneiro (Mar 26 2020 at 12:52):

I see a feature request: go to definition for user commands

Mario Carneiro (Mar 26 2020 at 12:53):

also hover with doc string

Kevin Buzzard (Mar 26 2020 at 12:57):

Yeah, these were the exact two things I tried before I asked here.

Gabriel Ebner (Mar 26 2020 at 13:34):

Mario Carneiro said:

figuring out that it's confluent is hard enough (undecidable?)

By Newman's lemma, confluence follows from local confluence and termination. Checking termination is undecidable (in general), but if we assume termination then we only need to check for local confluence, which is straightforward.
I've implemented this linter, but it didn't make it into the maintenance paper. You can look at the results here: https://gist.github.com/gebner/6f72773b22b18f6e1d4e4f8532aedfe8

The big issues are:

1. The implementation is really slow. It takes 3 hours to run.
2. In some cases, it takes even longer. There is a timeout implemented using try_for (shown as "timeout" in the output). But the linter framework doesn't work well with linters that fail non-deterministically (as is the case with try_for).
3. I'm not sure how useful the results are. Should all non-confluent pairs be fixed?

Mario Carneiro (Mar 26 2020 at 13:39):

Not all non-confluent pairs should be fixed. This probably gets even more dicey if we have nolint confluence lemmas and you somehow have to check confluence of everything else

Mario Carneiro (Mar 26 2020 at 13:41):

I have been working under the assumption that simp is only a best effort rewrite system with no pretensions to confluence, so I never made any serious attempt to keep the overall system confluent or nearly so

Mario Carneiro (Mar 26 2020 at 13:42):

and more to the point, as a #lint, fixing a confluence bug is not necessarily a thing you can just do. I would rather have more simplification at the cost of confluence

Kevin Buzzard (Mar 26 2020 at 13:44):

OK so here is a development of group theory in Lean. I always felt like at the beginning I was going uphill proving mul_one because the axiom was one_mul. Is this anything to do with confluent rewriting? And how do I figure out what lemmas to tag as simp? I just tried randomly -- as long as they passed the basic criteria (like being an equality rather than an implication, and the right hand side being a bit simpler than the left hand side, and everything being in canonical form) they were in.

Mario Carneiro (Mar 26 2020 at 13:44):

But the linter framework doesn't work well with linters that fail non-deterministically (as is the case with try_for).

But... but... what's all that business about (deterministic) timeout then?

Gabriel Ebner (Mar 26 2020 at 13:46):

Mario Carneiro said:

and more to the point, as a #lint, fixing a confluence bug is not necessarily a thing you can just do.

Most of the time the solution is to add more simp lemmas. It only becomes a problem if the new simp lemmas then introduce new confluence bugs. :slight_smile:

Kevin Buzzard (Mar 26 2020 at 13:48):

Hey linter

/- UNUSED ARGUMENTS: -/
#print mygroup.group.inv_eq_of_mul_eq_one /- argument 5: (h : a * b = 1) -/

what are you talking about?

Mario Carneiro (Mar 26 2020 at 13:49):

one example of a known non-confluence in simp is the rules around coercion. ↑x < ↑y <-> x < y, but also ↑(x + y) = ↑x + ↑y so ↑(a + b) < ↑c can simplify to either a + b < c or ↑a + ↑b < ↑c

Kevin Buzzard (Mar 26 2020 at 13:49):

Oh I see, a sorried proof and it complains that I didn't use the hypotheses.

Gabriel Ebner (Mar 26 2020 at 13:49):

Mario Carneiro said:

But... but... what's all that business about (deterministic) timeout then?

The deterministic part is unfortunately a bit optimistic in my experience. It just means that the timeout is not affected by how fast your machine is (or how many other programs you have running). But the timeout still depends on the state of the caches, the result of mk_fresh_name, etc.
So the linter would probably be deterministic if you run it twice on the same mathlib revision. But a tiny change could cause the linter to fail on unrelated declarations.

Gabriel Ebner (Mar 26 2020 at 13:51):

Mario Carneiro said:

one example of a known non-confluence in simp is the rules around coercion. ↑x < ↑y <-> x < y, but also ↑(x + y) = ↑x + ↑y so ↑(a + b) < ↑c can simplify to either a + b < c or ↑a + ↑b < ↑c

That's a good one. For completion, you'd need to add ↑a + ↑b < ↑c <-> a + b < c and ↑a + ↑b + ↑c < ↑d <-> a + b + c < d and so on.

Kevin Buzzard (Mar 26 2020 at 13:55):

let's get started!

Kevin Buzzard (Mar 26 2020 at 13:55):

how many do you think we'll need in practice?

Sebastian Ullrich (Mar 26 2020 at 13:56):

Obviously it's in parentheses because it's not always true. Uh.

Gabriel Ebner (Mar 26 2020 at 13:58):

Kevin Buzzard said:

OK so here is a development of group theory in Lean. [...] Is this anything to do with confluent rewriting?

Yes. In term rewriting systems, completion refers to the process of taking a set of equations (such as your group axioms) and producing a confluent and terminating set of rewrite rules (lemmas tagged as simp) with the property that every equation following from the group axioms is provable by simp. (For groups this works, and you can also do this automatically.)
There is a paper by Kaliszyk and Sternagel where they try to automatically derive a simp set in this way for all equations in the HOL light library: https://easychair.org/publications/paper/h6b

Kevin Buzzard (Mar 26 2020 at 14:00):

So simp-labelling is an art? No wonder I'm no good at it. I had assumed it was a science.

Mario Carneiro (Mar 26 2020 at 14:02):

there is a reasonably objective success metric, and an algorithm that will work in a large number of practical cases, but sometimes you have to be clever. In that sense it's like theorem proving

Kevin Buzzard (Mar 26 2020 at 14:02):

Why did I feel like I was going uphill until I'd proved mul_one and mul_right_inv, and then everything became trivial?

Patrick Massot (Mar 26 2020 at 15:34):

@Kevin Buzzard , I strongly suggests your should read this paper.

Kevin Buzzard (Mar 26 2020 at 16:06):

Thanks Patrick. Hey, they complain about list.length_singleton on p8. This feels a bit like the difference between {x} and {y | y = x}. @Mario Carneiro was talking about redefining {x} but perhaps [x] is a bit different because I can only think of cons x nil for the definition.


Last updated: Dec 20 2023 at 11:08 UTC