Zulip Chat Archive

Stream: mathlib4

Topic: simp proofs depend on order in which lemmas are tried


Joachim Breitner (May 15 2024 at 13:09):

I noticed that simp would sometimes try more general lemmas before more specific, doing unnecessary work (lean4#4173), so I experimented with a different order (lean4#4158), but (unfortunately, but not unexpectedly) breaks a fair number of proofs. Even simp only proofs, which we consider to be a rather stable, are affected!

You can see some of the changes I had to in https://github.com/leanprover-community/mathlib4/compare/nightly-testing...lean-pr-testing-4158.

There are some interesting insights here. For example can reveal that a certain lemma in the default simp set really isn’t that useful (e.g. List.length_pos, un-simp’ing proposed in #4172).

It can also reveal proofs that really only through mostly by accident, for example

@@ -1118,7 +1118,9 @@ lemma mul_div_mul_comm_of_dvd_dvd (hba : b ∣ a) (hdc : d ∣ c) :
 lemma pow_mod (a b n : ℕ) : a ^ b % n = (a % n) ^ b % n := by
   induction b with
   | zero => rfl
-  | succ b ih => simp [Nat.pow_succ, Nat.mul_mod, ih]
+  | succ b ih =>
+    conv_lhs => rw [Nat.pow_succ, Nat.mul_mod, ih]
+    conv_rhs => rw [Nat.pow_succ, Nat.mul_mod, Nat.mod_mod]

where Nat.mul_mod is an inherently looping rewrite rules.

Similarly, in

 theorem zipWith_distrib_tail : (zipWith f l l').tail = zipWith f l.tail l'.tail := by
-  rw [← drop_one]; simp [zipWith_distrib_drop]
+  simp only [← drop_one, zipWith_distrib_drop]

the original proof only worked because simp did not happen to rewrite with drop_one, as it would otherwise just undo the rw before.

I am a bit unsure what to do with these observations, though.

I would be curious if lean4#4158 yields a noticable performance improvement. But having to fix all of mathlib for that alone is probably too much work. And since I expect this to show up in particular for structurally recursive definitions with a catch-all match statement, mathlib probably won’t provide any signal at all.

Another motivation to push through fixing this branch is to discover more “issues” with the existing mathlib proofs or the default simp setup. I probably won’t be able to do that, but maybe someone else is interested in this experiment? (If so, feel free to just use my branch).

And maybe, independent of this draft PR, there may be some take away here? If math values robust proofs, could and should we lint simp proofs for non-confluence issues? Exhaustively exploring all rewrites, when more than one applies, is likely far too slow and hard to do with the simp machinery. But I could imagine an option that makes simp reverse the order of all candidates, or even a set_option simp_shuffle_seed that lets simp deterministically shuffles all candidates (of the same priority), and then CI could ensure that all simp proofs in Mathlib are robust against changes of lemma order. Would this be useful enough, given that it isn’t a particularly cheap thing to lint for, and probably often annoying.

Johan Commelin (May 15 2024 at 13:12):

Interesting experiment! Do you have an indication of how much effort fixing the branch would be?

Johan Commelin (May 15 2024 at 13:12):

As in, which fraction of mathlib did you fix, and how long did it take you?

Johan Commelin (May 15 2024 at 13:14):

Aha, I just looked at your branch. You changed ~17 files, and they are all pretty low-level. So probably there will be a lot that needs to be fixed. :oops:

Joachim Breitner (May 15 2024 at 13:14):

I touched 14 files so far, it seems that 1163 files work now, and cache still tries to download 3316 files. So maybe a few dozen proofs to look at at that rate?

Johan Commelin (May 15 2024 at 13:16):

Ooh, ok. That's better than I expected. I thought maybe ~200 files would work now.

Matthew Ballard (May 15 2024 at 13:23):

Default to filling with sorry and only fix things that contain data. Then you can bench it.

Johan Commelin (May 15 2024 at 13:26):

But isn't sorry much faster than simp + rest of proof?

Joachim Breitner (May 15 2024 at 13:27):

I can bench the two sorried’s versions against each other, to compare the impact of a change to lean4, that’s true. Does the speedcenter accept mathlib versions with sorrys?

Matthew Ballard (May 15 2024 at 13:30):

Johan Commelin said:

But isn't sorry much faster than simp + rest of proof?

Yeah, but you can still often see the signal through this. Or as @Joachim Breitner suggested, you can normalize the comp.

Yaël Dillies (May 15 2024 at 13:44):

Joachim Breitner said:

It can also reveal proofs that really only through mostly by accident, for example [...]

I myself needed a better fix for this proof ages ago. Don't remember which PR of mine this is in. Probably the best way to find out is to merge them all :stuck_out_tongue:


Last updated: May 02 2025 at 03:31 UTC