Zulip Chat Archive

Stream: PR reviews

Topic: lean#211 don't unfold irred defs


Johan Commelin (May 04 2020 at 14:51):

Continuing discussion from https://leanprover.zulipchat.com/#narrow/stream/116395-maths/topic/witt.20vectors

I suppose this PR might be somewhat controversial.

  1. It seems that it breaks mathlib in ways where before we could get away with _ and let Lean figure it out, but now we need to supply data ourselves.
  2. It's not yet clear whether this will give a speedup of slowdown of the build.
  3. Without this PR, it seems that Witt vectors are "doomed".

Kevin Buzzard (May 04 2020 at 14:53):

Are any of the core Lean experts in a position to explain why the PR makes Lean more stupid, and whether anything can be done about this stupidity whilst still enabling the Witt vectors work to compile in a reasonable amount of time?

Patrick Massot (May 04 2020 at 14:54):

I'm very worried by everything I read on this topic. I have no idea what is a Witt vector, so I guess I don't care about them. I care very much about Lean not being stupid.

Kevin Buzzard (May 04 2020 at 14:54):

Witt vectors are just a completely normal piece of algebra.

Kevin Buzzard (May 04 2020 at 14:56):

They solve the following problem. Given the ring Z/pZ, we want to define Z/p^2Z as a set by (Z/pZ)^2, and now we have to figure out how to define addition on it, so we need functions which send (a,b) in Z/pZ to 0 if a+b<0 and 1 if a+b>=p (the "carry digit"). This problem is solved by Witt polynomials, and Witt vectors are just (Z/pZ)^n with this structure of addition and multiplication. It all generalises to a large class of perfect fields.

Kevin Buzzard (May 04 2020 at 14:56):

Mathematically there is absolutely nothing remarkable about it.

Kevin Buzzard (May 04 2020 at 14:57):

You can take some limit of them, and then construct the p-adic integers Z_p from its residue field Z/pZ

Kevin Buzzard (May 04 2020 at 14:57):

Doing it with other finite fields builds a really interesting (to an arithmetician) class of complete discrete valuation rings.

Johan Commelin (May 04 2020 at 14:58):

Over at the Witt vectors topic, they seem to discuss this PR.
Here, on the topic of some technical PR, people discuss Witt vectors
/me rolls on the floor

Kevin Buzzard (May 04 2020 at 14:59):

If you start with the p-adic numbers, take an algebraic closure, take the ring of integers, reduce it mod p, perfect it, and then take the Witt vectors and invert p, you have made Fontaine's field B_{dR}, which you (Patrick) once told me was a ridiculous goal (in terms of getting Antoine to formalise it ;-) ), but actually you can see how close we are :-)

Reid Barton (May 04 2020 at 14:59):

Did someone try fixing all the errors in mathlib yet, or are we just guessing that they look pretty innocuous?

Kevin Buzzard (May 04 2020 at 15:00):

The first one looked innocuous, so by induction we should be OK.

Kevin Buzzard (May 04 2020 at 15:00):

and induction usually works great in Lean

Johan Commelin (May 04 2020 at 15:03):

I've never built C++ Lean before.

Johan Commelin (May 04 2020 at 15:03):

Let me try to do that, and try to fix mathlib.

Bryan Gin-ge Chen (May 04 2020 at 15:04):

I'm looking at them now. I've been able to fix the non-category theory ones so far.

Bryan Gin-ge Chen (May 04 2020 at 15:04):

I'll push what I have to a branch shortly.

Kevin Buzzard (May 04 2020 at 15:04):

So did filling in the _ with x * y in the algebra/opposite file solve that first problem?

Bryan Gin-ge Chen (May 04 2020 at 15:05):

For some reason x * y didn't work (maybe more coercions were needed somewhere), but _ * _ did.

Reid Barton (May 04 2020 at 15:06):

How about another question: do we understand why these places broke? Or at least one of them?

Bryan Gin-ge Chen (May 04 2020 at 15:06):

I don't!

Reid Barton (May 04 2020 at 15:09):

Hmm, interesting. So this one looked fairly simple:

/mathlib/src/data/real/hyperreal.lean:591:11: error: "eliminator" elaborator type mismatch, term
  is_st_add
has type
  is_st ?m_1 ?m_2  is_st ?m_3 ?m_4  is_st (?m_1 + ?m_3) (?m_2 + ?m_4)
but is expected to have type
  infinitesimal x  infinitesimal y  infinitesimal (x + y)

Reid Barton (May 04 2020 at 15:09):

def infinitesimal (x : *) := is_st x 0

Reid Barton (May 04 2020 at 15:10):

Presumably this was ultimately relying on the fact that (0 : ℝ) + 0 = 0!

Reid Barton (May 04 2020 at 15:10):

lemma is_st_add {x y : *} {r s : } : is_st x r  is_st y s  is_st (x + y) (r + s)

Reid Barton (May 04 2020 at 15:11):

lemma infinitesimal_add {x y : *} :
  infinitesimal x  infinitesimal y  infinitesimal (x + y) :=
zero_add 0  is_st_add

Reid Barton (May 04 2020 at 15:11):

It's almost like @[irreducible] is an obscure feature that nobody really understands??

Johan Commelin (May 04 2020 at 15:12):

Certainly I don't

Gabriel Ebner (May 04 2020 at 15:14):

It doesn't help that Lean ≤3.10 unfolds irreducible definitions in lots of unexpected places. When checking whether two types are definitionally equal (e.g. when assigning metavariables). When inferring types. When checking that a type is a pi-type, etc.

Bryan Gin-ge Chen (May 04 2020 at 15:15):

Here's the branch: https://github.com/leanprover-community/mathlib/tree/lean_pr_211

After you build the PR branch from lean#211, use elan toolchain link local <path/to/lean/> and then the above branch should pick up the new version of Lean.

Gabriel Ebner (May 04 2020 at 15:18):

I feel a bit uncomfortable only changing the transparency in a single place. Should we change it to semireducible everywhere? This is how it is in Lean 4.

Johan Commelin (May 04 2020 at 15:19):

@Bryan Gin-ge Chen How do I undo that elan change when I'm done?

Reid Barton (May 04 2020 at 15:20):

Gabriel Ebner said:

I feel a bit uncomfortable only changing the transparency in a single place. Should we change it to semireducible everywhere? This is how it is in Lean 4.

We should at least try it I think.

Reid Barton (May 04 2020 at 15:21):

Some of these other errors, like the ones in geometry/manifold/real_instances.lean:217, are more mysterious

Bryan Gin-ge Chen (May 04 2020 at 15:21):

@Johan Commelin The elan command just creates a new toolchain which you should be able to remove with elan toolchain uninstall. It shouldn't affect anything else unless you've already got a toolchain named local.

Johan Commelin (May 04 2020 at 15:22):

Ok, so should I also change the .toml?

Johan Commelin (May 04 2020 at 15:23):

Ooh, maybe you already did.

Johan Commelin (May 04 2020 at 15:23):

(I'm at 70% of the Lean build) :fencing:

Reid Barton (May 04 2020 at 15:23):

The toml is changed.

Bryan Gin-ge Chen (May 04 2020 at 15:23):

I've updated the gist with the errors, marking the few I was able to fix. I won't be able to work on this until later tonight, so feel free to push arbitrary changes to either the Lean PR branch or the mathlib branch.

Gabriel Ebner (May 04 2020 at 15:24):

Another cool elan trick: if you don't want to change the toml, you can run elan override set .... in the mathlib directory where you want a different lean.

Reid Barton (May 04 2020 at 15:26):

In particular several of these errors don't seem to have anything irreducible around, which is confusing

Gabriel Ebner (May 04 2020 at 15:26):

Can you point me to one of them?

Reid Barton (May 04 2020 at 15:26):

Oh whoops I forgot op has @[irreducible]

Reid Barton (May 04 2020 at 15:26):

Let me look harder

Reid Barton (May 04 2020 at 15:27):

Maybe src/data/pfun.lean:206?

Reid Barton (May 04 2020 at 15:29):

Or the ones in src/data/padics/padic_norm.lean? This file does import data.real.cau_seq but it's not obvious to me why (it doesn't seem to use the real numbers; padic_norm itself takes rational values)

Johan Commelin (May 04 2020 at 15:30):

How can I check that I'm running the correct Lean?

Gabriel Ebner (May 04 2020 at 15:30):

lean --version in the correct directory. Then check the git revision it prints.

Johan Commelin (May 04 2020 at 15:30):

Aah, right, thanks!

Johan Commelin (May 04 2020 at 15:31):

It seems to be ok

Gabriel Ebner (May 04 2020 at 15:31):

elan show will also tell you what "active toolchain" it uses

Reid Barton (May 04 2020 at 15:31):

or the ones in src/set_theory/cofinality.lean: the set_theory directory doesn't contain irreducible and it doesn't import much else, I think

Johan Commelin (May 04 2020 at 15:31):

What is the best way to build mathlib and stop at the first error?

Reid Barton (May 04 2020 at 15:32):

I use leanpkg build 2>&1 | less for this

Johan Commelin (May 04 2020 at 15:32):

But that just collects errors, right? It doesn't kill the build

Johan Commelin (May 04 2020 at 15:32):

But maybe that's fine

Reid Barton (May 04 2020 at 15:34):

I think the remaining errors either definitely involve something irreducible (opposite, real) or it seems like too much effort to figure out (data.polynomial?)

Reid Barton (May 04 2020 at 15:36):

Though, I don't understand specifically why some of the ones that do involve irreducible things failed, yet.

Johan Commelin (May 04 2020 at 15:38):

@Gabriel Ebner Would you mind pushing your suggested semireducibility changes to a branch on top of lean#211?

Gabriel Ebner (May 04 2020 at 15:39):

I think I understand a bit what happens in pfun: right now we also unfold proofs of theorems.

Johan Commelin (May 04 2020 at 15:39):

Then we can also build that version of Lean, and see how much mathlib likes that.

Johan Commelin (May 04 2020 at 15:39):

I'll be back after dinner.

Gabriel Ebner (May 04 2020 at 15:39):

Pushed.

Johan Commelin (May 04 2020 at 15:39):

Pushed where?

Gabriel Ebner (May 04 2020 at 15:40):

To the type_context_fix branch?

Johan Commelin (May 04 2020 at 15:40):

Ooh, I thought we might want both versions next to each other. But maybe not (-;

Johan Commelin (May 05 2020 at 08:15):

I fixed the opposite stuff in categories

Johan Commelin (May 05 2020 at 09:11):

Fixed the hyperreals

Johan Commelin (May 05 2020 at 09:15):

@Gabriel Ebner Could this also be implemented via set_option standard_reducibility blabla?

Johan Commelin (May 05 2020 at 09:16):

Then I could just switch that option on in the witt vector file, and the rest of mathlib doesn't have to care.

Gabriel Ebner (May 05 2020 at 09:17):

I think it's the right default. Irreducible definitions should not be unfolded, that's the explicit intention behind marking them as such. Besides, that's how it's going to be in :four_leaf_clover:, so we might as well get used to it now.

Johan Commelin (May 05 2020 at 09:27):

Ok... I'll try to fix as much as I can.

Johan Commelin (May 05 2020 at 09:28):

Note that in one of my fixes I cheated: I locally marked the definition as semireducible. But in my opinion the file was building API for the definition, so then it should be fine.

Johan Commelin (May 05 2020 at 09:31):

But some of the failure doesn't seem to have anything to do with irreducible defs.

Johan Commelin (May 05 2020 at 09:32):

Ooh, wait. It does.

Johan Commelin (May 05 2020 at 09:32):

There was an op somewhere :expressionless:

Gabriel Ebner (May 05 2020 at 09:32):

Lemmas are also irreducible. :smile: In some cases, Lean was able to infer arguments by unifying proofs of lemmas...

Johan Commelin (May 05 2020 at 09:37):

It broke on refine blabla _ _...

Johan Commelin (May 05 2020 at 09:40):

simp no longer solves

((cones_hom F  cones_inv F).app X c).app limits.walking_parallel_pair.zero 
      limits.limit.π (functor.of_function F.obj) j =
    ((𝟙 (functor.cones (diagram F))).app X c).app limits.walking_parallel_pair.zero 
      limits.limit.π (functor.of_function F.obj) j

Johan Commelin (May 05 2020 at 09:41):

We probably need some help from @Scott Morrison

Gabriel Ebner (May 05 2020 at 09:41):

Which file is this in?

Johan Commelin (May 05 2020 at 09:43):

category_theory/limits/shapes/constructions/limits_of_products_and_equalizers.lean

Johan Commelin (May 05 2020 at 09:43):

It seems that convert is brutal enough to replace the more refined refine. But the problem just shifts to the next line.

Johan Commelin (May 05 2020 at 09:44):

And there even convert gives up.

Johan Commelin (May 05 2020 at 09:46):

Which is not a surprise, because we went from 2 goals to 11 goals...

Johan Commelin (May 05 2020 at 09:48):

Gabriel Ebner said:

Lemmas are also irreducible. :smile: In some cases, Lean was able to infer arguments by unifying proofs of lemmas...

So secretly we could actually pull of the mathematicians trick of "by the object constructed in the proof of Theorem 5.4"...

Johan Commelin (May 05 2020 at 09:49):

We could even do it without saying anything!

Sebastien Gouezel (May 05 2020 at 09:52):

That's not just a trick, that's a very dirty trick.

Gabriel Ebner (May 05 2020 at 10:05):

category_theory/limits/shapes/constructions/limits_of_products_and_equalizers.lean uses pi.lift c.app : X ⟶ ∏ F.obj to fill in a goal of type unop X ⟶ ∏ F.obj. I guess @Scott Morrison needs to fix this properly.

Johan Commelin (May 05 2020 at 10:05):

@Gabriel Ebner I see that you also just locally mark things reducible?

Johan Commelin (May 05 2020 at 10:06):

I just pushed a fix of geometry/manifold/real_instances. It's quite dirty again.

Johan Commelin (May 05 2020 at 10:06):

--- a/src/geometry/manifold/real_instances.lean
+++ b/src/geometry/manifold/real_instances.lean
@@ -213,8 +213,10 @@ def Icc_left_chart (x y : ℝ) [fact (x < y)] :
   end,
   open_target := begin
     have : is_open {z : ℝ | z < y - x} := is_open_Iio,
-    have : is_open {z : fin 1 → ℝ | z 0 < y - x} :=
-      (continuous_apply 0) _ this,
+    have : is_open {z : fin 1 → ℝ | z 0 < y - x},
+    { have key : _ → _ := @continuous_apply (fin 1) (λ _, ℝ) _ 0,
+      dsimp at key,
+      exact key _ this },
     exact continuous_subtype_val _ this
   end,
   continuous_to_fun := begin

Gabriel Ebner (May 05 2020 at 10:07):

If I don't see a better solution, then yes, I just mark them reducible.

Kenny Lau (May 05 2020 at 10:07):

wow, diff is a language?

Johan Commelin (May 05 2020 at 10:08):

Well, you can syntax highlight them... makes sense right?

Kenny Lau (May 05 2020 at 10:08):

it looks like magic

Johan Commelin (May 05 2020 at 10:08):

pygments doesn't do magic
— eastern wisdom

Johan Commelin (May 05 2020 at 10:13):

/me :fencing: and it's looking good so far

Johan Commelin (May 05 2020 at 10:18):

Fixed bochner integration

Gabriel Ebner (May 05 2020 at 10:22):

--- a/src/geometry/manifold/real_instances.lean
+++ b/src/geometry/manifold/real_instances.lean
@@ -213,10 +213,8 @@ def Icc_left_chart (x y : ℝ) [fact (x < y)] :
   end,
   open_target := begin
     have : is_open {z : ℝ | z < y - x} := is_open_Iio,
-    have : is_open {z : fin 1 → ℝ | z 0 < y - x},
-    { have key : _ → _ := @continuous_apply (fin 1) (λ _, ℝ) _ 0,
-      dsimp at key,
-      exact key _ this },
+    have : is_open {z : fin 1 → ℝ | z 0 < y - x} :=
+      @continuous_apply (fin 1) (λ _, ℝ) _ 0 _ this,
     exact continuous_subtype_val _ this
   end,
   continuous_to_fun := begin

Johan Commelin (May 05 2020 at 10:58):

@Gabriel Ebner I thought I tried that!

Johan Commelin (May 05 2020 at 10:58):

It seems that this stuff is subtle/fragile... :sad:

Johan Commelin (May 05 2020 at 11:16):

:tada: :fencing: :thumbs_up:
Mathlib builds for me.

Johan Commelin (May 05 2020 at 11:17):

I just pushed

Kevin Buzzard (May 05 2020 at 11:17):

Does this mean Witt vectors are back on?

Johan Commelin (May 05 2020 at 11:19):

The total number of changes aren't even too big:
https://github.com/leanprover-community/mathlib/compare/lean_pr_211?expand=1

Johan Commelin (May 05 2020 at 11:19):

Kevin Buzzard said:

Does this mean Witt vectors are back on?

Well, first we need to get lean#211 merged, and then released and then this PR has to be merged into mathlib.

Kevin Buzzard (May 05 2020 at 11:20):

But it means you are unblocked, right? :D

Reid Barton (May 05 2020 at 11:25):

Not yet though. Maybe you should formalize sober spaces while you wait.

Reid Barton (May 05 2020 at 11:27):

Soooo, what about also having Lean not unfold proofs of lemmas.

Reid Barton (May 05 2020 at 11:28):

Currently I'm forced to admit that definitional proof irrelevance is "an obscure feature that nobody really understands".

Reid Barton (May 05 2020 at 11:29):

Or, wait. Is this change fixing that too?

Patrick Massot (May 05 2020 at 11:34):

Do we have any performance benchmark here?

Johan Commelin (May 05 2020 at 11:38):

It didn't feel like Lean was slower, when compiling mathlib

Johan Commelin (May 05 2020 at 11:42):

I am now running time nice leanpkg build on this mathlib.
When I'm done, I'll also run it on regular mathlib.
Check back in ~ 4 hours (-;

Reid Barton (May 05 2020 at 11:49):

Another thing I am a bit confused about: In general, the proofs of lemmas may not be immediately available. If unification wants to unfold one of these proofs, does that mean it joins on the task computing the proof?

Johan Commelin (May 05 2020 at 12:33):

time nice leanpkg build ### building with lean#211
real    48m45.006s
user    584m25.837s
sys     7m46.827s

Johan Commelin (May 05 2020 at 12:34):

Starting the "classic" build now.

Johan Commelin (May 05 2020 at 13:20):

time nice leanpkg build ### building with lean-3.10.0
real    45m12.958s
user    571m15.583s
sys     7m42.585s

Johan Commelin (May 05 2020 at 13:21):

So the build times seem very similar. I have no idea how big the noise factor is.

Johan Commelin (May 05 2020 at 13:21):

But we do seem to get a tiny slowdown

Johan Commelin (May 05 2020 at 13:21):

I would have expected that lean is trying less stuff, more conservative. So would it have been crazy to actually see a slight speedup instead?

Mario Carneiro (May 05 2020 at 13:21):

I'm surprised it's not a speedup

Johan Commelin (May 05 2020 at 13:26):

2.3% ... are we willing to sacrifice that?

Johan Commelin (May 05 2020 at 13:29):

Maybe we should include witt vectors in the comparison?

Johan Commelin (May 05 2020 at 13:30):

But more seriously... is there anything we can say about the noise? Or whether or not this should have been a speedup?

Johan Commelin (May 05 2020 at 13:30):

@Sebastian Ullrich Do you have some wise words to say?

Gabriel Ebner (May 05 2020 at 13:31):

I bet if you build mathlib five times, you'll see a larger variance.

Gabriel Ebner (May 05 2020 at 13:31):

Also how did you build lean? Compiler version, build type (debug/release)?

Johan Commelin (May 05 2020 at 13:31):

Aaah, does that matter?

Gabriel Ebner (May 05 2020 at 13:32):

It matters about one or two percent, I would imagine.

Johan Commelin (May 05 2020 at 13:32):

I see. I just followed the build instructions. Is there some config file you would want to see?

Johan Commelin (May 05 2020 at 13:33):

If you want, I can run a number of builds tonight. But maybe we just decide that this PR is going to get merged anyway?

Gabriel Ebner (May 05 2020 at 13:33):

I vote for merge.

Sebastian Ullrich (May 05 2020 at 13:33):

Yeah, there really isn't any conceivable reason to defend the old behavior

Johan Commelin (May 05 2020 at 13:34):

Unless it would be significantly faster

Sebastian Ullrich (May 05 2020 at 13:34):

Well, then obviously something else must be broken to make it slower!

Sebastian Ullrich (May 05 2020 at 13:38):

Btw, you might already be aware of http://speedcenter.informatik.kit.edu/lean/. I can't really recommend this particular software to use for benchmarking other projects such as mathlib, but we have some students who are working on an actually usable alternative, so maybe soon...

Johan Commelin (May 05 2020 at 13:40):

I was not aware of it

Johan Commelin (May 05 2020 at 13:40):

So, thanks (-;

Johan Commelin (May 05 2020 at 13:44):

When the usable alternative is there, it would be cool to add these stats to the community website

Rob Lewis (May 05 2020 at 14:13):

I think we're long overdue for some kind of mathlib performance benchmarking. It's way too easy to introduce invisible slowdowns.

Patrick Massot (May 05 2020 at 14:14):

I think everybody agrees, but everybody seem to be measuring only noise

Rob Lewis (May 05 2020 at 14:15):

Presumably there's some kind of service we can pay for that will do this with dedicated resources, right?

Reid Barton (May 05 2020 at 14:40):

Certainly one can rent dedicated servers, though they are moderately expensive and/or require custom infrastructure (saving $ by not running the server 24/7 means more complexity managing the server).

Reid Barton (May 05 2020 at 14:41):

Not sure whether there are off-the-shelf services of this kind. Probably worth discussing with the Lean 4 team.

Reid Barton (May 05 2020 at 14:49):

Actually we are already at the point where a reasonably beefy machine would be running at a nontrivial fraction of capacity if it did nothing but build mathlib commits. I tried one of the Scaleway HC-BM1-S bare-metal instances ("2× Intel® Xeon Silver 4114", whatever that means; 20 cores) and it took about 40 minutes to build mathlib once, so if there are ~240 commits in a month that comes to 160 hours. One month is about 730 hours.

Reid Barton (May 05 2020 at 14:50):

An alternative: Pick some metric(s) that Lean can count while building, for example number of entries to is_def_eq_core or whatever, and use that as a proxy for compilation time which is not affected by the hardware.

Bryan Gin-ge Chen (May 05 2020 at 15:05):

I'm just coming back to this. It looks like Gabriel just merged lean#211 into master :tada: . Thanks to Gabriel and Johan for following up on this!

Out of curiosity, how did the changes in the "Semireducible everywhere!' commit (https://github.com/leanprover-community/lean/pull/211/commits/d9293317c8cd3b77a7f94b20f05f7fc4164467e1) affect the number of errors in the mathlib branch?

Johan Commelin (May 05 2020 at 15:06):

See https://github.com/leanprover-community/mathlib/compare/lean_pr_211?expand=1

Johan Commelin (May 05 2020 at 15:06):

I don't think it got worse

Bryan Gin-ge Chen (May 05 2020 at 15:09):

Ah, OK. Can we revert any of the fixes to mathlib that were made before that commit, or is the new Lean still unable to fill in as many underscores as Lean 3.10.0?

Johan Commelin (May 05 2020 at 15:14):

Reid Barton said:

Actually we are already at the point where a reasonably beefy machine would be running at a nontrivial fraction of capacity if it did nothing but build mathlib commits. I tried one of the Scaleway HC-BM1-S bare-metal instances ("2× Intel® Xeon Silver 4114", whatever that means; 20 cores) and it took about 40 minutes to build mathlib once, so if there are ~240 commits in a month that comes to 160 hours. One month is about 730 hours.

Maybe we can improve Scott's build bot a bit? I think he mentioned averaging the build times of the last 5 builds, for example.

Johan Commelin (May 05 2020 at 15:14):

Bryan Gin-ge Chen said:

Ah, OK. Can we revert any of the fixes to mathlib that were made before that commit, or is the new Lean still unable to fill in as many underscores as Lean 3.10.0?

I didn't try

Johan Commelin (May 05 2020 at 15:14):

Feel free to golf the PR, while I'm having dinner (-;

Bryan Gin-ge Chen (May 06 2020 at 00:11):

No luck reverting any of the earlier changes. Oh well.


Last updated: Dec 20 2023 at 11:08 UTC