Zulip Chat Archive
Stream: mathlib4
Topic: AdjoinRoot: gigantic terms take VS Code down
Kevin Buzzard (Jun 04 2023 at 15:25):
I was trying to make sense of why things are slow with AdjoinRoot
. The proof of docs4#AdjoinRoot.quotAdjoinRootEquivQuotPolynomialQuot_symm_mk_mk is just by rw [about 12 lemmas]
and right now in master we have set_option maxHeartbeats 300000 in
for this lemma. I thought it might be worth inspecting the full term, so I tried this:
import Mathlib.RingTheory.AdjoinRoot
noncomputable section
universe u v w
variable {R : Type u}
open Classical BigOperators Polynomial Ideal
namespace AdjoinRoot
open DoubleQuot
variable [CommRing R] (I : Ideal R) (f : R[X])
-- uncomment this to take your VS Code down
-- set_option pp.all true in
theorem quotAdjoinRootEquivQuotPolynomialQuot_symm_mk_mk' (p : R[X]) :
(quotAdjoinRootEquivQuotPolynomialQuot I f).symm
(Ideal.Quotient.mk (span ({f.map (Ideal.Quotient.mk I)} : Set (R ⧸ I)[X]))
(p.map (Ideal.Quotient.mk I))) =
Ideal.Quotient.mk (I.map (of f)) (mk f p) := by
sorry
end AdjoinRoot
end
but uncommenting set_option pp.all true
puts VS Code into some kind of catatonic state, where for a while the infoview doesn't display anything at all, and then finally it does display the theorem statement and the goal is so large that I cannot even inspect how large it is; the moment it is displayed (together with a very small scrollbar) VS Code becomes completely unresponsive.
Is this an indication of a problem? Or should Lean be able to handle gigantic terms which are too big to be displayed?
I don't think any of the 12 rewrites are in particular to blame; without pp.all each one is basically a bit slow, perhaps because the term is obscenely huge or perhaps for other reasons.
I managed to get command line Lean to output the term -- it's just over 37000 lines long. You can view it here.
Michael Stoll (Jun 04 2023 at 18:27):
There is a lot redundancy in this text -- gzip compresses it from 2955890 down to 54963 bytes (a factor of ~54). So I assume that internally, a lot of sharing will be going on. Still, to my naïve eyes, this looks bad. Is exponential blow-up of terms unavoidable in the long run?
Kevin Buzzard (Jun 04 2023 at 22:39):
Yes, this is really my question. I know nothing about programming, and I'm wondering why this proof is slow. Is the answer "because the term is 3 million characters long" or is the answer "it doesn't matter at all that the term is 3 million characters long when pretty printed because internally it is a completely reasonable size and the problem is somewhere else"?
Scott Morrison (Jun 05 2023 at 00:16):
For comparison, the corresponding theorem in mathlib3 has a goal that pretty prints to 767235 bytes (~1/4 the size), and compresses to 15000 bytes (pretty much the same ~51 compression ratio).
Scott Morrison (Jun 05 2023 at 00:18):
To be honest the goal type doesn't look that scary. It's a huge amount of duplication in pp.all
, but I think it is all going to be shared internally.
Scott Morrison (Jun 05 2023 at 00:25):
Understanding the difference between the mathlib3 and mathlib4 terms might be helpful, however.
Scott Morrison (Jun 05 2023 at 00:26):
The mathlib3 term is here.
Eric Wieser (Jun 05 2023 at 00:26):
How much of that 1/4 is due to not having stupidly long instance names?
Scott Morrison (Jun 05 2023 at 00:27):
Not much, I think. I renamed instHasQuotientIdealToSemiringToCommSemiring
and it makes little difference,.
Scott Morrison (Jun 05 2023 at 00:29):
It actually looks like a bigger cause of the inflation is lack of beta reduction on the mathlib4 side. There are lots of (fun (a : T) ↦ b) a
with large terms T
.
Scott Morrison (Jun 05 2023 at 00:32):
I am tempted to suggest that we refactor AdjoinRoot
, essentially getting rid of it almost entirely in favour of IsAdjoinRoot
. We could just have an opaque
definition for AdjoinRoot
if all the theory were developed immediately in terms of IsAdjoinRoot
.
Eric Wieser (Jun 05 2023 at 00:35):
I think if we do that we lose a bunch of scalar action structures
Last updated: Dec 20 2023 at 11:08 UTC