Zulip Chat Archive
Stream: general
Topic: performance improvements to ring
Mario Carneiro (Sep 02 2018 at 20:43):
@Rob Lewis The latest commit replaces some of the mk_app
applications used in ring
, and there was a huge performance gain. Now ring
will solve (x + y)^n = (x+y)^n
up to n = 60
before hitting the timeout, compared to n = 11
before
Mario Carneiro (Sep 02 2018 at 20:44):
example (ε : ℚ) : ε / 3 + ε / 3 + ε / 3 = ε := by ring
takes a fraction of a second now
Patrick Massot (Sep 02 2018 at 20:45):
Is this something we could learn from?
Mario Carneiro (Sep 02 2018 at 20:46):
mk_app
sucks
Mario Carneiro (Sep 02 2018 at 20:47):
There are still some mk_app
uses in ring, but the problems seem to be when some of the arguments are (large) proof terms, even if they aren't used in type inference
Mario Carneiro (Sep 02 2018 at 20:47):
I think it is typechecking the terms, which is a really bad idea since it gets done so often
Kevin Buzzard (Sep 02 2018 at 21:01):
good catch Mario. I have had a lot of success with ring in the past but I usually cleared denominators myself and then applied the tactic when there were no divisions left.
Mario Carneiro (Sep 02 2018 at 21:13):
the problem wasn't denominators per se. The major slowdown was in structural parts of the proof that are done in just about every proof. For example normalizing (x+y)^n
uses no division
Simon Hudon (Sep 02 2018 at 21:13):
I assume you're talking about tactic.mk_app
. What do you replace it with? I was told it was preferable to use it to to_expr
.
Mario Carneiro (Sep 02 2018 at 21:17):
mk_app
will build an application while performing typeclass inference and typechecking (or at least inferring the types) of all passed arguments. The alternative is just to build the expr yourself, using expr.app
or expr.mk_app
, which does no typechecking or inference. I thought the inference might be slow, which is why ring
uses an instance_cache
, but I think that this was not the problem. Now I just do the typeclass inference directly and then put the app together manually.
Mario Carneiro (Sep 02 2018 at 21:18):
meta def cache.mk_app (c : cache) (n inst : name) (l : list expr) : tactic expr := do m ← mk_instance ((expr.const inst [c.univ] : expr) c.α), return $ (@expr.const tt n [c.univ] c.α m).mk_app l
then
c.mk_app ``norm_num.subst_into_prod ``has_mul [l, e, tl, e, t, hl, hr, p₂],
Rob Lewis (Sep 02 2018 at 21:24):
That's awesome! I spent a little while profiling linarith
earlier today and was also seeing performance problems with mk_app
. It's not a straightforward story when to use it over to_expr
, and yeah, making expressions by hand is probably the way to go for critical applications.
Simon Hudon (Sep 02 2018 at 21:27):
It would be nice to have a more systematic approach to finding bottle necks. One way I could think of is to use Travis to publish metrics for every build.
Rob Lewis (Sep 02 2018 at 22:22):
@Mario Carneiro I'm about to go to bed, so I'm going to leave this here for you to think about if you want to, otherwise I'll investigate tomorrow. :slight_smile: Compare the two profiles here:
set_option profiler true constants (T : Type) (h : discrete_linear_ordered_field T) attribute [instance] h example (ε : T) : (41/42) * ε - (ε / 2 + ε / 3 + ε / 7) = 0 := by ring example (ε : ℚ) : (41/42) * ε - (ε / 2 + ε / 3 + ε / 7) = 0 := by ring
The second one takes twice as long and spends almost all its time in the final exact
. It's evaluating rational arithmetic in the kernel. The first one spends 20x longer in norm_num
but that's still better than kernel evaluation.
Mario Carneiro (Sep 02 2018 at 22:23):
My guess is that this is tactic.norm_num
's fault (the one built into core)
Rob Lewis (Sep 03 2018 at 11:50):
After chasing down a few false leads: it isn't tactic.norm_num
, although tactic.norm_num
is surprisingly sensitive to typeclass short circuits. These commands take roughly the same amount of time if you add enough short circuits for T
, otherwise the second takes 7x longer.
run_cmd prod.fst <$> norm_num `((5/60) + (1/70) : ℚ) >>= trace run_cmd prod.fst <$> norm_num `((5/60) + (1/70) : T) >>= trace
I find this a little surprising since norm_num
caches and pre-applies the instances, so it shouldn't be searching for any instance more than once. But this isn't what's causing the issue above.
Rob Lewis (Sep 03 2018 at 11:51):
When ring
solves (41/42)*ε - (ε / 2 + ε / 3 + ε / 7) = 0
, it actually proves (41/42)*ε + -(ε / 2 + ε / 3 + ε / 7) = 0
. Unifying these over ℚ
takes way longer than over T
for some reason.
Rob Lewis (Sep 03 2018 at 11:52):
There's an easy workaround: change the last lines of ring1
to
ntp ← infer_type p, tactic.change ntp ff, tactic.exact p
Rob Lewis (Sep 03 2018 at 11:54):
The kernel is much faster at checking that the proof from ring
has the expected type than exact
.
Mario Carneiro (Sep 03 2018 at 21:00):
Argh, this is so annoying. I inserted explicit change proof terms in the proof. I think it is fixed but keep me appraised if you find anything else.
Rob Lewis (Sep 04 2018 at 07:14):
Thanks! Will do.
Last updated: Dec 20 2023 at 11:08 UTC