Zulip Chat Archive

Stream: maths

Topic: ring is slow with polynomials


view this post on Zulip Kevin Buzzard (Apr 05 2020 at 21:22):

Is this just a fact of life for some reason:

import data.polynomial

set_option profiler true

open polynomial

example (R : Type) [comm_ring R] (x : R) :
  (1 + x^2 + x^4 + x^6) * (1 + x) = 1+x+x^2+x^3+x^4+x^5+x^6+x^7 :=
by ring -- 5 seconds

example (R : Type) [comm_ring R] :
  (1 + X^2 + X^4 + X^6 : polynomial R) * (1 + X) = 1+X+X^2+X^3+X^4+X^5+X^6+X^7 :=
by ring -- 18 seconds

view this post on Zulip Mario Carneiro (Apr 05 2020 at 21:30):

I suspect that the majority of the time is coming from the few times I "save some time" by calling lean builtins instead of hand rolling exprs using the constructor

view this post on Zulip Kevin Buzzard (Apr 05 2020 at 21:31):

My workaround is to prove the top one and deduce the bottom one with a rw.

view this post on Zulip Mario Carneiro (Apr 05 2020 at 21:33):

 8612ms    37.1%   tactic.mk_mapp
...
 3405ms    14.7%   tactic.mk_app
...
    5ms     0.0%   expr.app
    3ms     0.0%   tactic.ring.cache.cs_app

the ones on the top are the lean builtins, the ones on the bottom are the hand rolled ones

view this post on Zulip Mario Carneiro (Apr 05 2020 at 21:33):

there is something seriously wrong with the lean internals here

view this post on Zulip Mario Carneiro (Apr 05 2020 at 21:34):

it's almost certainly calling typeclass inference, and that's responsible for the time spent, but I don't understand why the typeclass cache isn't working

view this post on Zulip Mario Carneiro (Apr 05 2020 at 21:36):

Actually, it might not be a bad idea to do something like that inside ring, that is, generalize the type of the expression before starting the proof proper

view this post on Zulip Mario Carneiro (Apr 05 2020 at 21:36):

that way those monstrous typeclass arguments are not copied everywhere, but rather are passed in via a have

view this post on Zulip orlando (Apr 05 2020 at 21:47):

Hey Kevin, with my computer it's faster with the polynomial ? It's normal ? 15 for the first and 11 for the second :upside_down:

view this post on Zulip Ryan Lahfa (Apr 05 2020 at 22:09):

FWIW, on a Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz, got 7 seconds for the first and 28 seconds for the second.

Profile for the 2nd one:

elaboration: tactic execution took 28.9s
num. allocated objects:  7769
num. allocated closures: 10240
28887ms   100.0%   _interaction._lambda_2
28887ms   100.0%   scope_trace
28887ms   100.0%   tactic.interactive.ring1._lambda_3
28887ms   100.0%   tactic.interactive.ring1
28887ms   100.0%   tactic.step
28887ms   100.0%   tactic.istep
28887ms   100.0%   tactic.interactive.ring
28887ms   100.0%   _interaction
28887ms   100.0%   interaction_monad_orelse
28887ms   100.0%   tactic.istep._lambda_1
28887ms   100.0%   tactic.interactive.ring._lambda_1
28886ms   100.0%   tactic.ring.ring_m.run
28886ms   100.0%   interaction_monad.monad._lambda_9
28872ms    99.9%   tactic.ring.lift
13373ms    46.3%   tactic.norm_num
10911ms    37.8%   tactic.mk_mapp
 9164ms    31.7%   tactic.ring.eval_add._main._lambda_2
 7687ms    26.6%   tactic.ring.eval_pow._main._lambda_2
 7154ms    24.8%   norm_num.eval_pow
 5977ms    20.7%   norm_num.derive
 5977ms    20.7%   tactic.ext_simplify_core
 5861ms    20.3%   norm_num.derive._main._lambda_3
 4157ms    14.4%   tactic.mk_app
 1846ms     6.4%   expr.of_nat._lambda_1

Profile for the 1st:

elaboration: tactic execution took 7.45s
num. allocated objects:  7771
num. allocated closures: 10240
 7445ms   100.0%   interaction_monad_orelse
 7445ms   100.0%   tactic.interactive.ring._lambda_1
 7445ms   100.0%   tactic.istep._lambda_1
 7445ms   100.0%   _interaction
 7445ms   100.0%   tactic.interactive.ring
 7445ms   100.0%   tactic.istep
 7445ms   100.0%   _interaction._lambda_2
 7445ms   100.0%   tactic.step
 7445ms   100.0%   scope_trace
 7445ms   100.0%   tactic.interactive.ring1
 7445ms   100.0%   tactic.interactive.ring1._lambda_3
 7444ms   100.0%   interaction_monad.monad._lambda_9
 7444ms   100.0%   tactic.ring.ring_m.run
 7434ms    99.9%   tactic.ring.lift
 3567ms    47.9%   tactic.norm_num
 2348ms    31.5%   tactic.ring.eval_add._main._lambda_2
 2147ms    28.8%   tactic.ring.eval_pow._main._lambda_2
 2120ms    28.5%   tactic.mk_mapp
 2050ms    27.5%   norm_num.eval_pow
 1698ms    22.8%   tactic.ext_simplify_core
 1698ms    22.8%   norm_num.derive
 1593ms    21.4%   norm_num.derive._main._lambda_3
 1440ms    19.3%   tactic.mk_app

view this post on Zulip Anne Baanen (Apr 06 2020 at 07:55):

I don't want to sound too repetitive,, but you should use ring_exp here too :slight_smile:

view this post on Zulip Anne Baanen (Apr 06 2020 at 08:01):

I get approximately the same times as Kevin for ring, but 150ms and 500ms for ring_exp. I'm actually surprised that the difference is so large

view this post on Zulip Kevin Buzzard (Apr 06 2020 at 08:11):

Being repetitive is very helpful for me -- I have to hear something several times before it sinks in. Does ring_exp not use ring then??

view this post on Zulip Kevin Buzzard (Apr 06 2020 at 08:15):

@Anne Baanen you just shaved 10 seconds off a Codewars Kata I'm working on, which might move it from "too slow to work on the site" to "killer question" :-)

view this post on Zulip Anne Baanen (Apr 06 2020 at 08:20):

Kevin Buzzard said:

Being repetitive is very helpful for me -- I have to hear something several times before it sinks in. Does ring_exp not use ring then??

It doesn't! Basically ring only knows how to represent + and *, but can do so efficiently (using the Horner form), and ring_exp trades in the efficiency on simpler problems to also deal with ^

view this post on Zulip Yury G. Kudryashov (Apr 06 2020 at 08:28):

I like "trades in the efficiency" together with "runs much faster".

view this post on Zulip Sebastien Gouezel (Apr 06 2020 at 08:31):

If in practice ring_exp is most often faster on reasonable problems, shouldn't we rename ring to ring_horner and ring_exp to ring? @Mario Carneiro

view this post on Zulip Patrick Massot (Apr 06 2020 at 08:31):

Reading assignment for student Kevin: https://lean-forward.github.io/ring_exp/paper.pdf

view this post on Zulip Kevin Buzzard (Apr 06 2020 at 08:34):

This looks like some implementation issue to me

view this post on Zulip Kevin Buzzard (Apr 06 2020 at 08:34):

ring = ring_exp, right? By ext.

view this post on Zulip Kevin Buzzard (Apr 06 2020 at 08:37):

So should Grégoire-Mahboubi be called "proving equalities in a comm ring done wrong in coq"? :P (sorry Assia, I love your paper!)

view this post on Zulip Johan Commelin (Apr 06 2020 at 08:44):

Kevin Buzzard said:

ring = ring_exp, right? By ext.

No, I don't think so. ring can't deal with (a + b)^2, whereas ring_exp can.

view this post on Zulip Anne Baanen (Apr 06 2020 at 08:52):

I wouldn't say that ring_exp is always faster in practice: linarith uses ring internally, and linear expressions are exactly the simpler kind that ring is better at. The Horner form is optimal in theory (if you use the correct definition of "optimal")

view this post on Zulip Rob Lewis (Apr 06 2020 at 08:53):

IIRC, there are some situations where ring is faster than ring_exp. Anne and I did some investigation when they were working on ring_exp and there was a performance regression when we just replaced ring entirely.

(Yep, what Anne said!)

view this post on Zulip Rob Lewis (Apr 06 2020 at 08:54):

Of course implementation details in both can shift the performance either way for specific problems.

view this post on Zulip Rob Lewis (Apr 06 2020 at 08:54):

It's probably possible to backport some of the ring_exp optimizations to ring.

view this post on Zulip Anne Baanen (Apr 06 2020 at 08:55):

Johan Commelin said:

ring can't deal with (a + b)^2, whereas ring_exp can.

It can deal with that specific case, because it first rewrites (a + b)^2 to (a + b)*(a + b). You can replace the 2 with n : ℕ to get a real counterexample

view this post on Zulip Rob Lewis (Apr 06 2020 at 08:56):

But what I'm hearing from all these questions about ring and polynomials is that there are still type class issues with polynomials. The inefficiency in ring is probably that it's repeating unnecessary searches, but these searches are slow, and that's the root issue.

view this post on Zulip Kevin Buzzard (Apr 06 2020 at 08:58):

Patrick Massot said:

Reading assignment for student Kevin: https://lean-forward.github.io/ring_exp/paper.pdf

@Anne Baanen typo in table of p3: "ab+c=ab+aca^{b+c}=a^b+a^c"

view this post on Zulip Kevin Buzzard (Apr 06 2020 at 09:01):

section 4 you seem to say "we tried random identities and ring_exp was slower" -- but I'm trying identities that come up in practice and ring_exp is a lot quicker :-)

view this post on Zulip Anne Baanen (Apr 06 2020 at 09:05):

Kevin Buzzard said:

Anne Baanen typo in table of p3: "ab+c=ab+aca^{b+c}=a^b+a^c"

Well spotted! I already fixed it in the updated version of the paper, which should be available soon.

view this post on Zulip Anne Baanen (Apr 06 2020 at 09:11):

Indeed, it's not the most informative comparison, but it's objective and can be explained in a few sentences when there is a 5 page limit, so I think I will have to live with it appearing a bit slower :slight_smile:

view this post on Zulip Mario Carneiro (Apr 06 2020 at 11:04):

To repeat: the reason ring is slow here has little to do with ring, it has to do with the lean functions that ring calls that are slow when they should not be. Yes, it is possible to make ring faster by avoiding these functions, but this essentially means that you aren't allowed to use the elaborator at all, you have to build a fully formed expr by hand if you want to get performance and that's ridiculous. I'll do it, ring is already 99% implemented like this and the remaning 1% of lean is causing the problem, but this really should be fixed at a lower level

view this post on Zulip Jasmin Blanchette (Apr 06 2020 at 11:30):

But Anne, you will have 6 pages for the final version, right? Maybe you can also show some examples from this thread.

view this post on Zulip Kevin Buzzard (Apr 06 2020 at 11:31):

and this thread

view this post on Zulip Anne Baanen (Apr 06 2020 at 11:45):

Yeah, I want to add a few more positive examples. But space is still pretty tight, since there are some places that the reviewers pointed out that they want more explanation. And also Mario's objection that the same optimizations can be applied to ring.

view this post on Zulip Mario Carneiro (Apr 06 2020 at 14:26):

Okay, I did a bunch of optimization (which is to say, mk_app considered harmful) and managed to get it down to 11 s. There are a lot of places that implicitly assume that calling mk_app is not that bad, so it's not easy going and the result looks horrible, but that's what you get for trying to circumvent lean internals. I don't think this is likely to get much better until I rewrite tactic.norm_num.

elaboration: tactic compilation took 5.97ms
elaboration: tactic compilation took 4.2ms
elaboration: tactic execution took 3.59s
num. allocated objects:  7909
num. allocated closures: 11047
 3594ms   100.0%   tactic.interactive.ring1
 3594ms   100.0%   scope_trace
 3594ms   100.0%   tactic.interactive.ring1._lambda_3
 3594ms   100.0%   tactic.step
 3594ms   100.0%   _interaction._lambda_2
 3594ms   100.0%   tactic.istep
 3594ms   100.0%   tactic.interactive.ring
 3594ms   100.0%   _interaction
 3594ms   100.0%   interaction_monad_orelse
 3594ms   100.0%   tactic.istep._lambda_1
 3594ms   100.0%   tactic.interactive.ring._lambda_1
 3593ms   100.0%   tactic.ring.ring_m.run
 3590ms    99.9%   tactic.ring.ring_m.run._lambda_1
 3590ms    99.9%   interaction_monad.monad._lambda_9
 3590ms    99.9%   tactic.using_new_ref
 2770ms    77.1%   tactic.norm_num
 1559ms    43.4%   tactic.ring.eval_add._main._lambda_2
 1519ms    42.3%   tactic.ring.eval_pow._main._lambda_2
 1517ms    42.2%   norm_num.eval_pow
 1240ms    34.5%   tactic.ext_simplify_core
 1240ms    34.5%   norm_num.derive
 1166ms    32.4%   norm_num.derive._main._lambda_3
  603ms    16.8%   tactic.mk_app
  202ms     5.6%   tactic.ring.eval_add._main._lambda_21
  141ms     3.9%   expr.of_nat._lambda_1
  107ms     3.0%   tactic.ring.eval_add._main._lambda_20
   85ms     2.4%   tactic.mk_mapp
   78ms     2.2%   tactic.ring.eval_pow._main._lambda_19
   42ms     1.2%   tactic.ring.eval_mul._main._lambda_2
   28ms     0.8%   tactic.ring.eval_atom._lambda_2
   18ms     0.5%   tactic.ring.eval_add._main._lambda_14
   16ms     0.4%   state_t.monad._lambda_19
   15ms     0.4%   reader_t.monad._lambda_19
   15ms     0.4%   tactic.ring.eval._main._lambda_42
   12ms     0.3%   char.of_nat
    7ms     0.2%   interaction_monad.failed
    6ms     0.2%   tactic.failed
    6ms     0.2%   tactic.ring.eval_add._main._lambda_13
    4ms     0.1%   norm_num.derive._main._lambda_2
    4ms     0.1%   tactic.ring.eval_add
    3ms     0.1%   tactic.ring.horner_expr.xadd'
    3ms     0.1%   tactic.ring.eval_add._main._lambda_23
    3ms     0.1%   tactic.mk_instance
    3ms     0.1%   expr.app
    2ms     0.1%   tactic.ring.ring_m.mk_app._lambda_2
    2ms     0.1%   tactic.ring.eval_add._main._lambda_22
    2ms     0.1%   and.decidable
    2ms     0.1%   tactic.ring.eval._main._lambda_43
    2ms     0.1%   tactic.ring.eval_add._main._lambda_4
    2ms     0.1%   norm_num.eval_div_ext
    2ms     0.1%   expr.mk_app
    1ms     0.0%   tactic.ring.eval_atom._lambda_3
    1ms     0.0%   tactic.ring.eval_add._main._lambda_6
    1ms     0.0%   interaction_monad.fail
    1ms     0.0%   tactic.ring.eval
    1ms     0.0%   tactic.ring.eval_add._main._lambda_15
    1ms     0.0%   state_t.monad._lambda_20
    1ms     0.0%   string.empty
    1ms     0.0%   expr.to_pos_nat
    1ms     0.0%   tactic.exact
    1ms     0.0%   or.decidable
    1ms     0.0%   tactic.ring.eval._main._lambda_29
    1ms     0.0%   expr.to_nat
elaboration: tactic execution took 11.5s
num. allocated objects:  7909
num. allocated closures: 11047
11461ms   100.0%   tactic.interactive.ring._lambda_1
11461ms   100.0%   tactic.istep._lambda_1
11461ms   100.0%   interaction_monad_orelse
11461ms   100.0%   _interaction
11461ms   100.0%   tactic.interactive.ring
11461ms   100.0%   tactic.istep
11461ms   100.0%   _interaction._lambda_2
11461ms   100.0%   tactic.step
11461ms   100.0%   tactic.interactive.ring1
11461ms   100.0%   scope_trace
11461ms   100.0%   tactic.interactive.ring1._lambda_3
11453ms    99.9%   tactic.ring.ring_m.run._lambda_1
11453ms    99.9%   interaction_monad.monad._lambda_9
11453ms    99.9%   tactic.using_new_ref
11453ms    99.9%   tactic.ring.ring_m.run
 9524ms    83.1%   tactic.norm_num
 5637ms    49.2%   tactic.ring.eval_add._main._lambda_2
 5253ms    45.8%   tactic.ring.eval_pow._main._lambda_2
 5251ms    45.8%   norm_num.eval_pow
 4390ms    38.3%   tactic.ext_simplify_core
 4390ms    38.3%   norm_num.derive
 4308ms    37.6%   norm_num.derive._main._lambda_3
 1710ms    14.9%   tactic.mk_app
  180ms     1.6%   tactic.ring.eval_mul._main._lambda_2
  156ms     1.4%   tactic.ring.eval_add._main._lambda_21
  115ms     1.0%   expr.of_nat._lambda_1
   90ms     0.8%   tactic.ring.eval_add._main._lambda_20
   66ms     0.6%   tactic.ring.eval_pow._main._lambda_19
   62ms     0.5%   tactic.mk_mapp
   21ms     0.2%   tactic.ring.eval_atom._lambda_2
   18ms     0.2%   tactic.ring.eval_add._main._lambda_14
   17ms     0.1%   tactic.ring.eval._main._lambda_42
    8ms     0.1%   char.of_nat
    8ms     0.1%   tactic.exact
    8ms     0.1%   state_t.monad._lambda_19
    7ms     0.1%   reader_t.monad._lambda_19
    7ms     0.1%   interaction_monad.failed
    5ms     0.0%   tactic.failed
    5ms     0.0%   tactic.ring.eval_add._main._lambda_13
    5ms     0.0%   expr.app
    3ms     0.0%   norm_num.eval_div_ext
    2ms     0.0%   expr.mk_app
    2ms     0.0%   tactic.ring.horner_expr.xadd'
    2ms     0.0%   tactic.ring.ring_m.mk_app._lambda_2
    1ms     0.0%   nat.binary_rec
    1ms     0.0%   tactic.ring.eval_pow._main._lambda_20
    1ms     0.0%   interaction_monad.monad._lambda_4
    1ms     0.0%   tactic.ring.eval_add._main._lambda_22
    1ms     0.0%   nat.bodd
    1ms     0.0%   string.push
    1ms     0.0%   norm_num.eval_ineq
    1ms     0.0%   tactic.ring.eval_pow._main._lambda_23
    1ms     0.0%   tactic.ring.eval_pow._main._lambda_22
    1ms     0.0%   tactic.ring.eval_pow
    1ms     0.0%   state_t.monad._lambda_20
    1ms     0.0%   norm_num.derive._main._lambda_2
    1ms     0.0%   nat.decidable_lt
    1ms     0.0%   reader_t.monad._lambda_20

view this post on Zulip Mario Carneiro (Apr 06 2020 at 14:30):

(as always, I am frustrated because I know that this job can be done in <1ms if you consider what literally has to happen to produce the term; everything beyond that is lean/vm bloat)

view this post on Zulip Patrick Massot (Apr 06 2020 at 14:31):

Why don't you hack^W develop Lean, instead? Would that be much harder?

view this post on Zulip Mario Carneiro (Apr 06 2020 at 14:34):

I don't want to touch typeclass inference with a ten foot pole

view this post on Zulip Mario Carneiro (Apr 06 2020 at 14:34):

maybe the issue in mk_app has a simple fix, maybe not


Last updated: May 10 2021 at 08:14 UTC