# Zulip Chat Archive

## Stream: maths

### Topic: ring is slow with polynomials

#### Kevin Buzzard (Apr 05 2020 at 21:22):

Is this just a fact of life for some reason:

import data.polynomial set_option profiler true open polynomial example (R : Type) [comm_ring R] (x : R) : (1 + x^2 + x^4 + x^6) * (1 + x) = 1+x+x^2+x^3+x^4+x^5+x^6+x^7 := by ring -- 5 seconds example (R : Type) [comm_ring R] : (1 + X^2 + X^4 + X^6 : polynomial R) * (1 + X) = 1+X+X^2+X^3+X^4+X^5+X^6+X^7 := by ring -- 18 seconds

#### Mario Carneiro (Apr 05 2020 at 21:30):

I suspect that the majority of the time is coming from the few times I "save some time" by calling lean builtins instead of hand rolling exprs using the constructor

#### Kevin Buzzard (Apr 05 2020 at 21:31):

My workaround is to prove the top one and deduce the bottom one with a rw.

#### Mario Carneiro (Apr 05 2020 at 21:33):

8612ms 37.1% tactic.mk_mapp ... 3405ms 14.7% tactic.mk_app ... 5ms 0.0% expr.app 3ms 0.0% tactic.ring.cache.cs_app

the ones on the top are the lean builtins, the ones on the bottom are the hand rolled ones

#### Mario Carneiro (Apr 05 2020 at 21:33):

there is something seriously wrong with the lean internals here

#### Mario Carneiro (Apr 05 2020 at 21:34):

it's almost certainly calling typeclass inference, and that's responsible for the time spent, but I don't understand why the typeclass cache isn't working

#### Mario Carneiro (Apr 05 2020 at 21:36):

Actually, it might not be a bad idea to do something like that inside ring, that is, generalize the type of the expression before starting the proof proper

#### Mario Carneiro (Apr 05 2020 at 21:36):

that way those monstrous typeclass arguments are not copied everywhere, but rather are passed in via a `have`

#### orlando (Apr 05 2020 at 21:47):

Hey Kevin, with my computer it's faster with the polynomial ? It's normal ? 15 for the first and 11 for the second :upside_down:

#### Ryan Lahfa (Apr 05 2020 at 22:09):

FWIW, on a Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz, got 7 seconds for the first and 28 seconds for the second.

Profile for the 2nd one:

elaboration: tactic execution took 28.9s num. allocated objects: 7769 num. allocated closures: 10240 28887ms 100.0% _interaction._lambda_2 28887ms 100.0% scope_trace 28887ms 100.0% tactic.interactive.ring1._lambda_3 28887ms 100.0% tactic.interactive.ring1 28887ms 100.0% tactic.step 28887ms 100.0% tactic.istep 28887ms 100.0% tactic.interactive.ring 28887ms 100.0% _interaction 28887ms 100.0% interaction_monad_orelse 28887ms 100.0% tactic.istep._lambda_1 28887ms 100.0% tactic.interactive.ring._lambda_1 28886ms 100.0% tactic.ring.ring_m.run 28886ms 100.0% interaction_monad.monad._lambda_9 28872ms 99.9% tactic.ring.lift 13373ms 46.3% tactic.norm_num 10911ms 37.8% tactic.mk_mapp 9164ms 31.7% tactic.ring.eval_add._main._lambda_2 7687ms 26.6% tactic.ring.eval_pow._main._lambda_2 7154ms 24.8% norm_num.eval_pow 5977ms 20.7% norm_num.derive 5977ms 20.7% tactic.ext_simplify_core 5861ms 20.3% norm_num.derive._main._lambda_3 4157ms 14.4% tactic.mk_app 1846ms 6.4% expr.of_nat._lambda_1

Profile for the 1st:

elaboration: tactic execution took 7.45s num. allocated objects: 7771 num. allocated closures: 10240 7445ms 100.0% interaction_monad_orelse 7445ms 100.0% tactic.interactive.ring._lambda_1 7445ms 100.0% tactic.istep._lambda_1 7445ms 100.0% _interaction 7445ms 100.0% tactic.interactive.ring 7445ms 100.0% tactic.istep 7445ms 100.0% _interaction._lambda_2 7445ms 100.0% tactic.step 7445ms 100.0% scope_trace 7445ms 100.0% tactic.interactive.ring1 7445ms 100.0% tactic.interactive.ring1._lambda_3 7444ms 100.0% interaction_monad.monad._lambda_9 7444ms 100.0% tactic.ring.ring_m.run 7434ms 99.9% tactic.ring.lift 3567ms 47.9% tactic.norm_num 2348ms 31.5% tactic.ring.eval_add._main._lambda_2 2147ms 28.8% tactic.ring.eval_pow._main._lambda_2 2120ms 28.5% tactic.mk_mapp 2050ms 27.5% norm_num.eval_pow 1698ms 22.8% tactic.ext_simplify_core 1698ms 22.8% norm_num.derive 1593ms 21.4% norm_num.derive._main._lambda_3 1440ms 19.3% tactic.mk_app

#### Anne Baanen (Apr 06 2020 at 07:55):

I don't want to sound too repetitive,, but you should use `ring_exp`

here too :slight_smile:

#### Anne Baanen (Apr 06 2020 at 08:01):

I get approximately the same times as Kevin for `ring`

, but 150ms and 500ms for `ring_exp`

. I'm actually surprised that the difference is so large

#### Kevin Buzzard (Apr 06 2020 at 08:11):

Being repetitive is very helpful for me -- I have to hear something several times before it sinks in. Does `ring_exp`

not use `ring`

then??

#### Kevin Buzzard (Apr 06 2020 at 08:15):

@Anne Baanen you just shaved 10 seconds off a Codewars Kata I'm working on, which might move it from "too slow to work on the site" to "killer question" :-)

#### Anne Baanen (Apr 06 2020 at 08:20):

Kevin Buzzard said:

Being repetitive is very helpful for me -- I have to hear something several times before it sinks in. Does

`ring_exp`

not use`ring`

then??

It doesn't! Basically `ring`

only knows how to represent `+`

and `*`

, but can do so efficiently (using the Horner form), and `ring_exp`

trades in the efficiency on simpler problems to also deal with `^`

#### Yury G. Kudryashov (Apr 06 2020 at 08:28):

I like "trades in the efficiency" together with "runs much faster".

#### Sebastien Gouezel (Apr 06 2020 at 08:31):

If in practice `ring_exp`

is most often faster on reasonable problems, shouldn't we rename `ring`

to `ring_horner`

and `ring_exp`

to `ring`

? @Mario Carneiro

#### Patrick Massot (Apr 06 2020 at 08:31):

Reading assignment for student Kevin: https://lean-forward.github.io/ring_exp/paper.pdf

#### Kevin Buzzard (Apr 06 2020 at 08:34):

This looks like some implementation issue to me

#### Kevin Buzzard (Apr 06 2020 at 08:34):

ring = ring_exp, right? By ext.

#### Kevin Buzzard (Apr 06 2020 at 08:37):

So should Grégoire-Mahboubi be called "proving equalities in a comm ring done wrong in coq"? :P (sorry Assia, I love your paper!)

#### Johan Commelin (Apr 06 2020 at 08:44):

Kevin Buzzard said:

ring = ring_exp, right? By ext.

No, I don't think so. `ring`

can't deal with `(a + b)^2`

, whereas `ring_exp`

can.

#### Anne Baanen (Apr 06 2020 at 08:52):

I wouldn't say that `ring_exp`

is always faster in practice: `linarith`

uses `ring`

internally, and linear expressions are exactly the simpler kind that `ring`

is better at. The Horner form is optimal in theory (if you use the correct definition of "optimal")

#### Rob Lewis (Apr 06 2020 at 08:53):

IIRC, there are some situations where `ring`

is faster than `ring_exp`

. Anne and I did some investigation when they were working on `ring_exp`

and there was a performance regression when we just replaced `ring`

entirely.

(Yep, what Anne said!)

#### Rob Lewis (Apr 06 2020 at 08:54):

Of course implementation details in both can shift the performance either way for specific problems.

#### Rob Lewis (Apr 06 2020 at 08:54):

It's probably possible to backport some of the `ring_exp`

optimizations to `ring`

.

#### Anne Baanen (Apr 06 2020 at 08:55):

Johan Commelin said:

`ring`

can't deal with`(a + b)^2`

, whereas`ring_exp`

can.

It can deal with that specific case, because it first rewrites `(a + b)^2`

to `(a + b)*(a + b)`

. You can replace the `2`

with `n : ℕ`

to get a real counterexample

#### Rob Lewis (Apr 06 2020 at 08:56):

But what I'm hearing from all these questions about `ring`

and polynomials is that there are still type class issues with polynomials. The inefficiency in `ring`

is probably that it's repeating unnecessary searches, but these searches are slow, and that's the root issue.

#### Kevin Buzzard (Apr 06 2020 at 08:58):

Patrick Massot said:

Reading assignment for student Kevin: https://lean-forward.github.io/ring_exp/paper.pdf

@Anne Baanen typo in table of p3: "$a^{b+c}=a^b+a^c$"

#### Kevin Buzzard (Apr 06 2020 at 09:01):

section 4 you seem to say "we tried random identities and `ring_exp`

was slower" -- but I'm trying identities that come up in practice and `ring_exp`

is a lot quicker :-)

#### Anne Baanen (Apr 06 2020 at 09:05):

Kevin Buzzard said:

Anne Baanen typo in table of p3: "$a^{b+c}=a^b+a^c$"

Well spotted! I already fixed it in the updated version of the paper, which should be available soon.

#### Anne Baanen (Apr 06 2020 at 09:11):

Indeed, it's not the most informative comparison, but it's objective and can be explained in a few sentences when there is a 5 page limit, so I think I will have to live with it appearing a bit slower :slight_smile:

#### Mario Carneiro (Apr 06 2020 at 11:04):

To repeat: the reason `ring`

is slow here has little to do with `ring`

, it has to do with the lean functions that `ring`

calls that are slow when they should not be. Yes, it is possible to make `ring`

faster by avoiding these functions, but this essentially means that you aren't allowed to use the elaborator at all, you have to build a fully formed expr by hand if you want to get performance and that's ridiculous. I'll do it, ring is already 99% implemented like this and the remaning 1% of lean is causing the problem, but this really should be fixed at a lower level

#### Jasmin Blanchette (Apr 06 2020 at 11:30):

But Anne, you will have 6 pages for the final version, right? Maybe you can also show some examples from this thread.

#### Kevin Buzzard (Apr 06 2020 at 11:31):

and this thread

#### Anne Baanen (Apr 06 2020 at 11:45):

Yeah, I want to add a few more positive examples. But space is still pretty tight, since there are some places that the reviewers pointed out that they want more explanation. And also Mario's objection that the same optimizations can be applied to `ring`

.

#### Mario Carneiro (Apr 06 2020 at 14:26):

Okay, I did a bunch of optimization (which is to say, `mk_app`

considered harmful) and managed to get it down to 11 s. There are a lot of places that implicitly assume that calling `mk_app`

is not that bad, so it's not easy going and the result looks horrible, but that's what you get for trying to circumvent lean internals. I don't think this is likely to get much better until I rewrite `tactic.norm_num`

.

elaboration: tactic compilation took 5.97ms elaboration: tactic compilation took 4.2ms elaboration: tactic execution took 3.59s num. allocated objects: 7909 num. allocated closures: 11047 3594ms 100.0% tactic.interactive.ring1 3594ms 100.0% scope_trace 3594ms 100.0% tactic.interactive.ring1._lambda_3 3594ms 100.0% tactic.step 3594ms 100.0% _interaction._lambda_2 3594ms 100.0% tactic.istep 3594ms 100.0% tactic.interactive.ring 3594ms 100.0% _interaction 3594ms 100.0% interaction_monad_orelse 3594ms 100.0% tactic.istep._lambda_1 3594ms 100.0% tactic.interactive.ring._lambda_1 3593ms 100.0% tactic.ring.ring_m.run 3590ms 99.9% tactic.ring.ring_m.run._lambda_1 3590ms 99.9% interaction_monad.monad._lambda_9 3590ms 99.9% tactic.using_new_ref 2770ms 77.1% tactic.norm_num 1559ms 43.4% tactic.ring.eval_add._main._lambda_2 1519ms 42.3% tactic.ring.eval_pow._main._lambda_2 1517ms 42.2% norm_num.eval_pow 1240ms 34.5% tactic.ext_simplify_core 1240ms 34.5% norm_num.derive 1166ms 32.4% norm_num.derive._main._lambda_3 603ms 16.8% tactic.mk_app 202ms 5.6% tactic.ring.eval_add._main._lambda_21 141ms 3.9% expr.of_nat._lambda_1 107ms 3.0% tactic.ring.eval_add._main._lambda_20 85ms 2.4% tactic.mk_mapp 78ms 2.2% tactic.ring.eval_pow._main._lambda_19 42ms 1.2% tactic.ring.eval_mul._main._lambda_2 28ms 0.8% tactic.ring.eval_atom._lambda_2 18ms 0.5% tactic.ring.eval_add._main._lambda_14 16ms 0.4% state_t.monad._lambda_19 15ms 0.4% reader_t.monad._lambda_19 15ms 0.4% tactic.ring.eval._main._lambda_42 12ms 0.3% char.of_nat 7ms 0.2% interaction_monad.failed 6ms 0.2% tactic.failed 6ms 0.2% tactic.ring.eval_add._main._lambda_13 4ms 0.1% norm_num.derive._main._lambda_2 4ms 0.1% tactic.ring.eval_add 3ms 0.1% tactic.ring.horner_expr.xadd' 3ms 0.1% tactic.ring.eval_add._main._lambda_23 3ms 0.1% tactic.mk_instance 3ms 0.1% expr.app 2ms 0.1% tactic.ring.ring_m.mk_app._lambda_2 2ms 0.1% tactic.ring.eval_add._main._lambda_22 2ms 0.1% and.decidable 2ms 0.1% tactic.ring.eval._main._lambda_43 2ms 0.1% tactic.ring.eval_add._main._lambda_4 2ms 0.1% norm_num.eval_div_ext 2ms 0.1% expr.mk_app 1ms 0.0% tactic.ring.eval_atom._lambda_3 1ms 0.0% tactic.ring.eval_add._main._lambda_6 1ms 0.0% interaction_monad.fail 1ms 0.0% tactic.ring.eval 1ms 0.0% tactic.ring.eval_add._main._lambda_15 1ms 0.0% state_t.monad._lambda_20 1ms 0.0% string.empty 1ms 0.0% expr.to_pos_nat 1ms 0.0% tactic.exact 1ms 0.0% or.decidable 1ms 0.0% tactic.ring.eval._main._lambda_29 1ms 0.0% expr.to_nat elaboration: tactic execution took 11.5s num. allocated objects: 7909 num. allocated closures: 11047 11461ms 100.0% tactic.interactive.ring._lambda_1 11461ms 100.0% tactic.istep._lambda_1 11461ms 100.0% interaction_monad_orelse 11461ms 100.0% _interaction 11461ms 100.0% tactic.interactive.ring 11461ms 100.0% tactic.istep 11461ms 100.0% _interaction._lambda_2 11461ms 100.0% tactic.step 11461ms 100.0% tactic.interactive.ring1 11461ms 100.0% scope_trace 11461ms 100.0% tactic.interactive.ring1._lambda_3 11453ms 99.9% tactic.ring.ring_m.run._lambda_1 11453ms 99.9% interaction_monad.monad._lambda_9 11453ms 99.9% tactic.using_new_ref 11453ms 99.9% tactic.ring.ring_m.run 9524ms 83.1% tactic.norm_num 5637ms 49.2% tactic.ring.eval_add._main._lambda_2 5253ms 45.8% tactic.ring.eval_pow._main._lambda_2 5251ms 45.8% norm_num.eval_pow 4390ms 38.3% tactic.ext_simplify_core 4390ms 38.3% norm_num.derive 4308ms 37.6% norm_num.derive._main._lambda_3 1710ms 14.9% tactic.mk_app 180ms 1.6% tactic.ring.eval_mul._main._lambda_2 156ms 1.4% tactic.ring.eval_add._main._lambda_21 115ms 1.0% expr.of_nat._lambda_1 90ms 0.8% tactic.ring.eval_add._main._lambda_20 66ms 0.6% tactic.ring.eval_pow._main._lambda_19 62ms 0.5% tactic.mk_mapp 21ms 0.2% tactic.ring.eval_atom._lambda_2 18ms 0.2% tactic.ring.eval_add._main._lambda_14 17ms 0.1% tactic.ring.eval._main._lambda_42 8ms 0.1% char.of_nat 8ms 0.1% tactic.exact 8ms 0.1% state_t.monad._lambda_19 7ms 0.1% reader_t.monad._lambda_19 7ms 0.1% interaction_monad.failed 5ms 0.0% tactic.failed 5ms 0.0% tactic.ring.eval_add._main._lambda_13 5ms 0.0% expr.app 3ms 0.0% norm_num.eval_div_ext 2ms 0.0% expr.mk_app 2ms 0.0% tactic.ring.horner_expr.xadd' 2ms 0.0% tactic.ring.ring_m.mk_app._lambda_2 1ms 0.0% nat.binary_rec 1ms 0.0% tactic.ring.eval_pow._main._lambda_20 1ms 0.0% interaction_monad.monad._lambda_4 1ms 0.0% tactic.ring.eval_add._main._lambda_22 1ms 0.0% nat.bodd 1ms 0.0% string.push 1ms 0.0% norm_num.eval_ineq 1ms 0.0% tactic.ring.eval_pow._main._lambda_23 1ms 0.0% tactic.ring.eval_pow._main._lambda_22 1ms 0.0% tactic.ring.eval_pow 1ms 0.0% state_t.monad._lambda_20 1ms 0.0% norm_num.derive._main._lambda_2 1ms 0.0% nat.decidable_lt 1ms 0.0% reader_t.monad._lambda_20

#### Mario Carneiro (Apr 06 2020 at 14:30):

(as always, I am frustrated because I know that this job can be done in <1ms if you consider what literally has to happen to produce the term; everything beyond that is lean/vm bloat)

#### Patrick Massot (Apr 06 2020 at 14:31):

Why don't you hack^W develop Lean, instead? Would that be much harder?

#### Mario Carneiro (Apr 06 2020 at 14:34):

I don't want to touch typeclass inference with a ten foot pole

#### Mario Carneiro (Apr 06 2020 at 14:34):

maybe the issue in `mk_app`

has a simple fix, maybe not

Last updated: May 10 2021 at 08:14 UTC