Zulip Chat Archive

Stream: general

Topic: set is slow

Kenny Lau (Apr 09 2020 at 10:08):

the tactic set takes a long time to parse and elaborate:

import tactic

set_option profiler true
-- parsing took 399ms
-- elaboration of test_let took 139ms
theorem test_let : true :=
begin
  let f : ℕ → ℕ := λ n, n+1,
  let f : ℕ → ℕ := λ n, n+1,
  let f : ℕ → ℕ := λ n, n+1,
  let f : ℕ → ℕ := λ n, n+1,
  let f : ℕ → ℕ := λ n, n+1,
  let f : ℕ → ℕ := λ n, n+1,
  let f : ℕ → ℕ := λ n, n+1,
  let f : ℕ → ℕ := λ n, n+1,
  let f : ℕ → ℕ := λ n, n+1,
  let f : ℕ → ℕ := λ n, n+1,
  trivial
end

-- parsing took 1.38s
-- elaboration of test_set took 687ms
theorem test_set : true :=
begin
  set f : ℕ → ℕ := λ n, n+1 with hf,
  set f : ℕ → ℕ := λ n, n+1 with hf,
  set f : ℕ → ℕ := λ n, n+1 with hf,
  set f : ℕ → ℕ := λ n, n+1 with hf,
  set f : ℕ → ℕ := λ n, n+1 with hf,
  set f : ℕ → ℕ := λ n, n+1 with hf,
  set f : ℕ → ℕ := λ n, n+1 with hf,
  set f : ℕ → ℕ := λ n, n+1 with hf,
  set f : ℕ → ℕ := λ n, n+1 with hf,
  set f : ℕ → ℕ := λ n, n+1 with hf,
  trivial
end

Kenny Lau (Apr 09 2020 at 10:14):

meta def «let» (h : parse ident?) (q₁ : parse (tk ":" *> texpr)?) (q₂ : parse $ (tk ":=" *> texpr)?) : tactic unit :=
let h := h.get_or_else `this in
match q₁, q₂ with
| some e, some p := do
  t ← i_to_expr e,
  v ← i_to_expr ``(%%p : %%t),
  tactic.definev h t v
| none, some p := do
  p ← i_to_expr p,
  tactic.pose h none p
| some e, none := i_to_expr e >>= tactic.define h
| none, none := do
  u ← mk_meta_univ,
  e ← mk_meta_var (sort u),
  tactic.define h e
end >> skip

meta def set (h_simp : parse (tk "!")?) (a : parse ident) (tp : parse ((tk ":") >> texpr)?)
  (_ : parse (tk ":=")) (pv : parse texpr)
  (rev_name : parse opt_dir_with) :=
do let vt := match tp with | some t := t | none := pexpr.mk_placeholder end,
   let pv := ``(%%pv : %%vt),
   v ← to_expr pv,
   tp ← infer_type v,
   definev a tp v,
   when h_simp.is_none $ change' pv (some (expr.const a [])) loc.wildcard,
   match rev_name with
   | some (flip, id) :=
     do nv ← get_local a,
        pf ← to_expr (cond flip ``(%%pv = %%nv) ``(%%nv = %%pv)) >>= assert id,
        reflexivity
   | none := skip
   end

Kenny Lau (Apr 09 2020 at 10:15):

so by elimination parse opt_dir_with should be the problem?

Mario Carneiro (Apr 09 2020 at 10:15):

There are some things that can be cleaned up there, like using mk_eq instead of elaborating pre-expressions

Mario Carneiro (Apr 09 2020 at 10:16):

there is also change'

Mario Carneiro (Apr 09 2020 at 10:18):

the reflexivity also looks suspicious

Kenny Lau (Apr 09 2020 at 10:20):

what's the difference between set a : b := c with d and let a : b := c, have d : a = c := rfl?

Mario Carneiro (Apr 09 2020 at 10:22):

nothing

Mario Carneiro (Apr 09 2020 at 10:23):

actually not nothing, set also replaces occurrences of c with a in the hypotheses and goal if you use !, I think

Mario Carneiro (Apr 09 2020 at 10:23):

or if you don't use !?

Mario Carneiro (Apr 09 2020 at 10:24):

from looking at the code it seems it's the other way around

Mario Carneiro (Apr 09 2020 at 10:26):

I should mention that I'm seeing some significantly different profile results

-- parsing took 95.4ms
-- elaboration of test_let took 25ms
theorem test_let : true :=
...

-- parsing took 264ms
-- elaboration of test_set took 60.9ms
theorem test_set : (λ n, n+1) = (λ n, n+1) :=
...

Mario Carneiro (Apr 09 2020 at 10:26):

are you using a chromebook or some similar low power device?

Kenny Lau (Apr 09 2020 at 10:27):

I think my computer is 4x slower than others here

Kenny Lau (Apr 09 2020 at 10:27):

which, if you factor that in, seems accurate

Mario Carneiro (Apr 09 2020 at 10:27):

the difference is more like 10x here though

Kenny Lau (Apr 09 2020 at 10:28):

399 ~ 95.4 x 4
139 ~ 25 x 4
1380 ~ 264 x 4
687 ~ 60 x 10??

Kenny Lau (Apr 09 2020 at 10:28):

yeah, about 4 times

Mario Carneiro (Apr 09 2020 at 10:29):

okay, I'll just make the test 4x longer

Kenny Lau (Apr 09 2020 at 10:29):

Intel(R) Core(TM) i5-7300U CPU @ 2.60 GHz 2.71 GHz 8.00 GB

Gabriel Ebner (Apr 09 2020 at 10:31):

That seems like a reasonable CPU. Are you running a debug build of Lean?

Kenny Lau (Apr 09 2020 at 10:32):

why does it seem like I have only 4GB of RAM available?

Kenny Lau (Apr 09 2020 at 10:32):

when I open the task manager, it says I'm using ~3000 MB and that is ~80%

Kenny Lau (Apr 09 2020 at 10:33):

hmm, if I go to the performance tab the it says I'm using 6.6 GB

Mario Carneiro (Apr 09 2020 at 10:33):

are you on windows?

Kenny Lau (Apr 09 2020 at 10:33):

yeah

Kenny Lau (Apr 09 2020 at 10:33):

Gabriel Ebner said:

That seems like a reasonable CPU. Are you running a debug build of Lean?

what does this mean?

Mario Carneiro (Apr 09 2020 at 10:33):

windows does heap compression

Kenny Lau (Apr 09 2020 at 10:34):

also I only have 2 cores and I think you guys have like 8 cores

Kenny Lau (Apr 09 2020 at 10:34):

I have 2 cores and 4 threads

Gabriel Ebner (Apr 09 2020 at 10:34):

Did you compile lean yourself?

Kenny Lau (Apr 09 2020 at 10:34):

no, I used elan

Mario Carneiro (Apr 09 2020 at 10:34):

I don't think individual proofs are multithreaded though

Mario Carneiro (Apr 09 2020 at 10:44):

the parsing is surprisingly expensive here, and playing around with things it seems like prod.mk is more expensive than I would have guessed

Kenny Lau (Apr 09 2020 at 10:44):

why would prod.mk be expensive at all?

Gabriel Ebner (Apr 09 2020 at 10:45):

As I've posted in a private chat:
Did you know how the syntax for the tactic arguments is implemented? When an interactive tactic has an argument of type parse foo : α, then foo : parser α is a metaprogram that is executed in the VM to parse the argument. So you can make parsing as slow as you want by running arbitrary programs at parse time!

Gabriel Ebner (Apr 09 2020 at 10:46):

To make matters even worse, the VM compiler is run for every argument, every time you parse an interactive tactic.

Kenny Lau (Apr 09 2020 at 10:46):

also parsing seems to be in general slower than the last time I was here

Kenny Lau (Apr 09 2020 at 10:46):

even though I didn't use any new tactics

Kenny Lau (Apr 09 2020 at 10:46):

(partly because I don't know them)

Mario Carneiro (Apr 09 2020 at 10:50):

open lean.parser
local postfix `?`:9001 := optional
namespace tactic
namespace interactive
open interactive

meta def foo (rev_name : parse (do
  tk "with",
  a ← (tk "<-")?,
  b ← ident,
  return (a,b))?) := skip

meta def bar (rev_name : parse (do
  tk "with",
  a ← (tk "<-")?,
  b ← ident,
  return (b))?) := skip

end interactive
end tactic

set_option profiler true

-- parsing took 950ms - 1.5s
theorem test_foo : true :=
begin
  foo with hf,
  foo with hf,
  -- ×40
  trivial
end

-- parsing took 161ms - 207ms
theorem test_bar : true :=
begin
  bar with hf,
  bar with hf,
  -- ×40
  trivial
end

Mario Carneiro (Apr 09 2020 at 10:51):

note that foo and bar accept the same grammar

Gabriel Ebner (Apr 09 2020 at 10:51):

Is there a typo in test_bar? It should be bar with instead of foo with, right?

Mario Carneiro (Apr 09 2020 at 10:51):

yes

Rob Lewis (Apr 09 2020 at 10:51):

Indeed, it looks like there are two independent slowdowns, the parsing and the replacing. Mario seems to have a handle on the parsing. Using set! instead of set makes the elaboration times about the same for me. So change' is slow.

Kenny Lau (Apr 09 2020 at 10:52):

oh no

Mario Carneiro (Apr 09 2020 at 10:52):

what's the story here? change' is fixing something in change with?

Rob Lewis (Apr 09 2020 at 10:53):

change with gets things wrong when there are dependencies.

Mario Carneiro (Apr 09 2020 at 10:54):

do you have an example?

Mario Carneiro (Apr 09 2020 at 10:54):

I would have assumed that since everything is defeq there are no problems with dependencies

Rob Lewis (Apr 09 2020 at 10:56):

The example in the old PR is

example (a : ℕ) (h : a = 0) : true :=
begin
  change a with a + 0 at *,
end

Rob Lewis (Apr 09 2020 at 10:57):

https://github.com/leanprover-community/mathlib/pull/712

Rob Lewis (Apr 09 2020 at 10:58):

I don't remember the details.

Mario Carneiro (Apr 09 2020 at 10:58):

is it just at * that has the problem? I tried at h and at h |- and at a h |- and they all work fine

Rob Lewis (Apr 09 2020 at 10:59):

at h a doesn't work.

Mario Carneiro (Apr 09 2020 at 11:01):

hm. well to some extent I would expect set to be asymptotically slower than let because it has to examine the whole context on each new line, so an O(n^2) operation

Mario Carneiro (Apr 09 2020 at 11:02):

the parsing on the other hand is completely baffling

Rob Lewis (Apr 09 2020 at 11:03):

On my machine the parsing makes a way bigger difference than the change.

Mario Carneiro (Apr 09 2020 at 11:03):

inlining the parser helps some

Mario Carneiro (Apr 09 2020 at 11:03):

I tried @[inline] but it doesn't seem to work

Sebastian Ullrich (Apr 09 2020 at 11:19):

Gabriel Ebner said:

To make matters even worse, the VM compiler is run for every argument, every time you parse an interactive tactic.

Whoever came up with that :information_desk_person:

Gabriel Ebner (Apr 09 2020 at 11:21):

The compilation overhead seems to be negligible, at least in Mario's example. I've tried factoring out the parsers into separate definitions, it's not faster, and I still see the performance difference:

open lean.parser
local postfix `?`:9001 := optional
namespace tactic
namespace interactive
open interactive

set_option trace.compiler.optimize_bytecode true
set_option trace.compiler.preprocess true

meta def foo_core := (do
tk "with",
a ← (tk "<-")?,
b ← ident,
return (a,b))?

meta def bar_core := (do
tk "with",
a ← (tk "<-")?,
b ← ident,
return (b))?

meta def foo (rev_name : parse bar_core) := skip

meta def bar (rev_name : parse foo_core) := skip

Gabriel Ebner (Apr 09 2020 at 11:21):

Weirdly enough, foo_core and bar_core only differ in two bytecode instructions...

Gabriel Ebner (Apr 09 2020 at 11:27):

Oh, no.. I think I know what is going on. After the parser is finished, the result (for foo this is a value of type option (option unit × string)) is converted into an expr by calling reflect. It's not prod.mk that is slow, but reflecting the product.

Mario Carneiro (Apr 09 2020 at 11:28):

I saw the reflectable thing but I'm not exactly sure how it is implemented

Mario Carneiro (Apr 09 2020 at 11:29):

or what it's for. Is it like decidable p where the reflectable instance is the "real" parser and everything else is just window dressing?

Gabriel Ebner (Apr 09 2020 at 11:30):

This is from lean#19 which was merged into 3.5...

Mario Carneiro (Apr 09 2020 at 11:30):

so Kenny's claims about a performance regression are true?

Gabriel Ebner (Apr 09 2020 at 11:31):

At least they're plausible.

Kenny Lau (Apr 09 2020 at 11:31):

Could someone eli5?

Gabriel Ebner (Apr 09 2020 at 12:23):

lean#180

Kenny Lau (Apr 09 2020 at 12:34):

thanks

Last updated: May 02 2025 at 03:31 UTC