Zulip Chat Archive

Stream: lean4

Topic: heartbeats, GMP vs no GMP


David Renshaw (Oct 08 2022 at 03:50):

I'm poking at https://github.com/leanprover/lean4/issues/827 a little bit.
I have found an input that causes a non-GMP build to emit

Error pretty printing expression: (deterministic) timeout at 'delab', maximum number of heartbeats (1000) has been reached (use 'set_option maxHeartbeats <num>' to set the limit

while the equivalent GMP build succesfully pretty-prints without hitting the limit.

Is that an interesting result? It was somewhat surprising to me, as I did not expect mpz operations to change the heartbeat count.

Chris Lovett (Oct 08 2022 at 05:46):

sounds like the heartbeat counter needs to look for wraparound in the counter?

Leonardo de Moura (Oct 08 2022 at 14:06):

@David Renshaw Yes, this is an interesting result, but it is consistent with the current implementation. We increase the number of heartbeats in our custom memory allocator, and the non-GMP implementation uses this custom allocator.
https://github.com/leanprover/lean4/blob/master/src/runtime/mpz.cpp#L264

David Renshaw (Oct 08 2022 at 15:10):

here's a reduced input:

def x: Int :=(1)
#reduce x*49

David Renshaw (Oct 08 2022 at 15:11):

where is the non-utf8 byte 0xff

David Renshaw (Oct 08 2022 at 15:15):

$ echo "ZGVmIHg6IEludCA6Pf8oMSkKI3JlZHVjZSB4KjQ5Cg==" | base64 -d > differ.lean

$ ~/src/lean4/build/release-no-gmp/stage1/bin/lean differ.lean -D maxHeartbeats=1000
differ.lean:1:13: error: unknown identifier '«»'
differ.lean:1:14: error: expected command
failed to pretty print expression (use 'set_option pp.rawOnError true' for raw representation)

$ ~/src/lean4/build/release/stage1/bin/lean differ.lean -D maxHeartbeats=1000
differ.lean:1:13: error: unknown identifier '«»'
differ.lean:1:14: error: expected command
Int.rec
  (fun a =>
    Int.ofNat
... [large expression]

David Renshaw (Oct 08 2022 at 15:17):

We increase the number of heartbeats in our custom memory allocator, and the non-GMP implementation uses this custom allocator.

Ah, okay, that makes sense. This result is no longer surprising to me.

Leonardo de Moura (Oct 08 2022 at 15:21):

@David Renshaw Is it clear to you where the big numbers are being created in this example?

Leonardo de Moura (Oct 08 2022 at 15:23):

Is the non-utf8 byte relevant to trigger the discrepancy? That is, can you observe the discrepancy if you use def x: Int := ???

David Renshaw (Oct 08 2022 at 15:24):

yes, the non-utf8 byte seems to be necessary

David Renshaw (Oct 08 2022 at 15:25):

I assume some error handling fallback logic is creating a large number

David Renshaw (Oct 08 2022 at 15:26):

though maybe if I get just the right large int I don't need that...

David Renshaw (Oct 08 2022 at 15:27):

hm.. this is not triggering it:

def x: Int :=1000000000000000000000000000000000000000000000000900000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
#reduce x*49

Leonardo de Moura (Oct 08 2022 at 15:29):

David Renshaw said:

hm.. this is not triggering it:

def x: Int :=1000000000000000000000000000000000000000000000000900000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
#reduce x*49

#reduce and the kernel have special support for reducing terms like that efficiently in one step. That is, they will not unfold Nat.mul and compute the result using the unary encoding.

Leonardo de Moura (Oct 08 2022 at 15:35):

David Renshaw said:

I assume some error handling fallback logic is creating a large number

It is unclear to me what is happening here. I added

set_option pp.all true
#print x

and I get

def x : Int :=
@sorryAx.{1} Int Bool.true

as expected. I don't have the "non-gmp" version built on my machine right now.

David Renshaw (Oct 08 2022 at 17:56):

Aha! This input exhibits the same behavior, with no bad utf8:

def x: Int := sorry
#reduce x * 49

David Renshaw (Oct 08 2022 at 17:57):

the behavior is: without GMP enabled and with -D maxHeartbeats=1000, we get the error failed to pretty print expression (use 'set_option pp.rawOnError true' for raw representation), while with GMP enabled and the same maxHeartbeats, pretty printing succeeds

Leonardo de Moura (Oct 09 2022 at 12:50):

Great. Did you find where the extra "heartbeats" are coming from? I am assuming there are no big-numbers being created in this #reduce command.

David Renshaw (Oct 10 2022 at 00:21):

I don't know where the extra heartbeats are coming from.

The exact counts are: non-gmp version takes 1045 heartbeats, while the gmp version takes 963 heartbeats.

Mario Carneiro (Oct 10 2022 at 00:26):

That doesn't seem like an excessive difference. GMP is not using an allocator that counts heartbeats so some difference is expected.

David Renshaw (Oct 10 2022 at 17:03):

stack trace of an allocation that occurs in the non-gmp version and does not occur in the gmp version:

#9  0x00007ffff4b531e2 in lean::mpz::allocate(unsigned long) [clone .constprop.0] () from /home/dwrensha/src/lean4/build/release-no-gmp/stage1/bin/../lib/lean/libleanshared.so
#10 0x00007ffff4b5321a in lean::mpz::init() () from /home/dwrensha/src/lean4/build/release-no-gmp/stage1/bin/../lib/lean/libleanshared.so
#11 0x00007ffff4b5327a in lean::mpz::mpz(char const*) () from /home/dwrensha/src/lean4/build/release-no-gmp/stage1/bin/../lib/lean/libleanshared.so
#12 0x00007ffff79282d9 in lean_cstr_to_nat () from /home/dwrensha/src/lean4/build/release-no-gmp/stage1/bin/../lib/lean/libleanshared.so
#13 0x00007ffff4dc410f in initialize_Init_Prelude () from /home/dwrensha/src/lean4/build/release-no-gmp/stage1/bin/../lib/lean/libleanshared.so
#14 0x00007ffff4e293c9 in initialize_Init () from /home/dwrensha/src/lean4/build/release-no-gmp/stage1/bin/../lib/lean/libleanshared.so
#15 0x00007ffff791fce7 in lean_initialize () from /home/dwrensha/src/lean4/build/release-no-gmp/stage1/bin/../lib/lean/libleanshared.so
#16 0x00007ffff77d1f4e in lean_main () from /home/dwrensha/src/lean4/build/release-no-gmp/stage1/bin/../lib/lean/libleanshared.so
#17 0x00007ffff4229d90 in __libc_start_call_main (main=main@entry=0x555555555060 <main>, argc=argc@entry=4, argv=argv@entry=0x7fffffffdcc8) at ../sysdeps/nptl/libc_start_call_main.h:58
#18 0x00007ffff4229e40 in __libc_start_main_impl (main=0x555555555060 <main>, argc=4, argv=0x7fffffffdcc8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdcb8)

David Renshaw (Oct 10 2022 at 17:09):

I believe this traces back to https://github.com/leanprover/lean4/blob/7b3709e28ab1252a1e175dbe2ebd5c468fa359b9/src/Init/Prelude.lean#L1814

David Renshaw (Oct 10 2022 at 17:10):

and https://github.com/leanprover/lean4/blob/7b3709e28ab1252a1e175dbe2ebd5c468fa359b9/src/Init/Prelude.lean#L1889

David Renshaw (Oct 10 2022 at 19:02):

further along, in the non-gmp version I observe Lean.Subexpr.Pos.push calling lean::mpz::allocate:
https://github.com/leanprover/lean4/blob/7b3709e28ab1252a1e175dbe2ebd5c468fa359b9/src/Lean/SubExpr.lean#L45

Leonardo de Moura (Oct 10 2022 at 20:15):

David Renshaw said:

further along, in the non-gmp version I observe Lean.Subexpr.Pos.push calling lean::mpz::allocate:
https://github.com/leanprover/lean4/blob/7b3709e28ab1252a1e175dbe2ebd5c468fa359b9/src/Lean/SubExpr.lean#L45

Now that you found it, it makes sense :)
Thanks for investigating!


Last updated: Dec 20 2023 at 11:08 UTC