Zulip Chat Archive

Stream: lean4

Topic: initialization order fiasco


Mario Carneiro (Jun 03 2023 at 09:09):

I've been debugging !4#4573, which has a random failure in the first file to import Mathlib.Tactic.LabelAttr after disabling the compiler. (Spoilers, the new compiler is completely unrelated to the bug.) Isolating only the important parts, the code looks something like this (without the print statements):

initialize a : IO.Ref Nat  println! "1"; IO.mkRef 0
initialize b : Unit  println! "2"; a.modify (· + 1)

In actual practice the first initialize is initialize labelExtensionMapRef and the second one is created indirectly by

/-- A dummy label attribute, to ease testing. -/
register_label_attr dummy_label_attr

Now my first reaction to this was, how is this allowed in the first place? You can't evaluate initialize variables which were declared in the same file. But since it's in another initialize it doesn't actually run in the file itself, the initialize blocks are run when the next file imports this one. So far so good.

But then when one tries to minimize this, you randomly get either a failure error: cannot evaluate [init] declaration 'a' in the same module when you try to import this file in another, or everything works fine. For example, this works:

initialize a : IO.Ref Nat  println! "1"; IO.mkRef 0
initialize b : Unit  println! "2"; a.modify (· + 1)

but this fails

initialize a : IO.Ref Nat  println! "1"; IO.mkRef 0
initialize c : Unit  println! "2"; a.modify (· + 1)

where the only difference is that the second initialize is named c instead of b. It is also affected by adding or removing definitions with other names between and around the initialize. I am at this point fairly sure that the effect I am seeing is hashmap iteration order. Finally, we see that in the a,b case we see 1, 2 and no error, and in the a,c case we see 2 and then the failure.

In other words [init] declarations are executed in random order, which is bad when there are dependencies between them as in this case. It is surprising we don't run into this more often, but for the most part dependency relations have to go via separate files anyway because the body code of a lean file executes before any [init] declarations made in the same file.

So I guess this is a PSA, or possibly a bug report if the [init] declarations are supposed to be happening in declaration order (which would be more sensible).

Sebastian Ullrich (Jun 03 2023 at 09:18):

Yes, this is definitely a bug in [init]

Patrick Massot (Jun 03 2023 at 09:27):

Do you think there is any hope to fix that today? It is not crucial of course but it would allow to start the MSRI summer school on Monday with 50 people downloading a lot less.

Mario Carneiro (Jun 03 2023 at 09:28):

I have a patch for this particular issue open at !4#4623 . Putting the dummy_label_attr in the same file was ill-advised to begin with.

Mario Carneiro (Jun 03 2023 at 09:29):

if we can get that merged I can try to rebase the compiler.enableNew PR on it

Mario Carneiro (Jun 03 2023 at 09:32):

(BTW, the title is a reference to https://en.cppreference.com/w/cpp/language/siof, where C++ had exactly this issue of static initializers getting run in a random order and bad things happening.)

Patrick Massot (Jun 03 2023 at 09:46):

Thanks!

Patrick Massot (Jun 03 2023 at 12:35):

@Mario Carneiro Do you understand the new failure in !4#4573?

Mario Carneiro (Jun 03 2023 at 12:41):

fpowerSeries shouldn't be marked simp

Patrick Massot (Jun 03 2023 at 12:41):

And how is that related to disabling the new compiler?

Mario Carneiro (Jun 03 2023 at 12:42):

it's probably latent nondeterminism like the other error

Mario Carneiro (Jun 03 2023 at 12:43):

if I use simp [-fpowerSeries] it works

Mario Carneiro (Jun 03 2023 at 12:44):

I don't think simp should be nondeterministic in this way, so it could be another bug to track down

Mario Carneiro (Jun 03 2023 at 12:44):

but also the inputs are non-confluent so :shrug:

Sebastian Ullrich (Jun 03 2023 at 13:38):

I'm a little concerned about these issues, is it actual nondeterminism in this case? In the case of [init], it is probably the Name.quickLt order, so unpredictable ordering based on the hash, but ultimately determined solely by the corresponding declaration names

Mario Carneiro (Jun 03 2023 at 14:57):

I think it's not just the declaration names but also the other declarations that exist. After all, turning off the new compiler basically had the effect of deleting a large number of constants that are otherwise unused, which probably affected the size of some hashmaps and hence the relative order of existing constants

Mario Carneiro (Jun 03 2023 at 14:59):

I would expect the result of quickLt not to be affected by the existence of other declarations, except that they might have used up some unique names, making some hygienic names have different numbers in them, making them have a different hash...

Mario Carneiro (Jun 03 2023 at 15:00):

still I don't see why that would arise in simp. Aren't the lemmas applied strictly in declaration / priority order once we have a candidate list from the discrimination tree?

Sebastian Ullrich (Jun 03 2023 at 15:15):

Ah, shifted hygiene scopes are a good point. I don't know why simp could be affected either though


Last updated: Dec 20 2023 at 11:08 UTC