Zulip Chat Archive

Stream: general

Topic: lean 4 and panics

Jukka Suomela (Dec 16 2024 at 16:09):

Hi all, I've got some fundamental misunderstanding of what "panic" means in Lean 4 (e.g. in the context of the documentation of functions like Array.get! and String.toNat!).

For example, here is a program that is utterly broken in so many different ways and that I was expecting to fail in some visible sense, either at compilation time or run time (note the array indexing overflow + attempts of parsing invalid strings as natural numbers):

def foo (x : Nat) : Nat := Id.run do
  let a : Nat := Array.mkArray0.get! x
  let b := "bad".toNat!
  panic! "bad bad"
  panic "bad bad bad"
  unreachable!
  assert! false
  x + a + b

def main : IO Unit :=
  let r := foo 123
  IO.println s!"all good, got result {r}"

Yet if I just do lake exe foo, it happily compiles it and runs it and it prints out "all good, got result 0" and terminates successfully. And if I remove just assert! false, it says "all good, got result 123".

Is there some way to ask Lean 4 to really panic in cases in which it claims to panic (when running stuff with lake)? Or should I just avoid all functions that say something about panicking in their documentation?

Jannis Limperg (Dec 16 2024 at 16:32):

Set environment variable LEAN_ABORT_ON_PANIC=1.

Jukka Suomela (Dec 16 2024 at 16:43):

@Jannis Limperg Many thanks, this helps! At least the program crashes now! It just doesn't print out anything useful, but it also doesn't complete successfully if I'm using Array.get! or String.toNat! incorrectly. And apparently assert! false is also leading to crashes now, good.

Jukka Suomela (Dec 16 2024 at 16:46):

I still don't understand the semantics of stuff like panic! and friends, as e.g. this still succeeds:

def foo (x : Nat) : Nat := Id.run do
  if x == 123 then
    panic! "bad bad"
  x

def main : IO Unit :=
  let r := foo 123
  IO.println s!"all good, got result {r}"

While e.g. this crashes:

def foo (x : Nat) : Nat := Id.run do
  if x == 123 then
    panic! "bad bad"
  if x == 123 then
    panic! "bad bad"
  x

def main : IO Unit :=
  let r := foo 123
  IO.println s!"all good, got result {r}"

Jukka Suomela (Dec 16 2024 at 16:47):

Also, is there any way to ask it to also print out something to stderr when it crashes, or do I need an external shell script for that?

Jason Rute (Dec 16 2024 at 16:48):

I don’t know why, but it works with single =

Jukka Suomela (Dec 16 2024 at 16:49):

I've noticed that there are lots of minor variants, some of which "work" and some don't and I don't really see a pattern here…

Jason Rute (Dec 16 2024 at 16:53):

I think the panic behavior is confusing in Lean. Maybe this is a subtle way the developers are telling us to do things right. :smile: Using IO monads errors, option types (List.get?), or just prove that the behavior is correct (.List.get). But it is annoying to not be sure your code will respect panics.

Kyle Miller (Dec 16 2024 at 16:53):

I think the issue is that the compiler is free to eliminate the panic since this is the Id monad, and there's no data dependency on the panic.

set_option trace.compiler.ir.rc true
def foo (x : Nat) : Nat := Id.run do
  if x == 123 then
    panic! "bad bad"
  x
/-
def foo (x_1 : @& obj) : obj :=
  inc x_1;
  ret x_1
-/

Kyle Miller (Dec 16 2024 at 16:54):

The IR is saying that the function was simplified to the identity function.

Kyle Miller (Dec 16 2024 at 16:54):

The fix is easy: make sure you have a data dependence on the panic.

def foo (x : Nat) : Nat := Id.run do
  if x == 123 then
    panic! "bad bad"
  else
    x

In the first case, it must return the result of the panic!, so it must be executed.

Jukka Suomela (Dec 16 2024 at 16:56):

Oh right I should use panic! like this:

def foo (x : Nat) : Nat := Id.run do
  if x == 123 then
    return panic! "bad bad"
  x

Jukka Suomela (Dec 16 2024 at 16:57):

I for some reason thought panic! is magic that implicitly returns but it doesn't.

Jukka Suomela (Dec 16 2024 at 16:57):

Would be awesome to have a compiler warning that I should use the result of panic!…

Jason Rute (Dec 16 2024 at 16:57):

Or a linter.

Jukka Suomela (Dec 16 2024 at 16:58):

OK, stuff making more sense now. Always do return panic! in this kind of context and always run everything with LEAN_ABORT_ON_PANIC=1.

Kyle Miller (Dec 16 2024 at 16:58):

Jukka Suomela said:

I for some reason thought panic! is magic that implicitly returns but it doesn't.

These examples make me think that it should implicitly return. That's doable, if there were a doElem macro for it.

Feel free to create a Lean 4 issue with the feedback that the version without the return was surprising that it didn't work.

Jukka Suomela (Dec 16 2024 at 16:59):

Btw, is there a reason LEAN_ABORT_ON_PANIC=1 isn't a default?

Jason Rute (Dec 16 2024 at 17:01):

Also, why does assert! return the default value even without a data dependency, but panic! doesn’t? Is assert immediately returning?

Kyle Miller (Dec 16 2024 at 17:03):

assert! isn't a function, it's a do element (docs#Lean.Parser.Term.doAssert)

Kyle Miller (Dec 16 2024 at 17:04):

It turns out it expands to assert! c; k where k is the rest of the do block, and this expands to an if/else. docs#Lean.Elab.Term.expandAssert

Jason Rute (Dec 16 2024 at 17:05):

Which is what maybe panic! should also be? Or is the issue that panic! also works outside of do blocks?

Kyle Miller (Dec 16 2024 at 17:06):

There shouldn't be a problem with it working outside of do blocks. There are a few examples of multipurpose syntax like that (even do works as a do element and as a term)

Jannis Limperg (Dec 16 2024 at 17:10):

Jukka Suomela said:

Btw, is there a reason LEAN_ABORT_ON_PANIC=1 isn't a default?

The reason is that this may invalidate proofs about programs with panics. From a logical perspective, panic! "..." = default, where default is some arbitrary (but fixed) element of the respective type. Hence, for example, (panic! "..." : Nat) = 0. So this is a provably correct implementation of multiplication:

import Aesop

def times : Nat → Nat → Nat
  | _, 0 => panic! "zero"
  | n, .succ m => times n m + n

theorem times_eq_mul (n m) : times n m = n * m := by
  induction m <;> aesop (add simp [times])

It's a bit of a weird tradeoff with no clear best option.

Jason Rute (Dec 16 2024 at 17:59):

Arguably panics shouldn’t return a predictable value (similar to partial). So then you can’t prove stuff about the panic route.

Jason Rute (Dec 16 2024 at 18:02):

In other words why do you want to prove theorems like the above one? Wouldn’t you want to prove if m is not 0 then times n m = n * m?

Jason Rute (Dec 16 2024 at 18:04):

And also, does that flag prevent you from proving that theorem? (And if the issue is that Lean proves the program gives an answer but in practice it crashes, there are lots of ways to have Lean do that without that flag, like external implementations.)

Kyle Miller (Dec 16 2024 at 18:04):

It's a lot easier to apply that sort of theorem, which helps a lot with formalization. Well-chosen junk values mean you can have unconditional simp lemmas.

Kyle Miller (Dec 16 2024 at 18:05):

I'm not sure I understand your example @Jannis Limperg, unless the point is "panic! has no logical meaning, it's just default with special runtime support, so beware"

Jason Rute (Dec 16 2024 at 18:06):

Yeah, why do proofs have anything to do with that flag being on as the default?

Kyle Miller (Dec 16 2024 at 18:09):

I'm not understanding this claim about the flag having anything to do with proofs

Jannis Limperg (Dec 16 2024 at 18:10):

The example shows a program, times, that is proven to behave exactly like mul. So if you run this program, arguably it should behave exactly like mul. The LEAN_ABORT_ON_PANIC flag subverts this expectation. That's my understanding of why Leo wanted it off by default.

Jukka Suomela (Dec 16 2024 at 18:10):

I guess it'd be fairly easy to also write a Lean program that is provably correct but crashes at run time because it tries to allocate gazillion terabytes of memory?

Jannis Limperg (Dec 16 2024 at 18:10):

FWIW I agree that this mostly produces weird behaviour in practice. But I can see the reason.

Jason Rute (Dec 16 2024 at 18:11):

Then List.get! Is a major footgun. It behaves like getD, but the user doesn’t realize that. So they are surprised when 0s silently show up in their code, and it is hard to debug, no?

Jannis Limperg (Dec 16 2024 at 18:12):

Yes, this occasionally happens.

Jason Rute (Dec 16 2024 at 18:12):

It it panicked by default then that is fine, but otherwise it is a trap.

Jannis Limperg (Dec 16 2024 at 18:12):

I run the Aesop test suite with LEAN_ABORT_ON_PANIC=1 for this reason, and I think Mathlib also does that.

Jukka Suomela (Dec 16 2024 at 18:13):

But I really like this perspective. Does a provably correct program/function in the ideal world mean:

it always works correctly, full stop
if it happens to terminate successfully (and it didn't e.g. run out of memory or panic for a similar reason) it also worked correctly?

Kyle Miller (Dec 16 2024 at 18:15):

Found the likely comment that told you about Leo's position @Jannis Limperg: #lean4 > add_decl_doc panic loop @ 💬

Kyle Miller (Dec 16 2024 at 18:16):

(This might be the only documentation about it. The Lean 4 source doesn't have any information about the choice for the default value.)

Jason Rute (Dec 16 2024 at 18:18):

There isn’t a great match between the semantics of the kernel math and the semantics of the code. It is at best something they try to keep functionally the same, but I don’t think there are any guarantees. A proof in Lean is a proof about the ideal version of the code, not the implementation. @Mario Carneiro complains about this sometimes.

Jukka Suomela (Dec 16 2024 at 18:46):

Btw, something like lake exe --abort-on-panic foo would be convenient. This kind of a flag would also be visible in lake exe --help, so unlike LEAN_ABORT_ON_PANIC=1 one might accidentally learn about it.

And it would be especially convenient if with this flag lake exe also printed out to stderr at least some basic stuff like "program crashed" when the program crashes so that it doesn't look like everything was silently OK.

This would make some difference especially for programs that don't normally print out anything but just e.g. manipulate some files.

Jukka Suomela (Dec 16 2024 at 18:48):

I mean writing stuff like LEAN_ABORT_ON_PANIC=1 lake exe foo || echo "something went wrong" gets somewhat tiresome and isn't that intuitive for beginners.

Mario Carneiro (Dec 16 2024 at 20:20):

Jason Rute said:

There isn’t a great match between the semantics of the kernel math and the semantics of the code. It is at best something they try to keep functionally the same, but I don’t think there are any guarantees.

There is at least a soft guarantee that the compiled code does not produce different (valid) results from the kernel version of the code, because violation of this leads to proofs of false via native_decide. (This guarantee is invalidated when you use implemented_by or extern, which is the equivalent of axiom for the purpose of compiler correctness, but I believe the devs try to keep the existing implemented_by and extern implementations from violating this property.)

Mario Carneiro (Dec 16 2024 at 20:21):

But there is no guarantee that the compiled code won't panic while the kernel code doesn't (as indeed it can't panic, that's not a thing in the type theory)

Mario Carneiro (Dec 16 2024 at 20:22):

From a modeling perspective, panic is just a no-op

Mario Carneiro (Dec 16 2024 at 20:22):

in particular it does not halt execution (and as observed, it also does not halt execution in normal compiled code either, it just reports an error to some side channel and continues unless you set a certain environment variable)

Mario Carneiro (Dec 16 2024 at 20:23):

if you want to halt execution you should use exceptions or other mechanisms that play well with the type system

Mario Carneiro (Dec 16 2024 at 20:23):

Some people will tell you that panic itself was a mistake and you shouldn't use it at all

Niels Voss (Dec 16 2024 at 20:27):

I could see the argument for this either way. I'm somewhat worried that at some point, someone is going to perform a critical IO action like deleting a folder, without realizing that panic doesn't stop their code, and it will cause a bunch of damage. On the other hand, having panic terminate the program basically breaks the semantics

Niels Voss (Dec 16 2024 at 20:27):

Maybe we need some way to track which implemented_by or partial function we rely on, in the same way that we track which axioms that are relied on?

Mario Carneiro (Dec 16 2024 at 20:27):

yeah, I have to remind people often not to treat panic like it early aborts

Mario Carneiro (Dec 16 2024 at 20:28):

I think it's not well named to give that impression

Mario Carneiro (Dec 16 2024 at 20:29):

I think it would be useful to have a way to check specifically for uses of panic (and a few other lean runtime functions that also have the effect of a panic, like Array.get!)

Niels Voss (Dec 16 2024 at 20:29):

The other problem I can see with making it abort early by default is that people will learn to rely on this, and it might be even more likely someone will use it to safeguard dangerous IO

Niels Voss (Dec 16 2024 at 20:31):

People coming from rust in particular are highly susceptible to this I would imagine

Mario Carneiro (Dec 16 2024 at 20:31):

I think it would be better to rename it

Mario Carneiro (Dec 16 2024 at 20:31):

Rust panics absolutely cannot continue to the next line of code

Mario Carneiro (Dec 16 2024 at 20:31):

they can abort or they can throw

Niels Voss (Dec 16 2024 at 20:32):

Some programming language have different logging levels like INFO, WARNING, ERROR, FATAL. If the intended behavior of panic is a fatal log, maybe that's what it should be renamed to?

Mario Carneiro (Dec 16 2024 at 20:32):

but rust panics have a type which amounts to panic : Unit -> False

Mario Carneiro (Dec 16 2024 at 20:32):

and this is problematic for a logic language

Mario Carneiro (Dec 16 2024 at 20:33):

fatal also sounds like it stops execution

Niels Voss (Dec 16 2024 at 20:33):

yeah, that's a problem

Mario Carneiro (Dec 16 2024 at 20:33):

how about just bug!

Mario Carneiro (Dec 16 2024 at 20:34):

which is the name for the rustc internal macro for internal compiler errors

Niels Voss (Dec 16 2024 at 20:34):

That sounds sort of like a temporary statement, even though it isn't

Niels Voss (Dec 16 2024 at 20:35):

if we rename panic, should we also have an abort! to actually stop execution?

Mario Carneiro (Dec 16 2024 at 20:35):

the point is that we can't really have that to begin with

Kyle Miller (Dec 16 2024 at 20:35):

no! or drat! would be a little amusing

Mario Carneiro (Dec 16 2024 at 20:35):

at least not without bringing back all the flaws that panic! has

Mario Carneiro (Dec 16 2024 at 20:36):

(it's easy to have this in the IO monad, but the problem is having it in pure code)

Niels Voss (Dec 16 2024 at 20:36):

well, I think the idea would be that abort! should only be used in areas of the code we don't care about semantics that much, like partial or unsafe

Niels Voss (Dec 16 2024 at 20:37):

no! or drat! would be a little amusing

What about d'oh!

Mario Carneiro (Dec 16 2024 at 20:39):

Keep in mind that this is not the entirety of the macro, it's followed by a string literal

Niels Voss (Dec 16 2024 at 20:40):

Would we also need to rename unreachable!?

Kyle Miller (Dec 16 2024 at 20:42):

d'oh! "the argument is out of bounds"

Jukka Suomela (Dec 16 2024 at 20:42):

Just a quick note: even if we set LEAN_ABORT_ON_PANIC=1, the panic message is not shown anywhere or logged anywhere, it seems to be completely lost, right? So this isn't really comparable to "logging" in other programming languages?

Mario Carneiro (Dec 16 2024 at 20:42):

that seems like a bug

Mario Carneiro (Dec 16 2024 at 20:43):

in vscode it ends up in the output panel

Jukka Suomela (Dec 16 2024 at 20:44):

I mean if you run e.g. LEAN_ABORT_ON_PANIC=1 lake exe foo in the terminal, you don't see anything? At least on macOS and Lean 4.14.0?

Mario Carneiro (Dec 16 2024 at 20:44):

there seems to be a flag to enable

Niels Voss (Dec 16 2024 at 20:44):

Would it be feasible to create an abort! that only works in partial and unsafe functions? I feel like terminating abruptly fits my conception of a partial function (and in most cases, aborting is probably better than looping forever)

Mario Carneiro (Dec 16 2024 at 20:45):

but no, the flag seems to be always on during regular code, so it should print to stderr

Kim Morrison (Dec 16 2024 at 20:46):

I do like the idea of renaming panic!

Mario Carneiro (Dec 16 2024 at 20:46):

they aren't particularly panicked, just more of a mild concern

Kim Morrison (Dec 16 2024 at 20:47):

Even something verbose like raise_panic_flag! might be helpful. (Less suggestive that it makes execution? Longer, to discourage people using it?)

Mario Carneiro (Dec 16 2024 at 20:47):

I don't think it's feasible to get rid of all the ! suffixes though

Niels Voss (Dec 16 2024 at 20:48):

even something like raise! is probably better than panic!

Mario Carneiro (Dec 16 2024 at 20:48):

nope, that one also sounds like non-local control flow

Kim Morrison (Dec 16 2024 at 20:48):

In any case, certainly the doc-string for panic! should mention LEAN_ABORT_ON_PANIC.

Niels Voss (Dec 16 2024 at 20:48):

oh, that's fair

Mario Carneiro (Dec 16 2024 at 20:49):

it does, although maybe not with that capitalization

Niels Voss (Dec 16 2024 at 20:49):

if panic! is renamed, would this mean getting rid of LEAN_ABORT_ON_PANIC?
If not, I'm not sure it will be easy to find a word that means "Possibly abort, but also possibly just log an error"

Jukka Suomela (Dec 16 2024 at 20:49):

Mario Carneiro said:

but no, the flag seems to be always on during regular code, so it should print to stderr

Sorry, do you mean that the message is visible currently and I'm doing something wrong if I'm not seeing it?

Or do you mean that there is a bug but the intended behavior is that the message would get printed to stderr?

Mario Carneiro (Dec 16 2024 at 20:49):

it seems the intended behavior is to print to stderr, and from what I can tell it is going through the motions

Mario Carneiro (Dec 16 2024 at 20:49):

I have not tested the code

Mario Carneiro (Dec 16 2024 at 20:52):

does it work if you pass lean -e?

Mario Carneiro (Dec 16 2024 at 20:52):

this uses a slightly different code path but should be the same as LEAN_ABORT_ON_PANIC

Jukka Suomela (Dec 16 2024 at 20:53):

Here is a very simple example that reproduces it for me:

def foo : Nat := Id.run do
  panic! "bad bad bad"

def main : IO Unit := do
  let r := foo
  IO.println s!"all well, got result {r}"

Running lake exe foo in the terminal gives "all well, got result 0".

And running LEAN_ABORT_ON_PANIC=1 lake exe foo in the terminal gives nothing (which was surprising to me), but the program crashes with a non-zero error code (which is of course good and intended here).

Jukka Suomela (Dec 16 2024 at 20:58):

@Mario Carneiro If I add #eval main in the end of the file, and then run it with lean -e, I can see the panic error message. Is this what you meant? I don't really know how to use lean -e to properly run a full Lean program (comparable to lake exe)?

Mario Carneiro (Dec 16 2024 at 20:59):

yes, I'm starting to suspect the issue has to do with the fact that you aren't using lean

Jukka Suomela (Dec 16 2024 at 21:01):

Oh, I didn't know I'm doing something wrong or unusual. :)

Isn't lake the right frontend for everything? It's after all using lean to generate C code, etc.?

Mario Carneiro (Dec 16 2024 at 21:02):

not wrong, there are simply multiple ways to run lean code and panic messages go in different directions depending

Mario Carneiro (Dec 16 2024 at 21:02):

the point here is that you are building a standalone program and running it, rather than panicking in the elaborator (e.g. via #eval)

Jukka Suomela (Dec 16 2024 at 21:04):

Yes, I do usually see panic messages and all kinds of useful stuff if I do quick interactive experiments by adding #eval in VS Code and looking at Lean Infoview.

Mario Carneiro (Dec 16 2024 at 21:05):

oh! it's panicking during initialization due to closed term extraction

Mario Carneiro (Dec 16 2024 at 21:10):

this is lean#1965

Mario Carneiro (Dec 16 2024 at 21:10):

error messages are masked during initialization

Mario Carneiro (Dec 16 2024 at 21:10):

to fix it you need to make your panic message not constant

Jukka Suomela (Dec 16 2024 at 21:13):

It also happens with e.g. this, where I guess the problem is the same (get! uses a fixed error message)?

def foo : Nat := Id.run do
  Array.mkArray0.get! 123

def main : IO Unit := do
  let r := foo
  IO.println s!"all well, got result {r}"

Mario Carneiro (Dec 16 2024 at 21:15):

no, here the issue is that the entire computation of foo happens at initialization time

Mario Carneiro (Dec 16 2024 at 21:15):

because foo is a constant

Jukka Suomela (Dec 16 2024 at 21:15):

Oh I see, makes sense.

Mario Carneiro (Dec 16 2024 at 21:16):

I tried setting set_option compiler.extract_closed false which should help but it's apparently not enough when you have global constants like foo

Mario Carneiro (Dec 16 2024 at 21:16):

if you make foo (_ : Unit) : Nat then it will finally output a message

Jukka Suomela (Dec 16 2024 at 21:18):

No, that's not enough, or at least this doesn't fix it:

def foo (_ : Unit) : Nat := Id.run do
  Array.mkArray0.get! 123

def main : IO Unit := do
  let r := foo ()
  IO.println s!"all well, got result {r}"

Mario Carneiro (Dec 16 2024 at 21:18):

you also need the extract_closed option

Jukka Suomela (Dec 16 2024 at 21:18):

Oh, right, got it!

Jukka Suomela (Dec 16 2024 at 21:20):

So stuff like this is apparently enough to get error messages:

set_option compiler.extract_closed false

def foo (_ : Unit) : Nat := Id.run do
  Array.mkArray0.get! 123

def main : IO Unit := do
  let r := foo ()
  IO.println s!"all well, got result {r}"

Jukka Suomela (Dec 17 2024 at 18:40):

@Mario Carneiro Exactly what does compiler.extract_closed do? I've got a program with this behavior:

Seems to work fine with lake exe
Crashes with no message with LEAN_ABORT_ON_PANIC=1 lake exe
Seems to work fine with set_option compiler.extract_closed false + LEAN_ABORT_ON_PANIC=1 lake exe

Mario Carneiro (Dec 17 2024 at 18:56):

extract_closed is an optimization the compiler performs to move all closed subterms into top level definitions, which are calculated at initialization time. lean#1965 is the bug that this process occurs even to closed subterms which contain panic!, meaning that the code will panic (and abort if you have the appropriate setting) at initialization time, even if the code it was extracted from is never called. So it can cause a perfectly well defined program to crash. (This crash also occurs at initialization time, which is why there is no message.)

Last updated: May 02 2025 at 03:31 UTC