Zulip Chat Archive
Stream: lean4
Topic: Segfault due to mutual recursion?
Siddharth Bhat (Sep 29 2021 at 02:03):
I'm on the LEAN4 nightly, and I've encountered a segfault when implementing a custom Parser
monad to write a hand written recursive-descent parser. Here's a full gist with the repro. I'm on LEAN4 commit f4759c9a223f
I've pasted the program here for ease-of-use:
inductive Maybe (a : Type) : Type where
| ok: a -> Maybe a
| err: Maybe a
structure P (a: Type) where
runP: String -> Maybe (String × a)
def ppure {a: Type} (v: a): P a := { runP := λ s => Maybe.ok (s, v) }
def pbind {a b: Type} (pa: P a) (a2pb : a -> P b): P b :=
{ runP := λs => match pa.runP s with
| Maybe.ok (s', a) => (a2pb a).runP s'
| Maybe.err => Maybe.err
}
instance : Monad P := { pure := ppure, bind := pbind }
def pfail : P a := { runP := λ _ => Maybe.err }
instance : Inhabited (P a) where default := pfail
mutual
partial def pregion : P Unit := do
pblock
partial def pop : P Unit := do
pregion
partial def pblock : P Unit := do
pop
end -- end mutual
def main: IO Unit := do
return ()
Am I violating some contract that causes this program to crash? I currently suspect that the mutual
block I have somehow violates something. However, it's dead code (main
doesn't invoke it!) so I'm not sure what the actual problem is.
Mac (Sep 29 2021 at 02:06):
I would suspect the problem is the combination of mutual
and partial
. You should be using one or the other, not both.
Siddharth Bhat (Sep 29 2021 at 02:07):
What if I do really need both? (mutually recursive functions whose termination is too complicated to prove, and are hence partial
for practical purposes.)
Also, is this a bug?
Mac (Sep 29 2021 at 02:10):
@Siddharth Bhat partial definitions can already be written in a mutual recursive manner but just unpacking them. The partial
definition:
partial def foo := bar
is essentially equivalent to:
unsafe def fooImpl : T := bar
@[implementedBy fooImpl] constant foo : T
Siddharth Bhat (Sep 29 2021 at 02:13):
@Mac I don't understand how to extend this to a definition of bar
that depends on foo
. I naively tried:
unsafe def fooImpl : Unit := bar
unsafe def barImpl : Unit := foo
@[implementedBy fooImpl] constant foo : Unit
@[implementedBy barImpl] constant bar : Unit
which doesn't work. Could you please show me how? :)
Mac (Sep 29 2021 at 02:15):
This should work:
constant foo : Unit
constant bar : Unit
unsafe def fooImpl : Unit := bar
unsafe def barImpl : Unit := foo
attribute [implementedBy fooImpl] foo
attribute [implementedBy barImpl] bar
Mac (Sep 29 2021 at 02:18):
Siddharth Bhat said:
Also, is this a bug?
Yes. Lean should not be segfaulting. My point was that you are doing something that is likely very unexpected (combining mutual
and partial
) and thus it was probably not well tested (resulting in said segfault).
Siddharth Bhat (Sep 29 2021 at 02:21):
Interesting, the suggested change fixes the segfault :) Thanks @Mac , I'll file a bug report and use the workaround for now.
Siddharth Bhat (Sep 29 2021 at 02:28):
@Mac I filed an issue: https://github.com/leanprover/lean4/issues/697
Mario Carneiro (Sep 29 2021 at 04:04):
Mac said:
Siddharth Bhat said:
Also, is this a bug?
Yes. Lean should not be segfaulting. My point was that you are doing something that is likely very unexpected (combining
mutual
andpartial
) and thus it was probably not well tested (resulting in said segfault).
I have used partial and mutual together many times without issue. All definitions in the block must be all-partial
or all not, but otherwise it works just as you would expect.
Mario Carneiro (Sep 29 2021 at 04:07):
The fact that the recursion is completely unguarded here might be an issue though (mutual functions are usually functions with at least one argument), and my guess is that it is hitting an internal evaluation limit when constructing the closed terms pregion
, pop
, pblock
Mac (Sep 29 2021 at 04:38):
Mario Carneiro said:
I have used partial and mutual together many times without issue. All definitions in the block must be all-
partial
or all not, but otherwise it works just as you would expect.
In that case, would it make more sense for parital
to be a modifier to mutual
in such cases rather than part of each def
?
Mario Carneiro (Sep 29 2021 at 04:40):
There are several features of the declarations of the mutual block that could arguably be moved to the header. That was the lean 3 design. My understanding is that the lean 4 design is intended to make mutual defs read just like regular defs, and enable more flexibility in the way the individual definitions are constructed
Mac (Sep 29 2021 at 04:43):
My reason for saying this is that mutual
tends to imply a lot of complex kernel type checking to make sure the recursion is valid. With all partial definitions, all that is skipped. The only thing the block is doing at that point is allowing for forward reference. Right?
Mac (Sep 29 2021 at 04:44):
The heavy distinction there feels like it should be clearer.
Mario Carneiro (Sep 29 2021 at 04:44):
The elaboration of partial
and non-partial
mutual defs is the same
Mac (Sep 29 2021 at 04:44):
@Mario Carneiro What do you mean by that?
Mario Carneiro (Sep 29 2021 at 04:45):
The definitions are typechecked in the same context, with the same things in scope
Mario Carneiro (Sep 29 2021 at 04:45):
the only difference is in the backend compilation in the kernel
Mario Carneiro (Sep 29 2021 at 04:46):
they also produce the same compiled code
Mac (Sep 29 2021 at 04:46):
Mario Carneiro said:
the only difference is in the backend compilation in the kernel
which is why I defined the difference as "complex kernel type checking"
Mario Carneiro (Sep 29 2021 at 04:47):
I'm not sure I understand your suggestion
Mario Carneiro (Sep 29 2021 at 04:48):
Do you think that unsafe
should be divorced from the line with the def
as well?
Mario Carneiro (Sep 29 2021 at 04:48):
If it's a large mutual block, it might not be obvious that the def is in a block at all, in which case the displaced partial
could be confusing
Mac (Sep 29 2021 at 04:50):
That is a weird argument, if its a large mutual block that one has forgotten is mutual, it might be just as confusing where the recursive functions come from (as one normally assume they are defined earlier in the file rather than later).
Mac (Sep 29 2021 at 04:52):
My view was simply that as the termination checking occurs w.r.t. the entire mutual
block and partial
essentially means 'don't do termination checking', it seems most logica; (to me) to apply that to the mutual
which is doing said checking as opposed to the def
which is not.
Mario Carneiro (Sep 29 2021 at 04:54):
The application of partial
, or unsafe
, is restricted to apply to a block at a time, which is why all defs in a block have to have the same partial
/unsafe
setting
Mac (Sep 29 2021 at 04:54):
For instance, the new termination_by
statement for wf recursion will apply to the end of the mutual
block instead of at each def
-- so for consistency it makes sense to apply the partial
(its opposite, essentially) to the block as well.
Mario Carneiro (Sep 29 2021 at 04:55):
The fact that it is written on a def is only for convenience / style
Mac (Sep 29 2021 at 04:55):
@Mario Carneiro I think it would be hard to argue that repeating partial
multiple times is more 'convenient' than doing it once.
Mario Carneiro (Sep 29 2021 at 04:56):
Perhaps you might not feel the same way if you have used lean 3 mutual defs
Mac (Sep 29 2021 at 04:57):
how so?
Mac (Sep 29 2021 at 04:58):
It also violates a common adage of CS: don't repeat yourself (DRY).
Mac (Sep 29 2021 at 04:59):
Also, Lean is already quite fond of this in other places -- i.e., DRY is what things like universe
and variable
are for (which also apply to the whole block).
Mario Carneiro (Sep 29 2021 at 05:00):
Like I said, lean 3 mutual defs don't repeat anything: meta
/noncomputable
shows up only once, in the mutual header, as well as parameters that aren't participating in the recursion. It is followed by a list of with
defs that give the body of the definitions. In actual use, with
has an unfortunate status as not quite a def; it means that you have to learn a different syntax for defs and you can't use other commands in the block
Mario Carneiro (Sep 29 2021 at 05:01):
Keep in mind, when proposing to change this, that this is literally what lean 4 changed away from
Mac (Sep 29 2021 at 05:01):
Mario Carneiro said:
as well as parameters that aren't participating in the recursion.
I can definitely see why this would be annoying
Mac (Sep 29 2021 at 05:02):
@Mario Carneiro Also, fyi, I did use Lean 3 myself. :wink: I don't remember having a problem with mutual
there.
Mario Carneiro (Sep 29 2021 at 05:02):
Lean 4 also does the parameter thing differently when you call a function recursively, even without mutual
- you need to repeat parameters in the function call
Mario Carneiro (Sep 29 2021 at 05:02):
the lean 3 way is DRY but also confusing
Mac (Sep 29 2021 at 05:02):
@Mario Carneiro Yeah I am quite with that. That always tripped me up in Lean 3.
Mac (Sep 29 2021 at 05:03):
Mario Carneiro said:
the lean 3 way is DRY but also confusing
There is also the fact that it just wouldn't work with partial defs at all.
Mario Carneiro (Sep 29 2021 at 05:03):
how so?
Mac (Sep 29 2021 at 05:04):
A partial def has no delineation between recursive and non-recursive parameters so the idea wouldn't really apply there, right?
Mario Carneiro (Sep 29 2021 at 05:04):
meta def foo (a : ℕ) : ℕ → ℕ
| 0 := foo a
| (n+1) := foo n
Mac (Sep 29 2021 at 05:05):
How does meta
know that I don't want to recur on both arguments?
Mario Carneiro (Sep 29 2021 at 05:05):
because only one is right of the colon
Mario Carneiro (Sep 29 2021 at 05:06):
that's how the rule goes in lean 3
Mac (Sep 29 2021 at 05:06):
Yeah, it ties pattern matching with recursion, which I think is the bigger problem there.
Mario Carneiro (Sep 29 2021 at 05:07):
you don't have to do any actual pattern matching though
Mac (Sep 29 2021 at 05:08):
Yes but it ties the pattern-matching syntax with recursion, which is weird. And it would make writing most partial
functions (which tend to be structure in normal CS style) very weird.
Mac (Sep 29 2021 at 05:09):
As the boilerplate of partial def foo : a -> b -> c | a, b =>
would be quite common.
Mac (Sep 29 2021 at 05:10):
Furthermore, def
alternates in Lean 4 are equivalent to an internal match
(which also makes me quite happy), this style of delineating recursion wouldn't even be logical anymore in Lean 4.
Mac (Sep 29 2021 at 05:12):
My point being that the parameter part of old mutual
is much more than just DRY it also encapsulates/demands more information than one might wish to provide (and, as you said, breaks consistency).
Mac (Sep 29 2021 at 05:13):
On the other hand partial
is a modifier, and some modifiers already only work in certain circumstances, so there is much less consistency to break (if any).
Mario Carneiro (Sep 29 2021 at 05:14):
Note that you can still have "parameters" in lean 4 style recursion, by using variable
. I haven't decided whether this is a good thing or inconsistent yet
Mario Carneiro (Sep 29 2021 at 05:16):
partial def foo (a : Nat) : Nat → Nat
| 0 => foo a a
| n+1 => foo a n
variable (a : Nat)
partial def foo' : Nat → Nat
| 0 => foo' a
| n+1 => foo' n
Mac (Sep 29 2021 at 05:16):
@Mario Carneiro Personally, if I were writing a style guide, I would ban explicit binders in variable
, as it makes the parameters one needs to specify of a def
(and there order) hard to figure out.
Mac (Sep 29 2021 at 05:18):
Especially if the def
is far removed from the variable
.
Mario Carneiro (Sep 29 2021 at 05:19):
looking at lean 4 itself it seems like about 1/3 of variable
lines violate your style guide
Mac (Sep 29 2021 at 05:19):
I am aware. :(
Mac (Sep 29 2021 at 05:19):
It has caused me much pain.
Scott Morrison (Sep 29 2021 at 05:20):
And in mathlib3 there are at least 1400 occurrences. :-)
Mac (Sep 29 2021 at 05:20):
My only exception to this rule would be the the binder meant to be used with dot notation (which I would demand be named self
).
Mario Carneiro (Sep 29 2021 at 05:21):
Regarding determining the parameters of a def
, I use #print
for that
Mac (Sep 29 2021 at 05:24):
@Mario Carneiro while I like Lean's interactivity, having to go right a #print
statement while deep in a function is not fun.
Mac (Sep 29 2021 at 05:24):
And is also impossible when I am viewing Lean code in a non-interactive manner.
Mario Carneiro (Sep 29 2021 at 05:25):
well, one day we will have #print
at expression/tactic scope
Mac (Sep 29 2021 at 05:25):
Honestly, I would much prefer to see it at an editor level through a command.
Mario Carneiro (Sep 29 2021 at 05:25):
hover already fills that niche to a great extent
Mac (Sep 29 2021 at 05:26):
@Mario Carneiro not when there are elaboration errors in your code. :(
Mac (Sep 29 2021 at 05:26):
which is quite common when writing a function.
Mac (Sep 29 2021 at 05:27):
and even more common when said error is the thing you are trying to fix by verifying types/defs. XD
Mario Carneiro (Sep 29 2021 at 05:28):
usually if lean has enough understanding to give you a type error at a function application, it will be able to give you a hover on the function
Mac (Sep 29 2021 at 05:30):
yeah, but such types are generally less than useful as they are some intermediate output of the inference engine (i.e., with metavariables and dependent types) and thus unreadable to my unexpert eyes.
Mac (Sep 29 2021 at 05:30):
also, if the problem is syntactic hovers may not even exist.
Mario Carneiro (Sep 29 2021 at 05:32):
when I hover on a constant name, the output looks like the original type of the constant, before unification
Mario Carneiro (Sep 29 2021 at 05:33):
although I would really like hovering on the name of a definition to also work and act just like hovering on a use of that definition
Mario Carneiro (Sep 29 2021 at 05:33):
that would solve your parameters issue pretty well
Mac (Sep 29 2021 at 05:33):
Oh, you mean the actual function itself, in that case, yeah, but that is rarely my problem.
Mario Carneiro (Sep 29 2021 at 05:34):
as in, if you are looking at some def foo
that picks up some unknown set of variables above, you can hover on foo
to find out the actual type
Mac (Sep 29 2021 at 05:34):
Mario Carneiro said:
that would solve your parameters issue pretty well
I think you greatly underestimate how much Lean code I read on GitHub.
Mac (Sep 29 2021 at 05:34):
Also, I have been having major problems getting Lean to properly find definitions when browsing its source at all.
Mario Carneiro (Sep 29 2021 at 05:34):
I gave up on reading proof assistants from raw code long ago
Mac (Sep 29 2021 at 05:34):
Most of my go-to-defs find nothing.
Mario Carneiro (Sep 29 2021 at 05:35):
can you #mwe?
Mac (Sep 29 2021 at 05:35):
Note that this is only when browsing the cloned repo, not in the distribution acquired through elan
.
Mac (Sep 29 2021 at 05:36):
@Mario Carneiro How does one #mwe an interactivity issue?
Mario Carneiro (Sep 29 2021 at 05:36):
there are interactive tests
Mac (Sep 29 2021 at 05:36):
I suspect this may be due to the fact I often update the code without rebuilding it. Since rebuilding still takes a good 15-30mins on my machine.
Mario Carneiro (Sep 29 2021 at 05:36):
but the usual way to do it here is to set up something reproducible and describe what you are doing and what you see
Mac (Sep 29 2021 at 05:40):
Okay, just to demonstrate how this works for me (and how I don't understand interactivity #mwe):
- I open up my local copy of the
lean4
repo in VSCode - I open up
src/Lean/Parser.lean
- Wait for elaboration to finish
- Hover over
Parenthesizer
inident.parenthesizer
- Right-click 'Go to Definition'
- Get "no definition found for 'Parenthesize'"
Mario Carneiro (Sep 29 2021 at 05:45):
That works for me (go-to-definition takes me to Parenthesizer
); I have vscode, just updated to master, never built, vscode-lean4 v0.0.33
Mac (Sep 29 2021 at 05:46):
Does it work if you have an older build with the proper settings for elan
? That is my case.
Mac (Sep 29 2021 at 05:47):
I have my elan
set up in the folder as described here: https://leanprover.github.io/lean4/doc/dev/index.html#dev-setup-using-elan
Mario Carneiro (Sep 29 2021 at 05:48):
I just ran elan update leanprover/lean4:nightly
, still works
Mac (Sep 29 2021 at 05:48):
I meant having overrides set for the repo root and src
dirs
Mario Carneiro (Sep 29 2021 at 05:48):
I have elan set up to use nightly, not master
Mac (Sep 29 2021 at 05:49):
The dev setup for lean4
is suppose to look like the following (as described here):
# in the Lean rootdir
elan toolchain link lean4 build/release/stage1
elan toolchain link lean4-stage0 build/release/stage0
# make `lean` etc. point to stage1 in the rootdir and subdirs
elan override set lean4
cd src
# make `lean` etc. point to stage0 anywhere inside `src`
elan override set lean4-stage0
Mario Carneiro (Sep 29 2021 at 05:50):
I haven't actually built lean 4, the stage 0 stuff scares me off
Johan Commelin (Sep 29 2021 at 05:50):
Mac said:
Mario Carneiro Personally, if I were writing a style guide, I would ban explicit binders in
variable
, as it makes the parameters one needs to specify of adef
(and there order) hard to figure out.
I think there are many places where it explicit binders in variables
can be reasonably defended.
Example:
variables {X : Type} [base_class1 X] [base_class2 X] [base_class3 X] [mixin1 X] [mixin2 X] [mixin3 X] (x : X)
lemma API_lemma_1 : blabla x
lemma API_lemma_2 : blabla x
lemma API_lemma_3 : blabla x
lemma API_lemma_4 : blabla x
variables (X) -- make `X` explicit (but don't repeat all the typeclass assumptions, DRY)
def cool_construction [mixin4 X] : foobar X := somestuff involving ∀ x : X, and API_lemma_4 from above.
variables {X} -- from now on `X` can be implicit again
Johan Commelin (Sep 29 2021 at 05:51):
This pattern is pretty common in mathlib, I think.
Mac (Sep 29 2021 at 05:52):
@Johan Commelin wouldn't the following just be better? (if it was supported)
variables {X : Type} [base_class1 X] [base_class2 X] [base_class3 X] [mixin1 X] [mixin2 X] [mixin3 X] (x : X)
lemma API_lemma_1 : blabla x
lemma API_lemma_2 : blabla x
lemma API_lemma_3 : blabla x
lemma API_lemma_4 : blabla x
-- modify the binder in the def for just this def
def cool_construction (X) [mixin4 X] : foobar X := somestuff involving ∀ x : X, and API_lemma_4 from above.
Johan Commelin (Sep 29 2021 at 05:54):
If that were supported, that would be quite useful. But somehow it has to work together with introducing fresh variables that can typically shadow names that were introduced by variables
before. I don't have a good sense for how that would work. (Because I don't know anything about language design at all.)
And nevertheless. Sometimes you want to make X
explicit for 8 declarations in a row, so you put those in a section
and add variables (X)
to that section.
Johan Commelin (Sep 29 2021 at 05:55):
Note that there are also very reasonable situations where you want to make write variables (R A B M N)
and make 5 variables explicit for the next 7 declarations.
Mac (Sep 29 2021 at 05:57):
Johan Commelin said:
hat can typically shadow names that were introduced by
variables
before.
I know that is how it currently but I think being able to shadow a binder from variable
is just more confusing. I view the idea behind variable
being that, within, this section x
is X
. If that is not the case, I don't really feel like the def
should be in the same section.
Mac (Sep 29 2021 at 06:00):
Johan Commelin said:
write
variables (R A B M N)
and make 5 variables explicit for the next 7 declarations.
Where? Having multiple functions with the same 5 explicit variables (and 5 explicit arguments in the first place) is a bit concerning. Generally, if a function is taking that many explicit parameters, you should probably be passing a struct with said fields instead.
Though I guess one reasonable exception / use case I can imagine would be explicit binders with defaults that are meant to be used only by named argument syntax. In such cases, the order is unimportant and you can still use the function without specifying them (so shares more in common with an inferred binder than with other explicit binders) . However, unfortunately, since variable
as its arguments at the start this is usually not possible.
Even if it were, though, I don't wouldn't the fact that you can't tell from the definition which arguments are to be specify to it and which it infers or otherwise inherits from the surrounding scope. But I would be less opinionated on such cases.
Scott Morrison (Sep 29 2021 at 06:05):
@Mac, for context, there are many examples in mathlib where we change many variables binders at once, e.g. https://github.com/leanprover-community/mathlib/blob/master/src/algebra/lie/tensor_product.lean#L77
Scott Morrison (Sep 29 2021 at 06:07):
(That one appears to be a record, but there are lots where we change 3 or 4 at a time.)
Mac (Sep 29 2021 at 06:07):
The expression lift R L M N P f (m ⊗ₜ n)
just hurts me.
Mac (Sep 29 2021 at 06:09):
However, I suspect this has more to do with the nature of mathlib. I suspect that you rarely actually do apply those arguments explicitly and instead apply them through notation.
Scott Morrison (Sep 29 2021 at 06:10):
Three modules, a ring, a lie algebra, a linear map, and two vectors! Just maths. :-)
Johan Commelin (Sep 29 2021 at 06:10):
you should probably be passing a struct with said fields instead.
but without eta for records, this can cause defeq issues down the road, rigth? (just one objection)
Also, this might be nice while setting up the API for that particular theory, but when applying it, users would constantly have to bundled up random types into a record, instead of just passing them directly to the functions.
In 50% of the applications, unification can figure out the types, so you just write an _
. But for those other 50%, we're really happy that the argument has explicit binding.
Johan Commelin (Sep 29 2021 at 06:11):
Mac said:
However, I suspect this has more to do with the nature of mathlib. I suspect that you rarely actually do apply those arguments explicitly and instead apply them through notation.
I'm not so sure. Things like lift R L M N P f
or very similar things are certainly not an uncommon thing in mathlib
Mac (Sep 29 2021 at 06:12):
Johan Commelin said:
In 50% of the applications, unification can figure out the types, so you just write an
_
. But for those other 50%, we're really happy that the argument has explicit binding.
In that case, wouldn't those arguments work better as implicit arguments (that are some times specified by name)? Since Lean 3 lacked named arguments, maybe that is why such formulations were so popular?
Johan Commelin (Sep 29 2021 at 06:13):
I agree that this is now a new feature that is quite exciting. And we'll have to do experiments to find out what the right balance is in using it.
Johan Commelin (Sep 29 2021 at 06:13):
_
is really short, so it makes "a bit too much explicit variables" really cheap.
Johan Commelin (Sep 29 2021 at 06:14):
If something is implicit, and 50% of the time you have to write (base_ring := R)
, this will quickly make me wish that R
was explicit.
Johan Commelin (Sep 29 2021 at 06:15):
But certainly, named variables provide a new potential!
Mac (Sep 29 2021 at 06:16):
@Johan Commelin Also, I think my rules are more geared towards CS than theorem proving. In theorem proving you are likely to have a lot of parameters, whereas in CS, you tend to have relatively few.
Johan Commelin (Sep 29 2021 at 06:17):
Sure, I guess we can different variations in style guide for different areas
Mac (Sep 29 2021 at 06:18):
For instance, there are many places variable
is used for def
s that have one or two parameters, and thus it makes it hard to distinguish them from def
s with none (and the same would be true in end-user code).
Johan Commelin (Sep 29 2021 at 06:19):
Just for the record: here is how you say "let be a manifold with corners" in mathlib-speak:
variables {ι : Type uι}
{E : Type uE} [normed_group E] [normed_space ℝ E] [finite_dimensional ℝ E]
{H : Type uH} [topological_space H] {I : model_with_corners ℝ E H}
{M : Type uM} [topological_space M] [charted_space H M] [smooth_manifold_with_corners I M]
Mac (Sep 29 2021 at 06:20):
This is why I feel such explicit binders are not really explicit binders in theorem proving. They are things that on a paper proof would be implicit, but are made explicit mostly due to the limitations / requirements of the formalization.
Mac (Sep 29 2021 at 06:21):
Thus it makes much more since for them to be tossed in a variable
header (as that is how it would appear in the paper proof).
Mac (Sep 29 2021 at 06:25):
In CS, on the other hand, the explicit binders are what you are expecting to see in the compiled functions signature. Thus, I think it is reasonable to expect to see them in signature of the def
being compiled. The implicit / synthetic binders, however, are abstract fluff that is (usually) compiled away and therefore not part of the signature. Thus, it makes sense to stick them in some variable
header (as they have to do with the abstraction that section is dealing with).
Last updated: Dec 20 2023 at 11:08 UTC