Zulip Chat Archive
Stream: general
Topic: lean3 impl questions
Leonard Wiechmann (Aug 01 2022 at 17:26):
hey, i've been studying the lean3 source code.
while i've been able to answer most of my questions so far, there are a few things, where asking the authors seems like the better idea:
- what's the rationale behind using
expr_local
for uses of locals? is it just the convenience of not having to look up the local in the local context to get its type and stuff like that, or is there a "deeper" reason? - where is the kernel type checker invoked for definitions commands? or do you not do that for regular operation (and instead have some "verify the stuff please" action)?
Kyle Miller (Aug 01 2022 at 17:29):
For your second question, it should be https://github.com/leanprover-community/lean/blob/master/src/frontends/lean/definition_cmds.cpp#L194 and you can see where it gets called in the definition commands. (I'm not the author.)
Leonard Wiechmann (Aug 01 2022 at 17:35):
ah, indeed, thanks a lot!
i was searching for type_checker tc(
cause that seemed to be a common pattern, but only found it in the check
command.
Leonard Wiechmann (Aug 01 2022 at 17:55):
another question related to the first one: expr_var
doesn't seem to be used much at all. what's the rationale behind that?
actually, nvm. it's used in instantiate_rev_locals
and resolve_local_name_core
, which are called quite a bit.
Mario Carneiro (Aug 01 2022 at 17:58):
That's the locally nameless approach. Most functions expect closed terms, so whenever you "enter" a binder you instantiate all the var
s in it for local constants
Mario Carneiro (Aug 01 2022 at 17:59):
note that the actual type information in a local_const
is not reliable, it's usually a dummy value
Mario Carneiro (Aug 01 2022 at 18:00):
I'm not sure what you mean by "rationale behind using expr_local
for uses of locals". What else would you use it for?
Leonard Wiechmann (Aug 01 2022 at 18:00):
i see, so using expr_local
is a kind of "certification/invariant" thing (> Most functions expect closed terms)
Mario Carneiro (Aug 01 2022 at 18:01):
in any case, in lean 4 local_const
is now .fvar (fvarId : Name) : Expr
, so it seems it was agreed that all the other stuff in the local const is dead weight
Mario Carneiro (Aug 01 2022 at 18:02):
it's now solely a pointer into the LocalContext
which has the actual type information and whatnot
Leonard Wiechmann (Aug 01 2022 at 18:02):
i found it a little weird that both the "definition" and the "use" of a local used exactly the same expr (except for adjusted source position information).
Leonard Wiechmann (Aug 01 2022 at 18:03):
it's now solely a pointer into the LocalContext which has the actual type information and whatnot
that sounds more like what i would have expected, i think.
Mario Carneiro (Aug 01 2022 at 18:03):
Huh? what is a "definition" of a local constant?
Leonard Wiechmann (Aug 01 2022 at 18:04):
well fun id (a: nat) -> nat := a
the (a: nat)
part is the definition.
the a
part in the body is the use.
clearly at source level, these are quite different, but in the ast, they use the same expr.
Mario Carneiro (Aug 01 2022 at 18:04):
I suppose you could call the things in the LocalContext the "definition of a local_const", but that's local_decl
, not local_const
Mario Carneiro (Aug 01 2022 at 18:05):
The expression (a: nat) -> nat
is represented as an expr like Pi "a" (const "nat") (const "nat")
, there aren't any local_const involved at all
Leonard Wiechmann (Aug 01 2022 at 18:07):
i'm not sure what local_const
is.
i was talking about lean3's expr_local
:thinking:
Mario Carneiro (Aug 01 2022 at 18:07):
Mario Carneiro (Aug 01 2022 at 18:07):
the names on the C++ side are weird
Leonard Wiechmann (Aug 01 2022 at 18:09):
ok, i see.
Leonard Wiechmann (Aug 01 2022 at 18:09):
but the body (a
) would be a local_const?
Mario Carneiro (Aug 01 2022 at 18:11):
the local consts get involved when you enter a binder during elaboration. Let's say the body is \lam a, a
, so we start by entering the a
binder. This means creating a new local_decl like {name := "a", uniq := 1234, type := nat}
and then all occurrences of var 0
get replaced with local_const "a" 1234 <>
Mario Carneiro (Aug 01 2022 at 18:13):
and then we would resolve the user reference a
to that local constant, and when we close the scope we would abstract it back to var 0
resulting in the term lam "a" (const "nat") (var 0)
for that identity function
Leonard Wiechmann (Aug 01 2022 at 18:14):
ok, makes sense!
what's the idea behind replacing the vars with locals, instead of doing the lookups ad hoc? (this really is my central question)
Leonard Wiechmann (Aug 01 2022 at 18:15):
i guess at some points they're turned back into local_consts, because the kernel (in lean3) doesn't seem to accept vars.
Leonard Wiechmann (Aug 01 2022 at 18:17):
i've got to go, back in like an hour.
thanks for the help so far!
Leonard Wiechmann (Aug 01 2022 at 18:23):
i guess one reason for switching to the unique names could be that you don’t have to bother with adjusting de-bruijn indices or alpha conversion.
Leonard Wiechmann (Aug 01 2022 at 18:35):
hmm, but then why would de-bruijn be used at all? :thinking:
Kyle Miller (Aug 01 2022 at 18:55):
I believe this is the keyword:
That's the locally nameless approach.
There seem to be papers you could look up about "locally nameless representation"
Mario Carneiro (Aug 01 2022 at 19:00):
Leonard Wiechmann said:
what's the idea behind replacing the vars with locals, instead of doing the lookups ad hoc? (this really is my central question)
Leonard Wiechmann said:
i guess one reason for switching to the unique names could be that you don’t have to bother with adjusting de-bruijn indices or alpha conversion.
You answered your own question. It basically localizes the problem of lifting and substitution to one place, so nothing else has to worry about it. See also http://www.cs.ru.nl/~james/RESEARCH/haskell2004.pdf which explains some of the theory.
Mario Carneiro (Aug 01 2022 at 19:01):
Leonard Wiechmann said:
i guess at some points they're turned back into local_consts, because the kernel (in lean3) doesn't seem to accept vars.
Actually it's the other way around. Kernel terms must not contain local_consts, although the kernel itself also does the locally nameless thing so it will turn vars into local_consts as it traverses the term.
Leonard Wiechmann (Aug 01 2022 at 20:00):
Kyle Miller said:
There seem to be papers you could look up about "locally nameless representation"
thanks, i'll take a look!
Leonard Wiechmann (Aug 01 2022 at 20:00):
Mario Carneiro said:
It basically localizes the problem of lifting and substitution to one place, so nothing else has to worry about it. See also http://www.cs.ru.nl/~james/RESEARCH/haskell2004.pdf which explains some of the theory.
that makes sense, thanks for the link!
Leonard Wiechmann (Aug 01 2022 at 20:03):
Mario Carneiro said:
Actually it's the other way around. Kernel terms must not contain local_consts, although the kernel itself also does the locally nameless thing so it will turn vars into local_consts as it traverses the term.
ah, i see. i did wonder how the kernel verified the bindings if the input used local_const.
i saw the lean_assert(closed(e));
in type_checker::infer_type_core
, i must have missed the conversion.
Leonard Wiechmann (Aug 01 2022 at 20:41):
found it: finalize_definition
in definition_cmds.cpp
abstracts the locals.
and stuff like type_checker::infer_lambda
temporarily switches to expr_local
for inference.
seems closed(e)
doesn't mean "doesn't contain any vars", but just means "doesn't contain free vars". (yes, that's exactly what it means. says it in the comment right above it... :D)
man, this would probably be so much easier, if i had just downloaded and compiled the code :D
Leonard Wiechmann (Aug 01 2022 at 20:45):
anyway, thanks kyle and mario for the help, really appreciate it!
Leonard Wiechmann (Aug 02 2022 at 13:35):
working my way through the typing rules in the lean paper again (The Type Theory of Lean).
have a few questions:
1) has the type theory changed in any way for lean4?
2) isn't the proof irrelevance reduction rule (second line in the image) technically unjustified?
the def-eq rule (first line) only has one p
, no p'
.
it's my understanding that you can't "apply def-eq to the rules themselves" (i.e. if the rule only has one p
, you can't apply it if you have two different p
, p'
, even if they're def-eq).
image.png
Mario Carneiro (Aug 02 2022 at 21:19):
1) Yes and no. The theory is not supposed to have changed in any essential way, but lean 4 does include some new kernel features not supported by lean 3, mostly optimizations of things that were supported already but not primitively.
- Primitive projections. This is just shorthand for a
rec
term applied to a lambda that extracts one of the arguments, and it has the expected definitional equalities. - Primitive nat and string literals. This makes
#reduce
stop early and changes the behavior of whnf around these terms, but otherwise they are just shorthand for the term construction you could do before.- Primitive nat literal operations like
Nat.add
. It only works on literals, not variables, so it can only compute things that were already computable through a long sequence of kernel reductions.
- Primitive nat literal operations like
- Eta for structures. This is a "real" change, but it is actually a bit closer to the paper than before, since I had to add eta for (some) structures in the "reduction to W-types" chapter. It does not pose any really difficult problems.
- Nested and mutual inductives. This is a big change, and one that probably needs some investigation at some point, but it should be checking the same rules as in lean 3 (where nested and mutual inductives were only simulated by the frontend). The simulation did not have all the desired definitional equalities though, so this is a real change.
Mario Carneiro (Aug 02 2022 at 21:25):
2) The "algorithmic" rules do not have to be 1 for 1 copies of the "idealized" rules. They just need to be able to show essentially the same things in the same context. In this case, you can show that if you replace with in the second rule you still get a provable theorem: Suppose , , , and . Then , so by the conv rule ( implies -> ). Thus by proof irrelevance.
Reid Barton (Aug 02 2022 at 21:39):
BTW, does "eta for structures" literally mean for things defined with the structure
keyword? Or does the equivalent inductive
also have eta?
Eric Rodriguez (Aug 02 2022 at 22:13):
https://github.com/leanprover/lean4/commit/0aa32d643e063efd68bd31478ccd8028a3de7533
Mario Carneiro (Aug 02 2022 at 22:18):
Yes:
inductive Foo : Type
| mk : Nat → Nat → Foo
def Foo.fst : Foo → Nat := λ ⟨n, _⟩ => n
def Foo.snd : Foo → Nat := λ ⟨_, n⟩ => n
example (f : Foo) : f = Foo.mk f.fst f.snd := rfl -- ok
inductive Rec : Type
| mk : Nat → Rec → Rec
def Rec.fst : Rec → Nat := λ ⟨n, _⟩ => n
def Rec.snd : Rec → Rec := λ ⟨_, n⟩ => n
example (f : Rec) : f = Rec.mk f.fst f.snd := rfl -- fail
Mario Carneiro (Aug 02 2022 at 22:20):
The kernel doesn't actually receive a "structure" flag. It computes whether an inductive is "structure-like" itself
Leonard Wiechmann (Aug 03 2022 at 08:59):
Mario Carneiro said:
2) The "algorithmic" rules do not have to be 1 for 1 copies of the "idealized" rules. They just need to be able to show essentially the same things in the same context.
meaning "imply the ideal rules"? (in a way that's implementation friendly)
totally forgot about the conv rule. noticed it was absent, but didn't realize it was "baked into the proof irrelevance rule" (like eta, which had pi-intro baked in, turning it into an extensionality principle).
Mario Carneiro (Aug 03 2022 at 09:07):
The conv rule doesn't work very well algorithmically, since you don't know what other type to prove definitionally equal to the one you have. Instead, everywhere type equality constraints appear, those equalities get replaced with definitional equality checks
Mario Carneiro (Aug 03 2022 at 09:08):
Another standard place definitional equality checks appear is in the function application rule
Mario Carneiro (Aug 03 2022 at 09:09):
... I guess it doesn't show up in the paper since I don't give an algorithmic analogue of the typing judgment itself (i.e. a typechecker)
Leonard Wiechmann (Aug 03 2022 at 09:26):
Mario Carneiro said:
The conv rule doesn't work very well algorithmically, since you don't know what other type to prove definitionally equal to the one you have.
which also applies to the transitivity rule, i guess (besides that thing about sub-singleton-elim & decidability).
Mario Carneiro said:
Another standard place definitional equality checks appear is in the function application rule
... I guess it doesn't show up in the paper
i'm not sure i understand.
there's this, looks like a func app rule to me: image.png
and these two: image.png
Mario Carneiro (Aug 03 2022 at 13:59):
the first one is application compatibility, and the second one is the beta rule. The function application rule is
Mario Carneiro (Aug 03 2022 at 14:01):
and the algorithmic version looks like
Leonard Wiechmann (Aug 05 2022 at 12:32):
Mario Carneiro said:
and the algorithmic version looks like
ah, makes sense.
so like with proof irrelevance, you bake in the conversion rule.
Leonard Wiechmann (Aug 05 2022 at 12:44):
i found this check in lean3's abstract function:
lean_assert(std::all_of(subst, subst+n, [](expr const & e) { return closed(e) && is_local(e); }));
i was confused by the closed(e)
part. how can a local not be closed? in lean3 maybe if the type isn't closed.
then i checked lean4 to see if there were any changes.
and indeed there were.
the check is still there, just called !has_loose_bvars(e)
now. (https://github.com/leanprover/lean4/blob/6a767a66a13c588cce3829bf1de00a1830883843/src/kernel/abstract.cpp#L15)
however it doesn't seem to do anything.
has_loose_bvars seems to check the computed Expr.data
field. but looseBVarRange is always zero for fvars: https://github.com/leanprover/lean4/blob/6a767a66a13c588cce3829bf1de00a1830883843/src/Lean/Expr.lean#L428
is that a bug? (and the fvar's type should have been checked for loose bvars; not that the type should ever contain any bvars either, because it's instantiated before putting it into the local context, though that's technically not enforced by the local context)
or is that just a redundant check?
Mario Carneiro (Aug 05 2022 at 12:45):
Like I said, lean mostly works on closed terms
Mario Carneiro (Aug 05 2022 at 12:45):
that doesn't mean no local consts, that means no var
s
Mario Carneiro (Aug 05 2022 at 12:45):
or bvar
s in lean 4
Leonard Wiechmann (Aug 05 2022 at 12:46):
yes, i got that.
Mario Carneiro (Aug 05 2022 at 12:46):
so it's not a bug that looseBVarRange
is 0 for fvars
Mario Carneiro (Aug 05 2022 at 12:46):
because it doesn't have any bvars in it
Leonard Wiechmann (Aug 05 2022 at 12:47):
yeah, that's why i was confused by that check.
Leonard Wiechmann (Aug 05 2022 at 12:47):
so it's just redundant?
Mario Carneiro (Aug 05 2022 at 12:47):
?
Mario Carneiro (Aug 05 2022 at 12:47):
there are other expressions that have bvars in them
Mario Carneiro (Aug 05 2022 at 12:47):
but fvars don't
Leonard Wiechmann (Aug 05 2022 at 12:48):
this is the check: return !has_loose_bvars(e) && is_fvar(e);
(https://github.com/leanprover/lean4/blob/6a767a66a13c588cce3829bf1de00a1830883843/src/kernel/abstract.cpp#L15)
conjunction, but the second implies the first.
Leonard Wiechmann (Aug 05 2022 at 12:49):
in lean3, it seems to check the type's vars.
Mario Carneiro (Aug 05 2022 at 12:50):
Oh I see. That could be a performance optimization but it seems unlikely, both checks should be O(1)
Mario Carneiro (Aug 05 2022 at 12:50):
it could also be grandfathered code since it's the exact same line modulo naming
Mario Carneiro (Aug 05 2022 at 12:51):
and it's in an assert anyway so it doesn't matter
Mario Carneiro (Aug 05 2022 at 12:51):
so it could just be for clarity
Leonard Wiechmann (Aug 05 2022 at 12:57):
i think the behavior is technically different in lean3:
locals are initialized with their type's free var range.
(https://github.com/leanprover/lean/blob/master/src/kernel/expr.cpp#L164)
expr_mlocal::expr_mlocal(bool is_meta, name const & n, name const & pp_n, expr const & t, tag g):
expr_composite(is_meta ? expr_kind::Meta : expr_kind::Local, n.hash(), is_meta || t.has_expr_metavar(), t.has_univ_metavar(),
!is_meta || t.has_local(), t.has_param_univ(),
1, get_free_var_range(t), g),
// ^ THIS sets expr_composite.m_free_var_range
and that's read in:
(https://github.com/leanprover/lean/blob/master/src/kernel/expr.h#L594)
inline unsigned get_free_var_range(expr const & e) {
switch (e.kind()) {
case expr_kind::Var: return var_idx(e) + 1;
case expr_kind::Constant: case expr_kind::Sort: return 0;
default: return static_cast<expr_composite*>(e.raw())->m_free_var_range; // HERE
}
}
/** \brief Return true iff the given expression has free variables. */
inline bool has_free_vars(expr const & e) { return get_free_var_range(e) > 0; }
/** \brief Return true iff the given expression does not have free variables. */
inline bool closed(expr const & e) { return !has_free_vars(e); }
so the implementations seem to be inconsistent.
but it shouldn't matter because
Mario Carneiro said:
Like I said, lean mostly works on closed terms
Leonard Wiechmann (Aug 05 2022 at 12:58):
so i guess
Mario Carneiro said:
so it could just be for clarity
Mario Carneiro (Aug 05 2022 at 13:31):
A local constant cannot have a type that depends on a var
Leonard Wiechmann (Aug 05 2022 at 16:53):
i mean, syntactically it can, of course.
i guess you're implying that lean always instantiates before putting locals into the context, right?
(so if you have lam (A: Type), lam (a: #Var 0), a
the outer lambda is instantiated first. so a
's type will be a local const by the time a
is put into the context)
Mario Carneiro (Aug 05 2022 at 17:05):
right
Leonard Wiechmann (Aug 21 2022 at 09:09):
question about K-reduction:
my understanding is that it applies to any sub-singleton (because all arguments are given in the indices, so you can reconstruct the intro term).
but the implementation only supports k-reduction if the ctor has no arguments.
https://github.com/leanprover/lean4/blob/fa7769260a805440f34bdd3da982e6925e7d72b4/src/kernel/inductive.cpp#L538
is this because it's not needed in practice and simplifies the implementation, or am i misunderstanding the type theory?
Mario Carneiro (Aug 21 2022 at 19:03):
what you describe is the requirements for large elimination for an inductive type in Prop
. For K-like reduction, you need to have no arguments at all to the constructor because otherwise you would need projections to recover the arguments
Mario Carneiro (Aug 21 2022 at 19:07):
Among common inductive types, only Eq
has K-like reduction (and indeed that's the main application), while Acc
is a large eliminating Prop which does not have K-like reduction
Leonard Wiechmann (Aug 22 2022 at 07:11):
Mario Carneiro said:
For K-like reduction, you need to have no arguments at all to the constructor because otherwise you would need projections to recover the arguments
i'm not sure i understand.
the recursor lets you recover the (non-recursive) arguments.
in rec C e p[b] h
, all non-recursive arguments must occur directly in the indices p[b]
(for sub-singletons).
so you can turn this into rec C e p[b] (intro b)
, recovering b
from p[b]
.
recursive arguments can't be recovered because they can't be used in types.
that's also how i understood this: "This rule only applies when all the variables in b are actually on the LHS, which is the reason for
the peculiar requirements on subsingleton eliminators. If bi appears in the parameters for its type,
that means that pj [b] = bi for some j, and so bi is on the LHS [meaning it can be recovered]."
Leonard Wiechmann (Aug 22 2022 at 07:16):
recovering them would require a little more book keeping (storing the j
for each bi
).
which is why i thought the implementation choice was pragmatic in nature.
Mario Carneiro (Aug 22 2022 at 22:14):
it's possible that lean implements a proper subset of the possible K-like inductives. It's not a very important / used mechanism except in the case of eq
Mario Carneiro (Aug 22 2022 at 22:16):
I think of the three categories of things you can have in subsingleton eliminating types: (1) prop args, (2) arguments that appear in the indices, (3) recursive args, only (2) is allowed in K-like inductives, and lean implements none of them
Leonard Wiechmann (Aug 23 2022 at 07:35):
right, makes sense, thanks!
Last updated: Dec 20 2023 at 11:08 UTC