Zulip Chat Archive

Stream: general

Topic: lean3 impl questions

Leonard Wiechmann (Aug 01 2022 at 17:26):

hey, i've been studying the lean3 source code.
while i've been able to answer most of my questions so far, there are a few things, where asking the authors seems like the better idea:

what's the rationale behind using expr_local for uses of locals? is it just the convenience of not having to look up the local in the local context to get its type and stuff like that, or is there a "deeper" reason?
where is the kernel type checker invoked for definitions commands? or do you not do that for regular operation (and instead have some "verify the stuff please" action)?

Kyle Miller (Aug 01 2022 at 17:29):

For your second question, it should be https://github.com/leanprover-community/lean/blob/master/src/frontends/lean/definition_cmds.cpp#L194 and you can see where it gets called in the definition commands. (I'm not the author.)

Leonard Wiechmann (Aug 01 2022 at 17:35):

ah, indeed, thanks a lot!
i was searching for type_checker tc( cause that seemed to be a common pattern, but only found it in the check command.

Leonard Wiechmann (Aug 01 2022 at 17:55):

another question related to the first one: expr_var doesn't seem to be used much at all. what's the rationale behind that?
actually, nvm. it's used in instantiate_rev_locals and resolve_local_name_core, which are called quite a bit.

Mario Carneiro (Aug 01 2022 at 17:58):

That's the locally nameless approach. Most functions expect closed terms, so whenever you "enter" a binder you instantiate all the vars in it for local constants

Mario Carneiro (Aug 01 2022 at 17:59):

note that the actual type information in a local_const is not reliable, it's usually a dummy value

Mario Carneiro (Aug 01 2022 at 18:00):

I'm not sure what you mean by "rationale behind using expr_local for uses of locals". What else would you use it for?

Leonard Wiechmann (Aug 01 2022 at 18:00):

i see, so using expr_local is a kind of "certification/invariant" thing (> Most functions expect closed terms)

Mario Carneiro (Aug 01 2022 at 18:01):

in any case, in lean 4 local_const is now .fvar (fvarId : Name) : Expr, so it seems it was agreed that all the other stuff in the local const is dead weight

Mario Carneiro (Aug 01 2022 at 18:02):

it's now solely a pointer into the LocalContext which has the actual type information and whatnot

Leonard Wiechmann (Aug 01 2022 at 18:02):

i found it a little weird that both the "definition" and the "use" of a local used exactly the same expr (except for adjusted source position information).

Leonard Wiechmann (Aug 01 2022 at 18:03):

it's now solely a pointer into the LocalContext which has the actual type information and whatnot

that sounds more like what i would have expected, i think.

Mario Carneiro (Aug 01 2022 at 18:03):

Huh? what is a "definition" of a local constant?

Leonard Wiechmann (Aug 01 2022 at 18:04):

well fun id (a: nat) -> nat := a
the (a: nat) part is the definition.
the a part in the body is the use.
clearly at source level, these are quite different, but in the ast, they use the same expr.

Mario Carneiro (Aug 01 2022 at 18:04):

I suppose you could call the things in the LocalContext the "definition of a local_const", but that's local_decl, not local_const

Mario Carneiro (Aug 01 2022 at 18:05):

The expression (a: nat) -> nat is represented as an expr like Pi "a" (const "nat") (const "nat"), there aren't any local_const involved at all

Leonard Wiechmann (Aug 01 2022 at 18:07):

i'm not sure what local_const is.
i was talking about lean3's expr_local :thinking:

Mario Carneiro (Aug 01 2022 at 18:07):

docs#expr

Mario Carneiro (Aug 01 2022 at 18:07):

the names on the C++ side are weird

Leonard Wiechmann (Aug 01 2022 at 18:09):

ok, i see.

Leonard Wiechmann (Aug 01 2022 at 18:09):

but the body (a) would be a local_const?

Mario Carneiro (Aug 01 2022 at 18:11):

the local consts get involved when you enter a binder during elaboration. Let's say the body is \lam a, a, so we start by entering the a binder. This means creating a new local_decl like {name := "a", uniq := 1234, type := nat} and then all occurrences of var 0 get replaced with local_const "a" 1234 <>

Mario Carneiro (Aug 01 2022 at 18:13):

and then we would resolve the user reference a to that local constant, and when we close the scope we would abstract it back to var 0 resulting in the term lam "a" (const "nat") (var 0) for that identity function

Leonard Wiechmann (Aug 01 2022 at 18:14):

ok, makes sense!
what's the idea behind replacing the vars with locals, instead of doing the lookups ad hoc? (this really is my central question)

Leonard Wiechmann (Aug 01 2022 at 18:15):

i guess at some points they're turned back into local_consts, because the kernel (in lean3) doesn't seem to accept vars.

Leonard Wiechmann (Aug 01 2022 at 18:17):

i've got to go, back in like an hour.
thanks for the help so far!

Leonard Wiechmann (Aug 01 2022 at 18:23):

i guess one reason for switching to the unique names could be that you don’t have to bother with adjusting de-bruijn indices or alpha conversion.

Leonard Wiechmann (Aug 01 2022 at 18:35):

hmm, but then why would de-bruijn be used at all? :thinking:

Kyle Miller (Aug 01 2022 at 18:55):

I believe this is the keyword:

That's the locally nameless approach.

There seem to be papers you could look up about "locally nameless representation"

Mario Carneiro (Aug 01 2022 at 19:00):

Leonard Wiechmann said:

what's the idea behind replacing the vars with locals, instead of doing the lookups ad hoc? (this really is my central question)

Leonard Wiechmann said:

i guess one reason for switching to the unique names could be that you don’t have to bother with adjusting de-bruijn indices or alpha conversion.

You answered your own question. It basically localizes the problem of lifting and substitution to one place, so nothing else has to worry about it. See also http://www.cs.ru.nl/~james/RESEARCH/haskell2004.pdf which explains some of the theory.

Mario Carneiro (Aug 01 2022 at 19:01):

Leonard Wiechmann said:

i guess at some points they're turned back into local_consts, because the kernel (in lean3) doesn't seem to accept vars.

Actually it's the other way around. Kernel terms must not contain local_consts, although the kernel itself also does the locally nameless thing so it will turn vars into local_consts as it traverses the term.

Leonard Wiechmann (Aug 01 2022 at 20:00):

Kyle Miller said:

There seem to be papers you could look up about "locally nameless representation"

thanks, i'll take a look!

Leonard Wiechmann (Aug 01 2022 at 20:00):

Mario Carneiro said:

It basically localizes the problem of lifting and substitution to one place, so nothing else has to worry about it. See also http://www.cs.ru.nl/~james/RESEARCH/haskell2004.pdf which explains some of the theory.

that makes sense, thanks for the link!

Leonard Wiechmann (Aug 01 2022 at 20:03):

Mario Carneiro said:

Actually it's the other way around. Kernel terms must not contain local_consts, although the kernel itself also does the locally nameless thing so it will turn vars into local_consts as it traverses the term.

ah, i see. i did wonder how the kernel verified the bindings if the input used local_const.
i saw the lean_assert(closed(e)); in type_checker::infer_type_core, i must have missed the conversion.

Leonard Wiechmann (Aug 01 2022 at 20:41):

found it: finalize_definition in definition_cmds.cpp abstracts the locals.
and stuff like type_checker::infer_lambda temporarily switches to expr_local for inference.
seems closed(e) doesn't mean "doesn't contain any vars", but just means "doesn't contain free vars". (yes, that's exactly what it means. says it in the comment right above it... :D)
man, this would probably be so much easier, if i had just downloaded and compiled the code :D

Leonard Wiechmann (Aug 01 2022 at 20:45):

anyway, thanks kyle and mario for the help, really appreciate it!

Leonard Wiechmann (Aug 02 2022 at 13:35):

working my way through the typing rules in the lean paper again (The Type Theory of Lean).
have a few questions:

1) has the type theory changed in any way for lean4?

2) isn't the proof irrelevance reduction rule (second line in the image) technically unjustified?
the def-eq rule (first line) only has one p, no p'.
it's my understanding that you can't "apply def-eq to the rules themselves" (i.e. if the rule only has one p, you can't apply it if you have two different p, p', even if they're def-eq).
image.png

Mario Carneiro (Aug 02 2022 at 21:19):

1) Yes and no. The theory is not supposed to have changed in any essential way, but lean 4 does include some new kernel features not supported by lean 3, mostly optimizations of things that were supported already but not primitively.

Primitive projections. This is just shorthand for a rec term applied to a lambda that extracts one of the arguments, and it has the expected definitional equalities.
Primitive nat and string literals. This makes #reduce stop early and changes the behavior of whnf around these terms, but otherwise they are just shorthand for the term construction you could do before.
- Primitive nat literal operations like Nat.add. It only works on literals, not variables, so it can only compute things that were already computable through a long sequence of kernel reductions.
Eta for structures. This is a "real" change, but it is actually a bit closer to the paper than before, since I had to add eta for (some) structures in the "reduction to W-types" chapter. It does not pose any really difficult problems.
Nested and mutual inductives. This is a big change, and one that probably needs some investigation at some point, but it should be checking the same rules as in lean 3 (where nested and mutual inductives were only simulated by the frontend). The simulation did not have all the desired definitional equalities though, so this is a real change.

Mario Carneiro (Aug 02 2022 at 21:25):

2) The "algorithmic" rules do not have to be 1 for 1 copies of the "idealized" rules. They just need to be able to show essentially the same things in the same context. In this case, you can show that if you replace $\Leftrightarrow$ with $\equiv$ in the second rule you still get a provable theorem: Suppose $\Gamma\vdash p:\mathbb{P}$ , $\Gamma\vdash h:p$ , $\Gamma\vdash h':p'$ , and $\Gamma\vdash p\equiv p'$ . Then $\Gamma\vdash p'\equiv p$ , so $\Gamma\vdash h':p$ by the conv rule ( $\Gamma\vdash\alpha\equiv\beta$ implies $\Gamma\vdash e:\alpha$ -> $\Gamma\vdash e:\beta$ ). Thus $\Gamma\vdash h\equiv h'$ by proof irrelevance.

Reid Barton (Aug 02 2022 at 21:39):

BTW, does "eta for structures" literally mean for things defined with the structure keyword? Or does the equivalent inductive also have eta?

Eric Rodriguez (Aug 02 2022 at 22:13):

https://github.com/leanprover/lean4/commit/0aa32d643e063efd68bd31478ccd8028a3de7533

Mario Carneiro (Aug 02 2022 at 22:18):

Yes:

inductive Foo : Type
| mk : Nat → Nat → Foo

def Foo.fst : Foo → Nat := λ ⟨n, _⟩ => n
def Foo.snd : Foo → Nat := λ ⟨_, n⟩ => n

example (f : Foo) : f = Foo.mk f.fst f.snd := rfl -- ok

inductive Rec : Type
| mk : Nat → Rec → Rec

def Rec.fst : Rec → Nat := λ ⟨n, _⟩ => n
def Rec.snd : Rec → Rec := λ ⟨_, n⟩ => n

example (f : Rec) : f = Rec.mk f.fst f.snd := rfl -- fail

Mario Carneiro (Aug 02 2022 at 22:20):

The kernel doesn't actually receive a "structure" flag. It computes whether an inductive is "structure-like" itself

Leonard Wiechmann (Aug 03 2022 at 08:59):

Mario Carneiro said:

2) The "algorithmic" rules do not have to be 1 for 1 copies of the "idealized" rules. They just need to be able to show essentially the same things in the same context.

meaning "imply the ideal rules"? (in a way that's implementation friendly)
totally forgot about the conv rule. noticed it was absent, but didn't realize it was "baked into the proof irrelevance rule" (like eta, which had pi-intro baked in, turning it into an extensionality principle).

Mario Carneiro (Aug 03 2022 at 09:07):

The conv rule doesn't work very well algorithmically, since you don't know what other type to prove definitionally equal to the one you have. Instead, everywhere type equality constraints appear, those equalities get replaced with definitional equality checks

Mario Carneiro (Aug 03 2022 at 09:08):

Another standard place definitional equality checks appear is in the function application rule

Mario Carneiro (Aug 03 2022 at 09:09):

... I guess it doesn't show up in the paper since I don't give an algorithmic analogue of the typing judgment itself (i.e. a typechecker)

Leonard Wiechmann (Aug 03 2022 at 09:26):

Mario Carneiro said:

The conv rule doesn't work very well algorithmically, since you don't know what other type to prove definitionally equal to the one you have.

which also applies to the transitivity rule, i guess (besides that thing about sub-singleton-elim & decidability).

Mario Carneiro said:

Another standard place definitional equality checks appear is in the function application rule
... I guess it doesn't show up in the paper

i'm not sure i understand.

there's this, looks like a func app rule to me: image.png

and these two: image.png

Mario Carneiro (Aug 03 2022 at 13:59):

the first one is application compatibility, and the second one is the beta rule. The function application rule is

$\dfrac{\Gamma \vdash f:\forall x: A.\;B(x)\quad \Gamma \vdash e: A}{\Gamma\vdash f\;x: B[e/x]}$

Mario Carneiro (Aug 03 2022 at 14:01):

and the algorithmic version looks like

$\dfrac{\Gamma \Vdash f:\forall x: A.\;B(x)\quad \Gamma \Vdash e: A'\quad \Gamma \vdash A\Leftrightarrow A'}{\Gamma\Vdash f\;x: B[e/x]}$

Leonard Wiechmann (Aug 05 2022 at 12:32):

Mario Carneiro said:

and the algorithmic version looks like

$\dfrac{\Gamma \Vdash f:\forall x: A.\;B(x)\quad \Gamma \Vdash e: A'\quad \Gamma \vdash A\Leftrightarrow A'}{\Gamma\Vdash f\;x: B[e/x]}$

ah, makes sense.
so like with proof irrelevance, you bake in the conversion rule.

Leonard Wiechmann (Aug 05 2022 at 12:44):

i found this check in lean3's abstract function:
lean_assert(std::all_of(subst, subst+n, [](expr const & e) { return closed(e) && is_local(e); }));

i was confused by the closed(e) part. how can a local not be closed? in lean3 maybe if the type isn't closed.
then i checked lean4 to see if there were any changes.
and indeed there were.

the check is still there, just called !has_loose_bvars(e) now. (https://github.com/leanprover/lean4/blob/6a767a66a13c588cce3829bf1de00a1830883843/src/kernel/abstract.cpp#L15)
however it doesn't seem to do anything.

has_loose_bvars seems to check the computed Expr.data field. but looseBVarRange is always zero for fvars: https://github.com/leanprover/lean4/blob/6a767a66a13c588cce3829bf1de00a1830883843/src/Lean/Expr.lean#L428

is that a bug? (and the fvar's type should have been checked for loose bvars; not that the type should ever contain any bvars either, because it's instantiated before putting it into the local context, though that's technically not enforced by the local context)
or is that just a redundant check?

Mario Carneiro (Aug 05 2022 at 12:45):

Like I said, lean mostly works on closed terms

Mario Carneiro (Aug 05 2022 at 12:45):

that doesn't mean no local consts, that means no vars

Mario Carneiro (Aug 05 2022 at 12:45):

or bvars in lean 4

Leonard Wiechmann (Aug 05 2022 at 12:46):

yes, i got that.

Mario Carneiro (Aug 05 2022 at 12:46):

so it's not a bug that looseBVarRange is 0 for fvars

Mario Carneiro (Aug 05 2022 at 12:46):

because it doesn't have any bvars in it

Leonard Wiechmann (Aug 05 2022 at 12:47):

yeah, that's why i was confused by that check.

Leonard Wiechmann (Aug 05 2022 at 12:47):

so it's just redundant?

Mario Carneiro (Aug 05 2022 at 12:47):

there are other expressions that have bvars in them

Mario Carneiro (Aug 05 2022 at 12:47):

but fvars don't

Leonard Wiechmann (Aug 05 2022 at 12:48):

this is the check: return !has_loose_bvars(e) && is_fvar(e); (https://github.com/leanprover/lean4/blob/6a767a66a13c588cce3829bf1de00a1830883843/src/kernel/abstract.cpp#L15)
conjunction, but the second implies the first.

Leonard Wiechmann (Aug 05 2022 at 12:49):

in lean3, it seems to check the type's vars.

Mario Carneiro (Aug 05 2022 at 12:50):

Oh I see. That could be a performance optimization but it seems unlikely, both checks should be O(1)

Mario Carneiro (Aug 05 2022 at 12:50):

it could also be grandfathered code since it's the exact same line modulo naming

Mario Carneiro (Aug 05 2022 at 12:51):

and it's in an assert anyway so it doesn't matter

Mario Carneiro (Aug 05 2022 at 12:51):

so it could just be for clarity

Leonard Wiechmann (Aug 05 2022 at 12:57):

i think the behavior is technically different in lean3:

locals are initialized with their type's free var range.
(https://github.com/leanprover/lean/blob/master/src/kernel/expr.cpp#L164)

expr_mlocal::expr_mlocal(bool is_meta, name const & n, name const & pp_n, expr const & t, tag g):
    expr_composite(is_meta ? expr_kind::Meta : expr_kind::Local, n.hash(), is_meta || t.has_expr_metavar(), t.has_univ_metavar(),
                   !is_meta || t.has_local(), t.has_param_univ(),
                   1, get_free_var_range(t), g),
                      // ^ THIS sets expr_composite.m_free_var_range

and that's read in:
(https://github.com/leanprover/lean/blob/master/src/kernel/expr.h#L594)

inline unsigned get_free_var_range(expr const & e) {
    switch (e.kind()) {
    case expr_kind::Var:                            return var_idx(e) + 1;
    case expr_kind::Constant: case expr_kind::Sort: return 0;
    default:                                        return static_cast<expr_composite*>(e.raw())->m_free_var_range;  // HERE
    }
}
/** \brief Return true iff the given expression has free variables. */
inline bool has_free_vars(expr const & e) { return get_free_var_range(e) > 0; }
/** \brief Return true iff the given expression does not have free variables. */
inline bool closed(expr const & e) { return !has_free_vars(e); }

so the implementations seem to be inconsistent.
but it shouldn't matter because
Mario Carneiro said:

Like I said, lean mostly works on closed terms

Leonard Wiechmann (Aug 05 2022 at 12:58):

so i guess

Mario Carneiro said:

so it could just be for clarity

Mario Carneiro (Aug 05 2022 at 13:31):

A local constant cannot have a type that depends on a var

Leonard Wiechmann (Aug 05 2022 at 16:53):

i mean, syntactically it can, of course.
i guess you're implying that lean always instantiates before putting locals into the context, right?
(so if you have lam (A: Type), lam (a: #Var 0), a the outer lambda is instantiated first. so a's type will be a local const by the time a is put into the context)

Mario Carneiro (Aug 05 2022 at 17:05):

right

Leonard Wiechmann (Aug 21 2022 at 09:09):

question about K-reduction:
my understanding is that it applies to any sub-singleton (because all arguments are given in the indices, so you can reconstruct the intro term).
but the implementation only supports k-reduction if the ctor has no arguments.
https://github.com/leanprover/lean4/blob/fa7769260a805440f34bdd3da982e6925e7d72b4/src/kernel/inductive.cpp#L538
is this because it's not needed in practice and simplifies the implementation, or am i misunderstanding the type theory?

Mario Carneiro (Aug 21 2022 at 19:03):

what you describe is the requirements for large elimination for an inductive type in Prop. For K-like reduction, you need to have no arguments at all to the constructor because otherwise you would need projections to recover the arguments

Mario Carneiro (Aug 21 2022 at 19:07):

Among common inductive types, only Eq has K-like reduction (and indeed that's the main application), while Acc is a large eliminating Prop which does not have K-like reduction

Leonard Wiechmann (Aug 22 2022 at 07:11):

Mario Carneiro said:

For K-like reduction, you need to have no arguments at all to the constructor because otherwise you would need projections to recover the arguments

i'm not sure i understand.
the recursor lets you recover the (non-recursive) arguments.
in rec C e p[b] h, all non-recursive arguments must occur directly in the indices p[b] (for sub-singletons).
so you can turn this into rec C e p[b] (intro b), recovering b from p[b].
recursive arguments can't be recovered because they can't be used in types.
that's also how i understood this: "This rule only applies when all the variables in b are actually on the LHS, which is the reason for
the peculiar requirements on subsingleton eliminators. If bi appears in the parameters for its type,
that means that pj [b] = bi for some j, and so bi is on the LHS [meaning it can be recovered]."

Leonard Wiechmann (Aug 22 2022 at 07:16):

recovering them would require a little more book keeping (storing the j for each bi).
which is why i thought the implementation choice was pragmatic in nature.

Mario Carneiro (Aug 22 2022 at 22:14):

it's possible that lean implements a proper subset of the possible K-like inductives. It's not a very important / used mechanism except in the case of eq

Mario Carneiro (Aug 22 2022 at 22:16):

I think of the three categories of things you can have in subsingleton eliminating types: (1) prop args, (2) arguments that appear in the indices, (3) recursive args, only (2) is allowed in K-like inductives, and lean implements none of them

Leonard Wiechmann (Aug 23 2022 at 07:35):

right, makes sense, thanks!

Last updated: May 02 2025 at 03:31 UTC