Zulip Chat Archive

Stream: general

Topic: HOAS support


Todd Wilson (Feb 07 2023 at 22:57):

I used Lean (version 2, I guess) back in 2016 to completely formalize my 2001 JSL paper, but haven't used it since. Coming back now, I see that Lean is already up to version 4. I have an interest in formalizing several HOAS developments that I had previously formalized in Abella and Twelf, and I see that the Lean Manual gives some examples (in Sections 6.5 and 6.6) of how to achieve similar results.

Here is my question: Are the syntax and tactic definition capabilities of Lean 4 sufficient to be able to add new constructions for defining LF-like specifications and reasoning about them in a way that would be familiar to Abella and Twelf users, effectively hiding their "implementations" as either dependent de Bruijn indices or parametric HOAS (or something else)?

For concreteness, here is how I might imagine the syntax looking for both an extrinsic and intrinsic formulation of simply typed lambda calculus (adapted from Twelf):

signature extrinsic
   tm : type.
   lam : (tm -> tm) -> tm.
   app : tm -> tm -> tm.
   tp : type.
   arr : tp -> tp -> tp.
   of : tm -> tp -> type.
   of-lam : ({x:tm}{d:of x T} of (B x) T') -> of (lam B) (arr T T').
   of-app : of M' (arr T T') -> of M T -> of (app M' M) T'.
end

signature intrinsic
   tp : type.
   arr : tp -> tp -> tp.
   tm : tp -> type.
   lam : (tm T -> tm T') -> tm (arr T T').
   app : tm (arr T T') -> tm T -> tm T'.
end

There would also need to be a mechanism for defining "contexts" (Abella) or "worlds" (Twelf) and stating and proving, with tactic support, theorems about contextual judgments in these signatures.

Eric Wieser (Feb 07 2023 at 23:32):

It's hard for me to tell, but this might be a literal translation of your first example into Lean 4

structure Extrinsic where
   tm : Type
   lam : (tm  tm)  tm
   app : tm  tm  tm
   tp : Type
   arr : tp  tp  tp
   of : tm  tp  Type  -- you maybe want `Prop` here
   of_lam {T T' B} {x : tm} {d : of x T} : of (B x) T' of (lam B) (arr T T')
   of_app {T T' M M'} : of M' (arr T T')  of M T  of (app M' M) T'

Todd Wilson (Feb 08 2023 at 00:20):

The point of HOAS, though, is that the argument to lam is not the full function space but just the lambda functions, i.e., terms of type tm relative to a context containing a variable of type tm. Such types, unlike the full function space, have an induction principle on which most of the proofs are based. Once you allow variables in the context, functions and predicates over these types must be generalized to include contexts as part of their "input", which then you have to limit with some kind of context predicate (or "world" in the terminology of Twelf) to get totality. This kind of genericity is captured in both of the approaches in Sections 6.5 and 6.6 of the manual but at an additional "semantic" cost. My question is whether it is possible to hide all of this semantic cost behind a kind of interface that mimics the reasoning you do in Abella or Twelf.

Chris Hughes (Feb 08 2023 at 00:33):

I think something like this. It's not how I would do lamba calculus, but it seems close to your definition.

inductive tp : Type
| arr : tp -> tp - > tp

inductive tm : tp -> Type
| lam : (tm T-> tm T') -> tm (arr T T')
| app : tm (arr T T') -> tm T -> tm T'

Eric Wieser (Feb 08 2023 at 00:34):

Isn't tp isomorphic to empty there?

Chris Hughes (Feb 08 2023 at 00:35):

Yes. I would probably do another constructor like var : Nat -> tp

Todd Wilson (Feb 08 2023 at 00:40):

Chris, I'm not sure whether your post crossed with my reply to Eric, but that answer I think applies also to your example. If I'm not mistaken, the type A -> B in Lean includes all functions from A to B, including functions that can be defined by case splits, recursion, and other means. In HOAS, A -> B is crucially inductively generated only by the constructors of B and (effectively) a new constant of type A.

Chris Hughes (Feb 08 2023 at 00:55):

I think it should be done something like this

inductive tp : Type
| const : Nat - > tp
| arr : tp -> tp -> tp

inductive Context : Type
| empty : context
| snoc : context -> tp -> context

open Context

inductive tm : Context -> tp -> Type
| const : (T : tp) -> Nat -> tm empty T
| intro : tm (snoc C T) T' -> tm C (arr T T')
| app : tm C (arr T T') -> tm C T' -> tm C T
| var : ...

I can't quite figure out the whole lot right now. You need a constructor for var which takes a natural numver smaller than the size of the context.

Chris Hughes (Feb 08 2023 at 00:55):

I don't really understand what -> is in HOAS.

Eric Wieser (Feb 08 2023 at 00:58):

Todd, in case you haven't already seen it, it might provide some inspiration to see how lean models it's own expression syntax (for metaprogramming), at docs4#Lean.Expr. Obviously you can't use that directly.

Eric Wieser (Feb 08 2023 at 00:59):

(ouch, that page is ugly on mobile; I'll try to fix it tomorrow)

Chris Hughes (Feb 08 2023 at 01:00):

Probably you need a constructor for tm C T -> tm (snoc C T') T as well. Lean.Expr allows for expressions that don't type check as well, and this isn't the best way to do lambda calculus I've heard.

Todd Wilson (Feb 08 2023 at 01:08):

Yes, this is closer to what I'm getting at. For var you need some kind of look-up function that has a dependent type depending on the index. But the whole point of my question is trying to hide all of this in a nice "interface".

To illustrate the difference with HOAS, suppose we had this, modified from your example:

inductive tp : Type
| nat : tp
| arr : tp -> tp - > tp

inductive tm : tp -> Type
| z : tm nat
| s : tm nat -> tm nat
| lam : (tm T-> tm T') -> tm (arr T T')
| app : tm (arr T T') -> tm T -> tm T'

Then, the term lam (fun x => if x == z then s z else z)) would have type tm (arr nat nat) assuming that there was an if-then-else available. By contrast, the only terms of type tm (arr nat nat) in HOAS would be lam (fun x => s (s ( ... (s *)))), where * is either z or x.

Eric Wieser (Feb 08 2023 at 01:20):

Presumably lam should just take a tm, and produce a tm whose context has one element removed?

Eric Wieser (Feb 08 2023 at 01:23):

Which is the intro rule in Chris' message

Eric Wieser (Feb 08 2023 at 01:24):

Todd Wilson said:

But the whole point of my question is trying to hide all of this in a nice "interface".

I think it will be a lot easier for people to help with that if we can first get to a correct but ugly/specialized example.

Todd Wilson (Feb 08 2023 at 01:33):

Eric Wieser said:

Todd Wilson said:

But the whole point of my question is trying to hide all of this in a nice "interface".

I think it will be a lot easier for people to help with that if we can first get to a correct but ugly/specialized example.

That's been done already, in two different ways, in Sections 6.5 and 6.6 of the manual. So that's exactly what I'm trying to make less ugly. Maybe I could do both of those examples in Abella and Twelf, for those not familiar, to show what they look like for comparison. (Probably will have to wait until tomorrow, though.)

Trebor Huang (Feb 08 2023 at 09:42):

It's generally hard to translate LF-style into other proof assistants like Lean and Agda. It can be done via a combination of encoding techniques like PHOAS, but the advantage of LF-style assistants is precisely the weaker function space. So if Lean could painlessly do this, languages like Twelf would become useless.

Eric Wieser (Feb 08 2023 at 11:46):

Todd Wilson said:

That's been done already, in two different ways, in Sections 6.5 and 6.6 of the manual.

Oh sorry, I completely missed those two examples

Todd Wilson (Feb 08 2023 at 19:36):

Trebor, thanks for your statement, which is useful and was in fact my own starting point. Glancing over the Lean 4 manual, I was impressed by all of the syntax-extension (Chapter 8 of the manual) and meta-programming (Functional Programming in Lean) capabilities, so I was inspired to ask about whether those capabilities might, in effect, allow one to embed Abella or Twelf into Lean 4 in a nice way, not only so that developments in those systems could be accommodated almost directly, but also so that those systems could effectively be extended to include new type constructors (e.g., universes) and make use of Lean's built in types and reasoning.

And, in fact, the example from Section 6.5 of the manual illustrates this perfectly. I cannot encode that example directly in either Abella or Twelf, because the denote function there requires a universe, as it constructs types recursively from an inductive type, and neither Abella nor Twelf has universes. Furthermore, that example makes use of Lean's Nat type and associated (+) operation, which, if I wanted to code it in Abella or Twelf, would need to be defined from scratch.

Todd Wilson (Feb 10 2023 at 22:29):

So, after the above clarifications, I guess I can restate my question now more concisely as: Are Lean 4's syntax-extension and meta-programming capabilities sufficient for hiding a development in the style of the example in Section 6.5 of the manual (Dependent de Bruijn Indices) behind an Abella- or Twelf-like HOAS "interface" that would seem familiar to anyone with experience with those systems?

I'm hoping that someone with knowledge of both Lean 4 capabilities and LF-style proof assistants can give me some confidence that such a project has a reasonable chance of success before I dive in and make the investment.

Wojciech Nawrocki (Feb 10 2023 at 23:38):

@Todd Wilson the answer to "are Lean 4's syntax-extension and meta-programming capabilities sufficient for implementing MyLF" is almost certainly 'yes', but the implementation effort may vary hugely depending on how you want to do it (I'm sorry to say I did not read your posts carefully enough to understand what exactly you are trying).

  • Maybe you just want to use Lean as a programming language in which you define a type of MyLF syntax, a typechecker for MyLF expressions, etc, etc. Lean 4 is general purpose, so you can write any language you want this way (including Lean, which is implemented in itself). However the implementation effort of doing things this way is huge because you can't really reuse any type-theoretic infrastructure.
  • Maybe you also want to use Lean the type theory (which should really be given a different name for clarity) in which you reason about LF derivations. With this approach you can use all the metaprogramming capabilities purpose-built for this type theory. I am not sufficiently familiar with Twelf to tell whether a shallow embedding can work. If you go with a deep embedding in which you have an inductive type of LF derivations, you can definitely write some meta code which makes it appear to the user as if they are working in LF, but in reality will construct an LF derivation tree in Lean's type theory. You could look to lean-iris for inspiration on how this might work.

I hope this helps somewhat!

Todd Wilson (Feb 11 2023 at 00:09):

Thanks, Wojciech! Of your two bullets, I really have the second in mind -- a "shallow" embedding, as you say, along the lines of the Section 6.5 example. This gives me the ability to use type capabilities of Lean that go beyond the source language, as Section 6.5 again illustrates with its use of universes and pre-defined Lean types like Nat (and Lists, finite sets, etc.), in my LF developments. At the same time, I'd rather have my nice LF specifications translated automatically to the more complicated ones involving de Bruijn indices without constantly having to pay the cost of maintaining those de Bruijn indices and having them clutter my theorems and proofs.

Wojciech Nawrocki (Feb 11 2023 at 00:19):

Am I understanding correctly that you don't want a Lean encoding of the LF syntax itself but rather want to translate constructions in the LF to Lean constructions directly? Indeed this would be closer to something 'shallow'.

Todd Wilson (Feb 11 2023 at 00:25):

Yes, exactly. I want to give an LF specification, perhaps augmented with additional Lean types (e.g, Nat, Lists, finite sets, universes), and have Lean define for me instead the Dependent de Bruijn Indices version behind the scenes, while maintaining the illusion of an LF-style proof assistant with various syntax and tactics that allow me to work directly with the LF specifications, without having to break the abstraction layer and mess with de Bruijn indices -- unless absolutely necessary, but that would just mean that I needed more syntax and tactics to hide some more stuff.

The point is, once I set up this embedding, I could start with the LF developments I've already done and then expand outward to include more and more of Lean's types as needed, kind of like a workbench where extensions to existing LF-style proof assistants could be explored, all within Lean's type system, and with Lean acting as the ultimate consistency check.

Wojciech Nawrocki (Feb 11 2023 at 01:58):

I see, in that case lean-iris is the closest-in-spirit project that I can think of. It sounds like it would be a fair amount of work, but possible!

Todd Wilson (Feb 11 2023 at 02:08):

Thanks for this reference! I'll take a look.


Last updated: Dec 20 2023 at 11:08 UTC