Zulip Chat Archive

Stream: general

Topic: Implementing internal language with proper variable scoping

Andrej Bauer (Dec 12 2024 at 14:15):

Are there examples of implementation of an internal language (of a category), especially ones with proper variable bindings? Something along the lines of:

Algebraic theories: write something like x, y, z ⊢ t to build a term t in the context with variables x, y, z. The term t denotes a morphism S × S × S → S where S is the carrier of the algebra (an object in a category with finite limits). For example x, y, z ⊢ y should be the second projection. If we also introduce pairs, then x ⊢ (x,x) has to be the diagonal.
Typed λ-calculus as the internal language of a cartesian-closed category (ccc): here the challenge is handling of bound variables. A term x : A ⊢ t : B would denote a morphism A → B where A and B are objects of the underlying ccc.

I am looking for code that will teach me how to humanely handle variable scoping (the x, y, z in the above examples, and bound variables in a λ-calculus). I suspect that this is the job for the elaborator, but I am not sure. It might be the case that I can use parsing to construct a tree representation, and apply an interpretation functions to the tree. However, the tree would have to be valid, so some checking of variable scopes and type-checking needs to occur (so we're back to an elaborator).

The best I managed to do so far was to use de Bruijn indices for variables, and even there I cheated by interpreting invalid de Bruijn indices in an ad hoc maner. Wasn't there a "HOTT0" project along the lines of what I am asking for? Perhaps @Mario Carneiro did some work on it?

Andrej Bauer (Dec 12 2024 at 14:22):

So there is an implementation of the simply-typed λ-calculus at https://github.com/muldrik/lean-stlc/tree/master which uses strings as variable names. I suppose that's not too bad, but one would hope that Lean would offer some support for variables.

Mario Carneiro (Dec 12 2024 at 14:27):

Unfortunately HoTT-0 isn't yet at the stage where it would have such a tactic. The thing we end up with is also likely to be rather more "heavyweight" than you would want for simply interpreting the language of CCCs. I think there has been some discussion about that before but I don't know where it's gone.

One concrete implementation which gets pretty close to what you want and may be good enough to cargo cult is a small tactic I wrote for lean4lean to be able to quote lean terms into the deeply embedded VExpr type used for reasoning about the lean type theory. You give it a typing context and an expression (which is elaborated using lean, so it can e.g. infer types in the expression), like and it returns a literal of the VExpr type. There are a few versions of it, here is an example use to define lean's quotient definitional equality rule

Mario Carneiro (Dec 12 2024 at 14:31):

cc: @Wojciech Nawrocki who has also thought a lot about this kind of thing

Mario Carneiro (Dec 12 2024 at 14:36):

Another (rather elaborate) example of this is the Qq library, which builds a custom context within which to elaborate lean terms, then post-processes the result to produce typed Expr terms

Mario Carneiro (Dec 12 2024 at 14:38):

As long as the internal language is a subset of lean (possibly extended with axiomatic constants), the nicest approach here is likely to set up a custom context, elaborate the body as plain lean syntax, then postprocess the result to translate it into your internal language and ensure nothing outside the subset is accepted.

Jannis Limperg (Dec 12 2024 at 19:27):

I have a toy embedding of STLC (with a nice elaborator) here in case that helps.

Jannis Limperg (Dec 12 2024 at 19:29):

Examples are at the bottom of that file. Some of them fail to typecheck, in which case you get a compile-time error.

Last updated: May 02 2025 at 03:31 UTC