Zulip Chat Archive

Stream: lean4

Topic: open type family with parameters

Sébastien Michelland (May 13 2022 at 09:52):

I'd like to build an open type family using an inductive type. Essentially, the user should be able to add types through a generic interface:

inductive MyType :=
  | myInt
  | generic (α: Type) ... -- parameters as needed

I hope to satisfy the following requirements:

Each new declared type need to have a provably unique string identifier (so that generic types can be compared for equality)
The generic types need to allow parameters (so we can add eg. lists) and might live in different universes

The most natural option is to use a typeclass, which takes care of parameters and universes. However, I don't know if it's possible to restrict typeclass instances in a way that guarantees uniqueness (not to mention expressing it in a proof).

Another option is to make a global registry and add to it with a custom command (which is fine for my application). However, each generic type depends on two universe levels (one for its parameters and one for the type itself) and I don't know how to represent the registry in a way that allows different universes to be mixed.

Any insight or suggestion on how to tackle this would be much appreciated.
Cc @Siddharth Bhat

Mario Carneiro (May 13 2022 at 09:57):

What do you want to do with the provable uniqueness? Proving the strings different is easy but using this to deduce something about the types is hard

Mario Carneiro (May 13 2022 at 09:58):

I would be inclined to do this at the meta-level, i.e. have a declaration in a given namespace for the type family, with the name of the declaration being the provably unique thing

Mario Carneiro (May 13 2022 at 10:01):

Also there is an obvious problem with having MyType contain all of these types, since it would have to live in an arbitrarily large universe in order to encompass types of large universes that have yet to be defined

Sébastien Michelland (May 13 2022 at 10:01):

In the end I'd like to have DecidableEq on MyType; my expectation is that having equal names implies having equal custom types. (The name would be part of the data as well, so different names means different types.) I tend to think of the type family as a dictionary [name → generic type definition], hence the intuition.

Mario Carneiro (May 13 2022 at 10:03):

what if you put Nat -> Nat as one of the generic types? How do you still maintain DecidableEq?

Mario Carneiro (May 13 2022 at 10:05):

If MyType is just the "key" part, then it's just Name or an inductive containing it so DecidableEq is easy, but the hard part is the name -> generic type definition part, depending on what you mean by "generic type definition". What is the type of that function?

Sébastien Michelland (May 13 2022 at 10:08):

You would declare it with a fixed type name (eg. "nat_nat") that would be attached to the generic constructor, so DecidableEq of two generics would compare the names. That would be correct because each name could only be tied to a single custom type.

As for the function that queries the instances, I have tried both typeclass resolution and using a manually-defined dictionary. I'm having trouble allowing parameters in both, though I'm sure some of it is just me not being smart enough.

You're definitely right about the universes. Given the use case we can assume everything lives under a fixed constant universe, eg. Type 20.

Mario Carneiro (May 13 2022 at 10:09):

If we ignore the multiple universes thing and just suppose we want a map like

def MyTypeFamily : Name -> Type
| `Nat => Nat
| `Int => Int
| _ => TBD??

then I think the most reasonable encoding is as a constraint on possible satisfying functions:

structure MyTypeFamily (F : Name -> Type)
  Nat : F `Nat = Nat
  Int : F `Int = Int

which you can extends to add more constraints later

Mario Carneiro (May 13 2022 at 10:11):

But I have to say this problem skirts uncomfortably close to "fundamentally impossible"

Mario Carneiro (May 13 2022 at 10:14):

I'm not exactly sure what you are expecting when you talk about parameters. Do you have an example?

Sébastien Michelland (May 13 2022 at 10:15):

Yeah, it doesn't feel quite right. Initially I just though of it as "use a typeclass that gives the required information" but the equality test makes it much more difficult.

I'm now wondering if I could forsake DecidableEq and use some non-proven boolean equality in my theory, as then I can meta-assume that there is only once occurrence of each name and never have to prove it.

Sébastien Michelland (May 13 2022 at 10:18):

For parameters, I have a multi-dimensional vector type, which depends on a list of dimensions and a base type. The vector is only fully-specified when both of these parameters are applied, so whatever data I attach to generic would need to be parameterized (which typeclass instances do just fine).

Also, the base type is almost always going to be a MyType, and the universe bumps only in very rare cases, which is why it's fine if the whole scheme is typed in a fixed universe and doesn't go arbitrarily high.

Mario Carneiro (May 13 2022 at 10:22):

Could you explain the application a bit more? This seems like an #xy problem

Mario Carneiro (May 13 2022 at 10:23):

Also, the base type is almost always going to be a MyType, and the universe bumps only in very rare cases, which is why it's fine if the whole scheme is typed in a fixed universe and doesn't go arbitrarily high.

Umm.. that's going to be a problem because MyType (the containing type) will need to have a larger universe than MyType (the data)

Sébastien Michelland (May 13 2022 at 10:33):

Sure! I'm modeling language semantics. The language I'm targeting allows new types to be defined; this happens through the interpreter's C++ interface so there is no predefined framework (like structures/inductives/etc in Lean), instead the new types can be any C++ class with some interface.

My goal is to define MyType in a way that reflects this behavior; instead of implementing a C++ class with some interface you would supply a Lean type with a typeclass instance (or something similar).

Restricting what can be defined to fit a clean formal definition in Lean is fine, but one of the things I have to support is that new types can quantify on other types, like vectors of integers.

Umm.. that's going to be a problem because MyType (the containing type) will need to have a larger universe than MyType (the data)

You're right again, that would bump the universe as well. The current formalism I have has an explicit vector type that doesn't quantify, which I why I mixed it up. I do believe this is fine if MyType is universe polymorphic, though.

Mario Carneiro (May 13 2022 at 10:34):

the latter issue is more than just a universe bump, it needs a universe satisfying u < u

Mario Carneiro (May 13 2022 at 10:39):

I think you should have a generic type constructor that encodes the rules for new types into a single "closed" inductive. For example for a lean-like language I would define the mu-operator and its constructors and recursors, and then something like inductive Nat compiles to def Nat := mu `Nat (\lam X, 1 + X)

Mario Carneiro (May 13 2022 at 10:44):

in other words, it is not generic (A : Type) with no constraints on A, it is more like generic (A : TypeSpec) (p : A.args MyType) where TypeSpec is a declaration not depending on MyType which specifies the grammar of new type specifications in the target language and TypeSpec.args gives the arguments to that type (with MyType passed as argument because it can't directly refer to MyType)

Mario Carneiro (May 13 2022 at 10:44):

There will probably be complications, it depends on what TypeSpec looks like

Sébastien Michelland (May 13 2022 at 10:47):

Maybe I'm misunderstanding the universe problem. In the typeclass version, we only register the instantiated type so it works just fine:

class MyTypeIntf (α: Type) where
  default_value: α

inductive MyType :=
  | int
  | generic (α: Type) [MyTypeIntf α]

structure MyType.vector (τ: MyType) where
  length: Nat

instance {τ}: MyTypeIntf (MyType.vector τ) where
  default_value := { length := 0 }

#check MyType.generic (MyType.vector MyType.int)

Mario Carneiro (May 13 2022 at 10:49):

Oh, I thought MyType was data-carrying

Mario Carneiro (May 13 2022 at 10:51):

Well, no actually there is still a problem. How do you have a vector of MyTypes?

Sébastien Michelland (May 13 2022 at 10:55):

You don't; the language doesn't have polymorphism, only type constructors. Looking back I definitely said "quantify" quite liberally. :sweat_smile:

Mario Carneiro (May 13 2022 at 10:56):

Oh I was thinking this is like a Python-style type system with an "any" type

Mario Carneiro (May 13 2022 at 10:58):

In that case, remind me why you need MyType at all? Or type comparisons for that matter

Sébastien Michelland (May 13 2022 at 10:58):

Fortunately not. The example above would be sufficient for the whole system if not for the fact that it doesn't name the types or provide any sort of equality

Sébastien Michelland (May 13 2022 at 10:59):

For program analysis, I want to eg. check whether two variables are defined to be of the same type.

Mario Carneiro (May 13 2022 at 10:59):

can you implement those at the meta-level?

Mario Carneiro (May 13 2022 at 11:00):

This language seems to be at around the same level as lean in consistency strength so it's going to be tricky to implement meta-level operations at the object level

Mario Carneiro (May 13 2022 at 11:01):

like if you have functions like Nat -> Nat "in" your system, then you can't analyze the contents of these functions at the object level in any way other than by calling them

Mario Carneiro (May 13 2022 at 11:03):

you would need to get the meta-level Expr corresponding to the function to take a peek at the variable declarations inside

Sébastien Michelland (May 13 2022 at 11:03):

Well I do have some sort of injectivity; custom types are only equal if they come from the same definition with the same parameters. And we can assume that all parameters have decidable equality. Hence the scheme with the name to identify each custom type

Mario Carneiro (May 13 2022 at 11:04):

If you set aside the data part, it shouldn't be too hard to get a (nonrecursive) type with decidable equality which represents your type tag grammar

Mario Carneiro (May 13 2022 at 11:05):

actually I guess it can be recursive

Mario Carneiro (May 13 2022 at 11:06):

something like

inductive TypeTag
| app : Name -> List TypeTag -> TypeTag

maybe there are some primitives in there but that's the main part

Sébastien Michelland (May 13 2022 at 11:07):

Right. I'll go back and experiment a little more, so I can understand/explore these ideas.

Sébastien Michelland (May 13 2022 at 11:08):

Thank you very much for your feedback, this is extremely useful :grinning:

Andrés Goens (May 13 2022 at 15:22):

We also considered a related problem of having provably-unique identifiers with @Marcus Rossel. We were thinking of just assuming the existence of a function that would give us such names, but we didn't quite figure that out. I know (or rather, suppose) you're both trying to model more or less the same thing here, so probably the same solution would work for both.

Mac (Jul 07 2022 at 20:11):

I stumbled upon this while searching for something else today and felt it might be worthwhile to note that Lake now has its own approach to open type families in Lean: Lake.Util.Family. Its a little hacky and has the potential for some unsafe typing, but I think it works decently for software.

Last updated: May 02 2025 at 03:31 UTC