Zulip Chat Archive

Stream: lean4

Topic: universe polymorphic IO

Sébastien Michelland (May 16 2022 at 13:28):

I have a set of functions living in IO which I use to produce a data structure (that recently became universe polymorphic) along with logs and possible errors. I really like IO for this task because logs and errors are displayed in a very natural way by LSP plugins. Unfortunately, IO only lives in Type, so it can't accommodate for the latest version of the data structure.

I've seen a couple of threads suggesting that there is no reason IO couldn't be universe polymorphic, albeit with the limitation of using ULift (which is fine for me). How realistic would it be to make it universe polymorphic in the current codebase? Specifically is it an achievable target for a new contributor?

If there are other ways to achieve the same integration as IO with a lower-level structure, I'd be interested to know. I tried returning some Results but that doesn't seem to be what top-level error displays are based on.

Cc @Siddharth Bhat

Leonardo de Moura (May 16 2022 at 21:05):

How realistic would it be to make it universe polymorphic in the current codebase? Specifically is it an achievable target for a new contributor?

It is a realistic goal, but it can get messy. We consider this a low-priority modification because it is currently quite inconvenient to manually add the ULift. Moreover, error messages corresponding to missing ULifts are quite hard to understand. See comment at https://github.com/leanprover/lean4/issues/1136

Eric Wieser (May 16 2022 at 21:45):

There was some discussion in the context of lean3 about changing the monadic typeclasses to use out_params rather than quantifying over types in their fields; such that no ulift would be needed. I'll try and find the thread...

Eric Wieser (May 16 2022 at 21:49):

Zulip doesn't let me find a link to the message on mobile, but the thread was

https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/Changing.20the.20functor.20typeclasses

Sébastien Michelland (May 17 2022 at 09:29):

Leonardo de Moura said:

It is a realistic goal, but it can get messy. We consider this a low-priority modification because it is currently quite inconvenient to manually add the ULift. Moreover, error messages corresponding to missing ULifts are quite hard to understand. See comment at https://github.com/leanprover/lean4/issues/1136

Thanks. I've certainly seen the confusing messages while updating other parts of my project. Fortunately I had read about this effect of bind before, so I didn't have too much trouble. At least it seems worth trying.

Eric Wieser said:

There was some discussion in the context of lean3 about changing the monadic typeclasses to use out_params rather than quantifying over types in their fields; such that no ulift would be needed. I'll try and find the thread...

I don't fully get the approach here. As far as I understand, outParam (I assume it's still the same in Lean4) only influences the resolution process. But the trouble with ULift comes from the use of do/bind, which are going to remain the same regardless of how the instance is found. Could you please elaborate a little bit?

Eric Wieser (May 17 2022 at 09:34):

The reason bind causes universe troubles it that docs4#Bind.bind is defined as

class Bind (m : Type u → Type v) where
  bind : {α β : Type u} → m α → (α → m β) → m β

which restricts α and β to the same universe, The suggestion in that thread is to change it to something like

-- this allows `α` to be extracted from `list α`
class is_monadic (α : out_param Type*) (mα : Type*) : Prop

instance {α : Type*} : is_monadic α (list α) := is_monadic.mk

class Bind' {α β : out_param Type*} (mα mβ : Type*) [is_monadic α mα] [is_monadic β mβ] :=
  map : mα → (α → mβ) → mβ

which allows α and β to live in different universes. It also allows restrictions to be put on α and β, such as finiteness or decidable equality.

Sébastien Michelland (May 17 2022 at 10:18):

Right, I see now. This sounds like it would impact a lot of code though, compared to using ULift :thinking:

Eric Wieser (May 17 2022 at 10:33):

Yes, this would be a big change; but also, mathlib would be able to make a lot more use of these classes if they were phrased in this way. Right now we just end up mostly not using them at all because we'd lose universe polymorphism.

Eric Wieser (May 17 2022 at 10:34):

One example is defining a bind operation on finsets, which isn't possible without decideable equality. With Bind, there's nowhere to put that assumption, so it's not possible. With Bind', you can put it before the colon in the Bind' instance as something like

instance [DecidableEq β] : Bind' (Finset α) (Finset β) where
   bind s f := s.bUnion f

(docs#finset.bUnion)

Eric Wieser (May 17 2022 at 12:11):

(cc @Yaël Dillies, who may not follow this stream)

Mac (May 17 2022 at 16:30):

I made a similar suggestion for an HBind a while back which would could also solve the problem. See this message.

Reid Barton (May 17 2022 at 17:12):

How would you express operations that are polymorphic in a monad in this setup?

Sébastien Michelland (May 17 2022 at 19:54):

An HBind typeclass looks pretty appealing (from my limited perspective) since it would be non-invasive. I also looked at the code that elaborates do, and the uses of bind are fairly shallow, so it seems like testing a variation of do that uses it would be possible.

I'm not sure what would be the best way to approach the problem. @Leonardo de Moura do you have any thoughts on ULift vs. heterogeneous binds? Is it maybe worth an RFC issue?

Leonardo de Moura (May 17 2022 at 21:37):

@Sébastien Michelland This kind of modification is tricky and generates a bunch of unexpected elaboration problems. It is hard to tell without trying. Note that HBind does not propagate type information as effectively as the regular Bind. We have a bunch of polymorphic monadic operations and their elaboration will be "stuck" until we apply the HBind default instance. HBind will also affect a few meta-programs such as the "auto-lifting" procedure. Recall that it took us several iterations to get the heterogeneous arithmetic operators working in a reasonable way, but we still had to improve the elaborator a few weeks ago.
I do not have time right now to perform these experiments and devise workarounds for the new issues HBind may create. I will be very happy if you have time to try it. We cannot guarantee we will be able to merge the solution, but I am sure we will learn a lot from this experiment.

Leonardo de Moura (May 17 2022 at 21:39):

I have similar reservations regarding the is_monadic solution. Both is_monadic and HBind are at a very preliminary stage, and both may generate a bunch of unexpected elaboration issues.

Eric Wieser (May 17 2022 at 23:28):

Reid Barton said:

How would you express operations that are polymorphic in a monad in this setup?

Was that aimed at me or Mac? I'm not sure I understand the question.

Leonardo de Moura (May 18 2022 at 00:00):

@Eric Wieser I would also be very happy if you could explore the is_monadic approach you suggested. I could only find preliminary experiments in the commit you posted above. You had compelling examples above. Again, I think even if the experiments are not successful, I think we would learn a lot from them. For is_monadic, I am also curious about the interaction with all "gadgets" we use (e.g., transformers, MonadLift, MonadControl, etc). It would also be great to explore whether the two approaches could coexist. I suspect it should be feasible, but I didn't have time to play with your approach yet.

Eric Wieser (May 18 2022 at 00:17):

I'm unlikely to find time for that any time soon unfortunately (I haven't written any lean4 yet); but was hoping a more interested party might pick up the idea in the meantime!

We of course also need to be wary of any such change disturbing mathport in a big way, but I'm not remotely qualified to assess that.

Sébastien Michelland (May 18 2022 at 14:58):

Let me try to get some basics going. I pushed some tests in my project with @Siddharth Bhat.

Using HBind in itself works exactly as one would expect; the first harder challenge is to have do use it.

The elaborator for do uses MProd instead of Prod in a couple of places because it assumes no universe polymorphism, so I changed it back to Prod. I also mechanically replaced bind with hBind. This allows non-polymorphic examples to elaborate properly, eg.

def test_IO: IO Unit := hdo
  IO.println "Lean4"
  IO.println "hBind"
  return ()
-- HBind.hBind (IO.println "Lean4") fun x =>
-- HBind.hBind (IO.println "hBind") fun x =>
-- pure ()

The next issue is that do finds the monad in the return type (here IO Unit) and refers to it through a meta-variable. Both of these facts restrict the construction to a single universe.

We hacked around this by allowing the polymorphic form of the monad to be specified as an argument of the hdo notation, then building a new definition instead of a metavariable. This allows the following expression to elaborate (where each get_* returns a value in a different universe):

universe u in
example: Id Unit := hdo (monad := Id.{u})
  let _ ← get_0
  let _ ← get_1
  let _ ← get_any.{2}
  get_any.{0}

I'm pretty happy about this, even if it's mostly hacks. :smile: Feedback on how to do these things properly would be much welcome. In particular I'd like to know why a metavariable is created to refer to the monad instance.

Mac (May 18 2022 at 16:36):

Sébastien Michelland said:

I'm pretty happy about this, even if it's mostly hacks. :smile: Feedback on how to do these things properly would be much welcome.

Looks reasonable to me. The quickest way to test it would be to apply these changes to the Lean core and see if it builds and passes all the tests.

Eric Wieser (May 18 2022 at 16:38):

Note that the Hbind solution doesn't solve the finset DecidableEq problem

Leonardo de Moura (May 18 2022 at 18:55):

@Sébastien Michelland This is great start.

The elaborator for do uses MProd instead of Prod in a couple of places because it assumes no universe polymorphism, so I changed it back to Prod.

The MProd was originally introduced to ensure Lean did not get stuck into universe-level constraints while elaborating the for in do notation. Note that, we used Prod before we switched to MProd. Since then, we have improved the universe-level constraint solving part, but you should expect problems when switching back to MProd. Universe-level constraints still create issues. See comment on this recent issue. https://github.com/leanprover/lean4/issues/1136

In particular I'd like to know why a metavariable is created to refer to the monad instance.

This is a trick to embed an expression inside of Syntax. Note that, m in the code you pointed to is an Expr, then we create a metavariable ?m := m and refer to it in Syntax objects using the named hole notation ?m.

I'm pretty happy about this, even if it's mostly hacks.

It would be great if you could try harder examples before trying to apply the changes to Lean core. I think it is very unlikely you will be able to compile Lean core using these changes. Trying examples that use for in, try catch, ... will provide you insight into the challenges ahead. I think it is also worth experimenting with polymorphic monadic operators (i.e., functions that should work with arbitrary monads, and have parameters such as [MonadRef m], [MonadEnv m], etc), and functions that require auto lifting.

Reid Barton (May 20 2022 at 11:23):

Leonardo de Moura said:

I think it is also worth experimenting with polymorphic monadic operators (i.e., functions that should work with arbitrary monads, and have parameters such as [MonadRef m], [MonadEnv m], etc), [...]

@Eric Wieser This last part is what I meant earlier.

Sébastien Michelland (Jul 14 2022 at 13:15):

I'm picking this up again, and started by pushing the code to a proper repository updated to a recent nightly: https://github.com/lephe/lean4-hbind
I will try and build more complex examples to see how it inevitably fails. I'm not familiar with all the features of do notation yet.

Sébastien Michelland (Jul 14 2022 at 15:31):

Some more progress: I updated the elaborator (which has access to the monad name as a identifier with a universe polymorphic definition) to specify it on both sides of every hBind, which makes it possible to use actions defined for any monad:

-- Dn: Type n; Dx is universe polymorphic
def getD0 [Monad m]: m D0 := pure default
def getD1 [Monad m]: m D1 := pure default
def getDx [Monad m]: m Dx := pure default

universe u in
example: Id Unit := hdo (monad := Id.{u})
  let _ ← getD0
  let _ ← getD1
  let _ ← getDx.{2}
  getDx.{0}

Nothing fancy obviously, but I did spend some time trying to make the monad alias @[reducible]. I couldn't find where the DefViews are used, and ended up using Term.applyAttribute after declaring the alias with addAndCompile.

I believe that since only top-level definitions can be universe polymorphic, there are necessarily going to be restrictions on what can be used as input. For instance I don't think we could work with [MonadLift m m'] since that hypothesis would only be for a particular universe level.

Mac (Jul 14 2022 at 16:32):

@Sébastien Michelland It may also be worth playing with a fully generalized variant of HBind. That is, something like the following:

class HBind (m : Type u1 → Type v1) (n : outParam $ Type u2 → Type v2) (o : Type u2 → Type v3) where
  hBind {α : Type u1} {β : Type u2} : m α → (α → n β) → o β

Sébastien Michelland (Jul 14 2022 at 17:16):

Thanks. When does this sort of signature come into play?

Mac (Jul 14 2022 at 17:28):

@Sébastien Michelland A good example is the asynchronous binds (e.g., mapTask, bindTask) which, in Lean, have signatures like:

mapTask : (a -> BaseIO b) -> Task a -> BaseIO (Task a)
bindTask : Task a -> (a -> BaseIO (Task b)) -> BaseIO (Task a)

Thus, the mapTask instance would be:

abbrev AsyncBaseIO (α : Type) := BaseIO (Task α)
instance : HBind Task BaseIO AsyncBaseIO where
  hBind x f := BaseIO.mapTask f x

Mac (Jul 14 2022 at 17:34):

and the bindTask instance would be:

instance : HBind Task AsyncBaseIO AsyncBaseIO where
  hBind := BaseIO.bindTask

Thus, whether n should be an outParam and which one (or both) of the instances should be considered a bind is debatable.

Sébastien Michelland (Jul 14 2022 at 19:27):

Right. I don't feel like I can grasp the full extent of this change yet (especially on how this influences the "inference capital" of the whole scheme), so I've shelved it on the TODO list for now. I think I should consider this once I get the basic versions to work in a remotely useful manner.

Sébastien Michelland (Jul 14 2022 at 19:29):

On that note, I updated the monad detection code to automatically generalize universe parameters if the monad is a simple constant name, which means this example now works without the extra annotation and the universe u in... hack.

example: Id Unit := hdo
  let _ ← getD0
  let _ ← getD1
  let _ ← getDx.{2}
  getDx.{0}

Sébastien Michelland (Jul 18 2022 at 17:11):

Another update! I pushed a small improvement in the monad generalization step, and I figure I might as well explain the thought process. Any insight is welcome!

The basic problem is identifying the monad at play. The definition of HBind highlights what universe polymorphism tries to hide, in that the same monad at different universe levels is really different monads. Since the return type of the do block only gives us the monad used in the final bind, we either have to infer or provide the general shape.

For instance, here the monad determined from the output type is Id.{0}, but the second and third actions use Id.{1} and Id.{2} instead.

example: Id Unit := hdo
  let _ ← getD0
  let _ ← getD1
  let _ ← getDx.{2}
  getDx.{0}

The scheme works by generalizing the universe levels of Id.{0} into Id.{u}, storing it as a new constant, and then using that constant as a monad for each call to HBind.hBind. Determining the proper level is left to unification based on the action, as I don't see anything reasonable that can be done.

Note that since level parameters only exist on constants, this idea only works if the monad is a constant. In fact, in a function on eg. [MonadEnv m], there is only one monad m with one fixed universe, so it can't be universe polymorphic. So barring any typeclass shenanigans, it seems that the best possible result here is to stay in a single universe and just be compatible with do.

But there are still problems even with simpler types. For instance, this essentially freer monad has three levels u, v and w but only the last is relevant for HBind (the parameter E is fixed for any whole do block).

inductive ITree (E: Type u → Type v) (R: Type w): Type _ :=
  | Ret (r: R)
  | Vis {T: Type _} (e: E T) (k: T → ITree E R)

The most favorable case for the elaborator would be to specialize with an abbrev ITree'.{w} := ITree.{_,_,w} E, but this is inconvenient to write manually. In my latest commit, I just (non-recursively) generalize level parameters in constant parameters as well:

inductive PVoid: Type u → Type v :=

set_option trace.Elab.do true in
example: ITree PVoid Nat := hdo
  let _ ← getD0
  let _ ← getDx.{4}
  let _ ← getD1
  pure 0
-- [Elab.do] Found monad constant ITree with levels [?u.5198, ?u.5197, 0]
-- [Elab.do] Generalizing ITree into ITree.{u v w}
-- [Elab.do] Generalizing ITree.PVoid.{?_uniq.5198 ?_uniq.5197} into ITree.PVoid.{?_uniq.5227 ?_uniq.5226}
-- [Elab.do] Final monad: ITree.{u v w} ITree.PVoid.{u v}

Now when declaring the intermediate constant for the generalized monad, the unification step determines that the argument PVoid generalizes as PVoid.{u,v}.

However, this is quite fragile; if PVoid were only defined as Type → Type, this would fail since the generality of u and v would not be captured. In this simple example only w needs to be generalized, but I don't know how feasible it is to determine that. There are more complications if we have eg. ITree (E: Type u → Type v) (R: Type u).

Mac (Jul 18 2022 at 17:52):

@Sébastien Michelland I think you may be doing more work than necessary. For instance, you have this example:

-- The two monads in `hBind` are morally the same, but Lean won't guess that.
-- We need to indicate it by specifying the `m` parameter for the left action.
example: Id Unit :=
  hBind getD0 (m := Id) fun _ =>
  hBind getD1 (m := Id) fun _ =>
  hBind getDx.{5} (m := Id) fun _ =>
  pure default

The problem here is that the monad of getD[0,1,x] is polymorphic and hBind is polymorphic in its left action so there is nothing constraining the monad of the left action. I would also not say that it is morally the same as right action monad. If the right action monad is some super complicated monad stack (e.g., CommandElabM), we don't really want it to default to that (and have to lift the left action up the whole stack). In fact one of the main advantages of HBind is we don't have to do liftings in many cases in order to perform binds. Instead, we would like the monad to default to the simplest one possible. In this case, Id. For example, this works:

@[defaultInstance] instance : Monad Id := inferInstance

-- No type ascriptions needed.
example: Id Unit :=
  hBind getD0 fun _ =>
  hBind getD1 fun _ =>
  hBind getDx.{5} fun _ =>
  pure default

Mac (Jul 18 2022 at 18:00):

Your universe generalization approach, while neat, also fails on some basic use cases. For example:

example: ReaderT Nat Id Unit := hdo
  let n ← read
  let _ ← getD0
  let _ ← getD1
  return n

/-
application type mismatch
  ReaderT.{u, v} Nat
argument
  Nat
has type
  Type : Type 1
but is expected to have type
  Type u : Type (u + 1)
-/

I do not really see why you are generalizing them in the first place,. The signature, as you should be enough to determine the type of the final bind and that should be sufficient. The left actions of the higher behinds should be fixed before being bound based either on their type signature or some form of type class defaulting (it should not be using type information from the bind to determine that).

Mac (Jul 18 2022 at 18:01):

Regardless, I am very happy to see you working on this concept and am very impressed that you managed to port over the do code to make a new hdo notation. I look forward to seeing where you go from here. :big_smile:

Sébastien Michelland (Jul 18 2022 at 18:22):

@Mac Thanks for this input. The reason I'm generalizing is to guide instance search because this fails:

example: Id Unit :=
  hBind getD0 fun _ =>
  hBind getD1 fun _ =>
  hBind getDx.{5} fun _ =>
  pure default
-- typeclass instance problem is stuck, it is often due to metavariables
--  HBind.{?u.567, ?u.566, ?u.565, ?u.564} ?m.568 ?m.569

But thinking about it now, I think it should work, and neither (m := Id) nor the default instance (which I just learned is a thing!) should be required. This is because, unlike what the error message suggests, in the example the right monad is known so the HBind instance for Id is found. It fails for a different reason, as we can evidence in a simpler example:

set_option trace.Meta.synthInstance true in
#synth HBind.{1, _, 0, 0} _ Id.{0}
-- failed to synthesize
--   HBind.{1, ?u.524, 0, 0} ?m.525 Id.{0}

With the trace:

  [Meta.synthInstance] main goal HBind.{1, ?u.524, 0, 0} ?m.525 Id.{0}
  [Meta.synthInstance.newSubgoal] HBind.{1, ?u.524, 0, 0} ?m.525 Id.{0}
  [Meta.synthInstance.globalInstances] HBind.{1, ?u.524, 0, 0} ?m.525
    Id.{0}, [@instHBind.{?u.528, ?u.529}, instHBindIdId.{?u.530, ?u.531}]
  [Meta.synthInstance.generate] instance instHBindIdId.{?u.530, ?u.531}
  [Meta.synthInstance.tryResolve] ∨
    [Meta.synthInstance.tryResolve] HBind.{1, ?u.524, 0, 0} ?m.525
      Id.{0} =?= HBind.{?u.530, ?u.530, ?u.531, ?u.531} Id.{?u.530} Id.{?u.531}

Note how instHBindIdId is found, and as far as I can tell the last problem unification step should have succeeded with ?u.524 = ?u.530 = 1, ?u.531 = 0, and ?m.525 = Id.{1}. If you know how to debug this further that would be interesting, as indeed this might rid us of the fragile generalization idea.

Sébastien Michelland (Jul 18 2022 at 18:23):

Mac said:

Regardless, I am very happy to see you working on this concept and am very impressed that you managed to port over the do code to make a new hdo notation. I look forward to seeing where you go from here. :big_smile:

I'm glad I can help make some progress! Changing the do code was quite easy honestly, I just replaced a bunch of MProd with Prod (this might haunt me later), Bind.bind to HBind.hBind, and the main elaboration command. I was quite surprised.

Sébastien Michelland (Jul 21 2022 at 11:05):

Ok I figured out the problem, which is partly sitting on my chair. When synthesizing HBind.{1,1,0,0} ?m.524 Id.{0}, the failure comes from the unification of ?m.524 with the discovered instance HBind Id.{u} Id.{v}:

[Meta.isDefEq] ?m.524 [nonassignable] =?= Id.{?u.529} [nonassignable]
  [Meta.isDefEq.stuck] ?m.524 =?= Id.{?u.529}

I modified isDefEq to understand why ?m.524 is non-assignable:

[Meta.isDefEq] ?m.524 [nonassignable natural depth 0/1] =?= Id.{?u.529} [nonassignable]

And the reason is the metavariable depth was increased, which I believe is here specifically. It seems obvious in retrospect that it's supposed to be that way, and that this only works if m is an outParam in HBind.

class HBind (m : outParam (Type u₁ → Type v₁)) (n : Type u₂ → Type v₂) where
  hBind {α : Type u₁} {β : Type u₂} : m α → (α → n β) → n β

-- Easy
#synth HBind.{1, 1, 0, 0} _ Id.{0}

-- Also easy
example: Id Unit :=
  hBind getD0 fun _ =>
  hBind getD1 fun _ =>
  hBind getDx.{2} fun _ =>
  pure default

Now the question is how reliable is this approach? Do we have monads that would HBind with more than one instance? What does the typeclass engine do with outParam when there are several options?

Mac (Jul 21 2022 at 18:27):

Sébastien Michelland said:

Do we have monads that would HBind with more than one instance?

Definitely! The main point of HBInd is to enable binds between a wide variety of different monads.

Mac (Jul 21 2022 at 18:32):

However, I think the code is somewhat cheating here, so that may not matter. TheoutParam is actually non-injective (the are already multiple instances with different ms for a given n) based on the instances you have defined. Thus, Lean is able to pick the right instance when m has already been fixed and fix it when it has not,. Thus, the outParam may be fine.

Mac (Jul 21 2022 at 18:34):

However, this approach is still essentially causing the left action to default to the monad of the right action, which is not ideal.

Mac (Jul 22 2022 at 20:05):

Leonardo de Moura said:

How realistic would it be to make it universe polymorphic in the current codebase? Specifically is it an achievable target for a new contributor?

It is a realistic goal, but it can get messy. We consider this a low-priority modification because it is currently quite inconvenient to manually add the ULift.

Would a PR with the basic universe polymorphic IO (requiring ULift) be acceptable? I am currently battling a problem in Lake that very much wants IO to be in universe 3 (Type 2) and is been heavily constrained by the fact it only accepts Type.

Leonardo de Moura (Jul 22 2022 at 20:54):

@Mac I am surprised that you need Type 2 in Lake. It would be great if you could find a simpler design.
We already have so many things on our plates.

Mac (Jul 22 2022 at 20:55):

k, I will stick with my current workarounds :thumbs_up:

Kevin Buzzard (Jul 22 2022 at 22:36):

Dare I ask why you need the universe bump? In mathematics if you need a higher universe there's usually (but not always) a set theory whizz who can talk you back down.

Mario Carneiro (Jul 22 2022 at 22:58):

One "reasonable" option is a usquash operator that maps Type u to Type. Obviously this is unsound if we posit all the isomorphism stuff, but it is possible to soundly introduce the necessary operators as uninterpreted partial functions, and for programming applications universes are really irrelevant (evidence: haskell)

Mac (Jul 22 2022 at 23:06):

The problem is that FacetConfig has the type signature:

FacetConfig : (Name → Type u) → (Type u → Type v) → Name → Type (max (u + 2) v)

And that max (u + 2) bubbles up much of the type hierarchy of Lake.

Eric Wieser (Jul 23 2022 at 00:32):

I've no context at all, but it seems quite strange to be storing an equality of types in FacetConfig

Mac (Jul 23 2022 at 01:05):

It is strange, but it was the best mechanism I could think of to implement open type families in Lean (see Family.lean for details).

Mario Carneiro (Jul 23 2022 at 03:29):

@Mac Could you elaborate on this general design? I would like to #xy this but it seems there are a lot of threads to pull. What is FacetConfig trying to accomplish: is it a user extension point? How does the user use this type, and how does the system make use of the information?

Mario Carneiro (Jul 23 2022 at 03:30):

A more specific question: do you actually need FacetConfig to be universe polymorphic, and if so, why? Is it just "trivial generalization"?

Mario Carneiro (Jul 23 2022 at 05:44):

@Mac I added a refactor proposal at leanprover/lake#107 which rejiggers things so that FacetConfig can live in Type.

Mario Carneiro (Jul 23 2022 at 05:51):

One observation that might have gotten lost: DTT already has something that acts a lot like open type families: Function types. These are morally "coinductive" which means that you can create an open inductive by specifying all the ways you would like to destruct that value, and then downstream users can construct as many extra types as they want by choosing different tuples of functions. (These are often supplied via typeclasses.) If you want to control how the functions are derived, you can provide a "smart constructor" / corecursor that takes arbitrary data in any universe and applies your functions to construct data in the lowest universe.

Siddharth Bhat (Jul 23 2022 at 12:52):

@Mario Carneiro i don't follow this idea of using functions to mimic open type families. I'd be much obliged if there's a #mwe demonstrating this.

Mario Carneiro (Jul 24 2022 at 00:19):

The expression problem is the observation that when we have $m$ producers and $n$ consumers of an interface, there are $mn$ interactions to be specified, and we can't have both the producer side and the consumer side be fully decoupled and extensible because then it is possible to introduce a producer in one import and a consumer in another import and there is no place to put the interaction between them.

So we must choose either the producer side or the consumer side to be "non-extensible", which in this case means that extensions, although possible, require changing the source code and updating everyone on the other side to accommodate.

The case where the producers are non-extensible and the consumers are extensible is well known to lean users: we create an inductive type containing a case for each producer, and then consumers pattern match on the inductive type; there is no need for coordination between the consumers so you can have as many as you like.

The case where the consumers are non-extensible and the producers are extensible is what type classes (or generally structures of functions) are for. We create a type class containing a field for each consumer (each thing we want to do with the data), and then the producers are instances of the type class. The producers don't need to coordinate with each other, and you can have as many as you like.

Mario Carneiro (Jul 24 2022 at 00:25):

Here is an illustration of the two approaches in code:

-- non-extensible Producer, extensible consumer
inductive Producer
| A : Producer
| B : Producer

def consumerC : Producer → String
| .A => "interaction of A and C"
| .B => "interaction of B and C"

def consumerD : Producer → String
| .A => "interaction of A and D"
| .B => "interaction of B and D"

-- non-extensible consumer, extensible Producer
structure Producer where
  C : String
  D : String

def producerA : Producer where
  C := "interaction of A and C"
  D := "interaction of A and D"

def producerB : Producer where
  C := "interaction of B and C"
  D := "interaction of B and D"

Mac (Jul 24 2022 at 02:37):

Mario Carneiro said:

we can't have both the producer side and the consumer side be fully decoupled and extensible because then it is possible to introduce a producer in one import and a consumer in another import and there is no place to put the interaction between them.

As far as I know, there is no reason (in the abstract) why you can't decouple all three, have the producers in one place, the consumers in another, and their interactions in a third.

Mac (Jul 24 2022 at 02:43):

Which is an approach taken by e..g., open types and open definitions.

Mac (Jul 24 2022 at 02:45):

That is, open types are extensible producers, open definitions are extensible consumers, and their interactions can be defined independently.

Mac (Jul 24 2022 at 02:46):

The Expression Problem, while a core concern in programming language design, does have many hypothesized solutions, many approaches of which have been implemented (sometimes only partially, though) in new languages.

Mac (Jul 24 2022 at 03:01):

@Mario Carneiro I feel my responses may have come off as overly critical. I so. I am sorry. The strong assertion that "we can't have both the producer side and the consumer side be fully decoupled and extensible" confused me. Figuring out the best way to do so is generally a main part of expression problem solutions whereas the statement made it sound to me like it was impossible. Or were you just talking about in Lean for the topic in question?

Mario Carneiro (Jul 24 2022 at 03:02):

No, I mean that it is a fairly fundamental issue in decoupling things

Mario Carneiro (Jul 24 2022 at 03:03):

Because of this $m\times n$ thing, we need to know all consumers when we add a producer, and we need to know all producers when we add a consumer, and it's not obvious how to have them both without linearizing the addition of producers and consumers

Mario Carneiro (Jul 24 2022 at 03:06):

If you can separate the producers and the consumers via an "API", then you have essentially split the problem into two since each side only has to interact with the API, and then both sides can grow independently while the API becomes "non-extensible". This is totally possible. The expression problem is specifically for the case where there is no mediator, each producer wants to talk individually to each consumer and all pairwise interactions are potentially completely different

Mario Carneiro (Jul 24 2022 at 03:06):

(Note that in both examples, there are 4 string literals corresponding to every pairwise interaction of producer and consumer)

Kyle Miller (Jul 24 2022 at 03:08):

I thought I'd heard the tagless-final style solved the expression problem https://okmij.org/ftp/tagless-final/ (and a blogpost with an introduction)

This is a solution in the sense that it's possible to encode extensions to the API between producers and consumers as types, and given a valid producer/consumer pair, you can replace the producer with a "simpler" producer or the consumer with a "more complex" consumer, with respect to the API.

Mario Carneiro (Jul 24 2022 at 03:08):

I am assuming some aspects which could be considered specific to the lean setup, in particular that there is no "global view" of the entire set of things because everything could have a downstream dependent which performs some kind of extension

Mario Carneiro (Jul 24 2022 at 03:09):

so you have to decide in advance what things can be extended and what can't by downstream dependents

Mario Carneiro (Jul 24 2022 at 03:21):

A simplification of the tagless final pattern for lean is the observation that we can use typeclasses to just fill in elements of the matrix on an ad hoc basis:

class Interaction (producer consumer : Type) where
  interaction : String

structure ProducerA
structure ProducerB
structure ConsumerC
structure ConsumerD

instance : Interaction ProducerA ConsumerC where
  interaction := "interaction of A and C"

instance : Interaction ProducerA ConsumerD where
  interaction := "interaction of A and D"

instance : Interaction ProducerB ConsumerC where
  interaction := "interaction of B and C"

instance : Interaction ProducerB ConsumerC where
  interaction := "interaction of B and D"

The downside of this pattern is that you don't know until you call the function whether the interaction between the two chosen types has in fact been defined

Mario Carneiro (Jul 24 2022 at 03:22):

You can also take either the producer side or the consumer side to be the typeclass with the other one being the argument

Mac (Jul 24 2022 at 06:53):

Mario Carneiro said:

I am assuming some aspects which could be considered specific to the lean setup, in particular that there is no "global view" of the entire set of things because everything could have a downstream dependent which performs some kind of extension

For many things in Lean which are extensible (e.g., syntax, environment extensions, etc.) it is entirely possible to freeze the setup at any given point and perform global optimizations.

Mac (Jul 24 2022 at 06:54):

The only real exceptions to this is the type system, but it also doesn't really have true open types in the first place that need freezing. Type classes could be an exception, but they are technically also freezable.

Mac (Jul 24 2022 at 06:55):

Also, specialization could be seen as a form of freezing as well.

François G. Dorais (Jul 24 2022 at 10:32):

Mac said:

As far as I know, there is no reason (in the abstract) why you can't decouple all three, have the producers in one place, the consumers in another, and their interactions in a third.

I don't disagree but I would like to point out that this type of polygamy might be illegal in some jurisdictions. :rolling_on_the_floor_laughing: https://media2.giphy.com/media/MuI2Oxjb9gCy9s4Nxq/giphy.gif?cid=c623cb35w3hvqnavuoz1eqhktax59hsh81dlevxza57i33ou&rid=giphy.gif&ct=g

Eric Wieser (Dec 01 2023 at 20:57):

Mac Malone said:

Would a PR with the basic universe polymorphic IO (requiring ULift) be acceptable?

I found myself wanting this basic version again; is a PR like lean4#3010 (which is enough for me to do the rest downstream if needed) acceptable?

Last updated: May 02 2025 at 03:31 UTC