Zulip Chat Archive
Stream: lean4
Topic: universe polymorphic IO
Sébastien Michelland (May 16 2022 at 13:28):
I have a set of functions living in IO
which I use to produce a data structure (that recently became universe polymorphic) along with logs and possible errors. I really like IO
for this task because logs and errors are displayed in a very natural way by LSP plugins. Unfortunately, IO
only lives in Type
, so it can't accommodate for the latest version of the data structure.
I've seen a couple of threads suggesting that there is no reason IO
couldn't be universe polymorphic, albeit with the limitation of using ULift
(which is fine for me). How realistic would it be to make it universe polymorphic in the current codebase? Specifically is it an achievable target for a new contributor?
If there are other ways to achieve the same integration as IO
with a lower-level structure, I'd be interested to know. I tried returning some Result
s but that doesn't seem to be what top-level error displays are based on.
Cc @Siddharth Bhat
Leonardo de Moura (May 16 2022 at 21:05):
How realistic would it be to make it universe polymorphic in the current codebase? Specifically is it an achievable target for a new contributor?
It is a realistic goal, but it can get messy. We consider this a low-priority modification because it is currently quite inconvenient to manually add the ULift
. Moreover, error messages corresponding to missing ULift
s are quite hard to understand. See comment at https://github.com/leanprover/lean4/issues/1136
Eric Wieser (May 16 2022 at 21:45):
There was some discussion in the context of lean3 about changing the monadic typeclasses to use out_param
s rather than quantifying over types in their fields; such that no ulift would be needed. I'll try and find the thread...
Eric Wieser (May 16 2022 at 21:49):
Zulip doesn't let me find a link to the message on mobile, but the thread was
Sébastien Michelland (May 17 2022 at 09:29):
Leonardo de Moura said:
It is a realistic goal, but it can get messy. We consider this a low-priority modification because it is currently quite inconvenient to manually add the
ULift
. Moreover, error messages corresponding to missingULift
s are quite hard to understand. See comment at https://github.com/leanprover/lean4/issues/1136
Thanks. I've certainly seen the confusing messages while updating other parts of my project. Fortunately I had read about this effect of bind before, so I didn't have too much trouble. At least it seems worth trying.
Eric Wieser said:
There was some discussion in the context of lean3 about changing the monadic typeclasses to use
out_param
s rather than quantifying over types in their fields; such that no ulift would be needed. I'll try and find the thread...
I don't fully get the approach here. As far as I understand, outParam
(I assume it's still the same in Lean4) only influences the resolution process. But the trouble with ULift
comes from the use of do
/bind
, which are going to remain the same regardless of how the instance is found. Could you please elaborate a little bit?
Eric Wieser (May 17 2022 at 09:34):
The reason bind
causes universe troubles it that docs4#Bind.bind is defined as
class Bind (m : Type u → Type v) where
bind : {α β : Type u} → m α → (α → m β) → m β
which restricts α
and β
to the same universe, The suggestion in that thread is to change it to something like
-- this allows `α` to be extracted from `list α`
class is_monadic (α : out_param Type*) (mα : Type*) : Prop
instance {α : Type*} : is_monadic α (list α) := is_monadic.mk
class Bind' {α β : out_param Type*} (mα mβ : Type*) [is_monadic α mα] [is_monadic β mβ] :=
map : mα → (α → mβ) → mβ
which allows α
and β
to live in different universes. It also allows restrictions to be put on α
and β
, such as finiteness or decidable equality.
Sébastien Michelland (May 17 2022 at 10:18):
Right, I see now. This sounds like it would impact a lot of code though, compared to using ULift
:thinking:
Eric Wieser (May 17 2022 at 10:33):
Yes, this would be a big change; but also, mathlib would be able to make a lot more use of these classes if they were phrased in this way. Right now we just end up mostly not using them at all because we'd lose universe polymorphism.
Eric Wieser (May 17 2022 at 10:34):
One example is defining a bind operation on finsets, which isn't possible without decideable equality. With Bind
, there's nowhere to put that assumption, so it's not possible. With Bind'
, you can put it before the colon in the Bind'
instance as something like
instance [DecidableEq β] : Bind' (Finset α) (Finset β) where
bind s f := s.bUnion f
Eric Wieser (May 17 2022 at 12:11):
(cc @Yaël Dillies, who may not follow this stream)
Mac (May 17 2022 at 16:30):
I made a similar suggestion for an HBind
a while back which would could also solve the problem. See this message.
Reid Barton (May 17 2022 at 17:12):
How would you express operations that are polymorphic in a monad in this setup?
Sébastien Michelland (May 17 2022 at 19:54):
An HBind
typeclass looks pretty appealing (from my limited perspective) since it would be non-invasive. I also looked at the code that elaborates do
, and the uses of bind
are fairly shallow, so it seems like testing a variation of do
that uses it would be possible.
I'm not sure what would be the best way to approach the problem. @Leonardo de Moura do you have any thoughts on ULift
vs. heterogeneous binds? Is it maybe worth an RFC issue?
Leonardo de Moura (May 17 2022 at 21:37):
@Sébastien Michelland This kind of modification is tricky and generates a bunch of unexpected elaboration problems. It is hard to tell without trying. Note that HBind
does not propagate type information as effectively as the regular Bind
. We have a bunch of polymorphic monadic operations and their elaboration will be "stuck" until we apply the HBind
default instance. HBind
will also affect a few meta-programs such as the "auto-lifting" procedure. Recall that it took us several iterations to get the heterogeneous arithmetic operators working in a reasonable way, but we still had to improve the elaborator a few weeks ago.
I do not have time right now to perform these experiments and devise workarounds for the new issues HBind
may create. I will be very happy if you have time to try it. We cannot guarantee we will be able to merge the solution, but I am sure we will learn a lot from this experiment.
Leonardo de Moura (May 17 2022 at 21:39):
I have similar reservations regarding the is_monadic
solution. Both is_monadic
and HBind
are at a very preliminary stage, and both may generate a bunch of unexpected elaboration issues.
Eric Wieser (May 17 2022 at 23:28):
Reid Barton said:
How would you express operations that are polymorphic in a monad in this setup?
Was that aimed at me or Mac? I'm not sure I understand the question.
Leonardo de Moura (May 18 2022 at 00:00):
@Eric Wieser I would also be very happy if you could explore the is_monadic
approach you suggested. I could only find preliminary experiments in the commit you posted above. You had compelling examples above. Again, I think even if the experiments are not successful, I think we would learn a lot from them. For is_monadic
, I am also curious about the interaction with all "gadgets" we use (e.g., transformers, MonadLift
, MonadControl
, etc). It would also be great to explore whether the two approaches could coexist. I suspect it should be feasible, but I didn't have time to play with your approach yet.
Eric Wieser (May 18 2022 at 00:17):
I'm unlikely to find time for that any time soon unfortunately (I haven't written any lean4 yet); but was hoping a more interested party might pick up the idea in the meantime!
We of course also need to be wary of any such change disturbing mathport in a big way, but I'm not remotely qualified to assess that.
Sébastien Michelland (May 18 2022 at 14:58):
Let me try to get some basics going. I pushed some tests in my project with @Siddharth Bhat.
Using HBind
in itself works exactly as one would expect; the first harder challenge is to have do
use it.
The elaborator for do
uses MProd
instead of Prod
in a couple of places because it assumes no universe polymorphism, so I changed it back to Prod
. I also mechanically replaced bind
with hBind
. This allows non-polymorphic examples to elaborate properly, eg.
def test_IO: IO Unit := hdo
IO.println "Lean4"
IO.println "hBind"
return ()
-- HBind.hBind (IO.println "Lean4") fun x =>
-- HBind.hBind (IO.println "hBind") fun x =>
-- pure ()
The next issue is that do
finds the monad in the return type (here IO Unit
) and refers to it through a meta-variable. Both of these facts restrict the construction to a single universe.
We hacked around this by allowing the polymorphic form of the monad to be specified as an argument of the hdo
notation, then building a new definition instead of a metavariable. This allows the following expression to elaborate (where each get_*
returns a value in a different universe):
universe u in
example: Id Unit := hdo (monad := Id.{u})
let _ ← get_0
let _ ← get_1
let _ ← get_any.{2}
get_any.{0}
I'm pretty happy about this, even if it's mostly hacks. :smile: Feedback on how to do these things properly would be much welcome. In particular I'd like to know why a metavariable is created to refer to the monad instance.
Mac (May 18 2022 at 16:36):
Sébastien Michelland said:
I'm pretty happy about this, even if it's mostly hacks. :smile: Feedback on how to do these things properly would be much welcome.
Looks reasonable to me. The quickest way to test it would be to apply these changes to the Lean core and see if it builds and passes all the tests.
Eric Wieser (May 18 2022 at 16:38):
Note that the Hbind
solution doesn't solve the finset
DecidableEq
problem
Leonardo de Moura (May 18 2022 at 18:55):
@Sébastien Michelland This is great start.
The elaborator for do uses MProd instead of Prod in a couple of places because it assumes no universe polymorphism, so I changed it back to Prod.
The MProd
was originally introduced to ensure Lean did not get stuck into universe-level constraints while elaborating the for in do
notation. Note that, we used Prod
before we switched to MProd
. Since then, we have improved the universe-level constraint solving part, but you should expect problems when switching back to MProd
. Universe-level constraints still create issues. See comment on this recent issue. https://github.com/leanprover/lean4/issues/1136
In particular I'd like to know why a metavariable is created to refer to the monad instance.
This is a trick to embed an expression inside of Syntax
. Note that, m
in the code you pointed to is an Expr
, then we create a metavariable ?m := m
and refer to it in Syntax
objects using the named hole notation ?m
.
I'm pretty happy about this, even if it's mostly hacks.
It would be great if you could try harder examples before trying to apply the changes to Lean core. I think it is very unlikely you will be able to compile Lean core using these changes. Trying examples that use for in
, try catch
, ... will provide you insight into the challenges ahead. I think it is also worth experimenting with polymorphic monadic operators (i.e., functions that should work with arbitrary monads, and have parameters such as [MonadRef m]
, [MonadEnv m]
, etc), and functions that require auto lifting.
Reid Barton (May 20 2022 at 11:23):
Leonardo de Moura said:
I think it is also worth experimenting with polymorphic monadic operators (i.e., functions that should work with arbitrary monads, and have parameters such as
[MonadRef m]
,[MonadEnv m]
, etc), [...]
@Eric Wieser This last part is what I meant earlier.
Sébastien Michelland (Jul 14 2022 at 13:15):
I'm picking this up again, and started by pushing the code to a proper repository updated to a recent nightly: https://github.com/lephe/lean4-hbind
I will try and build more complex examples to see how it inevitably fails. I'm not familiar with all the features of do
notation yet.
Sébastien Michelland (Jul 14 2022 at 15:31):
Some more progress: I updated the elaborator (which has access to the monad name as a identifier with a universe polymorphic definition) to specify it on both sides of every hBind
, which makes it possible to use actions defined for any monad:
-- Dn: Type n; Dx is universe polymorphic
def getD0 [Monad m]: m D0 := pure default
def getD1 [Monad m]: m D1 := pure default
def getDx [Monad m]: m Dx := pure default
universe u in
example: Id Unit := hdo (monad := Id.{u})
let _ ← getD0
let _ ← getD1
let _ ← getDx.{2}
getDx.{0}
Nothing fancy obviously, but I did spend some time trying to make the monad alias @[reducible]
. I couldn't find where the DefView
s are used, and ended up using Term.applyAttribute
after declaring the alias with addAndCompile
.
I believe that since only top-level definitions can be universe polymorphic, there are necessarily going to be restrictions on what can be used as input. For instance I don't think we could work with [MonadLift m m']
since that hypothesis would only be for a particular universe level.
Mac (Jul 14 2022 at 16:32):
@Sébastien Michelland It may also be worth playing with a fully generalized variant of HBind
. That is, something like the following:
class HBind (m : Type u1 → Type v1) (n : outParam $ Type u2 → Type v2) (o : Type u2 → Type v3) where
hBind {α : Type u1} {β : Type u2} : m α → (α → n β) → o β
Sébastien Michelland (Jul 14 2022 at 17:16):
Thanks. When does this sort of signature come into play?
Mac (Jul 14 2022 at 17:28):
@Sébastien Michelland A good example is the asynchronous binds (e.g., mapTask
, bindTask
) which, in Lean, have signatures like:
mapTask : (a -> BaseIO b) -> Task a -> BaseIO (Task a)
bindTask : Task a -> (a -> BaseIO (Task b)) -> BaseIO (Task a)
Thus, the mapTask
instance would be:
abbrev AsyncBaseIO (α : Type) := BaseIO (Task α)
instance : HBind Task BaseIO AsyncBaseIO where
hBind x f := BaseIO.mapTask f x
Mac (Jul 14 2022 at 17:34):
and the bindTask
instance would be:
instance : HBind Task AsyncBaseIO AsyncBaseIO where
hBind := BaseIO.bindTask
Thus, whether n
should be an outParam
and which one (or both) of the instances should be considered a bind is debatable.
Sébastien Michelland (Jul 14 2022 at 19:27):
Right. I don't feel like I can grasp the full extent of this change yet (especially on how this influences the "inference capital" of the whole scheme), so I've shelved it on the TODO list for now. I think I should consider this once I get the basic versions to work in a remotely useful manner.
Sébastien Michelland (Jul 14 2022 at 19:29):
On that note, I updated the monad detection code to automatically generalize universe parameters if the monad is a simple constant name, which means this example now works without the extra annotation and the universe u in...
hack.
example: Id Unit := hdo
let _ ← getD0
let _ ← getD1
let _ ← getDx.{2}
getDx.{0}
Sébastien Michelland (Jul 18 2022 at 17:11):
Another update! I pushed a small improvement in the monad generalization step, and I figure I might as well explain the thought process. Any insight is welcome!
The basic problem is identifying the monad at play. The definition of HBind
highlights what universe polymorphism tries to hide, in that the same monad at different universe levels is really different monads. Since the return type of the do
block only gives us the monad used in the final bind, we either have to infer or provide the general shape.
For instance, here the monad determined from the output type is Id.{0}
, but the second and third actions use Id.{1}
and Id.{2}
instead.
example: Id Unit := hdo
let _ ← getD0
let _ ← getD1
let _ ← getDx.{2}
getDx.{0}
The scheme works by generalizing the universe levels of Id.{0}
into Id.{u}
, storing it as a new constant, and then using that constant as a monad for each call to HBind.hBind
. Determining the proper level is left to unification based on the action, as I don't see anything reasonable that can be done.
Note that since level parameters only exist on constants, this idea only works if the monad is a constant. In fact, in a function on eg. [MonadEnv m]
, there is only one monad m
with one fixed universe, so it can't be universe polymorphic. So barring any typeclass shenanigans, it seems that the best possible result here is to stay in a single universe and just be compatible with do
.
But there are still problems even with simpler types. For instance, this essentially freer monad has three levels u
, v
and w
but only the last is relevant for HBind
(the parameter E
is fixed for any whole do
block).
inductive ITree (E: Type u → Type v) (R: Type w): Type _ :=
| Ret (r: R)
| Vis {T: Type _} (e: E T) (k: T → ITree E R)
The most favorable case for the elaborator would be to specialize with an abbrev ITree'.{w} := ITree.{_,_,w} E
, but this is inconvenient to write manually. In my latest commit, I just (non-recursively) generalize level parameters in constant parameters as well:
inductive PVoid: Type u → Type v :=
set_option trace.Elab.do true in
example: ITree PVoid Nat := hdo
let _ ← getD0
let _ ← getDx.{4}
let _ ← getD1
pure 0
-- [Elab.do] Found monad constant ITree with levels [?u.5198, ?u.5197, 0]
-- [Elab.do] Generalizing ITree into ITree.{u v w}
-- [Elab.do] Generalizing ITree.PVoid.{?_uniq.5198 ?_uniq.5197} into ITree.PVoid.{?_uniq.5227 ?_uniq.5226}
-- [Elab.do] Final monad: ITree.{u v w} ITree.PVoid.{u v}
Now when declaring the intermediate constant for the generalized monad, the unification step determines that the argument PVoid
generalizes as PVoid.{u,v}
.
However, this is quite fragile; if PVoid
were only defined as Type → Type
, this would fail since the generality of u
and v
would not be captured. In this simple example only w
needs to be generalized, but I don't know how feasible it is to determine that. There are more complications if we have eg. ITree (E: Type u → Type v) (R: Type u)
.
Mac (Jul 18 2022 at 17:52):
@Sébastien Michelland I think you may be doing more work than necessary. For instance, you have this example:
-- The two monads in `hBind` are morally the same, but Lean won't guess that.
-- We need to indicate it by specifying the `m` parameter for the left action.
example: Id Unit :=
hBind getD0 (m := Id) fun _ =>
hBind getD1 (m := Id) fun _ =>
hBind getDx.{5} (m := Id) fun _ =>
pure default
The problem here is that the monad of getD[0,1,x]
is polymorphic and hBind
is polymorphic in its left action so there is nothing constraining the monad of the left action. I would also not say that it is morally the same as right action monad. If the right action monad is some super complicated monad stack (e.g., CommandElabM
), we don't really want it to default to that (and have to lift the left action up the whole stack). In fact one of the main advantages of HBind
is we don't have to do liftings in many cases in order to perform binds. Instead, we would like the monad to default to the simplest one possible. In this case, Id
. For example, this works:
@[defaultInstance] instance : Monad Id := inferInstance
-- No type ascriptions needed.
example: Id Unit :=
hBind getD0 fun _ =>
hBind getD1 fun _ =>
hBind getDx.{5} fun _ =>
pure default
Mac (Jul 18 2022 at 18:00):
Your universe generalization approach, while neat, also fails on some basic use cases. For example:
example: ReaderT Nat Id Unit := hdo
let n ← read
let _ ← getD0
let _ ← getD1
return n
/-
application type mismatch
ReaderT.{u, v} Nat
argument
Nat
has type
Type : Type 1
but is expected to have type
Type u : Type (u + 1)
-/
I do not really see why you are generalizing them in the first place,. The signature, as you should be enough to determine the type of the final bind and that should be sufficient. The left actions of the higher behinds should be fixed before being bound based either on their type signature or some form of type class defaulting (it should not be using type information from the bind to determine that).
Mac (Jul 18 2022 at 18:01):
Regardless, I am very happy to see you working on this concept and am very impressed that you managed to port over the do code to make a new hdo
notation. I look forward to seeing where you go from here. :big_smile:
Sébastien Michelland (Jul 18 2022 at 18:22):
@Mac Thanks for this input. The reason I'm generalizing is to guide instance search because this fails:
example: Id Unit :=
hBind getD0 fun _ =>
hBind getD1 fun _ =>
hBind getDx.{5} fun _ =>
pure default
-- typeclass instance problem is stuck, it is often due to metavariables
-- HBind.{?u.567, ?u.566, ?u.565, ?u.564} ?m.568 ?m.569
But thinking about it now, I think it should work, and neither (m := Id)
nor the default instance (which I just learned is a thing!) should be required. This is because, unlike what the error message suggests, in the example the right monad is known so the HBind
instance for Id
is found. It fails for a different reason, as we can evidence in a simpler example:
set_option trace.Meta.synthInstance true in
#synth HBind.{1, _, 0, 0} _ Id.{0}
-- failed to synthesize
-- HBind.{1, ?u.524, 0, 0} ?m.525 Id.{0}
With the trace:
[Meta.synthInstance] main goal HBind.{1, ?u.524, 0, 0} ?m.525 Id.{0}
[Meta.synthInstance.newSubgoal] HBind.{1, ?u.524, 0, 0} ?m.525 Id.{0}
[Meta.synthInstance.globalInstances] HBind.{1, ?u.524, 0, 0} ?m.525
Id.{0}, [@instHBind.{?u.528, ?u.529}, instHBindIdId.{?u.530, ?u.531}]
[Meta.synthInstance.generate] instance instHBindIdId.{?u.530, ?u.531}
[Meta.synthInstance.tryResolve] ∨
[Meta.synthInstance.tryResolve] HBind.{1, ?u.524, 0, 0} ?m.525
Id.{0} =?= HBind.{?u.530, ?u.530, ?u.531, ?u.531} Id.{?u.530} Id.{?u.531}
Note how instHBindIdId
is found, and as far as I can tell the last problem unification step should have succeeded with ?u.524 = ?u.530 = 1
, ?u.531 = 0
, and ?m.525 = Id.{1}
. If you know how to debug this further that would be interesting, as indeed this might rid us of the fragile generalization idea.
Sébastien Michelland (Jul 18 2022 at 18:23):
Mac said:
Regardless, I am very happy to see you working on this concept and am very impressed that you managed to port over the do code to make a new
hdo
notation. I look forward to seeing where you go from here. :big_smile:
I'm glad I can help make some progress! Changing the do
code was quite easy honestly, I just replaced a bunch of MProd
with Prod
(this might haunt me later), Bind.bind
to HBind.hBind
, and the main elaboration command. I was quite surprised.
Sébastien Michelland (Jul 21 2022 at 11:05):
Ok I figured out the problem, which is partly sitting on my chair. When synthesizing HBind.{1,1,0,0} ?m.524 Id.{0}
, the failure comes from the unification of ?m.524
with the discovered instance HBind Id.{u} Id.{v}
:
[Meta.isDefEq] ?m.524 [nonassignable] =?= Id.{?u.529} [nonassignable]
[Meta.isDefEq.stuck] ?m.524 =?= Id.{?u.529}
I modified isDefEq
to understand why ?m.524
is non-assignable:
[Meta.isDefEq] ?m.524 [nonassignable natural depth 0/1] =?= Id.{?u.529} [nonassignable]
And the reason is the metavariable depth was increased, which I believe is here specifically. It seems obvious in retrospect that it's supposed to be that way, and that this only works if m
is an outParam
in HBind
.
class HBind (m : outParam (Type u₁ → Type v₁)) (n : Type u₂ → Type v₂) where
hBind {α : Type u₁} {β : Type u₂} : m α → (α → n β) → n β
-- Easy
#synth HBind.{1, 1, 0, 0} _ Id.{0}
-- Also easy
example: Id Unit :=
hBind getD0 fun _ =>
hBind getD1 fun _ =>
hBind getDx.{2} fun _ =>
pure default
Now the question is how reliable is this approach? Do we have monads that would HBind
with more than one instance? What does the typeclass engine do with outParam
when there are several options?
Mac (Jul 21 2022 at 18:27):
Sébastien Michelland said:
Do we have monads that would
HBind
with more than one instance?
Definitely! The main point of HBInd
is to enable binds between a wide variety of different monads.
Mac (Jul 21 2022 at 18:32):
However, I think the code is somewhat cheating here, so that may not matter. TheoutParam
is actually non-injective (the are already multiple instances with different m
s for a given n
) based on the instances you have defined. Thus, Lean is able to pick the right instance when m
has already been fixed and fix it when it has not,. Thus, the outParam
may be fine.
Mac (Jul 21 2022 at 18:34):
However, this approach is still essentially causing the left action to default to the monad of the right action, which is not ideal.
Mac (Jul 22 2022 at 20:05):
Leonardo de Moura said:
How realistic would it be to make it universe polymorphic in the current codebase? Specifically is it an achievable target for a new contributor?
It is a realistic goal, but it can get messy. We consider this a low-priority modification because it is currently quite inconvenient to manually add the
ULift
.
Would a PR with the basic universe polymorphic IO
(requiring ULift
) be acceptable? I am currently battling a problem in Lake that very much wants IO
to be in universe 3 (Type 2
) and is been heavily constrained by the fact it only accepts Type
.
Leonardo de Moura (Jul 22 2022 at 20:54):
@Mac I am surprised that you need Type 2
in Lake. It would be great if you could find a simpler design.
We already have so many things on our plates.
Mac (Jul 22 2022 at 20:55):
k, I will stick with my current workarounds :thumbs_up:
Kevin Buzzard (Jul 22 2022 at 22:36):
Dare I ask why you need the universe bump? In mathematics if you need a higher universe there's usually (but not always) a set theory whizz who can talk you back down.
Mario Carneiro (Jul 22 2022 at 22:58):
One "reasonable" option is a usquash
operator that maps Type u
to Type
. Obviously this is unsound if we posit all the isomorphism stuff, but it is possible to soundly introduce the necessary operators as uninterpreted partial
functions, and for programming applications universes are really irrelevant (evidence: haskell)
Mac (Jul 22 2022 at 23:06):
The problem is that FacetConfig
has the type signature:
FacetConfig : (Name → Type u) → (Type u → Type v) → Name → Type (max (u + 2) v)
And that max (u + 2)
bubbles up much of the type hierarchy of Lake.
Eric Wieser (Jul 23 2022 at 00:32):
I've no context at all, but it seems quite strange to be storing an equality of types in FacetConfig
Mac (Jul 23 2022 at 01:05):
It is strange, but it was the best mechanism I could think of to implement open type families in Lean (see Family.lean for details).
Mario Carneiro (Jul 23 2022 at 03:29):
@Mac Could you elaborate on this general design? I would like to #xy this but it seems there are a lot of threads to pull. What is FacetConfig
trying to accomplish: is it a user extension point? How does the user use this type, and how does the system make use of the information?
Mario Carneiro (Jul 23 2022 at 03:30):
A more specific question: do you actually need FacetConfig
to be universe polymorphic, and if so, why? Is it just "trivial generalization"?
Mario Carneiro (Jul 23 2022 at 05:44):
@Mac I added a refactor proposal at leanprover/lake#107 which rejiggers things so that FacetConfig
can live in Type
.
Mario Carneiro (Jul 23 2022 at 05:51):
One observation that might have gotten lost: DTT already has something that acts a lot like open type families: Function types. These are morally "coinductive" which means that you can create an open inductive by specifying all the ways you would like to destruct that value, and then downstream users can construct as many extra types as they want by choosing different tuples of functions. (These are often supplied via typeclasses.) If you want to control how the functions are derived, you can provide a "smart constructor" / corecursor that takes arbitrary data in any universe and applies your functions to construct data in the lowest universe.
Siddharth Bhat (Jul 23 2022 at 12:52):
@Mario Carneiro i don't follow this idea of using functions to mimic open type families. I'd be much obliged if there's a #mwe demonstrating this.
Mario Carneiro (Jul 24 2022 at 00:19):
The expression problem is the observation that when we have producers and consumers of an interface, there are interactions to be specified, and we can't have both the producer side and the consumer side be fully decoupled and extensible because then it is possible to introduce a producer in one import and a consumer in another import and there is no place to put the interaction between them.
So we must choose either the producer side or the consumer side to be "non-extensible", which in this case means that extensions, although possible, require changing the source code and updating everyone on the other side to accommodate.
The case where the producers are non-extensible and the consumers are extensible is well known to lean users: we create an inductive type containing a case for each producer, and then consumers pattern match on the inductive type; there is no need for coordination between the consumers so you can have as many as you like.
The case where the consumers are non-extensible and the producers are extensible is what type classes (or generally structures of functions) are for. We create a type class containing a field for each consumer (each thing we want to do with the data), and then the producers are instances of the type class. The producers don't need to coordinate with each other, and you can have as many as you like.
Mario Carneiro (Jul 24 2022 at 00:25):
Here is an illustration of the two approaches in code:
-- non-extensible Producer, extensible consumer
inductive Producer
| A : Producer
| B : Producer
def consumerC : Producer → String
| .A => "interaction of A and C"
| .B => "interaction of B and C"
def consumerD : Producer → String
| .A => "interaction of A and D"
| .B => "interaction of B and D"
-- non-extensible consumer, extensible Producer
structure Producer where
C : String
D : String
def producerA : Producer where
C := "interaction of A and C"
D := "interaction of A and D"
def producerB : Producer where
C := "interaction of B and C"
D := "interaction of B and D"
Mac (Jul 24 2022 at 02:37):
Mario Carneiro said:
we can't have both the producer side and the consumer side be fully decoupled and extensible because then it is possible to introduce a producer in one import and a consumer in another import and there is no place to put the interaction between them.
As far as I know, there is no reason (in the abstract) why you can't decouple all three, have the producers in one place, the consumers in another, and their interactions in a third.
Mac (Jul 24 2022 at 02:43):
Which is an approach taken by e..g., open types and open definitions.
Mac (Jul 24 2022 at 02:45):
That is, open types are extensible producers, open definitions are extensible consumers, and their interactions can be defined independently.
Mac (Jul 24 2022 at 02:46):
The Expression Problem, while a core concern in programming language design, does have many hypothesized solutions, many approaches of which have been implemented (sometimes only partially, though) in new languages.
Mac (Jul 24 2022 at 03:01):
@Mario Carneiro I feel my responses may have come off as overly critical. I so. I am sorry. The strong assertion that "we can't have both the producer side and the consumer side be fully decoupled and extensible" confused me. Figuring out the best way to do so is generally a main part of expression problem solutions whereas the statement made it sound to me like it was impossible. Or were you just talking about in Lean for the topic in question?
Mario Carneiro (Jul 24 2022 at 03:02):
No, I mean that it is a fairly fundamental issue in decoupling things
Mario Carneiro (Jul 24 2022 at 03:03):
Because of this thing, we need to know all consumers when we add a producer, and we need to know all producers when we add a consumer, and it's not obvious how to have them both without linearizing the addition of producers and consumers
Mario Carneiro (Jul 24 2022 at 03:06):
If you can separate the producers and the consumers via an "API", then you have essentially split the problem into two since each side only has to interact with the API, and then both sides can grow independently while the API becomes "non-extensible". This is totally possible. The expression problem is specifically for the case where there is no mediator, each producer wants to talk individually to each consumer and all pairwise interactions are potentially completely different
Mario Carneiro (Jul 24 2022 at 03:06):
(Note that in both examples, there are 4 string literals corresponding to every pairwise interaction of producer and consumer)
Kyle Miller (Jul 24 2022 at 03:08):
I thought I'd heard the tagless-final style solved the expression problem https://okmij.org/ftp/tagless-final/ (and a blogpost with an introduction)
This is a solution in the sense that it's possible to encode extensions to the API between producers and consumers as types, and given a valid producer/consumer pair, you can replace the producer with a "simpler" producer or the consumer with a "more complex" consumer, with respect to the API.
Mario Carneiro (Jul 24 2022 at 03:08):
I am assuming some aspects which could be considered specific to the lean setup, in particular that there is no "global view" of the entire set of things because everything could have a downstream dependent which performs some kind of extension
Mario Carneiro (Jul 24 2022 at 03:09):
so you have to decide in advance what things can be extended and what can't by downstream dependents
Mario Carneiro (Jul 24 2022 at 03:21):
A simplification of the tagless final pattern for lean is the observation that we can use typeclasses to just fill in elements of the matrix on an ad hoc basis:
class Interaction (producer consumer : Type) where
interaction : String
structure ProducerA
structure ProducerB
structure ConsumerC
structure ConsumerD
instance : Interaction ProducerA ConsumerC where
interaction := "interaction of A and C"
instance : Interaction ProducerA ConsumerD where
interaction := "interaction of A and D"
instance : Interaction ProducerB ConsumerC where
interaction := "interaction of B and C"
instance : Interaction ProducerB ConsumerC where
interaction := "interaction of B and D"
The downside of this pattern is that you don't know until you call the function whether the interaction between the two chosen types has in fact been defined
Mario Carneiro (Jul 24 2022 at 03:22):
You can also take either the producer side or the consumer side to be the typeclass with the other one being the argument
Mac (Jul 24 2022 at 06:53):
Mario Carneiro said:
I am assuming some aspects which could be considered specific to the lean setup, in particular that there is no "global view" of the entire set of things because everything could have a downstream dependent which performs some kind of extension
For many things in Lean which are extensible (e.g., syntax, environment extensions, etc.) it is entirely possible to freeze the setup at any given point and perform global optimizations.
Mac (Jul 24 2022 at 06:54):
The only real exceptions to this is the type system, but it also doesn't really have true open types in the first place that need freezing. Type classes could be an exception, but they are technically also freezable.
Mac (Jul 24 2022 at 06:55):
Also, specialization could be seen as a form of freezing as well.
François G. Dorais (Jul 24 2022 at 10:32):
Mac said:
As far as I know, there is no reason (in the abstract) why you can't decouple all three, have the producers in one place, the consumers in another, and their interactions in a third.
I don't disagree but I would like to point out that this type of polygamy might be illegal in some jurisdictions. :rolling_on_the_floor_laughing: https://media2.giphy.com/media/MuI2Oxjb9gCy9s4Nxq/giphy.gif?cid=c623cb35w3hvqnavuoz1eqhktax59hsh81dlevxza57i33ou&rid=giphy.gif&ct=g
Eric Wieser (Dec 01 2023 at 20:57):
Mac Malone said:
Would a PR with the basic universe polymorphic
IO
(requiringULift
) be acceptable?
I found myself wanting this basic version again; is a PR like lean4#3010 (which is enough for me to do the rest downstream if needed) acceptable?
Last updated: Dec 20 2023 at 11:08 UTC