Zulip Chat Archive
Stream: lean4
Topic: Nested-inductive-safe types
Mario Carneiro (Apr 17 2024 at 16:00):
Inspired by @Markus Himmel's PR std4#748, I'd like to talk about how we should approach the issue of types being used in nested inductives, in the manner of:
inductive Foo
| mk : MyDataStructure Foo -> Foo
Let's say that we are making MyDataStructure
and want users to be able to write the above. In this case:
MyDataStructure
must be an inductive typeor definition wrapping one. (In particular it cannot beopaque
)- it must not contain any proofs, subtypes, or well formedness assumptions in it, because these use the value negatively
It is possible to encode a number of well formedness properties using only inductive types. I once saw a Coq talk on it. But my takeaway is that this is an extraordinarily painful road to take, and lean has been steadily evolving in the direction of more abstraction hiding and less reliance on definitional equality (e.g. using proofs by well founded recursion).
Nested inductives, as they are currently designed, are fundamentally abstraction breaking. If you write an inductive such as the one above, the Foo.rec
theorem will write out all the details of every inductive type involved in the definition of MyDataStructure
. This does not have to be the case; the way inductives are handled in Isabelle, adapted to lean with the QPF package, uses something closer to typeclasses asserting that MyDataStructure
is suitably functorial, and does not involve unwrapping any definitions.
It is undeniably a use case that people will want these kind of inductive types, but I am not sure following that thread will take us in a good direction. I'm curious how other people think we should approach this issue in the standard library and elsewhere.
Joachim Breitner (Apr 17 2024 at 16:18):
I had serious trouble recently because I could not nest under IO
, even though if unrolled completely, it would be a positive occurrence, I believe, and had to work around it. So more flexibility here might have helped me.
Mario Carneiro (Apr 17 2024 at 16:23):
ah, so actually my "or definition wrapping one" is not true
inductive Foo -- ok
| mk : List Foo → Foo
abbrev List' := List
inductive Foo' -- fail
| mk : List' Foo' → Foo'
Mario Carneiro (Apr 17 2024 at 16:30):
Joachim Breitner said:
I had serious trouble recently because I could not nest under
IO
, even though if unrolled completely, it would be a positive occurrence, I believe, and had to work around it. So more flexibility here might have helped me.
This is another case where if we had gone this way it would end up with backward incentives, since if we make IO
nested-inductive-safe (and promise this for the future), then we can't also make it opaque
Markus Himmel (Apr 18 2024 at 10:12):
@Mario Carneiro For something like HashMap
, I would want to consider the following things to be the public API:
- That there is a type
HashMap
that is not necessarily safe to use in nested inductives - That there is a type
HashMap.Imp
(perhaps deserving to be renamed) that is safe to use in nested inductives - All of the operations on
HashMap
andHashMap.Imp
likeinsert
,findEntry?
, etc. - That there is a predicate
HashMap.Imp.WF
that is preserved by the operations that modify the hash map - All of the (mostly yet-to-be-developed) lemmas for
HashMap.Imp
andHashMap
, with theHashMap.Imp
lemmas taking an extraHashMap.Imp.WF
- (Maybe) an equivalence between
HashMap
and{ m : HashMap.Imp // m.WF }
that is compatible with all of the operations
There are other things that I would not consider to be part of the public API, for example
- The definitions of
HashMap.Imp
andHashMap
- Things like
HashMap.rec
andHashMap.Imp.rec
- Helper functions like
reinsertAux
I agree that making sure that HashMap.Imp
is usable in nested inductives now and in the future is somewhat annoying given what is currently possible in Lean. I also agree with you that today it is not possible to prevent users from relying on the things that I do not consider to be part of the public API on a technical level. I do however think that as long as we clearly communicate our promises and non-promises in the documentation, it's acceptable to be abstraction-breaking on a technical level, and if users choose to ignore that, they will have to adapt their developments. I also think that it is worth the effort to support usability of our datastructures in nested inductives starting now, and that this does not prevent us from taking advantage of better technical solutions like the ones you referenced, should they become available in the future.
Timo Carlin-Burns (Apr 18 2024 at 16:35):
Wouldn't it be preferable if HashMap
was safe to use in nested inductives and HashMap.Imp
could remain an implementation detail?
Mario Carneiro (Apr 18 2024 at 17:04):
Markus Himmel said:
There are other things that I would not consider to be part of the public API, for example
- Things like
HashMap.rec
andHashMap.Imp.rec
Note that I didn't say MyDataStructure.rec
in the above description, I said Foo.rec
. When someone writes a nested inductive involving HashMap.Imp
, the definition of HashMap.Imp
is made visible as part of Foo.rec
, and I regard this as a major privacy violation given that we can't reasonably ask people not to use Foo.rec
(how else would they make use of the nested inductive they just defined?), but it still contains basically the same information as HashMap.Imp.rec
(which we want to be private).
Mario Carneiro (Apr 18 2024 at 17:10):
I also think that it is worth the effort to support usability of our datastructures in nested inductives starting now, and that this does not prevent us from taking advantage of better technical solutions like the ones you referenced, should they become available in the future.
When this means things like "definitions are not allowed", I just can't bring myself to agree with this. Nested inductives are a poorly supported feature in lean already, and they have some structural problems. Bending over backwards to support them in our data structures is going to actively make them worse along other axes and also will take more work to revert once we have a better mechanism that doesn't leak these details and will just allow people to write inductive Foo | mk: HashMap A Foo -> Foo
which is what they want anyway.
Markus Himmel (Apr 19 2024 at 07:26):
Mario Carneiro said:
When someone writes a nested inductive involving
HashMap.Imp
, the definition ofHashMap.Imp
is made visible as part ofFoo.rec
, and I regard this as a major privacy violation given that we can't reasonably ask people not to useFoo.rec
(how else would they make use of the nested inductive they just defined?), but it still contains basically the same information asHashMap.Imp.rec
(which we want to be private).
This is a good point, however I would expect that almost everyone uses the equation compiler/pattern matching instead of invoking Foo.rec
directly, and with that you only get exposed to the internals of the data structure if you actively force it. So I think this is a non-issue in almost all practical applications.
Markus Himmel (Apr 19 2024 at 07:33):
Mario Carneiro said:
I also think that it is worth the effort to support usability of our datastructures in nested inductives starting now, and that this does not prevent us from taking advantage of better technical solutions like the ones you referenced, should they become available in the future.
When this means things like "definitions are not allowed", I just can't bring myself to agree with this. Nested inductives are a poorly supported feature in lean already, and they have some structural problems. Bending over backwards to support them in our data structures is going to actively make them worse along other axes and also will take more work to revert once we have a better mechanism that doesn't leak these details and will just allow people to write
inductive Foo | mk: HashMap A Foo -> Foo
which is what they want anyway.
I agree that the current state of what's possible in Lean is not great, and I think that it is a very good idea to explore ways to make this better. However, at least for the map-like data structures in Std, it's really not that bad: docs#Std.RBNode already works in nested inductive types without any changes, and the changes in std4#748 seem innocous enough to me.
Mario Carneiro (Apr 19 2024 at 09:38):
Markus Himmel said:
Mario Carneiro said:
When someone writes a nested inductive involving
HashMap.Imp
, the definition ofHashMap.Imp
is made visible as part ofFoo.rec
, and I regard this as a major privacy violation given that we can't reasonably ask people not to useFoo.rec
(how else would they make use of the nested inductive they just defined?), but it still contains basically the same information asHashMap.Imp.rec
(which we want to be private).This is a good point, however I would expect that almost everyone uses the equation compiler/pattern matching instead of invoking
Foo.rec
directly, and with that you only get exposed to the internals of the data structure if you actively force it. So I think this is a non-issue in almost all practical applications.
I agree to an extent, but it is frequently the case that this is insufficient; we are still missing a lot of tools needed to be able to make definitions over such nested inductives such that well founded recursion works correctly. As long as it's all partial
functions there isn't too much issue though.
Last updated: May 02 2025 at 03:31 UTC