Zulip Chat Archive
Stream: lean4
Topic: Understanding the docstring for docs#Array.data
Eric Wieser (Oct 26 2023 at 14:45):
I'm confused by docs#Array.data and how it relates to docs#Array.toList; the former is documented as:
structure Array (α : Type u) where
/-- Convert a `List α` into an `Array α`. This function is overridden
to `List.toArray` and is O(n) in the length of the list. -/
mk ::
/-- Convert an `Array α` into a `List α`. This function is overridden
to `Array.toList` and is O(n) in the length of the list. -/
data : List α
Does "X is overridden to Y" mean "X is implemented_by
Y"? Does it mean "X
is an implementation detail, use Y
instead"?
Alex J. Best (Oct 26 2023 at 14:51):
It looks like it means the line below that in the source attribute [extern "lean_array_data"] Array.data
which is similar in meaning to implemented_by
yes
Joachim Breitner (Oct 26 2023 at 14:59):
And that C function is simply calling Array.toList
in Lean land. This detour is I think needed implemented_by
would insis on equal types. But otherwise it's the same. What's a better wording? “This projection is implemented by Array.toList
” maybe?
Alex J. Best (Oct 26 2023 at 15:09):
How do the types differ here?
Joachim Breitner (Oct 26 2023 at 15:12):
Hmm, nevermind, I confused it with something else I tried with implemented_by.
Joachim Breitner (Oct 26 2023 at 15:15):
So the C code merely translates between the different ABIs for extern
and export
:
https://github.com/leanprover/lean4/blob/6c5f79c0df01e9cd74ac4ee411ec392a5ac1b5d3/src/runtime/object.cpp#L346
(Why is it even different?)
So why isn't it using implemented_by
? Because it'd an unsupported forward reference?
Eric Wieser (Oct 26 2023 at 15:25):
Joachim Breitner said:
What's a better wording? “This projection is implemented by
Array.toList
” maybe?
Perhaps "implemented at runtime". Also there should be a docstring on Array.toList
that says "do not use this function, it is an implementation detail; use Array.data
instead"
Joachim Breitner (Oct 26 2023 at 15:33):
Or the other way around (I think using Array.toList
, and treating Array
as an opaque type, is more natural, but not sure.)
Eric Wieser (Oct 26 2023 at 15:35):
Indeed; I guess my point is that the documentation makes it very hard to tell which pair of functions is the implementation detail! (it doesn't help that I would expect "overridden by" not "overridden to"...)
Joachim Breitner (Oct 26 2023 at 15:37):
I think the important message here is that .mk
and .data
are not just constructor and projection, as you would expect, and have very different performance characteristics. I’ll draft a PR and let you refine it.
Eric Wieser (Oct 26 2023 at 15:41):
Would renaming .mk
to ofList
and data
to toList
be reasonable, and renaming the existing toList
to private toListImpl
or similar?
Joachim Breitner (Oct 26 2023 at 15:45):
Probably. But when doing proofs, where these really are just constructor and projection, it’s nice to have .mk
around, and to have a nicer variable name in inductions? That’s the only reason I can think of right now why you might want to keep them separate.
Joachim Breitner (Oct 26 2023 at 15:47):
Can you have a look at lean4#2771, refine if needed, and express support otherwise? Then I’ll mark it ready for review.
Joachim Breitner (Oct 26 2023 at 15:48):
(I did not explicitly describe one as “internal only”)
Eric Wieser (Oct 26 2023 at 15:55):
I think this doesn't solve the fundamental question I'm trying to answer, which is "how do I pick between .mk
and .ofList
?"
Joachim Breitner (Oct 26 2023 at 16:12):
Hmm. My intuition is: Pick .ofList
if you are treating Arrays
as an opaque data structure (e.g. when programming). Pick .mk
if you are treating it as a wrapper around Lists (e.g. when proving).
But I don’t know if this intuition is a good one, and if it is, how to concisely convey it.
Eric Wieser (Oct 26 2023 at 16:23):
I'd claim that you never want .ofList
, because it equal to and has the same performance characteristics as .mk
but is harder to prove things about
Joachim Breitner (Oct 26 2023 at 16:28):
But it’s much easier to find, if you have a List
and want an Array
, and don’t know about the internal definition of Array
:-D
Eric Wieser (Oct 26 2023 at 16:29):
Well then mk
has the wrong name!
Joachim Breitner (Oct 26 2023 at 16:29):
Right
Joachim Breitner (Oct 26 2023 at 16:30):
I guss I wouldn’t oppose a PR that renames .mk
and .data
and makes the other functions private.
Eric Wieser (Oct 26 2023 at 16:30):
Joachim Breitner said:
Probably. But when doing proofs, where these really are just constructor and projection, it’s nice to have
.mk
around
I don't really buy that; in a proof you'd probably just use the angle brackets anyway, and .ofList
is hardly much worse to type
and to have a nicer variable name in inductions?
Doesn't induction
pretty much force us to name the induction argument manually anyway? I agree toList
is slightly weirder than data
as the induction variable, but there's not much in it
Joachim Breitner (Oct 26 2023 at 16:31):
That’s true, the auto-picked name is rarely used in the end. I’m convinced.
Mac Malone (Oct 26 2023 at 18:08):
Joachim Breitner said:
So why isn't it using
implemented_by
? Because it'd an unsupported forward reference?
The forward reference concern can be elided by using e.g. attribute [implemented_by Array.toList] Array.data
after the relevant function is defined. I think this may be an interesting question to ask @Sebastian Ullrich or Leo because it seems to me that, if possible, using implemented_by
would be much clearer (as it would be deducible from the Lean level that one implements the other without needing to look at the C code).
Joachim Breitner (Oct 26 2023 at 18:33):
If you set thr attribute later, will it still affect the compilation of all definitions in between?
Mac Malone (Oct 26 2023 at 18:54):
@Joachim Breitner No, but the in-between definitions shouldn't be using Array.mk
/data
(in their compiled code) anyway. For Array.mk
, this certainly true as List.toArray
is defined a little ways down in the same file. Unfortunately, Array.toList
is defined separately in Init.Data.Array.Basic
, so it is harder to keep Array.data
honest (even though I suspect it still is). Regardless, implemented_by
simply does not work across files (understandably), so that separation does prevent it from being use for Array.data
. Thus, now having relooked at it, supporting this file separation is probably (at least one reason) why the export/extern trick was used.
Mac Malone (Oct 26 2023 at 18:54):
As an aside, If it was possible to mark the structure constructors and projections noncomputable
, one could have Lean force them not to be used in compiled code until they are provided with computable definitions via implemented_by
.
Mario Carneiro (Oct 26 2023 at 21:53):
FWIW I agree with Eric here: data
should be preferred, but it should be renamed to toList
and toList
should never be used directly (it can even be private, although it is nice to have the proof of toList = toListImpl
somewhere to make sure this isn't us pulling a fast one).
Mac Malone (Oct 27 2023 at 02:33):
@Mario Carneiro I imagine it should be the reverse (toListImpl = toList
) as a @[simp]
lemma to make sure it disappears if it accidently used somewhere.
Mario Carneiro (Oct 27 2023 at 02:33):
If it's not accessible without hacks (or even if it only has a weird name), I'm not too worried about it being accidentally used
Mario Carneiro (Oct 27 2023 at 02:34):
but sure, that ordering sounds fine
Mario Carneiro (Oct 27 2023 at 02:34):
I haven't seen any issues with people calling List.mapTR
by mistake
Mario Carneiro (Oct 27 2023 at 02:35):
Although those functions do show up in docgen and loogle; it might be nice to have some @[hidden]
attribute to indicate that it isn't meant for general use
Joachim Breitner (Oct 27 2023 at 07:36):
Related to functions showing up in docs that shouldn’t, also see https://github.com/leanprover/lean4/issues/2719 about let
/where
functions showing up in docs and in doc linters.
Last updated: Dec 20 2023 at 11:08 UTC