Zulip Chat Archive

Stream: mathlib4

Topic: measured lists?


Thomas Murrills (May 05 2023 at 23:11):

Does a structure that encodes redundancy for list length exist in mathlib4/std4/lean4 currently? For my current tactic port I'm appending lots of lists (I need the sharing, so I can't use Arrays), and want to avoid measuring their length each time I do. I rolled my own (shown below), but want to make sure I'm not duplicating something. Or, is it advisable to just track the length internally to the function and deal with pairs (n, list)? I thought the structure approach would be more legible.

/-- A list redundantly accompanied by its length. This redundancy is not enforced with a `Prop` field, as this structure is used only for optimization. -/
structure MeasuredList (α) where
  /-- The number of elements in the stored list. -/
  count : Nat
  /-- The list represented by an instance of this structure. -/
  items : List α

(Oops, meant to post in #Is there code for X? , but I suppose this is ok.)

Adam Topaz (May 05 2023 at 23:40):

We have docs4#Vector where the length is part of the type

Scott Morrison (May 05 2023 at 23:41):

I'm sure this doesn't exist already. It's obviously a weird example, carrying the length around but neither carrying a proof or using dependent types. I'd suggest just defining this locally to its use case, and we can punt on whether others would want to use it.

Scott Morrison (May 05 2023 at 23:41):

@Adam Topaz: Thomas is looking for the run-time efficiency of a cached value, but doesn't want to have to deal with proofs, or DTT hell.

Adam Topaz (May 05 2023 at 23:41):

Gotcha.

Adam Topaz (May 05 2023 at 23:50):

I guess in principle you could just sorry all proofs (since props are dropped when the code is compiled), then the measured lists would be essentially equivalent to the sigma type of Vector n as n varies.

Eric Wieser (May 06 2023 at 04:13):

Is this a place where computed properties can be used?

Thomas Murrills (May 06 2023 at 04:48):

Hmm, are you referring to computed_fields? (Interesting, I didn’t know about these!) Would that let me specify transformations on computed fields somehow? Like I’m guessing this would look something like

inductive MeasuredList (α) where
| mk : List α  MeasuredList α
with
  @[computed_fields] size : MeasuredList α  Nat
    | .mk l => l.length

right? (untested, on mobile) Then I would need something like

def MeasuredList.append : MeasuredList α MeasuredList α  MeasuredList α
| .mk l, .mk l' => .mk (l ++ l')

but I’d worry Lean would then have to recompute the computed field for the result. Is there a way to somehow tell Lean the quick way to calculate the new computed field (just add the two existing computed fields) in this definition?

Kyle Miller (May 06 2023 at 08:43):

List append involves iterating through the whole first list, so you don't get sharing there -- I just want to check that this is OK (maybe you're always appending small lists to the front?) Otherwise I'd suggest working with a tree of list/size pairs, where the size is the sum of all lengths in the tree, if possible

Thomas Murrills (May 06 2023 at 08:44):

I actually wound up finding computed fields useful, but in a different way! But: is this doing what I think it is? Here s is a "size" function I'd like to be generic, if possible. I'm not sure if that parameter is messing the memoization up.

inductive Foo (α) (s : α  Nat) where
| of : α  Foo α s
| node : Foo α s  Foo α s  Foo α s
with
  @[computed_field] size :  α s, Foo α s  Nat
  | _, s, .of a => s a
  | _, s, .node t₁ t₂ => t₁.size + t₂.size

Thomas Murrills (May 06 2023 at 08:45):

Oops, sorry, just so happened to be writing that before your message came in! Yes, I was appending small lists to the front, but I've now switched to trees and gluing them all together later, as you can see is hinted at from the example :)

Thomas Murrills (May 06 2023 at 08:47):

I'm also trying to store the sum of the lengths as a computed field at each node—I could just store it in the node the usual way, but I thought I may as well use this tech if it's available.

Kyle Miller (May 06 2023 at 08:50):

I think that if Lean accepts that code, then you should expect the computed field to memoize correctly (after all, Lean has strict evaluation and needs to fill in the size field). We could check the generated IR or C code just to be sure

Kyle Miller (May 06 2023 at 09:01):

Some relevant IR (from set_option trace.Compiler true):

      def Foo.node._override._redArg a.1 a.2 : Foo._impl lcErased lcErased :=
        jp _jp.3 _y.4 _y.5 : Foo._impl lcErased lcErased :=
          let _x.6 := Nat.add _y.4 _y.5;   --- Sizes are added here
          let _x.7 := Foo.node._impl _  _x.6 a.1 a.2;
          return _x.7;
        jp _jp.8 _y.9 : Foo._impl lcErased lcErased :=
          cases a.2 : Foo._impl lcErased lcErased -- sizes of the second argument are extracted from a field here
          | Foo.of._impl size a.10 =>
            goto _jp.3 _y.9 size
          | Foo.node._impl size a.11 a.12 =>
            goto _jp.3 _y.9 size;
        cases a.1 : Foo._impl lcErased lcErased -- sizes of the first argument are extracted from a field here
        | Foo.of._impl size a.13 =>
          goto _jp.8 size
        | Foo.node._impl size a.14 a.15 =>
          goto _jp.8 size
      def Foo.node._override α s a.1 a.2 : Foo lcErased lcErased := -- override of the `Foo.node` constructor
        let _x.3 := Foo.node._override._redArg a.1 a.2;
        return _x.3
      def Foo.size._override α s x : Nat := -- override of the `Foo.size` accessor; gets the pre-stored fields
        cases x : Nat
        | Foo.of._impl size a.1 =>
          return size
        | Foo.node._impl size a.2 a.3 =>
          return size

Thomas Murrills (May 06 2023 at 09:18):

Oh, that's interesting! Thanks for the comments, I might not have felt confident interpreting that IR otherwise. I can make sense of it pretty well now! :)

Thomas Murrills (May 06 2023 at 09:28):

Though, thinking about it, these are both for size on the .node constructor, which takes no meaningful parameters—my inclusion of s was unnecessary. I suppose the issue, if there was one, would likely be in Foo.of._override. But I'll continue tomorrow...

Thomas Murrills (May 06 2023 at 09:30):

Ah, I just checked, and it does immediately perform let _x.2 := s a.1 :)

Kyle Miller (May 06 2023 at 09:30):

Time for more IR!

      def Foo.of._override._redArg s a.1 : Foo._impl lcErased lcErased :=
        let _x.2 := s a.1; -- call the function
        let _x.3 := Foo.of._impl _  _x.2 a.1; -- save it using the actual 2-argument constructor
        return _x.3
      def Foo.of._override α s a.1 : Foo lcErased lcErased :=
        let _x.2 := Foo.of._override._redArg s a.1;
        return _x.2

Thomas Murrills (May 06 2023 at 09:31):

Great! Oh, I was wondering! What is that operation?

Kyle Miller (May 06 2023 at 09:33):

I think it's a marker that the argument was erased since it's irrelevant? The s argument doesn't need to actually be stored inside constructors since it's a parameter to the inductive type.

Kyle Miller (May 06 2023 at 09:34):

It looks like _ is used to mark arguments that are inherently irrelevant (like types)

Kyle Miller (May 06 2023 at 09:34):

I'm speculating, but at least in Lean 3 all erased arguments were displayed with a box.

Thomas Murrills (May 06 2023 at 09:37):

Hmm, that would make sense! Though if that's the case, I wonder why the argument just before it isn't a box as well, but an underscore, since I'm guessing it's not needed either... (Maybe it's things which have just been erased?)

Kyle Miller (May 06 2023 at 09:39):

That's a type -- I'm figuring that type and proof arguments are erased first because they're never computed, but then there's a pass to erase arguments that correspond to constructor parameters. My guess is that the first sort of erasure is indicated differently from the second type.

Thomas Murrills (May 06 2023 at 09:41):

Sounds reasonable! I searched in the lean4 repo, and is indeed arguments or IRTypes which are irrelevant internally. Maybe a closer look at Lean.Compiler.IR.Format would reveal the rule for _ as well.


Last updated: Dec 20 2023 at 11:08 UTC