Zulip Chat Archive

Stream: lean4

Topic: Label attribute improvements?


Thomas Murrills (Mar 16 2024 at 19:02):

There are a couple of minor improvements/fixes I'd like to make to the recently-upstreamed-from-std4 Lean.LabelAttribute:

  • pass the AttributeKind to ext.add (currently every label is added globally, and there's no error if someone uses local, so this doesn't seem intentional (is it?))
  • allow users to provide a validate function and specify the ApplicationTime in the same way TagAttribute does
  • improve/fix docs, specifically
    • fix reference to Std.Tactic.LabelAttr.labelled
    • explain difference in usage from TagAttribute further in the module doc (TagAttribute is for checking if a given decl has the tag, LabelAttribute is for getting a collection of all labelled decls)
    • after allowing validate etc.: point users from the simple macro to registerLabelAttr if they need these features

Would a PR (or PRs) be welcome for these?

Kyle Miller (Mar 16 2024 at 19:14):

  • local handling seems like a good idea, but isn't local already being handled by SimpleScopedEnvExtension? (If not, could you make a #mwe ?)
  • Do you have a use case for specifying the ApplicationTime or for needing to validate that you need this for?
  • That difference between TagAttribute and LabelAttribute doesn't seem right to me. It's possible to retrieve a list of every declaration that has a given TagAttribute, as far as I've understood. I thought the big differences were (1) you can add a LabelAttribute in a different module from the declaration and (2) the LabelAttribute can be locally scoped.

Thomas Murrills (Mar 16 2024 at 19:16):

The issue with local handling is that add for the attribute registered by mkLabelAttr looks like this:

  add   := fun declName _ _ =>
    ext.add declName

where ext.add uses SimpleScopedEnvExtension's add, and has a default AttributeKind.global argument. (I'll make a #mwe to make sure this does what I expect, though.) So, it can be handled by SimpleScopedEnvExtension, but we just don't pass the data along.

Thomas Murrills (Mar 16 2024 at 19:18):

  • I do have a case for validate, where I want to make sure everything added with the label has a certain type. I don't have a case for the application time, but I figure if it's in core, someone might want it; consistency with TagAttribute is probably good; and it's very easy/low maintenance to allow this feature. Though, honestly, I'm not familiar yet with what effects this has, so maybe it doesn't make sense.

Thomas Murrills (Mar 16 2024 at 19:22):

  • How do you retrieve a list of every tagged declaration with TagAttribute? My understanding is that it only provides hasTag, and works by looking in the extension data in the olean of the module of the given decl (or looking in the current environment if it's been tagged in the current module); and likewise addImportedFn is fun _ _ => pure {}, so we don't wind up collecting all tagged decls in something accessible in the environment.

Kyle Miller (Mar 16 2024 at 19:29):

Thomas Murrills said:

So, it can be handled by SimpleScopedEnvExtension, but we just don't pass the data along.

I think with a test case and a small modification to pass that data along, it'd likely be welcome.

Thomas Murrills said:

consistency with TagAttribute is probably good

If this is giving LabelAttribute a similar interface to TagAttribute, then I agree. A validate function makes sense to me as well, but it's good to be sure it's being motivated by a concrete application.

Kyle Miller (Mar 16 2024 at 19:32):

There's a big performance improvement on the table for LabelAttribute.

The Array Name should be in sorted order, like for the TagAttribute arrays, so that (1) when looking up, it can do a binary search and (2) when loading a module, it merges the sorted arrays.

Thomas Murrills (Mar 16 2024 at 19:36):

Would merging that way be more costly on import? I feel like LabelAttribute should be optimized for the application of "give me a collection of all declarations" while leaving lookup to TagAttribute (if I'm right about the distinction)—and/or create a different sort of attribute that's a "flexible TagAttribute" (allowing cross-module declarations but still being used for lookup only).

Thomas Murrills (Mar 16 2024 at 19:43):

(Re: the last part, see somewhere near the end of #general > pointless tactic linter; though maybe those messages should be moved.)

Patrick Massot (Mar 16 2024 at 19:44):

I confirm label attributes don’t handle local, I’ve been bitten by this in the past.

Patrick Massot (Mar 16 2024 at 19:44):

See https://leanprover.zulipchat.com/#narrow/stream/348111-std4/topic/register_label_attr.20and.20local.20attribute/near/402959403

Kyle Miller (Mar 16 2024 at 19:45):

The asymptotic running time of merging two sorted lists is O(n+m)O(n+m), and two non-sorted lists is O(nm)O(nm), so it's the opposite of more costly :-)

Kyle Miller (Mar 16 2024 at 19:45):

Here's how you can get a list of all the TagAttribute declarations:

import Lean
import Mathlib

open Lean

def Lean.TagAttribute.getAll (attr : TagAttribute) (env : Environment) : Array Name := Id.run do
  let decls := (attr.ext.toEnvExtension.getState env).importedEntries.flatten
  return decls ++ (attr.ext.getState env).toArray

elab "#match_patterns" : command => do
  let decls := matchPatternAttr.getAll ( getEnv)
  logInfo m!"Decls: {MessageData.joinSep (decls.map fun n => n).toList ", "}"

#match_patterns
/-
Decls: Unit.unit, HEq.rfl, Add.add, List.toArray, HAdd.hAdd, Nat.add, Neg.neg, rfl, Char.ofNat,
Nat.cast, Int.cast, BitVec.ofNatLt, BitVec.ofNat, Lean.mkApp9, Lean.mkApp7, Lean.mkApp8,
Lean.mkAppB, Lean.mkApp10, Lean.mkApp4, Lean.mkApp6, Lean.mkApp2, Lean.mkApp3, Lean.mkApp,
Lean.mkApp5, Lean.IR.Alt.ctor, Lean.IR.Alt.default, Lean.Meta.AC.bin, bit1, bit0, Rat.cast,
Sum3.in₂, Sum3.in₁, Sum3.in₀, Bot.bot, Top.top, ofLex, toLex, WithBot.some, WithTop.some,
Functor.AddConst.mk, Functor.Comp.mk, Functor.Const.mk, Mathlib.Meta.NormNum.Result.isFalse,
Mathlib.Meta.NormNum.Result.isRat, Mathlib.Meta.NormNum.Result.isNat,
Mathlib.Meta.NormNum.Result.isNegNat, Mathlib.Meta.NormNum.Result.isTrue, Vector.cons, Vector.nil,
Sym.nil, Sym.cons, Sym.mk, Sum.inlₗ, Sum.inrₗ, CategoryTheory.Limits.WalkingCospan.Hom.inl,
CategoryTheory.Limits.WalkingCospan.right, CategoryTheory.Limits.WalkingSpan.left,
CategoryTheory.Limits.WalkingCospan.Hom.id, CategoryTheory.Limits.WalkingCospan.one,
CategoryTheory.Limits.WalkingSpan.right, CategoryTheory.Limits.WalkingSpan.Hom.fst,
CategoryTheory.Limits.WalkingSpan.Hom.id, CategoryTheory.Limits.WalkingSpan.zero,
CategoryTheory.Limits.WalkingCospan.Hom.inr, CategoryTheory.Limits.WalkingSpan.Hom.snd,
CategoryTheory.Limits.WalkingCospan.left, UniformSpace.mk', ENNReal.ofNNReal, SymAlg.sym,
Booleanisation.comp, Booleanisation.lift, SimpleGraph.Adj.toWalk, SimpleGraph.Walk.nil',
SimpleGraph.Walk.cons', SimpleGraph.Coloring.mk, SNum.not, NzsNum.not, SNum.bit, Multiset.mkToType,
Vector3.nil, Vector3.cons, Hamming.ofHamming, Hamming.toHamming,
FirstOrder.Language.BoundedFormula.not, FirstOrder.Language.BoundedFormula.ex, NatOrdinal.toOrdinal,
Ordinal.toNatOrdinal, Lists.of', Lists.atom, OnePoint.infty, OnePoint.some,
Topology.WithLower.ofLower, Topology.WithUpper.toUpper, Topology.WithLower.toLower,
Topology.WithUpper.ofUpper, Topology.WithUpperSet.toUpperSet, Topology.WithLowerSet.ofLowerSet,
Topology.WithUpperSet.ofUpperSet, Topology.WithLowerSet.toLowerSet, Specialization.toEquiv,
Specialization.ofEquiv, Topology.WithScott.toScott, Topology.WithScott.ofScott
-/

Thomas Murrills (Mar 16 2024 at 19:54):

Huh, interesting—how does that work? And how does it square with the notion that for TagAttributes, we're meant to look up whether a decl has a tag by going to the olean of the module it was defined in? At some point, are we just going through all decls in the environment and checking if they're tagged?

Kyle Miller (Mar 16 2024 at 19:56):

Easy: the TagAttribute stores an Array (Array Name) indexed by module id.

Kyle Miller (Mar 16 2024 at 19:56):

Flattening that array gives you the array of all declarations with that attribute. (Minus the ones defined in the current module.)

Thomas Murrills (Mar 16 2024 at 19:57):

Hmm—where/how does it do that? And what's the (non?)-relation with addImportedFn being constant?

Kyle Miller (Mar 16 2024 at 20:00):

Isn't addImportedFn for initializing the state of the extension once everything is imported? The state for TagAttribute is just the declarations tagged in the current module.

My understanding is that the argument that's passed to it (the Array (Array Name)) is still accessible from Lean.PersistentEnvExtensionState.importedEntries, which is what hasTag is making use of.

Kyle Miller (Mar 16 2024 at 20:06):

Hm, maybe what I said about making LabelAttribute use sorted lists isn't possible since it uses docs#Lean.ScopedEnvExtension.addImportedFn

Still, it should be possible to use NameSet instead of Array Name for the state, which would speed things up in theory (according to the asymptotics at least).

Thomas Murrills (Mar 16 2024 at 20:08):

Kyle Miller said:

Isn't addImportedFn for initializing the state of the extension once everything is imported? The state for TagAttribute is just the declarations tagged in the current module.

My understanding is that the argument that's passed to it (the Array (Array Name)) is still accessible from Lean.PersistentEnvExtensionState.importedEntries, which is what hasTag is making use of.

Ah, ok! This is making more sense now. :)

Thomas Murrills (Mar 16 2024 at 20:16):

But I guess then I'd like to reconcile it with what @Mario Carneiro said here. I'm figuring the "loading from disk for ~free using mmap" part is the construction of importedEntries, and the overhead would be flattening (if performed). For TagAttributes we decide that the NameSet state in the environemnt created by addImportedFn should start fresh in each file, and only include current-module tagged decls. Access is quick because it's either "access the untouched importedEntries array at the right index, or use the small, current NameSet". (If that's right, then things make sense to me.)

Thomas Murrills (Mar 16 2024 at 20:21):

Kyle Miller said:

Hm, maybe what I said about making LabelAttribute use sorted lists isn't possible since it uses docs#Lean.ScopedEnvExtension.addImportedFn

Still, it should be possible to use NameSet instead of Array Name for the state, which would speed things up in theory (according to the asymptotics at least).

I've been seeing NameSet around and was wondering: do you know if folding over a NameSet is more/less/~equally efficient as folding over an Array? Sometimes (as in my application) that's what you want to do with the bag of decls you get. Or, would it even make more sense to doubly fold over importedEntries directly, without flattening first?

I guess I'm worried about the fact that you'd essentially be folding over every single module, most of them empty. But maybe you have to do that anyway when using addImportedFn?

Kyle Miller (Mar 16 2024 at 20:29):

Just guessing, folding over an array is more efficient, but the same O(n)O(n) running time.

Or, would it even make more sense to doubly fold over importedEntries directly, without flattening first?

For LabelAttribute? I don't think we have that choice. It's a scoped attribute, so it has to process all the global vs scoped entries anyway in docs#Lean.ScopedEnvExtension.addImportedFn, which is what calls docs#Lean.ScopedEnvExtension.Descr.addEntry per entry

Thomas Murrills (Mar 16 2024 at 20:31):

Right, I was wondering about the general case of "how to store names then get all of them", not necessarily LabelAttribute—basically, "is TagAttribute optimized for getting/folding over collections of names, and if not, what would be".

I suppose I'll mention more properties of my specific application as motivation: I need to get and fold over a small growable collection of names which will never be erased from, and which doesn't need scoping.

(I'm actually wondering if maybe a simple IO.Ref of some sort is most appropriate here, since it won't be managed by the user...but I'm not familiar yet with doing things at such a low level.)

Thomas Murrills (Mar 16 2024 at 20:36):

(Btw, parallel conversation, but #mwe for label attributes not handling scoping:

import Std.Test.Internal.DummyLabelAttr

section
@[local dummy_label_attr] def x := true
end

#eval Lean.labelled `dummy_label_attr -- #[`x]

I'm not sure how to include this test in a Lean PR itself, though, given that afaik no label attributes are registered in core and you can't import test files; maybe the fix is straightforward enough that it doesn't need one...)

Kyle Miller (Mar 16 2024 at 20:38):

Thanks. It seems good to include in a PR description to illustrate the issue (or to create an issue with this and then link to that issue in the PR).

Kyle Miller (Mar 16 2024 at 20:40):

When I wrote Lean.TagAttribute.getAll I was trying to think about whether a Lean.TagAttribute.foldM would be better, but it's really hard to say without profiling.

Sometimes things are faster if you create a temporary structure that you immediately throw away. There might be a lot of overhead in folding a function over all the entries, vs running an efficient loop over the returned array. Though maybe with compiler magic (inlining) the fold could become a nested for loop.

Thomas Murrills (Mar 16 2024 at 20:45):

Oh, interesting—are nested fors more efficient than nested folds?

Kyle Miller (Mar 16 2024 at 20:56):

Calling a function has overhead (from maintaining stack frames), and fors should compile into loops (on the Lean side, into recursive functions that do tail calls, which the C compiler can turn into jumps rather than calls).

folds seem to be marked @[inline] though, so they should be loops too.

Thomas Murrills (Mar 16 2024 at 20:57):

Oh, good to know! (Especially that I haven't been slowing things down when using fold instead of for.)

Thomas Murrills (Mar 16 2024 at 20:58):

Kyle Miller said:

Thanks. It seems good to include in a PR description to illustrate the issue (or to create an issue with this and then link to that issue in the PR).

Issue: lean4#3697; PR: lean4#3698

Thomas Murrills (Mar 16 2024 at 21:06):

One question re: ApplicationTime: why does TagAttribute apply after typechecking, but LabelAttribute applies after compilation?

Thomas Murrills (Mar 16 2024 at 21:30):

lean4#3699 is a draft PR for bringing registerLabelAttr into consistency with registerTagAttribute, including making the default applicationTime .afterTypeChecking. There are a couple of opinions I'd like to solicit, though:

Thomas Murrills (Mar 16 2024 at 21:31):

  1. Should the name be modified to registerLabelAttribute for consistency with registerTagAttribute? (If so, does that include register_label_attr += ibute?)

Thomas Murrills (Mar 16 2024 at 21:33):

  1. Should the macro accept an optional argument for validation, or just point users to registerLabelAttr(ibute) in the docstring if they need that feature?

Damiano Testa (Mar 16 2024 at 22:07):

This issue seems relevant as well.

Damiano Testa (Mar 16 2024 at 22:08):

This issue seems relevant as well.

Damiano Testa (Mar 16 2024 at 22:18):

(It is only one issue, repeated since I'm on mobile.)


Last updated: May 02 2025 at 03:31 UTC