Zulip Chat Archive

Stream: mathlib4

Topic: porting simps and to_additive


Floris van Doorn (Nov 30 2021 at 13:20):

I want to start porting some meta code to Lean 4, and I want to start with the things I'm most familiar with, which are @[simps] and @[to_additive].

I'll have to ask a bunch of questions along the way, so to start:
If in Lean 3 we used an attribute with a cache value, how do we do this in Lean 4? With a PersistentEnvExtension directly?

Floris van Doorn (Nov 30 2021 at 13:31):

Although I guess that most cache-attributes like that just stored the name_map of parameters, which seems to be handled automatically by the attribute.

Floris van Doorn (Nov 30 2021 at 14:03):

How hard would it be to use mathport to already port all files that are below all/most tactic files in the file hierarchy? Things like data.list.defs and data.string.defs? That would be useful for tactic writing.

Gabriel Ebner (Nov 30 2021 at 14:06):

Starting from the end: you can look at the ported files here: https://github.com/leanprover-community/mathlib3port/blob/master/Mathbin/Data/String/Defs.lean

But particularly strings are probably a bad thing to port directly. These should be rewritten.

Gabriel Ebner (Nov 30 2021 at 14:09):

Although I guess that most cache-attributes like that just stored the name_map of parameters, which seems to be handled automatically by the attribute.

There are various kinds of attributes that store (and compute) different kinds of state. I suggest reading through https://github.com/leanprover/lean4/blob/f01a124f187793cdf09788b0cc102eba156b1c60/src/Lean/Attributes.lean to get an idea of the available attribute types. You almost certainly shouldn't create a PersistentEnvExtension manually.

Gabriel Ebner (Nov 30 2021 at 14:10):

We also have another cache API here: https://github.com/leanprover-community/mathlib4/blob/8fe680846b7f4dd1b93688a6355c1c1a0cba7f06/Mathlib/Tactic/Cache.lean But I don't think this is applicable for simps/toAdditive.

Floris van Doorn (Nov 30 2021 at 14:15):

Thanks for the pointers!

Floris van Doorn (Nov 30 2021 at 14:44):

In Lean 3 there is has_attribute which takes the name of an attribute, which is convenient if your attribute has not defined yet (e.g. the implementation of to_additive checks whether other declarations are marked as to_additive). How to do this in Lean 4? I can only find ParametricAttribute.getParam and isAttribute which are somewhat similar.

Gabriel Ebner (Nov 30 2021 at 14:46):

I don't think this exists in general. But for to_additive, you could just look in the state of the attribute.

Floris van Doorn (Nov 30 2021 at 14:48):

I guess I could do that, but then I have to inline the definition src#to_additive.first_multiplicative_arg.

Gabriel Ebner (Nov 30 2021 at 14:51):

I mean you can still write a has_attribute' `to_additive function, it will just be specific to toAdditive (and won't work for other attributes).

Floris van Doorn (Nov 30 2021 at 14:57):

before I declare the attribute? I don't see how.

Floris van Doorn (Nov 30 2021 at 15:00):

Also, am I correct that I cannot tag a declaration with an attribute in the file where I define that attribute?

Gabriel Ebner (Nov 30 2021 at 15:48):

before I declare the attribute? I don't see how.

Ah yes, the easiest thing is probably to just pass the state around. You could also try to apply the @[init] attribute afterwards (not sure if that works):

constant myAttr : ParametricAttribute Foo

-- here you can already refer to myAttr

def myAttrInit : IO (ParametricAttribute Foo) :=
  registerParametricAttribute { ... }

attribute [init myAttrInit] myAttr

Gabriel Ebner (Nov 30 2021 at 15:49):

Also, am I correct that I cannot tag a declaration with an attribute in the file where I define that attribute?

Yes, depending on the attribute type you can also only tag declarations in the same file.

Floris van Doorn (Dec 01 2021 at 11:52):

Thanks, that is helpful!

Two more questions about attributes.

  • If I define an attribute like this
syntax (name := foo) "foo" num* : attr
initialize fooAttr : ParametricAttribute (List Nat) 
  registerParametricAttribute {
    name := `foo
    descr := "foo desc."
    getParam := λ decl stx =>
      match stx with
        | `(attr|foo $[$ns]*) => (ns.map (·.toNat)).toList
        | _ => throwError "unexpected foo syntax {stx}"
  }

I don't have to check that ns is a list of identifiers, because the parser will already complain if this is not the case, right?
Also, can I actually get to the throwError line? It seems like the parser will not even run this code if the syntax does not match the first case.

  • Suppose I have two syntax declarations
syntax (name := bar) "bar" num* : attr
syntax (name := bar!) "bar!" num* : attr

(this is in the Mathport.Syntax file for e.g. toAdditive). How do I make a single attribute that is triggered on both these parses? I tried something like

initialize barAttr : ParametricAttribute (List Nat) 
  registerParametricAttribute {
    name := `bar
    descr := "bar desc."
    getParam := λ decl stx =>
      match stx with
        | `(attr|bar $[$ns]*) => (ns.map (·.toNat)).toList
        | `(attr|bar! $[$ns]*) => (ns.map (·.toNat)).toList
        | _ => throwError "unexpected bar syntax {stx}"
  }

but that doesn't work for bar! (I get unknown attribute [bar!])

Gabriel Ebner (Dec 01 2021 at 12:22):

I don't have to check that ns is a list of identifiers, because the parser will already complain if this is not the case, right?

It's an array of numbers, but otherwise you're correct.

Gabriel Ebner (Dec 01 2021 at 12:24):

How do I make a single attribute that is triggered on both these parses?

I don't think you can. I would just either make two attributes or turn the parser into "bar" "!"?.

Mario Carneiro (Dec 01 2021 at 12:26):

does that actually work? I would have expected it to parse bar! as an ident anyway

Gabriel Ebner (Dec 01 2021 at 12:28):

You're right. Writing bar! doesn't work, but bar ! works, which is almost as good.

Mario Carneiro (Dec 01 2021 at 12:28):

For a lot of attributes, I think we want to separate the data storage from the syntax/parsing part. In this case we have two syntax-level attributes but only one data storage component

Mario Carneiro (Dec 01 2021 at 12:29):

I think the built in mechanisms like TagAttribute tie the two parts together, so we don't want to use that

Mario Carneiro (Dec 01 2021 at 12:29):

not sure about ParametricAttribute

Gabriel Ebner (Dec 01 2021 at 12:31):

It also has a storage part, but we can just ignore it (or use registerBuiltinAttribute directly).

Mario Carneiro (Dec 01 2021 at 12:34):

yeah, I think registerBuiltinAttribute is the way to go, most of the things done in the function registerParametricAttribute passes to it are not needed for a generic attribute definition

Mario Carneiro (Dec 01 2021 at 12:36):

In fact, I think we rarely want a simple PersistentEnvExtension that just tracks applications of the attribute. Usually the actual data store is somewhat derived from this

Jannis Limperg (Dec 01 2021 at 12:42):

I've defined a custom attribute like this for Aesop. Code: https://github.com/JLimperg/aesop/blob/55c0d71da597d540b51703afc13f488503ac2885/Aesop/Config.lean#L542

Mario Carneiro (Dec 01 2021 at 12:44):

What's the reason for holding on to the impl after passing it to registerBuiltinAttribute?

Floris van Doorn (Dec 01 2021 at 12:51):

Mario Carneiro said:

In fact, I think we rarely want a simple PersistentEnvExtension that just tracks applications of the attribute. Usually the actual data store is somewhat derived from this

One way we can do this with ParametricAttribute is to let the getParam function compute the data we want to store, right?

Mario Carneiro (Dec 01 2021 at 12:54):

That assumes that what we want to build is a NameMap A for some A, where the key is the name of the definition we are putting the attribute on. Sometimes we want that, other times the key is not this definition but one of the parameters of the attribute or something else altogether, or perhaps it's not a NameMap at all but rather a discrimination tree or other data structure

Floris van Doorn (Dec 01 2021 at 12:54):

Ok, that's fair.

Mario Carneiro (Dec 01 2021 at 12:55):

In this case, we also don't want to build two NameMaps, one for the toAdditive lemmas and one for toAdditive!

Floris van Doorn (Dec 01 2021 at 12:56):

So shall I make a variant of ParametricAttribute that

  • has a list/array of attribute names instead of 1 (e.g. toAdditive and toAdditive!; it will declare BuiltinAttributes for each of them
  • It has a custom type of what data it stores (generalizing NameMap α from ParametricAttribute). This data is "shared" among the different names of the attribute

Jannis Limperg (Dec 01 2021 at 12:56):

Mario Carneiro said:

What's the reason for holding on to the impl after passing it to registerBuiltinAttribute?

Not sure there is one now that I look at it.

Mario Carneiro (Dec 01 2021 at 12:57):

@Floris van Doorn Is PersistentEnvExtension sufficient, or too general for you?

Mario Carneiro (Dec 01 2021 at 12:57):

basically, I'm suggesting that you adapt the code of registerParametricAttribute instead of calling it directly

Floris van Doorn (Dec 01 2021 at 12:59):

I could work with that directly. But I expect that we will have multiple similar uses (for other attributes), and I thought a less general interface could be helpful.

Mario Carneiro (Dec 01 2021 at 12:59):

you are probably right, but it might be helpful to have 1 or 2 examples first

Floris van Doorn (Dec 01 2021 at 12:59):

sounds reasonable

Floris van Doorn (Dec 01 2021 at 13:55):

Am I correct that this function does not exist yet?

/-- Parse a string. -/
def toString! : Syntax  String
| node _ strLitKind stxs => stxs[0].getAtomVal!
| stx => panic!"String expected"

Mario Carneiro (Dec 01 2021 at 13:58):

I don't think that is correct, since the stored atom includes the quotation marks

Mario Carneiro (Dec 01 2021 at 13:59):

#eval show TermElabM _ from do toString! ( `("foo"))
-- "\"foo\""

Gabriel Ebner (Dec 01 2021 at 14:00):

There is Syntax.isStrLit?.

Floris van Doorn (Dec 01 2021 at 14:27):

@Gabriel Ebner in my PR you mentioned to use forallTelescope. Am I correct that I need to be in MetaM to do that? I believe that means I cannot execute this during AttributeImpl.add (which is in AttrM a.k.a. CoreM)

Gabriel Ebner (Dec 01 2021 at 14:28):

Yes, forallTelescope is in Meta. You can use MetaM.run' to call a MetaM function in CoreM.

Floris van Doorn (Dec 01 2021 at 14:30):

Thanks!

Mac (Dec 01 2021 at 16:25):

Floris van Doorn said:

  • Suppose I have two syntax declarations
syntax (name := bar) "bar" num* : attr
syntax (name := bar!) "bar!" num* : attr

(this is in the Mathport.Syntax file for e.g. toAdditive). How do I make a single attribute that is triggered on both these parses?

Does this not work?

syntax (name := bar) ("bar" <|> "bar!") num* : attr

Mac (Dec 01 2021 at 16:27):

Or, if you want to discriminate between them:

syntax bar := "bar" num*
syntax bar! :=  "bar!" num*
syntax (name := barAttr) (bar <|> bar!) : attr

Floris van Doorn (Dec 01 2021 at 16:33):

Oh, that does seem to work as well.

Floris van Doorn (Dec 01 2021 at 16:40):

Another question about auxiliary definitions (needed for toAdditive).
In Lean 3, a def or lemma foo could generate a couple of automatically generated declarations. On the top of my head we have the following, but there are probably more.

  • foo._proof_i and foo._match_i which occur in the value of foo.
  • foo.eqn_i for the equations generated by foo.

What is the status in Lean 4? I already see foo._cstagei, foo._sunfold, foo._unsafe_rec, foo.match_i. Are there methods of getting some/all auxiliary declarations for foo (something like docs#tactic.get_eqn_lemmas_for)?

Mac (Dec 01 2021 at 16:48):

I don't know enough about how to to_additive works in Lean 3 to know if this is reasonable, but is there any reason you just can't do the transformation on the ~syntax rather than the elaborated Expr`? That way you don't have to worry about the assorted of auxiliary definitions the elaborator may generate (which I don't think is actually bounded in Lean 4).

Floris van Doorn (Dec 01 2021 at 17:04):

Getting all the features of toAdditive on Syntax might be hard. For example, toAdditive checks for each occurrence of a declaration whether it is applied to Nat (or some other fixed type), and if so, don't change that occurrence. This information is usually implicit in the user input.

Sebastian Ullrich (Dec 01 2021 at 17:09):

@Floris van Doorn You could have the elaborator repeat some of the work without starting at Syntax, e.g. using Lean.Elab.addPreDefinitions. It takes a list of (mutually recursive) PreDefinitions and generates the final definition and these helper defs from them.

Mario Carneiro (Dec 01 2021 at 17:26):

@Mac I think using syntax (name := bar) ("bar" <|> "bar!") num* : attr leads to poor indexing of the syntax, since it doesn't start with a constant. If you have a lot of these then they all end up in a grab bag at the end

Mac (Dec 01 2021 at 17:28):

@Mario Carneiro yeah, hence my second example.

Mario Carneiro (Dec 01 2021 at 17:28):

that still has the same problem though, for barAttr

Mac (Dec 01 2021 at 17:28):

@Mario Carneiro how so?

Mario Carneiro (Dec 01 2021 at 17:29):

barAttr doesn't start with a constant, so it can't be indexed like one

Mac (Dec 01 2021 at 17:29):

the firstTokens system does account for nested parsers.

Mac (Dec 01 2021 at 17:29):

it doesn't require the start to literally be a token

Mac (Dec 01 2021 at 17:31):

The problem with ("bar" <|> "bar!") num* is not that it won't be indexed properly (it will still have as its firstTokens bar and bar! and be indexed accordingly). The problem is that you will no be easily able to distinguish between bar and bar! syntaxes (as they do not have separate node kinds).

Mac (Dec 01 2021 at 17:32):

At least, that is my understanding.

Mario Carneiro (Dec 01 2021 at 17:35):

That's not necessarily a problem; in this case we actually want them to have the same node kind since they are morally the same tactic/attr

Mario Carneiro (Dec 01 2021 at 17:36):

for example, that would allow us to have only one @[elab] for the corresponding tactic, or only one name for the attr

Mario Carneiro (Dec 01 2021 at 17:36):

(This is all working around the fact that "bar" "!"? doesn't work)

Yakov Pechersky (Dec 01 2021 at 17:38):

What's the block to using noWs between "bar" and "!"?

Sebastian Ullrich (Dec 01 2021 at 17:41):

I would put the keywords only in syntax defs:

syntax bar := "bar"
syntax bar! :=  "bar!"
syntax (name := barAttr) (bar <|> bar!) num* : attr

That way semantically different tactic invocations are structurally syntactically different and thus work with pattern matching

Mario Carneiro (Dec 01 2021 at 17:42):

@Yakov Pechersky The issue is that ! is an identifier char so bar! always gets lexed as an identifier

Yakov Pechersky (Dec 01 2021 at 17:44):

Maybe a different character then to differentiate simps from simps with reporting of output lemmas?

Mario Carneiro (Dec 01 2021 at 17:48):

this issue comes up a lot in mathlib tactics. It's a popular character for tactic modifiers and there aren't any good replacements

Mario Carneiro (Dec 01 2021 at 17:52):

I'd rather investigate why it got turned into an identifier character in the first place but I don't want to burn social capital in the process, so I'm sticking with workarounds for now

Johan Commelin (Dec 01 2021 at 17:54):

Innocent newbie question: does this also mean that we can't use n! notation for factorials?

Mario Carneiro (Dec 01 2021 at 17:54):

yes

Mario Carneiro (Dec 01 2021 at 17:54):

we'll need to pull out an inverted turkish dotted I or something

Mario Carneiro (Dec 01 2021 at 17:55):

or use n ! or (n)!

Sebastian Ullrich (Dec 01 2021 at 18:00):

Mario Carneiro said:

I'd rather investigate why it got turned into an identifier character in the first place but I don't want to burn social capital in the process, so I'm sticking with workarounds for now

The short answer: Array.get!/Array.get?/Array.get etc.. I'm afraid it's a very useful convention for programming.

Mario Carneiro (Dec 01 2021 at 18:04):

Those are good examples, but I would still be inclined toward a convention like getE/getO/getD/get for that, considering the tradeoff

Sebastian Ullrich (Dec 01 2021 at 18:07):

There definitely was some amount of regret when we figured out this would prevent syntax "foo" ident?. But it's also really hard to imagine giving it up at this point.

Johan Commelin (Dec 01 2021 at 18:11):

Is there no way to combine the two use cases using some clever macros or such?

Sebastian Ullrich (Dec 01 2021 at 18:17):

It would definitely be possible to e.g. turn off parsing ? as part of the ident in syntax, I'm relatively sure we will do that at some point. For bar! custom parsing would be a bit more problematic because of the mentioned indexing, i.e. the token is parsed before this specific parser is even run. And for factorials... you could write a custom ident elaborator that tries to resolve the name including the ! or else interpret it as a factorial application right now (edit: ah, hygiene might be a problem here though).

Mac (Dec 01 2021 at 22:27):

@Sebastian Ullrich Would it be possible to flip this idea on its head and remove !/? from the ident token and just make a composite ident plus punctuation syntax that expands to identifier with the symbol at the end? That is something like:

import Lean.Syntax

open Lean Syntax

syntax ident' := ident noWs ("!" <|> "?")

def expandIdent' (stx : Syntax) :=
  mkIdentFrom stx <| stx[0].getId.appendAfter stx[1].getAtomVal!

macro:max id:ident' : term => expandIdent' id

#check «foo»! -- unknown identifier 'foo!'

Mac (Dec 01 2021 at 22:31):

This would allow for the !/? to be used as identifiers in standard programming context (as is currently convenient) while making it easier to do something different in DSLs.

Sebastian Ullrich (Dec 02 2021 at 16:21):

@Mac Same issue as with the flipped idea: it will fail in macros because hygiene doesn't like ident manipulation

Mac (Dec 02 2021 at 16:51):

@Sebastian Ullrich isn't there modifyBase and other such methods to modify identifier names in a hygiene-preserving manner? Or I am misunderstanding how those work?

Mac (Dec 02 2021 at 16:52):

(deleted modifyBase example -- appendAfter apparently already uses modifyBase)

Floris van Doorn (Dec 09 2021 at 11:58):

Sebastian Ullrich said:

Floris van Doorn You could have the elaborator repeat some of the work without starting at Syntax, e.g. using Lean.Elab.addPreDefinitions. It takes a list of (mutually recursive) PreDefinitions and generates the final definition and these helper defs from them.

Thank you for this suggestion @Sebastian Ullrich! That is very useful.

So just checking that I understand this: If I add the @[toAdditive] attribute to a definition, theorem or opaque decl (is that constant?) then I'll use that Declaration to compute a new PreDefinition, and then use Lean.Elab.addPreDefinitions to add this declaration to the environment.

Some questions:

  • Do I just flatten out all the sub-definitions (like foo._cstagei) in the value, and then addPreDefinitions will add them in for the new declaration? Or should I do what we did in Lean 3: first additivize the sub-definitions and then finally process foo?
  • Is all the Modifiers information of a PreDefinition still obtainable from the environment when executing the attribute code? For example, can I check whether a declaration is computable?
  • Is the corresponding commands for inductives/structures Lean.Elab.Command.elabInductiveViews and Lean.Elab.Command.elabStructure?

Sebastian Ullrich (Dec 09 2021 at 12:51):

opaque decl (is that constant?)

Yes.

Do I just flatten out all the sub-definitions (like foo._cstagei) in the value

_cstage should not occur in value, but for recursive definitions going back to the PreDefinition.value might be a bit problematic now that I think about it. If it was possible to access the pre-definitions in the attribute hook, that might yield a more feasible solution, but it is not, currently.

Sebastian Ullrich (Dec 09 2021 at 12:59):

The other alternative I can see is to do the translation syntactically after all, but with type information taking from the info tree... which is not currently enabled outside of the server, but that might change soon

Floris van Doorn (Dec 09 2021 at 13:05):

I see. I indeed also see that _cstage does not occur in the value.
It would indeed be nice if the predefinition value is still obtainable somehow.
I could indeed also try to do the translation on the syntax: the main functionality still works without any type information/information about implicit arguments.

Sebastian Ullrich (Dec 09 2021 at 13:32):

Note that the _cstage are not even valid Lean terms but compiler IRs at various stages of type erasure... it might be possible to translate them as well, but personally I would try to avoid that

Floris van Doorn (Dec 09 2021 at 13:45):

I see. It seems like the main options are:
(1) From a declaration, make a predefinition and translate that (disadvantage: predefinition value might be different than the declaration value)
(2) From the syntax, build the additive syntax (disadvantage: no information about implicit arguments when translating)
(3) Wait until there is better support for either (1) or (2).

Is the problem with (1) only with recursive (or unsafe) definitions? If the predefinition value and the declaration value is the same for all safe nonrecursive definitions, then I think I'll go with (1). Almost all definitions involving toAdditive are not recursive.

Sebastian Ullrich (Dec 09 2021 at 14:07):

Yes, I believe so - https://github.com/leanprover/lean4/blob/265c18accd055c59e00afa837ab3469141b5ebd9/src/Lean/Elab/PreDefinition/Basic.lean#L99

Sebastian Ullrich (Dec 09 2021 at 14:30):

Mac said:

Sebastian Ullrich isn't there modifyBase and other such methods to modify identifier names in a hygiene-preserving manner? Or I am misunderstanding how those work?

References to global declarations are detected at macro declaration time (when the syntax quotation is processed), since hygiene says that the macro's definition site, not its use site should provide the scope. When the macro runs, the information is already lost.

Floris van Doorn (Dec 09 2021 at 15:28):

Am I correct that the Environment doesn't store whether a definition foo is computable? Is a semireliable way to check this by looking at whether foo._cstage1 exists?

Sebastian Ullrich (Dec 09 2021 at 16:03):

That sounds like a quite reliable method to me :)

Floris van Doorn (Dec 09 2021 at 16:27):

Am I correct that Lean 4 does not have variants of src#expr.replace and src#expr.fold yet? Is Lean.Expr.foldConsts a good implementation to mimic if I want to define similar functions on Expr (in such a way that they only visit each subexpression once)?

Gabriel Ebner (Dec 09 2021 at 16:28):

There is an Expr.replace function.

Floris van Doorn (Dec 09 2021 at 16:29):

Oh thanks! I missed that before.

Leonardo de Moura (Dec 09 2021 at 21:48):

@Floris van Doorn We also have https://github.com/leanprover/lean4/blob/master/src/Lean/Meta/Transform.lean


Last updated: Dec 20 2023 at 11:08 UTC