Zulip Chat Archive

Stream: lean4

Topic: tactic docstrings


Kevin Buzzard (Sep 28 2023 at 08:40):

A beginner hovers over the rewrite tactic and sees the fairly helpful

rewrite [e] applies identity e as a rewrite rule to the target of the main goal. If e is preceded by left arrow ( or <-), the rewrite is applied in the reverse direction. If e is a defined constant, then the equational theorems associated with e are used. This provides a convenient way to unfold e.

rewrite [e₁, ..., eₙ] applies the given rules sequentially.
rewrite [e] at l rewrites e at location(s) l, where l is either * or a list of hypotheses in the local context. In the latter case, a turnstile  or |- can also be used, to signify the target of the goal.
Using rw (config := {occs := .pos L}) [e], where L : List Nat, you can control which "occurrences" are rewritten. (This option applies to each rule, so usually this will only be used with a single rule.) Occurrences count from 1. At the first occurrence, whether allowed or not, arguments of the rewrite rule e may be instantiated, restricting which later rewrites can be found. {occs := .neg L} allows skipping specified occurrences.

(although if I were using this for teaching I would also have a more low-level explanation rather than launching straight into applying an identity as a rewrite rule, and I would definitely have examples).

Unfortunately, beginners will never hover over the rewrite tactic because nobody ever uses it. They will hover over the rw tactic and see

rw is like rewrite, but also tries to close the goal by "cheap" (reducible) rfl afterwards.

This doesn't even tell you what the tactic does, it just says "oh it's like some other tactic, whose docstring you can't see right now, and it's something to do with reducible, an idea which seems to be notoriously difficult to find documentation for".

For my teaching in Lean 3 I was really unhappy with the tactic documentation provided by the system. For a bunch of common tactics I wrote new explanations here

https://www.ma.imperial.ac.uk/~buzzard/xena/formalising-mathematics-2023/Part_C/Part_C.html

and the documentation for each tactic typically contained one or more of (a) an overview, written with no CS language, (b) examples (c) variants and bells/whistles, and sometimes (d) details. I will be updating everything to Lean 4 in December in preparation for me teaching the course in January. Similarly in NNG4 I make detailed jargon-free explanations of many basic tactics available just one click away.

My impression from looking at the docstring of the rw tactic is that the issue here (making the system explain itself well) is not something which has been concentrated on so far. In contrast, whenever anyone is unfortunate enough to hover over = (which is everywhere) one finds a huge explanation of introduction and elimination rules which fills up a chunk of the screen and which, I suspect, is not what one wants to know 99% of the time (e.g. I often just want to know the type of the two things which are equal, information which is often not available unless one carefully scrolls through the pop-up taking care not to move ones cursor out of the box because then you'll start scrolling through the proof you're looking at).

Clearly one cannot please all of the people all of the time here, and in fact it's probably even hard to please most of the people most of the time. But is there a vision going forwards for where, if anywhere, beginner-friendly and expert-helpful tactic documentation can live in the Lean ecosystem? The rewrite docstring has all this useful information about rewrite [t1, t2] at h \|- and an explanation of a config option allowing more targetted rewriting. But I cannot find a single usage of this tactic in mathlib so nobody ever sees it. Can we do better?

Mario Carneiro (Sep 28 2023 at 08:43):

rw is using a discouraged method of documentation, which is to use a cross reference instead of copy paste

Mario Carneiro (Sep 28 2023 at 08:44):

alias used to do this too but now it properly copies the original docs

Mario Carneiro (Sep 28 2023 at 08:44):

rw should just get its docs by inheritDoc, and there should be a blurb at the end of their common doc about how rewrite and rw differ

Eric Wieser (Sep 28 2023 at 08:45):

does inheritDoc allow adding a blurb?

Mario Carneiro (Sep 28 2023 at 08:45):

no

Mario Carneiro (Sep 28 2023 at 08:45):

so both rewrite and rw would have the blurb

Kevin Buzzard (Sep 28 2023 at 09:10):

The docstring for rewrite even mentions an example using rw so this confusion is already happening :-)

Kevin Buzzard (Sep 28 2023 at 09:13):

So are core PRs for this sort of thing welcome or do people not have time for them right now or do I have to open an RFC or what? Is there a way of adding documentation to a core tactic in mathlib? If so, would this idea be unhelpful in the long term?

Mario Carneiro (Sep 28 2023 at 11:01):

please open a PR in core

Kevin Buzzard (Sep 28 2023 at 15:29):

Are you speaking for the core devs? I don't want to waste their time.

Jireh Loreaux (Sep 28 2023 at 15:31):

Kevin, this is about system usability, and it's only documentation, so it's an easy PR to review. I don't think this will waste their time, regardless of whether Mario is explicitly speaking on their behalf or not.

Kevin Buzzard (Sep 28 2023 at 16:31):

One worry I have is that I'm going to have to figure out how to build core and I don't even have a copy of core on any of my computers (this is some indication of how totally new to core I am). Another issue is that I'd never heard of inheritDoc and I've just searched for it in mathlib and I can't find any occurrences. But if people are prepared to help me through then I can try this.

Henrik Böving (Sep 28 2023 at 16:40):

You wanna search for inherit_doc which gets used a ton in mathlib (it got renamed)

Building core itself is rather easy. THe docs are here: https://lean-lang.org/lean4/doc/make/index.html. iirc you are on Ubuntu right? For that you wanna do:

sudo apt-get install git libgmp-dev cmake ccache clang # gets the dependencies
git clone https://github.com/leanprover/lean4 --recurse-submodules # get the sources
cd lean4
mkdir -p build/release # create the directory to build in
cd build/release
cmake ../.. # auto generate the Makefiles from CMakeList.txt
make # run the make file to compile

If you wanna compile faster you can use make -j6 to use e.g. 6 cores. To run tests you do make test. If you want to recompile you just have to go into build/release again and re run make.

As to how to get your editor to cooperate with this custom lean that guide is right next to the above one: https://lean-lang.org/lean4/doc/dev/index.html#development-setup

@Kevin Buzzard

Joachim Breitner (Sep 28 2023 at 16:48):

Do we even need both rw and rewrite? (Having fewer tactics would be desirable). Kevin says no one ever uses rewrite? So why have it? If it's rarely needed, could it be an option of rw?

Kevin Buzzard (Sep 28 2023 at 16:53):

It's used in NNG for pedagogical purposes, but it was renamed to rw for time-saving purposes ;-)

Ruben Van de Velde (Sep 28 2023 at 17:12):

I do occasionally use rewrite, mostly when debugging a proof. Making it an option somehow would be fine too

Kyle Miller (Sep 28 2023 at 17:14):

@Joachim Breitner I've thought of rewrite as being an implementation detail for rw, since rw is a macro for rewrite followed by try (with_reducible rfl). It seems to be the easiest way to implement rw.

Maybe rewrite could be named to something else, like rw only or something.

Joachim Breitner (Sep 28 2023 at 17:31):

Is that something one usually needs? If not, I'd make that behavior a rw config option and ban rewrite from our collective mental model. (If there is a rewriteCore tactic for implementation reasons that's fine, but it should be documented as such, not a “normal” user facing tactic). WDYT?

Kyle Miller (Sep 28 2023 at 17:48):

I don't think I've ever needed rewrite, but I think the easiest improvement would be to rename rewrite to something that distinguishes it better from rw.

Renaming rewrite to rewriteCore might require successfully making the case that no one ever should use rewrite. Something like rw only suggests that rw is the normal one, sidestepping the issue.

Adding a rw configuration option is a bit tricky since you either add it to the main rewrite configuration structure (in which case it might be worth just merging rw and rewrite into a single tactic), or you make a rw-specific configuration structure and make the macro a bit more complex since you need to transform it to be the rewrite one. It's not hard doing these things, but takes some nonzero energy to evaluate the options.

Joachim Breitner (Sep 28 2023 at 18:24):

Ah, but for rw only that wouldn't be needed? I didn't immediately see that advantage.

The main point of my suggestion is less the renaming (although it helps), but to move the docstring to rw, and give rewrite a docstring saying “internal tactic of rw”. Maybe also explaining the difference in behavior (useful for advanced users), and check documentation to introduce rw only. Ideally new users never lean about rewrite, if that's realistic.

Richard Copley (Sep 28 2023 at 20:50):

This is almost as weird as the idea of changing theorem to lemma everywhere!

Everyone should fix their code and documentation to use rw when the rfl is needed, and rewrite otherwise. It will run faster.

Mario Carneiro (Sep 28 2023 at 21:01):

[citation needed]

Joachim Breitner (Sep 28 2023 at 21:04):

Not sure if Buster’s suggestion is sarcastic, or not. (Both seem plausible to me.)

Kyle Miller (Sep 28 2023 at 21:06):

rfl at reducible transparency is very weak. I wouldn't expect it to take more than a very small fraction of the overhead of rewrite itself.

I'm curious though if modifying mathlib to use rewrite in every place that doesn't use rfl would make a measurable impact.

Jireh Loreaux (Sep 28 2023 at 21:08):

It would really be a shame if we had to start worrying about such inexpensive parts of tactics in order to get better performance.

Ruben Van de Velde (Sep 28 2023 at 21:08):

Joachim Breitner said:

Not sure if Buster’s suggestion is sarcastic, or not. (Both seem plausible to me.)

I think that's Poe's law

Richard Copley (Sep 28 2023 at 21:17):

Not at all. The difference is significant. Here I have a little project with 60 rw's and 296 rewrites. Time to rebuild after deleting build artifacts (10 trials):

12.011s
11.886s
12.024s
12.040s
11.962s
11.989s
11.958s
11.955s
11.968s
11.955s

After replacing all the rewrite's with rw's:

15.333s
12.549s
12.631s
12.574s
12.997s
12.549s
12.551s
12.654s
12.673s
12.697s

Should I do a p-test?

Richard Copley (Sep 28 2023 at 21:18):

There are 50697 rw's in mathlib, give or take. I don't know how many are in terminal position.

Joachim Breitner (Sep 28 2023 at 21:21):

Hmm, good point, I wasn’t expecting that.

Jireh Loreaux (Sep 28 2023 at 21:31):

Just a quick point of reference: the average difference between your runs (including the big outlier in the second set) was : 0.9465s, and that's for 296 new rws. If we expect the same speed-up across all of mathlib (possibly not a fair assumption), we would be looking at 0.9465 * 50607 / 296 = 162.11s difference in total build time (not wall time).

Richard Copley (Sep 28 2023 at 21:39):

A little less, since some rw's will remain. Perhaps 0.9465s * 50697 / (296 + 60)=134.78s. Not large, but I still don't think it should be forced on people. These little inefficiencies add up.

Mario Carneiro (Sep 28 2023 at 21:40):

what percentage of total build time is that?

Richard Copley (Sep 28 2023 at 21:40):

90s / 3 hours? Not large

Mario Carneiro (Sep 28 2023 at 21:41):

I mean, is it noise-level

Mario Carneiro (Sep 28 2023 at 21:41):

it may not even be statistically significant

Richard Copley (Sep 28 2023 at 21:41):

It is both statistically significant and noise-level.

Mario Carneiro (Sep 28 2023 at 21:43):

it's probably nonzero, but I suspect the macro expansion costs more than the rfl. We would need my (still fictional) auto-rewriter to do the test for real (detecting all the rw that can be changed to rewrite without breaking the proof)

Joachim Breitner (Sep 28 2023 at 21:49):

So if the rfl check was part of the actual tactics code, it might be statistically significantly faster than the current implementation with macros?

Kyle Miller (Sep 28 2023 at 21:52):

Maybe we should distinguish between "practically significant" and "statistically significant". It's likely statistically significant (in that one can tell after enough runs whether it was the macro or non-macro one), but the question is whether it's significantly faster.

Richard Copley (Sep 28 2023 at 21:57):

If speed is important, it's a (small) win. If there's something else that's more important (I can't see it ... is it consistency? saved typing? tradition?) then fine, Mathlib should make its decision. But speed is sometimes important. I think it would be a shame to make this change in Lean.

Damiano Testa (Sep 29 2023 at 05:36):

Going back to the doc-string issue, there is also the possibility of improving old doc-strings. This is a little hacky, but seems to give better docs to rw:

import Mathlib.Data.Nat.Basic

namespace Mathlib.Tactic.ExtendDocs

/-- `extend_docs <declName> before <prefix_string> after <suffix_string>` extends the
docs of `<declName>` by adding `<prefix_string>` before and `<suffix_string>` after. -/
syntax "extend_docs" ident (colGt "before" str)? (colGt "after" str)? : command

open Lean in
elab_rules : command
  | `(command| extend_docs $na:ident $[before $bef:str]? $[after $aft:str]?) => do
    if bef.isNone && aft.isNone then throwError "expected at least one of 'before' or 'after'"
    let declName  Elab.resolveGlobalConstNoOverloadWithInfo na
    let bef := if bef.isNone then "" else (bef.get!).getString ++ "\n\n"
    let aft := if aft.isNone then "" else "\n\n" ++ (aft.get!).getString
    let oldDoc := ( findDocString? ( getEnv) declName).getD ""
    addDocString declName <| bef ++ oldDoc ++ aft
end Mathlib.Tactic.ExtendDocs

example : 0 = 0 := by
  rw [ Nat.add_zero 0]
-- hover over `rw` shows the old doc-string, since nothing has changed yet:
/-
`rw` is like `rewrite`, but also tries to close the goal by "cheap" (reducible) `rfl` afterwards.
-/

/- redefine `rw`, since I cannot find a way to replace the doc-string of a declaration that
has been defined in an earlier file.
The "new" `rw` inherits the docs of `rewrite`. -/
open Lean.Parser.Tactic Lean in
@[inherit_doc rewriteSeq]
macro (name := my_rw) "rw" c:(config)? s:rwRuleSeq l:(location)? : tactic =>
  `(tactic| (rw $(c)? $s $(l)?))

-- now we add to the docs of the "new" `rw` the prefix that we want.
extend_docs my_rw
  before "`rw` is like `rewrite` (see below), but also tries to close the goal by \"cheap\"
    (reducible) `rfl` afterwards."

example : 0 = 0 := by
  rw [ Nat.add_zero 0]
-- hover over `rw` now shows
/-
`rw` is like `rewrite` (see below), but also tries to close the goal by "cheap" (reducible) `rfl` afterwards.

[rewrite doc-string]
-/

Damiano Testa (Sep 30 2023 at 18:54):

Eric Wieser said:

does inheritDoc allow adding a blurb?

Since I have also sometimes wanted to use inherit_doc and add some text to the added docs, I opened #7446.

Moreover, I also try out the change in the docs to rw. While I think that those docs can be improved, the solution in the PR does not convince me completely. However, I still think that extend_doc is something that is potentially useful and I took rw as a Mathlib-wide test.

Honestly, if it were not a draft PR, I would probably only add the command and would not use it for rw, but since it is a draft... you can explore how Mathlib would look like if the doc-string to rw actually talked about rw and not rewrite!

Mario Carneiro (Sep 30 2023 at 21:32):

why not add this mechanism directly to inherit_doc?

Damiano Testa (Sep 30 2023 at 21:37):

I had thought of it, but then I saw that inherit_doc was not in mathlib and I did not know how to produce the PR...

Thomas Murrills (Sep 30 2023 at 23:32):

I really like the idea of composable docstrings and have several times wanted something like this myself! :) If this is making its way to a core PR, though, I wonder if it would be better to simply be able to interpolate docstrings in general instead of extending inherit_doc. That is, to be able to write something like

/-- `foo` is 0. -/
def foo := 0

/--
`bar` is one greater than `foo`.

doc!{foo}

Note that this causes `bar` to be unequal to `foo`.
-/
def bar := foo + 1

up to syntax.

Maybe it could also (have the option to?) automatically indicate that the inserted text is foo's docstring somehow, either by an automatically inserted heading for foo or a summary element (not sure if those are supported in hovers, though). Not sure if there are any obstacles that would prevent this kind of feature, or if there are reasons (speed?) to just inherit docstrings at the beginning and/or end.

Mario Carneiro (Sep 30 2023 at 23:46):

speed is usually not the issue here, the major problem is increased code complexity

Thomas Murrills (Oct 01 2023 at 00:08):

Gotcha...maybe it would make more sense to make this a separate thing somehow, then—if at all. (E.g. demand an attribute @[inherit_docs] or something which just processes and replaces the existing docstring, interpolating occurrences like this appropriately.)

My guess would be that if you wanted to do this "right", you'd want to change the docComment parser to add syntax nodes encoding the literal parts of the comment vs. the interpolated parts, and I can see how that could be a big change...I took a look to see how it currently works, and I see that everything is very simple right now! :)

Mario Carneiro (Oct 01 2023 at 00:12):

To implement this, I would probably keep the parser the same, but add an extra function for "evaluating" a doc template string into an actual docstring and call it somewhere between extracting the docstring from the syntax and calling addDocString with the result

Mario Carneiro (Oct 01 2023 at 00:13):

I don't think there is a need to have the parser in on it as long as you can't write expressions in doc template arguments, only labels

Mario Carneiro (Oct 01 2023 at 00:15):

on the other hand, you might want to have the doc template string parse available to the server so that it can use it for semantic highlighting (to make it more obvious that it is special and enable go-to-def on the label)

Thomas Murrills (Oct 01 2023 at 00:24):

Right! (Also likewise I figured you could have better error reporting when an identifier couldn't be found in the environment) So—how would you hand the server one parse while using another? (Is that straightforward?) Or are you saying that that's an argument for changing the parser?

Mario Carneiro (Oct 01 2023 at 00:42):

the server could call the doc string parser again

Mario Carneiro (Oct 01 2023 at 00:43):

you could also have a parser combinator that takes the result of the docstring parser and parses templates in it, if we want the Syntax itself to represent this structure

Damiano Testa (Oct 01 2023 at 04:33):

Before trying to interpolate doc-strings directly, I am trying to make the addition of a suffix and a prefix, similar to the PR.
Here is what I have:

import Lean.Elab.InfoTree.Main
import Lean.DocString

namespace Mathlib.Tactic.ExtendDoc

syntax (name := extend_doc) "extend_doc" (ident)? (colGt "before" str)? (colGt "after" str)? : attr

-- interpolated `inherit_doc` attribute and `extend_doc` PR
open Lean in
initialize registerBuiltinAttribute {
    name := `extend_doc
    descr := "inherit documentation from a specified declaration"
    add   := fun decl stx kind => do
      unless kind == AttributeKind.global do
        throwError "invalid `[extend_doc]` attribute, must be global"
      match stx with
      | `(attr| extend_doc $[$id?:ident]? $[before $bef:str]? $[after $aft:str]?) => withRef stx[0] do
        let some id := id?
          | throwError "invalid `[extend_doc]` attribute, could not infer doc source"
        let declName  Elab.resolveGlobalConstNoOverloadWithInfo id
        if ( findDocString? ( getEnv) decl).isSome then
          logWarning m!"{← mkConstWithLevelParams decl} already has a doc string"
        let some doc  findDocString? ( getEnv) declName
          | logWarningAt id m!"{← mkConstWithLevelParams declName} does not have a doc string"
        let bef := if bef.isNone then "" else (bef.get!).getString ++ "\n\n"
        let aft := if aft.isNone then "" else "\n\n" ++ (aft.get!).getString
        addDocString decl (bef ++ doc ++ aft)
      | _  => throwError "invalid `[extend_doc]` attribute"
  }

If I import in a separate file the code above, this is what I get (as expected):

import Mathlib.Tactic.extend_doc  -- where the attribute is defined

/-- `foo` is `1` -/
def foo :  := 1

@[extend_doc foo before "`alsoFoo` looks like `foo`:" after "... except that it add one to it."]
def alsoFoo :  := foo + 1

#check alsoFoo
-- hover over `alsoFoo`, as expected you see
/-
`alsoFoo` looks like `foo`:

`foo` is 1

... except that it add one to it.
-/

However, with notation, everything breaks down:

@[extend_doc]  -- invalid `[extend_doc]` attribute, could not infer doc source
notation "ex_foo" => foo
--  invalid `[extend_doc]` attribute, could not infer doc source
-- while `inherit_doc` does work

@[extend_doc foo before "prefix to `foo`" after "suffix to `foo`"]
notation "newFoo" => foo

#check newFoo
--  hover shows: `foo` is 1

Mario Carneiro (Oct 01 2023 at 04:59):

If you do this for real, those errors will go away

Mario Carneiro (Oct 01 2023 at 04:59):

various other parts of the system look for inherit_doc annotations and modify them before they execute

Damiano Testa (Oct 01 2023 at 05:01):

Ok, thanks!

Does this seem like a good suggestion, then? Doing it for real means actually modifying core?

Mario Carneiro (Oct 01 2023 at 05:01):

yes

Damiano Testa (Oct 01 2023 at 05:01):

So, should I open an issue? I have never done anything like this...

Mario Carneiro (Oct 01 2023 at 05:01):

RFC first, I think

Damiano Testa (Oct 01 2023 at 05:01):

ok, I'll look up what this actually means!

Damiano Testa (Oct 01 2023 at 05:24):

Here it is: Issue #2609!

It is the first time that I post an issue, so let me know if anything looks strange! :smile:


Last updated: Dec 20 2023 at 11:08 UTC