Zulip Chat Archive
Stream: new members
Topic: tactic preference
Damiano Testa (Oct 18 2020 at 05:56):
Dear All,
in the process of cleaning up proofs, I find myself having to make choices between using different tactics with the same final effect. In practice, I often end up with the following scheme:
simp
becomes rw
(if I am lucky, via squeeze_simp
)
apply [...], exact [...]
becomes exact [......]
rw [... \iff ...]
becomes apply [(...).mp(r)...]
(when reasonable/possible)
have
becomes a previous lemma
I realize that tactics are not just not linearly ordered, they are probably not even a poset! Still, a proof commits you to some choices and I would like to know that I am at least going in a reasonable direction. I would like to know whether you find the scheme above reasonable. What other substitutions would you make? For instance, I rarely use subst [...]
since the next tactic is almost invariably rw
and then I merge them in a longer rw
. Which is normally preferred: subst [...], rw [...]
or rw [......]
?
Sorry for the vague question: I hope that someone can still give me some pointers!
Mario Carneiro (Oct 18 2020 at 05:57):
I try to merge subst
into the nearest rcases
Damiano Testa (Oct 18 2020 at 05:58):
Mario Carneiro said:
I try to merge
subst
into the nearestrcases
Ah, I use very rarely rcases
! The fact that you mention it, makes me understand that I should use it more! Thanks!
Mario Carneiro (Oct 18 2020 at 05:59):
You can go really far with rcases
and refine
Mario Carneiro (Oct 18 2020 at 05:59):
it's basically term mode proof in tactic clothing
Damiano Testa (Oct 18 2020 at 05:59):
refine
I *try * a lot, but end up being able to apply it only very rarely.
Mario Carneiro (Oct 18 2020 at 06:00):
refine can be used in place of almost any tactic, although the equivalent may not be pretty depending on the tactic it replaces
Damiano Testa (Oct 18 2020 at 06:00):
so would a good measure be the length of the output of show_term
to inform the choices above?
Mario Carneiro (Oct 18 2020 at 06:01):
it's good for replacing split
, existsi
/use
and intro
in some combination
Damiano Testa (Oct 18 2020 at 06:02):
ah, I did not know that split
s could be avoided by refine
: this will keep me busy! Ahaha
Mario Carneiro (Oct 18 2020 at 06:02):
yes, although you should be aware that some tactics do work that doesn't show up in the term, for example simp
Mario Carneiro (Oct 18 2020 at 06:02):
split
is just refine <_, _>
Mario Carneiro (Oct 18 2020 at 06:02):
and you can stick a term in there if one of the sides is easy
Damiano Testa (Oct 18 2020 at 06:02):
I never leave a simp
and almost never leave even a simp only
Damiano Testa (Oct 18 2020 at 06:03):
Mario Carneiro said:
split
is justrefine <_, _>
I did not know! I will stop using split
, then!
Mario Carneiro (Oct 18 2020 at 06:04):
I will also point out that this doesn't necessarily match everyone's preference; this is good for making low compile time proofs but may go against one's aesthetics
Mario Carneiro (Oct 18 2020 at 06:05):
existsi a
is just refine <a, _>
Damiano Testa (Oct 18 2020 at 06:05):
Yes, I understand that. Although, to be honest, proof irrelevance has the effect in my mind of replacing "beauty" with "efficiency" in a proof.
Mario Carneiro (Oct 18 2020 at 06:06):
a kindred spirit :grinning_face_with_smiling_eyes:
Damiano Testa (Oct 18 2020 at 06:06):
Mario Carneiro said:
existsi a
is justrefine <a, _>
I did not even know about the tactic existsi
.
Mario Carneiro (Oct 18 2020 at 06:06):
use
is basically the same as existsi
Damiano Testa (Oct 18 2020 at 06:07):
Ah, good to know! I work aournd statements so as to avoid needing use
whenever I can. I do not know why...
Mario Carneiro (Oct 18 2020 at 06:08):
It does a bit of postprocessing with simp
to make it more user friendly I think
Mario Carneiro (Oct 18 2020 at 06:09):
check out this beautiful proof:
theorem split_bits.determ_l {n₁ n₂ l} (h₁ : split_bits n₁ l) (h₂ : split_bits n₂ l) : n₁ = n₂ :=
begin
induction l generalizing n₁ n₂, {cases h₁, cases h₂, refl},
rcases l_hd with ⟨_, l', rfl⟩,
induction l' generalizing n₁ n₂,
{ cases h₁, cases h₂, exact l_ih h₁_a h₂_a },
{ have : ∀ {n l'},
split_bits n l' →
l' = ⟨_, l'_hd :: l'_tl, rfl⟩ :: l_tl →
l'_hd = nat.bodd n ∧
split_bits (nat.div2 n) (⟨_, l'_tl, rfl⟩ :: l_tl),
{ intros, cases a; try {cases a_1},
rcases a_bs with ⟨l₂, rfl⟩,
injection a_1, cases h_2,
cases congr_arg (λ v : Σ n, bitvec n, v.2.1) h_1,
exact ⟨rfl, a_a⟩ },
rcases this h₁ rfl with ⟨rfl, h₁'⟩,
rcases this h₂ rfl with ⟨e, h₂'⟩,
rw [← nat.bit_decomp n₁, e, l'_ih h₁' h₂', nat.bit_decomp] }
end
100% old school tactics
Damiano Testa (Oct 18 2020 at 06:09):
I now need to resist the urge to go back to all the code that I have written and preform further substitutions... surely, alongside proof-checkers, there are "proof-spas", beautifying your code, no?
Mario Carneiro (Oct 18 2020 at 06:10):
it's big business in metamath land, but lean doesn't make it so easy to optimize proofs in any large scale way
Damiano Testa (Oct 18 2020 at 06:11):
Why do you call the proof above "old school tactics"? What are examples "new school tactics"? I feel that I am a beginner learning the old-school...
Heather Macbeth (Oct 18 2020 at 06:12):
Is this something AI can do for us someday? Take the corpus of all the lemmas and definitions in mathlib, and find its own proofs, sometimes 3 lines shorter than the existing proofs?
Damiano Testa (Oct 18 2020 at 06:13):
Heather Macbeth said:
Is this something AI can do for us someday? Take the corpus of all the lemmas and definitions in mathlib, and find its own proofs, sometimes 3 lines shorter than the existing proofs?
In my proofs, I think I would be happy if it only shortened them by 3 lines! Even my one-line-proofs are long...
Mario Carneiro (Oct 18 2020 at 06:14):
It's something that can absolutely be done in principle (that's basically what GPT-f did for metamath), but the limiting factor is ease of access of the data to automated tools, which is crazy considering that it's a formal theorem prover
Heather Macbeth (Oct 18 2020 at 06:16):
Mario Carneiro said:
that's basically what GPT-f did for metamath
I don't know about this, can you suggest something for further reading?
Mario Carneiro (Oct 18 2020 at 06:17):
In fact, for a long time the main metamath tool has had a proof processing tool called "minimize" that is basically a peephole optimizer: it looks at sequences of 2 or 3 steps in your proof and tries to find ways to do the same thing in 1 or 2 steps utilizing the whole library. It's computationally expensive, so it is run once a year or so on the entire library, but it can also be run in the background while you are working on your new proof
Mario Carneiro (Oct 18 2020 at 06:18):
There is some discussion about GPT-f on the #Machine Learning for Theorem Proving stream
Mario Carneiro (Oct 18 2020 at 06:20):
The upshot is that most proofs in set.mm are "locally optimal": they don't necessarily take the globally shortest path but there are no local missteps so everything proceeds straightforwardly
Mario Carneiro (Oct 18 2020 at 06:21):
a simple analogue for lean would be deleting have
statements that are never used
Mario Carneiro (Oct 18 2020 at 06:22):
But none of that can happen in lean 3 because no one can parse lean except lean
Damiano Testa (Oct 18 2020 at 06:38):
@Mario Carneiro Thank you very much for your input! I have already shortened a couple of proofs using some of your ideas!
Julian Berman (Oct 18 2020 at 13:29):
What causes the parsing difficulties in Lean 3 (and what's changing)?
Mario Carneiro (Oct 18 2020 at 13:29):
Well, lean 4 will be even harder to parse
Mario Carneiro (Oct 18 2020 at 13:30):
the reason lean 3 is hard to parse is because you can run arbitrary code while parsing
Julian Berman (Oct 18 2020 at 13:30):
Ah, of course.
Mario Carneiro (Oct 18 2020 at 13:31):
which means that parsing correctly implies also running the VM correctly, having a typechecker and tactic framework and every other observable feature of lean
Kenny Lau (Oct 18 2020 at 13:36):
how do you run arbitrary code while parsing?
Julian Berman (Oct 18 2020 at 13:40):
notation I guess?
Julian Berman (Oct 18 2020 at 13:41):
Or are you saying how would you implement that? I mean I guess if you're sufficiently motivated you start with running a Lean process externally or embedding it in the parser, parse everything you can statically and then communicate with it for the bits you can't parse yourself or something
Mario Carneiro (Oct 18 2020 at 13:42):
Anything using the parser
monad
Mario Carneiro (Oct 18 2020 at 13:42):
tactic argument parsing and user commands
Mario Carneiro (Oct 18 2020 at 13:46):
meta def tactic.interactive.even (e : interactive.parse (do
n ← lean.parser.small_nat,
guard (2 ∣ n))) := tactic.skip
example : true := by even 1234 -- ok
example : true := by even 1233 -- parse error
Mario Carneiro (Oct 18 2020 at 13:47):
I'm saying that it is impossible to parse lean without also doing everything else lean does. It is not a separate pass
Kevin Lacker (Oct 19 2020 at 17:21):
can you get Lean to output a representation of the abstract syntax tree of a file? like in python, it's also a pain to make a third-party parser for Python code, but you don't have to do it because Python itself can use its parser and then you can walk the AST to encode it into json or something
Reid Barton (Oct 19 2020 at 17:23):
I think there never really is an abstract syntax tree, or at least, not of the whole file at the same time
Reid Barton (Oct 19 2020 at 17:24):
which isn't to say you couldn't retrofit such a concept onto Lean, but it sounds like a lot of work
Reid Barton (Oct 19 2020 at 17:25):
I looked into how the "return information for the identifier at a specific location" feature works and it just reparses the whole file up to that location
Kevin Lacker (Oct 19 2020 at 17:28):
weird
Reid Barton (Oct 19 2020 at 17:28):
In Lean 4 the parser is written in Lean and will be user-extensible in ways which hopefully include being able to capture the parsed AST for whatever nefarious purposes one might have
Reid Barton (Oct 19 2020 at 17:31):
and there's actually a Syntax
Lean type which is the AST, rather than, when the parser encounters the inductive
keyword, it just calling the handler for inductive
commands, or however it works now
Mario Carneiro (Oct 19 2020 at 17:35):
One way to implement AST retrieval would be to have an AST structure in the parser state, and then just log all parser function calls in it, i.e. whenever you call p.inductive()
the first thing it does is call ast.inductive()
which pushes an inductive AST node on the stack. You end up with a preorder traversal of the AST as an array and stream that out
Kevin Lacker (Oct 19 2020 at 17:36):
i guess even the ast is kind of weird if you're thinking of external proof minimizers or similar
Kevin Lacker (Oct 19 2020 at 17:37):
what you really want is a reduced language... like theoretically, would it be possible to write every proof in lean as a sequence of have ___, from <existing lemma> _ _ _
type lines?
Kevin Lacker (Oct 19 2020 at 17:37):
because otherwise the minimizer has to understand norm_num
, solve_by_elim
, all these complex things
Mario Carneiro (Oct 19 2020 at 17:37):
That kind of thing is in the olean files and the export format, and there are already external parsers for that
Mario Carneiro (Oct 19 2020 at 17:38):
the external typecheckers work on the data from the export format, which you can get by running lean --export
Mario Carneiro (Oct 19 2020 at 17:39):
but what we don't have access to is stuff like the list of tactics used in a file
Reid Barton (Oct 19 2020 at 17:39):
But this might somehow be "too reduced"
Kevin Lacker (Oct 19 2020 at 17:40):
it's too reduced to minimize a proof - it isn't obvious how minimal a lean proof is, from its compiled form. it might be useful for something though...
Mario Carneiro (Oct 19 2020 at 17:41):
the olean files are a bit more feature rich in that they contain notations and annotations, in addition to raw logical proof data (which is all that is in the lean export files), but they don't have anything about begin end
blocks
Kevin Lacker (Oct 19 2020 at 17:41):
is lean 4 going to maintain these formats
Mario Carneiro (Oct 19 2020 at 17:41):
no, olean was never meant to be a stable format
Mario Carneiro (Oct 19 2020 at 17:42):
and even the core logic has changed so the export format would have to be different
Mario Carneiro (Oct 19 2020 at 17:42):
but I think in lean 4 there won't be an export format (but there will be enough tools to roll your own)
Kevin Lacker (Oct 19 2020 at 17:42):
i wonder how painful transitioning from 3 -> 4 is going to be....
Mario Carneiro (Oct 19 2020 at 17:42):
quite
Mario Carneiro (Oct 19 2020 at 17:44):
I'm told that lean 4 is at least compatible enough to read lean 3 olean files and get "raw mathlib"
Mario Carneiro (Oct 19 2020 at 17:44):
or maybe that was the export format? @Gabriel Ebner knows
Reid Barton (Oct 19 2020 at 17:44):
olean or export?
Reid Barton (Oct 19 2020 at 17:44):
I think it was export
Mario Carneiro (Oct 19 2020 at 17:45):
that would make more sense. No one wants to parse the rats nest that is olean files
Reid Barton (Oct 19 2020 at 17:45):
says the author of olean-rs
Reid Barton (Oct 19 2020 at 17:45):
by the way, last time I checked it was out of date :upside_down:
Mario Carneiro (Oct 19 2020 at 17:46):
I should do something about that
Gabriel Ebner (Oct 19 2020 at 17:46):
There is an importer for the text output format here: https://github.com/gebner/lean4-mathlib-import
Gabriel Ebner (Oct 19 2020 at 17:47):
Import mathlib into Lean 4
This tool takes the text export of mathlib and creates an olean file that you can import with Lean 4.
Notation, etc., is not supported. All theorems are imported as axioms. Some names are adapted to the new convention, some aren't. This code is not intended for production.
You can build the Mathlib.olean
as follows (make sure you have enough RAM):
make
make emacs # install the lean4 emacs package yourself!
(If you're on NixOS, you might need to run everything inside nix-shell ~/lean4
.)
After that, you can import mathlib and play with it in Lean 4:
import Import.Mathlib
#check (deriv : (real → real) → (real → real))
Reid Barton (Oct 19 2020 at 17:47):
Relevant Zulip thread https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/Importing.20mathlib.20into.20Lean.204/near/187561178
Mario Carneiro (Oct 19 2020 at 17:47):
oh, you write olean files?
Kevin Lacker (Oct 19 2020 at 17:47):
so let's say I want to parse, or quasi-parse mathlib. the theory is that you could make a faster library_search if you did some preprocessing and wrote things in a faster language. I just want the names, namespaces, and signatures of the lemmas/theorems, though. enough to generate some code that refers to them. I don't need to verify the proofs. is that possible from the export format?
Mario Carneiro (Oct 19 2020 at 17:47):
so I suppose you know what lean 4 olean looks like
Gabriel Ebner (Oct 19 2020 at 17:48):
I'm using the Lean 4 API for olean files.
Mario Carneiro (Oct 19 2020 at 17:49):
You can fairly easily write an introspection tactic from lean to gather that information. That's what leandoc is
Reid Barton (Oct 19 2020 at 17:49):
You can also reconstruct this information from the export format.
Kevin Lacker (Oct 19 2020 at 17:50):
ok. so it's not the full AST but it's like the first level of the AST
Mario Carneiro (Oct 19 2020 at 17:50):
more like the last level
Mario Carneiro (Oct 19 2020 at 17:50):
it's the fully digested form
Kevin Lacker (Oct 19 2020 at 17:51):
well, if you think of the AST as a tree where the root is like a namespace, and the first level of the tree is the children of that namespace, and then the next levels are how those children start to be defined, etc
Mario Carneiro (Oct 19 2020 at 17:51):
you don't even have information about implicit args though, so generating a parseable expr from that info is tough
Mario Carneiro (Oct 19 2020 at 17:51):
lean isn't really structured like that though
Mario Carneiro (Oct 19 2020 at 17:52):
you can put stuff in namespaces in an arbitrary order
Mario Carneiro (Oct 19 2020 at 17:52):
from any file
Mario Carneiro (Oct 19 2020 at 17:53):
I mean, the file structure is like that with namespaces and sections, but none of that structure is visible in the digested theorems
Kevin Lacker (Oct 19 2020 at 17:53):
makes sense. well i'll check out what leandoc does
Gabriel Ebner (Oct 19 2020 at 17:54):
Implicit arguments are there in the text output format. Even infix notation.
Gabriel Ebner (Oct 19 2020 at 17:54):
But yes, please write your own exporter in lean like doc-gen.
Mario Carneiro (Oct 19 2020 at 17:54):
That's another thing that you could only get at with a proper AST - we currently have no idea when namespaces and sections start and end, except when you are inside them
Mario Carneiro (Oct 19 2020 at 17:55):
and getting the current namespace is only recently available too. I think we still can't get at the list of variables in scope
Mario Carneiro (Oct 19 2020 at 17:56):
so my dream of writing a def
-equivalent tactic is as yet unfulfilled
Last updated: Dec 20 2023 at 11:08 UTC