Zulip Chat Archive

Stream: lean4

Topic: Automation


Rosie Baish (Jun 01 2022 at 15:29):

Does Lean4 have any automation features?
I've previously used Agda so I'm thinking of things like:

  1. the auto option which attempts to fill in the proof for you
  2. The refine option which takes part of a proof and creates the right combination of holes for you
  3. The auto-split features which does a case split on a variable or variables and creates all the cases with holes in the right places

I found these really useful in Agda, both to reduce the need for tons of typing, and as a kind of built-in tutorial that could teach me things I didn't know.

I'd love to hear about any other features that lean has which result in any additional code/proof being written by Lean, or any projects thinking about these things

Reid Barton (Jun 01 2022 at 15:46):

We use tactics for this (e.g. simp, refine/apply, cases)

Henrik Böving (Jun 01 2022 at 16:21):

As a side node, I am a little surprised that of all things an agda user would be asking for automation, I was under the impression that agda was especially well known for not making use of too much automation but rather sticking to the basics and typing the proof terms out etc? At least most agda people I've come across so far usually disliked tactics * a lot*.

Reid Barton (Jun 01 2022 at 16:38):

I think the Agda automation in question is the editor mode feature set (C-c C-r for refine, etc). The main difference is that in Agda the source .agda file contains the results of this automation, and not the automation steps themselves.

Reid Barton (Jun 01 2022 at 16:38):

In my (limited) Agda experience, you wouldn't really want to write all the proof terms without assistance and Agda doesn't make it easy to do so.

Jakob von Raumer (Jun 03 2022 at 07:31):

Yes, the three things they askes for are exactly the useful commands bound by agda-mode.

Rosie Baish (Jun 03 2022 at 08:05):

I should possibly clarify, that I'm a long way from an experienced Agda user.
As @Reid Barton said, I was mostly wondering about the editor features which put code into the source file for you with holes to fill yourself.
I found this increadibly useful when learning, as it created all the syntax for me and hugely reduced both the amount of typing and the amount of errors I had

Sebastian Ullrich (Jun 03 2022 at 08:09):

We may eventually get there, but the existence of the tactic framework lowers the priority of these features quite a bit compared to Agda

Rosie Baish (Jun 03 2022 at 10:39):

Is there anything in the pipeline that would facilitate it as part of Lean, or would that kind of thing be better as a standalone tool, or part of lean4-mode in emacs?

Rosie Baish (Jun 03 2022 at 10:39):

I'll probably have a play around at implementing something myself, but I might as well do it in the most sensible way so any advice is much appreciated!

Sebastian Ullrich (Jun 03 2022 at 10:49):

It would have to be a Lean plugin at the very least to reliably get type information and potentially for structurally modifying the syntax tree. I think any reasonable implementation could definitely be merged into Lean.

Sebastian Ullrich (Jun 03 2022 at 10:51):

My student assistant @Lars König is planning to look into improving the pretty printer towards a practical code formatter next, which any tools that want to generate syntax should benefit from as they don't have to care about formatting anymore

Reid Barton (Jun 03 2022 at 14:50):

Does Lean 4 have something like Lean 3's hole commands?

Rosie Baish (Jun 08 2022 at 23:43):

Me again with more stupid questions.
Is there a way to get lean to dump the AST of an expression?
I want to play around with some stuff in python, so ideally need the AST in some machine readable form.
All implicit brackets put back in and prefix notation would work, but a more proper AST would be amazing.

Rosie Baish (Jun 08 2022 at 23:44):

I want to create a mini proof planner which can take some theorems, some axioms and a goal and attempt to generate a proof in first order logic which lean would check.

Henrik Böving (Jun 08 2022 at 23:44):

Do you want the syntax level AST or the logical expression?

Henrik Böving (Jun 08 2022 at 23:45):

okay expression

Henrik Böving (Jun 08 2022 at 23:47):

You can use something like this

import Lean
open Lean Elab Term


#eval show TermElabM _ from do
  let expr  `(Nat.add 1 2)
  elabTerm expr none

it will return the docs4#Lean.Expr that corresponds to the syntax you put in the `() notation. If you want to integrate this into Lean as a tactic you probably want to start looking into how Lean meta programming works first, we're currently putting a (very WIP) resource together for this here. https://github.com/arthurpaulino/lean4-metaprogramming-book

Henrik Böving (Jun 08 2022 at 23:48):

And as a general remark, there is no need to do all of this in python and outside of Lean, Lean 4 is perfectly adequate and well equipped to do just what you want on its own with its meta programming facilities.

Chris Lovett (Jun 08 2022 at 23:49):

@Rosie Baish there are work items on the VSCode extension for doing some things like code actions to "fill holes" with suggestions form Lean Language Server. So definitely open to suggestions and even PR help on any great ideas you might have there. The poor mans version right now is that the lean program can print stuff in the InfoView which you can then "copy to editor" with a button in the info view :-) The other poor man's trick for filling holes is to type underscore in your Lean program and then look at the intellisense when you hover over the underscore. That at least tells you what kind of think you need to put there to complete the program.

Rosie Baish (Jun 09 2022 at 00:00):

I was aware that Lean4 could do everything I needed, but I'm a very long way from proficient in it
I just wanted to hack a quick prototype together, and python is my most familiar language for that
If it turns out to be a good idea I'll look at porting it to lean etc.

Rosie Baish (Jun 09 2022 at 00:00):

Also, I'm not sure I understand the distinction you made between syntax level AST and logical expression @Henrik Böving ?

Henrik Böving (Jun 09 2022 at 00:02):

Lean has a docs4#Lean.Syntax type for representing the user level syntax, for example things like 1 + 1 are not desugared to basically Nat.add 1 1 in this but are represented as 1 + 1 but in a syntax tree form so it wouldn't be too interesting for you. There is then a component in the compiler called the term elaborator (which is what I'm calling to here, hence TermElabM that turns a Syntax into an Expr which can actually be used for type checking and all that proof stuff you are interested in.

Rosie Baish (Jun 09 2022 at 00:05):

I wanted the version that turns (a \and b) into (prop.and a b), or something similar
I ran the code snippet you posted above and it generated a huge expression that didn't fit on my screen

Rosie Baish (Jun 09 2022 at 00:05):

I basically just want lean to output stuff in a way that means I don't have to figure out all the infix notation and associativity myself

Rosie Baish (Jun 09 2022 at 00:09):

The syntax definition you linked looks perfect to me, so I think I just need to figure out how to turn the current assumptions and goal into that format, and how to turn an arbitrary theorem into that format

Henrik Böving (Jun 09 2022 at 00:16):

No operating on the Syntax directly for proof generation is really not a good idea, if you want this to work on any user defined notation and properly together with the Lean compiler you want to operate on Expr because Expr is what all tactics and proof automation operates on really. The syntax tree for the above expression will look more tame but once you start getting into logic stuff blows up again big time, consider for example:

#eval show MetaM _ from do
  `( (a : Prop), a  ¬a)

Rosie Baish (Jun 09 2022 at 00:25):

Point taken, thanks!
I think I got spooked by the amount of stuff that the Nat example gave, the second one you sent is much more readable!

John Burnham (Jun 10 2022 at 02:44):

Rosie Baish said:

Is there a way to get lean to dump the AST of an expression?
I want to play around with some stuff in python, so ideally need the AST in some machine readable form.
All implicit brackets put back in and prefix notation would work, but a more proper AST would be amazing.

We're doing this in the Yatima project, you can take a look at our https://github.com/yatima-inc/yatima-lang/blob/main/Main.lean for an example of how to parse a .lean file to get the Lean.Environment that contains all the constants, expressions, etc. And https://github.com/yatima-inc/yatima-lang/blob/main/Yatima/DebugUtils.lean for an example of how to print things.

Most of the components you need, like the Lean parser and Lean.Expr printer, are importable from the Lean namespace, so our stuff is mostly just gluing existing things together, but might be of interest

Rosie Baish (Jun 10 2022 at 15:09):

@John Burnham Thanks

Rosie Baish (Jun 10 2022 at 15:11):

I've managed to get a basic prototype working that inserts bits of proof into the file for me.
However whenever it inserts it the syntax highlighting in Emacs utterly breaks.
I'm assuming that the LSP server isn't being updated or something, is there a part of lean4-mode I can use to workaround this?
Or do I need to get my code to paste into the buffer at point rather than do a direct insertion?

Sebastian Ullrich (Jun 10 2022 at 15:22):

You'd have to look into lsp-mode's implementation as to when it triggers an update and whether you can do that manually

Sebastian Ullrich (Jun 10 2022 at 15:23):

Note that synthesizing & inserting text on the client side definitely is not a robust approach, we're aiming for syntax tree transformation on the server side. Multiple people are working on a general framework for this.

Rosie Baish (Jun 10 2022 at 15:27):

@Sebastian Ullrich Yeah, I know that what I'm doing falls into the "horrible and hacky" category :-)
It's mostly just there to save me a load of typing, because I tend to go through the same few steps on every proof/case and they're highly amenable to automation!

Reid Barton (Jun 10 2022 at 15:54):

Have you considered using tactics?

Rosie Baish (Jun 11 2022 at 08:37):

@Reid Barton Yes
I don't want to use tactics because of what I consider to be a major floor in the concept - they're not cacheable. (n.b. there may be a more technical term that I don't know here)
Phrased another way, the complexity of generating a proof is in a different class to the complexity of checking a proof.
The reason I'm using Lean is to automate proofs about a programming language semantics
I therefore have a bunch of theorems which I want to prove hold for an inductive definition.
However, the inductive definition is generated from my semantics, and so will change over time.
Because of this, as part of my continuous integration I want to re-run all of my proofs.

Lets assume I have some arbitrarily complex tactic which can prove any of the theorems I want, but has a runtime measured in hours.
If it's a tactic, then every time I rerun my proof (either because theres a new version of Lean, or a new version of my semantics, or any of 100 other reasons to want to re-verify) I have to take that multi-hour runtime hit for every single one of my theorems

On the other hand, if I have a proof generator, it will spit out a proof of how it constructed the proof, which can then be checked much faster.

I can then have my CI simply use lean to check the proof that it generates, which is much more efficient.
If it fails, then I can go and repair the proof, either manually or automatically using the proof-generator.

Rosie Baish (Jun 11 2022 at 08:40):

If a tactic had a way to "show it's working" as it were, I'd be entirely happy with them

Alexander Bentkamp (Jun 11 2022 at 08:57):

There was a similar question here: https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/Leans.20version.20of.20.22Show.20Proof.22.3F
Maybe show_term would help you with what you are trying to do? I don't know it there is a Lean 4 equivalent yet, but it shouldn't be too hard to implement.

Rosie Baish (Jun 11 2022 at 09:23):

@Alexander Bentkamp
Do you mean do a show_term and then paste the result in as the proof?
Or does lean use the output of show_term on subsequant runs?
Because the latter is what I need, and modifying the source file seems like the most sensible way to acheive it

Alexander Bentkamp (Jun 11 2022 at 09:32):

No, Lean will always rerun the tactics inside show_term because where is Lean supposed to cache this? On the other hand, with Lean 4, I guess you could make a variant of show_term that caches the results in an external file.

Tomas Skrivan (Jun 11 2022 at 09:36):

You can also use the #print command to print the tactic generated proof and then copy&paste it back to your source code.

A simple example:

theorem foo (x : Nat) : id x = x := by simp[id]

#print foo  --- fun x => of_eq_true (eq_self x)

theorem foo' (x : Nat) : id x = x := of_eq_true (eq_self x)

Alexander Bentkamp (Jun 11 2022 at 09:37):

I was thinking you could use show_term as is, but I don't know if this is too much manual labor for your application. You could paste the output of show_term, but keep the show_term command in a comment. Whenever the proof breaks, you can uncomment to regenerate a proof term.

theorem ... :=
/- by show_term { expensive_tactic } -/
fun x y z => @explicit proof term

Rosie Baish (Jun 11 2022 at 10:02):

Ah, that's cool.
And a lot closer to what I had in mind!

Reid Barton (Jun 11 2022 at 11:47):

Yes, caching of tactic results is an issue in general, though in this context, it sounds like mostly a theoretical (or perhaps philosophical) one.

Reid Barton (Jun 11 2022 at 11:53):

If you're just applying refine, induction, constructor, etc. in an easily-automated way, there's no reason to suspect that proof generation time would be large compared to proof checking time.

Rosie Baish (Jun 11 2022 at 16:24):

Reid Barton said:

If you're just applying refine, induction, constructor, etc. in an easily-automated way, there's no reason to suspect that proof generation time would be large compared to proof checking time.

That's the start, but I'm currently implementing some backtracking stuff where it keeps attempting different strategies, with a timeout option for stuff it can't do!

Mac (Jun 11 2022 at 16:37):

My thought would be to have a tactic parameterized by another tactic/term (e.g., cache proof) that globally loads a Name to proof term map for the file (e.g., from a JSON file) and then checks if the current definition is in the map. If not, elaborate proof. If it is in the map, try to use that term. If it fails to type check, elaborate proof. If proof is successfully re-elaborated, save the new term for prrof into the map and save the whole map to back to the cache file at the end of the program.

Rosie Baish (Jun 11 2022 at 16:56):

Mac said:

My thought would be to have a tactic parameterized by another tactic/term (e.g., cache proof) that globally loads a Name to proof term map for the file (e.g., from a JSON file) and then checks if the current definition is in the map. If not, elaborate proof. If it is in the map, try to use that term. If it fails to type check, elaborate proof. If proof is successfully re-elaborated, save the new term for prrof into the map and save the whole map to back to the cache file at the end of the program.

That would be really cool
How easy would that be to implement?
I'd have no idea where to start!

Mac (Jun 11 2022 at 17:31):

I am working on a proof of concept. I just need to figure out how to serialize (and deserialize) expressions.

Henrik Böving (Jun 11 2022 at 17:34):

The olean loader/saver should be able to do that right? Maybe look there.

Henrik Böving (Jun 11 2022 at 17:35):

In general what you are describing sounds like olean but split per declaration instead of per module.

Mac (Jun 11 2022 at 18:26):

A fair comparison -- oleans do store a lot more than just terms, though.

Mac (Jun 11 2022 at 23:33):

Here is the main body of my PoC:

import Lean.Elab.ElabRules
open System Lean Meta Elab Term Command

initialize termCacheExt : EnvExtension (Option (NameMap Expr)) 
  registerEnvExtension (pure none)

def setTermCache (cache : NameMap Expr)  : CoreM PUnit := do
  modifyEnv (termCacheExt.setState · cache)

def getTermCacheFile : CoreM FilePath :=
  return FilePath.mk ( getFileName) |>.withExtension "json"

def getTermCache : CoreM (NameMap Expr) := do
  if let some cache := termCacheExt.getState ( getEnv) then
    return cache
  else
    let cacheFile  getTermCacheFile
    let cache  liftM (m := IO) do
      try
        let contents  IO.FS.readFile cacheFile
        let json  IO.ofExcept <| Json.parse contents
        IO.ofExcept <| fromJson? json
      catch
      | IO.Error.noFileOrDirectory .. => pure {}
      | err => throw err
    setTermCache cache
    return cache

def modifyTermCache (f : NameMap Expr  NameMap Expr) : CoreM PUnit := do
  setTermCache <| f ( getTermCache)

def saveTermCache : CoreM PUnit := do
  if let some cache := termCacheExt.getState ( getEnv) then
    IO.FS.writeFile ( getTermCacheFile) (Json.pretty <| toJson cache)

elab "cached_term " key:ident tm:term : term <= expectedType => do
  let id := key.getId
  let cache  getTermCache
  if let some exp := cache.find? id then
    let ty  inferType exp
    if ( isDefEq ty expectedType) then
      return exp
  let exp  elabTermAndSynthesize tm expectedType
  modifyTermCache (·.insert id exp)
  return exp

elab "save_term_cache" : command =>
  liftCoreM saveTermCache

The JSON serialization of Expr is not efficient and it does not work for auto-generated names, but it was quick and dirty solution in the my absence of knowing the proper way to serialize expressions. I am also not sure how to register save_term_cache to run at the end of elaboration so right now it has to be done manually. However, I think the code serves as a decent demonstrate of how this could be done.

Rosie Baish (Jun 11 2022 at 23:50):

Mac said:

Here is the main body of my PoC:

Here is my somewhat hacky python stuff, very much in a pre-alpha state for now:
https://github.com/RosieBaish/lean_auto_mode
You pass it the output of the "Lean Goals" buffer in emacs either in a file or stdin an it prints out it's best effort at a proof.

Sebastian Ullrich (Jun 12 2022 at 13:14):

Mac said:

My thought would be to have a tactic parameterized by another tactic/term (e.g., cache proof) that globally loads a Name to proof term map for the file (e.g., from a JSON file) and then checks if the current definition is in the map. If not, elaborate proof. If it is in the map, try to use that term. If it fails to type check, elaborate proof. If proof is successfully re-elaborated, save the new term for prrof into the map and save the whole map to back to the cache file at the end of the program.

One issue with this approach is that you won't notice the proof is broken until the cached proof breaks as well. Which means you lose reproducibility unless you check in the cache as well, with all the usual downsides of generated files in git such as fun merge conflicts. But it might not be the end of the world.

Siddhartha Gadgil (Jun 12 2022 at 14:51):

(deleted)


Last updated: Dec 20 2023 at 11:08 UTC