Zulip Chat Archive

Stream: general

Topic: ImProver discussion


David Renshaw (Oct 08 2024 at 14:59):

Screenshot from 2024-10-08 10-58-32.png
Please open a pull request with this improvement! https://github.com/dwrensha/compfiles/pulls

David Renshaw (Oct 08 2024 at 15:00):

(ImProver announcement)

Kevin Buzzard (Oct 08 2024 at 15:34):

ImProver is a great name for this tool by the way!

David Renshaw (Oct 08 2024 at 15:42):

(me deliberately misreading it)
it roves over the implications!

Kevin Buzzard (Oct 08 2024 at 16:11):

Reminds me of Noam Elkies referring to google as "go ogle" back in the day

Heather Macbeth (Oct 08 2024 at 16:36):

By varying the metric, we can perform tasks such as shortening proofs, making them more readable, or even automated proving. We consider 3 metrics:

  • Length Metric: The length metric measures the number of tactic invocations in the tactic proof ...
  • Readability Metric: .... the ratio of number of explicitly typed have tactics to total number of tactic invocations.
  • Completion Metric: .... This is a trivial metric which measures the number of errors present.

@Riyaz Ahuja I'd love to see some optimization using my favourite metric: number of lemmas invoked by name in the proof.

Kevin Buzzard (Oct 08 2024 at 16:39):

The idea being that the more lemmas people have to learn by name, the worse their life is? And if instead we could do everything with tactics then we're making life easier for both humans and machines, presumably?

Heather Macbeth (Oct 08 2024 at 16:40):

Yup! This is the metric I mentally adhere to when writing for beginners, like in Mechanics of Proof.

Riyaz Ahuja (Oct 08 2024 at 20:21):

This is a good idea - on a previous commit, ff9eb30a1ab5ed90d3ec96cf4292c65b84596a28 I believe, I added support for extracting dependency data from proofs along with the states. It was originally meant for adding better context for optimizing theorems that use a lot of definitions and lemmas that the model might have trouble finding, but it could easily be used to make a metric that counts the number of explicit lemmas that are used.

Riyaz Ahuja (Oct 08 2024 at 20:23):

Only thing is that this extraction was incredibly slow, so I decided to not put it in the paper. However, a lot of our current goals involve making ImProver a lot more efficient and usable as a general-purpose tool, so as we restructure the architecture of the project, I think this metric would be really interesting to test and use.

Riyaz Ahuja (Oct 09 2024 at 15:38):

David Renshaw said:

Screenshot from 2024-10-08 10-58-32.png
Please open a pull request with this improvement! https://github.com/dwrensha/compfiles/pulls

PR Submitted!

Robin Carlier (Oct 12 2024 at 11:27):

On a freshly cloned copy of the repo and fresh python3.12 virtual environment , running pip install -r requirements.txt; python3 scripts/build.py --config configs/config.json (the default mathlib:v4.9.0-only config file in the repo), the scripts builds a cache, but looking at the .json it generates, it’s only made of error messages and the tactics and proofTrees objects are empty. I don’t think this is normal behaviour. What am I doing wrong here?

Riyaz Ahuja (Oct 12 2024 at 20:28):

Robin Carlier said:

messages

I installed 3.12 with a fresh clone on a new machine and was unable to reproduce the error (i.e. cache was built normally), can you DM me the error messages you're getting and if you're getting the full 4436 build jobs?

Austin Letson (Oct 15 2024 at 00:13):

It would be interesting to incorporate ImProver into a CI workflow to provide feedback on PRs

Austin Letson (Oct 15 2024 at 00:26):

@Riyaz Ahuja Would incorporating ImProver into a Lean development workflow be feasible?

Riyaz Ahuja (Oct 15 2024 at 00:56):

This is 100% the goal. In its current state, integrating ImProver to standard workflows is feasible and also quite flexible (i.e. for mathlib, define your metrics to optimize for satisfying the mathlib style/formatting requirements, etc.).

The main limitation is the requirement of LLM API keys for users to use ImProver, as well as the time cost for large PR's - both of which hurt the usability for regular users to have ImProver fully integrated in their workflow. However, with some more grad students joining our team as well as access to more compute power, we've started on building our own model distilled from the more expensive GPT-4o to be far cheaper and faster. With that, integration into workflows will be far more usable and user-friendly.

Austin Letson said:

Riyaz Ahuja Would incorporating ImProver into a Lean development workflow be feasible?

Austin Letson (Oct 15 2024 at 23:04):

Thanks for providing the context. I am excited to try this out!


Last updated: May 02 2025 at 03:31 UTC