Zulip Chat Archive

Stream: lean-gptf

Topic: simplicity


Julian Berman (Jan 31 2021 at 04:01):

Hi -- naive question perhaps, but how does/did gpt-f learn what constitutes a simpler proof than another? Or more specifically, what made it produce this: https://github.com/leanprover-community/mathlib/pull/5903/files -- is it explicitly trained that rw is a simpler tactic than simp even though in terms of number of characters that's longer?

Mario Carneiro (Jan 31 2021 at 04:03):

The comments on the last few gpt-f PR's suggest that it's using the length of pp.all output

Julian Berman (Jan 31 2021 at 04:04):

A ha.

Jesse Michael Han (Jan 31 2021 at 04:34):

Mario Carneiro said:

The comments on the last few gpt-f PR's suggest that it's using the length of pp.all output

it is just the length of format.to_string <$> format.flatten <$> tactic.pp $PROOF_TERM with default pretty-printing settings --- they get an order of magnitude larger with pp.all enabled

Alex J. Best (Jan 31 2021 at 04:36):

But it's not trained to output short proofs , is that right? It just happens to find shorter ones sometimes?

Jesse Michael Han (Jan 31 2021 at 04:38):

correct, it's not explicitly trained to produce short proofs, nor do we optimize the data collection process in any way for this --- it just tries to predict what it thinks is the most likely tactic for a given tactic state, based on the training data (tactic steps in mathlib)

Jesse Michael Han (Jan 31 2021 at 04:41):

we determined which proofs were shorter with another script which replays the proofs found by gptf and compares the proof term to the one already in the environment.

Julian Berman (Jan 31 2021 at 14:14):

Makes sense, thanks! Nonetheless I suppose, during training did it get to see multiple proofs of the same thing at all?

Jesse Michael Han (Jan 31 2021 at 15:17):

probably --- remember, its training data is precisely what you see when you step through a tactic proof in mathlib, so for example, it's probably seen false proven from Q and not Q (and other extraneous hypotheses in the context) using contradiction, exact absurd h1 h2, tauto, finish...

we can check this rigorously, but later---our attention is occupied with some other experiments at the moment.


Last updated: Dec 20 2023 at 11:08 UTC