Zulip Chat Archive

Stream: Is there code for X?

Topic: better/best solving AI?

Belisarius Cawl (Mar 26 2021 at 07:13):

I was looking at the docs for simp and it said that the program can get stuck in replacement loops if one is not careful to put the simpler side on the left hand. This got me wondering - since the concept of automated provers (possibly with statistical approaches) is not new: How far is this integrated and how would one go about invoking it?

EDIT
For example, remembering already visited nodes in the reformulation-equivalence graph would get rid of loops. Since this graph is probably quite unending this is where the AI/statistics part would come in.

Johan Commelin (Mar 26 2021 at 07:19):

@Belisarius Cawl Several things have been attempted. For example, there is rewrite_search.
One thing that I've discovered since I started formalising is that there is a delicate balance between proof search, speed, and maintainability.
So if you have a smart AI that takes some time to find a proof that isn't completely straightforward, it should be able to emit a fast (and maintainable) way to replicate the proof. This isn't always straightforward.

Johan Commelin (Mar 26 2021 at 07:20):

But if you have 5000 lemmas in mathlib that are proven using by smart_ai and the ai takes 15 seconds to find the proof, then compile time is sky-rocketing.

Johan Commelin (Mar 26 2021 at 07:22):

On the other hand, if something in logic.basic changes, and smart_ai for some stupid reason can no longer find the proof of a certain lemma X, then it shouldn't require an author with domain specific knowledge to fix the proof of X. That is why by smart_ai should output a maintainable proof, and not just a fast but completely opaque proof term.

Johan Commelin (Mar 26 2021 at 07:23):

simp seems to find the middle ground, by asking from library authors to adhere to certain conventions (e.g., simpler side goes on the right, and make the simp-set as confluent as possible, etc...). If we stick to those conventions, in practice we seem to hit a nice balance between automation, speed, and maintainability.

Johan Commelin (Mar 26 2021 at 07:24):

Speed is obtained by using squeeze_simp to output a certificate that uses simp only [15, different, lemma, names]

Belisarius Cawl (Mar 26 2021 at 07:25):

@Johan Commelin I don't understand the last part. If I find a path why not just print it? Verifying it should be orders of magnitude faster.

Johan Commelin (Mar 26 2021 at 07:26):

It depends on what smart_ai does... what do you mean with "path"?

Johan Commelin (Mar 26 2021 at 07:27):

If smart_ai is only rewriting along lemmas of the form A = B that it finds in the library, then yes. This is rewrite_search

Johan Commelin (Mar 26 2021 at 07:27):

If you want it to also handle inequalities, or maybe other smart knowledge, then I don't know what "path" is.

Belisarius Cawl (Mar 26 2021 at 07:27):

I visualize the problem as a graph, where each edge is a lemma application.

Johan Commelin (Mar 26 2021 at 07:28):

Right, so if you print that "path", you are printing the proof term. Which is often larger and uglier than you expect. And quite opaque.

Belisarius Cawl (Mar 26 2021 at 07:28):

ah yes, I did not register that as a problem.

Johan Commelin (Mar 26 2021 at 07:29):

theorem LCFP.map_comp_Tinv : ∀ (V : NormedGroup) (r' : ℝ≥0) (n : ℕ) [_inst_5 : fact (0 < r')] (c c₂ : ℝ≥0)
[_inst_6 : fact (c₂ ≤ r' * c)] {M₁ M₂ : (ProFiltPseuNormGrpWithTinv r')ᵒᵖ} (f : M₁ ⟶ M₂),
  (LCFP V r' c n).map f ≫ Tinv V r' n c c₂ M₂ = Tinv V r' n c c₂ M₁ ≫ (LCFP V r' c₂ n).map f :=
λ (V : NormedGroup) (r' : ℝ≥0) (n : ℕ) [_inst_5 : fact (0 < r')] (c c₂ : ℝ≥0)
[_inst_6 : fact (c₂ ≤ r' * c)] {M₁ M₂ : (ProFiltPseuNormGrpWithTinv r')ᵒᵖ} (f : M₁ ⟶ M₂),
  id
    ((id
        ((λ
          (a a_1 :
            (LCP V n).obj (op (filtration_obj ↥(unop M₁) c)) ⟶
              (LCP V n).obj (op (filtration_obj ↥(unop M₂) c₂))) (e_1 : a = a_1)
          (ᾰ ᾰ_1 :
            (LCP V n).obj (op (filtration_obj ↥(unop M₁) c)) ⟶
              (LCP V n).obj (op (filtration_obj ↥(unop M₂) c₂))) (e_2 : ᾰ = ᾰ_1),
            congr (congr_arg eq e_1) e_2)
           ((LCP V n).map ((ProFiltPseuNormGrpWithTinv.level r' c).map f.unop).op ≫
              (LCP V n).map (Tinv₀_hom ↥(unop M₂) c₂ c).op)
           ((LCP V n).map (Tinv₀_hom ↥(unop M₂) c₂ c ≫ (ProFiltPseuNormGrpWithTinv.level r' c).map f.unop).op)
           (((LCP V n).map_comp ((ProFiltPseuNormGrpWithTinv.level r' c).map f.unop).op
               (Tinv₀_hom ↥(unop M₂) c₂ c).op).symm.trans
              ((λ (c : Profiniteᵒᵖ ⥤ NormedGroup) {X Y : Profiniteᵒᵖ} (ᾰ ᾰ_1 : X ⟶ Y)
                (e_4 : ᾰ = ᾰ_1), congr_arg c.map e_4)
                 (LCP V n)
                 (((ProFiltPseuNormGrpWithTinv.level r' c).map f.unop).op ≫ (Tinv₀_hom ↥(unop M₂) c₂ c).op)
                 (Tinv₀_hom ↥(unop M₂) c₂ c ≫ (ProFiltPseuNormGrpWithTinv.level r' c).map f.unop).op
                 op_comp.symm))
           ((LCP V n).map (Tinv₀_hom ↥(unop M₁) c₂ c).op ≫
              (LCP V n).map ((ProFiltPseuNormGrpWithTinv.level r' c₂).map f.unop).op)
           ((LCP V n).map
              ((ProFiltPseuNormGrpWithTinv.level r' c₂).map f.unop ≫ Tinv₀_hom ↥(unop M₁) c₂ c).op)
           (((LCP V n).map_comp (Tinv₀_hom ↥(unop M₁) c₂ c).op
               ((ProFiltPseuNormGrpWithTinv.level r' c₂).map f.unop).op).symm.trans
              ((λ (c : Profiniteᵒᵖ ⥤ NormedGroup) {X Y : Profiniteᵒᵖ} (ᾰ ᾰ_1 : X ⟶ Y)
                (e_4 : ᾰ = ᾰ_1), congr_arg c.map e_4)
                 (LCP V n)
                 ((Tinv₀_hom ↥(unop M₁) c₂ c).op ≫ ((ProFiltPseuNormGrpWithTinv.level r' c₂).map f.unop).op)
                 ((ProFiltPseuNormGrpWithTinv.level r' c₂).map f.unop ≫ Tinv₀_hom ↥(unop M₁) c₂ c).op
                 op_comp.symm)))).mpr
       ((λ (c : Profiniteᵒᵖ ⥤ NormedGroup) {X Y : Profiniteᵒᵖ} (ᾰ ᾰ_1 : X ⟶ Y) (e_4 : ᾰ = ᾰ_1),
           congr_arg c.map e_4)
          (LCP V n)
          (Tinv₀_hom ↥(unop M₂) c₂ c ≫ (ProFiltPseuNormGrpWithTinv.level r' c).map f.unop).op
          ((ProFiltPseuNormGrpWithTinv.level r' c₂).map f.unop ≫ Tinv₀_hom ↥(unop M₁) c₂ c).op
          ((λ (f f_1 : filtration_obj ↥(unop M₂) c₂ ⟶ filtration_obj ↥(unop M₁) c) (e_1 : f = f_1),
              congr_arg has_hom.hom.op e_1)
             (Tinv₀_hom ↥(unop M₂) c₂ c ≫ (ProFiltPseuNormGrpWithTinv.level r' c).map f.unop)
             ((ProFiltPseuNormGrpWithTinv.level r' c₂).map f.unop ≫ Tinv₀_hom ↥(unop M₁) c₂ c)
             (continuous_map.ext
                (λ (x : ↥((filtration_obj ↥(unop M₂) c₂).to_Top)),
                   subtype.cases_on x
                     (λ (x : ↥(unop M₂)) (hx : x ∈ filtration ↥(unop M₂) c₂),
                        subtype.ext (profinitely_filtered_pseudo_normed_group_with_Tinv_hom.map_Tinv f.unop x)))))))

Johan Commelin (Mar 26 2021 at 07:29):

That's the proof term of the latest lemma that I just proved

Johan Commelin (Mar 26 2021 at 07:30):

You can #print lemma_name to try this yourself

Johan Commelin (Mar 26 2021 at 07:30):

The proof script that generated the term is

begin
  dsimp [Tinv, LCFP],
  simp only [← (LCP V n).map_comp, ← op_comp],
  congr' 2,
  ext ⟨x, hx⟩,
  exact f.unop.map_Tinv x
end

Belisarius Cawl (Mar 26 2021 at 07:30):

Seems pretty obvious... haha :D

Jokes aside, I treat that as machine code. I guess a real mathematician would want to be able to understand it though.

Johan Commelin (Mar 26 2021 at 07:31):

And this is an "easy" lemma, that everyone who knows the maths would consider "too easy to even state"

Johan Commelin (Mar 26 2021 at 07:31):

The important point is that this proof term will break very easily if some little detail in another place of the library is changed.

Johan Commelin (Mar 26 2021 at 07:31):

But the tactic script is a lot more robust

Belisarius Cawl (Mar 26 2021 at 07:31):

It's a kind of philosophical topic it seems - If I proved my prover and my prover proves something with gibberish, does that count as understanding?

Belisarius Cawl (Mar 26 2021 at 07:32):

you could solve the brittleness with version tags, no?

Johan Commelin (Mar 26 2021 at 07:32):

I appreciate the philosophical point. But my arguments are much more pragmatic.

Belisarius Cawl (Mar 26 2021 at 07:34):

so you could have recent versions tagged as buggy or just "friendly" updated (like expansion or equivalent rewrite)

Johan Commelin (Mar 26 2021 at 07:34):

We want the following features from mathlib:

it should be able to grow (so we want some form of automation, because we don't want to keep stuck with low-level concepts and tools)
it should be reasonably fast to compile mathlib (currently ~2 hrs on reasonable hardware)
it should be maintainable (PR authors that edit file X shouldn't expect 100 other files to break, and if 3 files break that they don't know much about, they should still be able to fix the proofs easily)

Johan Commelin (Mar 26 2021 at 07:36):

So, any form of automation that helps with (1) forward, should take (2) and (3) into account

Johan Commelin (Mar 26 2021 at 07:37):

It's not clear to me how the version tags would help. It sounds like they will turn (3) into a nightmare.

Johan Commelin (Mar 26 2021 at 07:37):

tidy is another tactic that will search for proofs, and tidy? will print a tactic script that uses lower-level tactics to produce the proof that tidy found

Johan Commelin (Mar 26 2021 at 07:38):

This model seems to scale well, but it's not easy to write a good smart_ai?. (At least it's not easy for me.)

Scott Morrison (Mar 26 2021 at 07:39):

There is also the new gptf, with a very impressive trained neural net behind it. It definitely fits Johan's criterion, in that its job is solely to suggest what you might type in next! At the end of the process there's no sign that it was ever involved.

Scott Morrison (Mar 26 2021 at 07:41):

Another example is solve_by_elim, which does backwards chaining from a specified set of lemmas, and library_search that tries to apply a single lemma from the entire library to close the goal.

Scott Morrison (Mar 26 2021 at 07:41):

solve_by_elim is usually robust enough to "leave in place", but library_search is sadly slow.

Belisarius Cawl (Mar 26 2021 at 07:41):

I am thinking along the lines of - what are edits that cause a proof to break? Either the old version had a bug (in which case you need of course to discard everything relying on it) or you rewrite it because you have an equivalent but somehow "better" formulation (in which case you can keep your old proof, as long as you show that your rewrite is equivalent). Likely I don't really understand your problem though.

I am thinking of something like Nix. Also I got a feeling that modularization could help. Again, as an outsider.

Of course such an AI would be nontrivial which makes it exciting. There is a lot of art/subjectivity in (statistical) AI.

Scott Morrison (Mar 26 2021 at 07:41):

One of course wishes for the pushout of these two tactics. :-)

Scott Morrison (Mar 26 2021 at 07:42):

Proofs break constantly in mathlib. :-) Make a few pull requests and you'll get a feel for what is going on.

Belisarius Cawl (Mar 26 2021 at 07:43):

I really want to right now - but I have to constrain myself. Am already three rabbit holes down from my original task (which has nothing to do at all with proofs ;) )

Scott Morrison (Mar 26 2021 at 07:43):

mathlib definitely does not grow "just from the top". We are perpetually adding e.g. simp lemmas that make things work more smoothly, but that changes the behaviour of simp in every existing proof. It takes some care to only use tactics like this in ways that will break in predicable / manageable ways.

Anne Baanen (Mar 26 2021 at 09:09):

Belisarius Cawl said:

Johan Commelin I don't understand the last part. If I find a path why not just print it? Verifying it should be orders of magnitude faster.

A (related) technical term you might be interested in is a proof certificate, although that's more of a problem-specific concept. For example, to prove "¬ linear_independent R v" a proof certificate would be a list of coefficients c i such that Σ i, c i * v i = 0. So you could have a find_linear_dependence tactic outputting a definition of c, and a check_linear_dependence c tactic that verifies the linear combination adds up to 0. (You could say squeeze_simp outputs a proof certificate for the problem of determining whether two expressions are the same.)

Last updated: May 02 2025 at 03:31 UTC