Zulip Chat Archive

Stream: general

Topic: AI Lean definitions


Justin Asher (Feb 05 2026 at 18:53):

This thread spins out the discussion of how well AI can write definitions in Lean 4, i.e., how well it can autoformalize things.

This arose out of discussion for a potential sloplib:

Kevin Buzzard said:

2) Michael's comment made me realise that I know full well one of the problems here -- some definitions are hard. I think this is worth splitting out into a new thread; independent of the discussion here one could ask if AI tools could make definitions and I think we have a couple of interesting candidates.

Justin Asher (Feb 05 2026 at 18:55):

Justin Asher said:

Kevin Buzzard said:

What's the easiest way to ask an AI to formalize the definition of a connected reductive group over a field?

I tried this using a Claude Code setup I have on my computer, which utilizes the LeanExplore MCP tooling.

I got the following file:

Formalization of the definition of a reductive group over a field

It seems like there is (natural) difficulty. How does this look? Any clear problems with the way Claude approached this? Would this be fixable via better prompting or self-improvement loops?

Will repost this here.

Kevin Buzzard (Feb 05 2026 at 19:58):

So here's a brain dump on reductive algebraic groups.

The first two things to note are: (1) these things are not groups and (2) we should definitely have them in mathlib. What we really want is reductive group schemes and perhaps the definitive foundational work on this stuff is Conrad's notes https://math.stanford.edu/~conrad/papers/luminysga3.pdf , but if you look at the definition (Definition 3.1.1, p81) of a reductive group over a base scheme you see that one of the conditions is that the geometric fibres are reductive groups over a (algebraically closed) field, so it's best to focus first on the definition of a reductive group over a field; we'll need it for schemes.

The first design decision to make is whether to make the definition over an algebraically closed field or over a field. My instinct is to make the definition over a field, because in Langlands you want connected reductive groups over a global field (and these are not algebraically closed) and it seems a bit OTT to have to use schemes here. So I think that the first goal should be a definition of a reductive group over a field.

However some global fields are not perfect (and have characteristic p>0), so one has to be careful. For example the definition "has a completely reducible representation with finite kernel" doesn't work over imperfect fields (this is I think the definition of a pseudo-reductive group, which is more general and not what we want here I don't think). I asked Conrad and he said that he thought the best definition for a general field was the one in Borel's book "Linear algebraic groups" -- G is a smooth affine group scheme of finite type over k and the unipotent radical of G_{kbar} is trivial, where kbar is a random algebraic closure of k.

This means that one has to define unipotent radicals for smooth affine group varieties over an algebraically closed field. The definition in Conrad's notes is

Screenshot 2026-02-05 at 19.52.19.png

(p10 of the linked pdf above) and this will need a notion of dimension. The definition of unipotent is Def 1.1.5 (p8) with explanation of g_s and g_u above that.

Kevin Buzzard (Feb 05 2026 at 23:49):

Justin Asher said:

I got the following file:

Formalization of the definition of a reductive group over a field

This definition is no good because if I understood it correctly it's not correct in characteristic pp, and we need a good definition in characteristic pp for the Langlands conjectures.

Kim Morrison (Feb 06 2026 at 00:03):

@Kevin Buzzard, I just gave Claude a copy of Conrad's notes, and your post above, and prompted it with:

I would like you to attempt to sketch a preliminary definition of reductive algebraic groups for Mathlib. It's okay if this is incomplete for now: this project will definitely require human review and assistance, but I want you to get as far as you can confidently.

I've given you some notes in references/kevin.md, containing Kevin Buzzard's brain dump about how we should approach making these definitions.

It's fine to take your time to read the relevant parts of the Conrad notes, or indeed to do online research, or to stop to ask me questions. Of course you should also explore Mathlib to see what is available, and what we have to build or stub out before we can even write down approximate definitions.

The resulting (draft, experimental, never to be merged!) PR is at #34897.

Here's the full conversation log, although it's pretty painful to read; I'm posting this so you know there are no tricks! :-)

I would love to hear what you think of the result!

Kim Morrison (Feb 06 2026 at 00:09):

Actually, this version is pretty clearly wrong in some ways that even I can tell. :-) I don't want to spoil the experiment by actually trying to say anything myself, but I have just said to Claude "Can you please get a second opinion on this PR?" which it knows (via a second_opinion/SKILL.md file) means that it has to go ask Codex for a review.

Codex came back with (as summarized by Claude!)

  1. k-points definition is fundamentally wrong: GroupLike k A gives group-like elements (satisfying Δ(g) =
    g⊗g), which form a commutative group for commutative Hopf algebras. This means G(k) would be abelian even
    for GL_n. The correct k-points are A →ₐ[k] k (algebra homomorphisms) with convolution group structure.

  2. IsUnipotent definition is vacuous: Since AffineAlgGroup assumes A is geometrically reduced (hence
    reduced), any nilpotent element is zero. So (g - 1) nilpotent forces g = 1. This makes every group
    reductive, defeating the entire purpose.

  3. unipotentRadical is at wrong level: Should be a subgroup scheme (Hopf ideal/subalgebra), not a Subgroup
    of points. The current definition has no scheme-level content.

  4. IsConnected is too strong: IsDomain is strictly stronger than connectedness and not stable under base
    change. Should use "no nontrivial idempotents" instead.

  5. Base change instances may already exist: Mathlib/RingTheory/HopfAlgebra/TensorProduct.lean provides Hopf
    algebra structure on tensor products.

and I told it "Please make a plan for thoroughly addressing these criticisms." I'll update the PR with whatever that produces.

Kim Morrison (Feb 06 2026 at 00:45):

Okay, Claude has pushed a new version after it finished its conversation with Codex. I also had them write a PLAN.md for future work.

The current state is in #34897.

Here's the updated conversation log (just for verification that I contributed nothing interesting!)

Kim Morrison (Feb 06 2026 at 00:46):

Oh, it occurs to me that we could ask Claude to learn how to use blueprint, and convert PLAN.md into one.

Justin Asher (Feb 06 2026 at 05:15):

Here is the updated Lean file using Kevin's feedback:

Formalization of the definition of a reductive group over a field pt. 2

Johan Commelin (Feb 06 2026 at 07:24):

Using Kevin's feedback means that there is now a human expert in the loop.
I think it is important in Kim's approach that they avoid this.

Kim Morrison (Feb 06 2026 at 07:53):

(I'm offline for 48 hours, sorry I won't be able to respond on this thread until then.)

Kevin Buzzard (Feb 06 2026 at 11:25):

My expert comments on #34897 are that after ploughing through the 12000 lines PRed so far I found

class ReductiveGroup : Prop extends AffineAlgGroup k A where
  -- (nothing here at all)

class SemisimpleGroup : Prop extends ReductiveGroup k A where
  radical_trivial : True

which doesn't bode well.

My expert comments on Formalization of the definition of a reductive group over a field pt. 2 is that when I click on the link nothing happens.

Kim Morrison (Feb 06 2026 at 11:32):

Kevin Buzzard said:

My expert comments on #34897 are that after ploughing through the 12000 lines PRed so far I found

It's 750 lines?

Kevin Buzzard (Feb 06 2026 at 11:32):

Yeah, most of the PR is not Lean code :-)

Justin Asher (Feb 06 2026 at 12:40):

Johan Commelin said:

Using Kevin's feedback means that there is now a human expert in the loop.
I think it is important in Kim's approach that they avoid this.

I had it develop the relevant Lean code according to the references and path that Kevin provided, similar to a blueprint.

While having a human expert in the loop slows things down, it (hopefully) would still be noticeably faster than writing everything by hand.

Matthew Ballard (Feb 06 2026 at 13:04):

There are four separate modes here depending whether you want it one-shot or not and whether you are supplying tests or not. There is no way you are getting something useful without tests. With tests, then I’d still doubt a one shot but could be surprised. Really you need an extensive plan mode interview if you aren’t supplying some scaffolding. And Claude should come back to plan mode when things appear complicated. I would just hook codex up via mcp if you want to off load natural language math planning.

Matthew Ballard (Feb 06 2026 at 13:10):

There is one domain expert that I can think of that might be more indefatigable than claude, making for a great interview subject

Kevin Buzzard (Feb 06 2026 at 14:23):

Here's a hack: instead of saying "GkG_{\overline{k}} has trivial unipotent radical" (which forces you to define the unipotent radical, which might be some work) it suffices to say "GkG_{\overline{k}} has no nontrivial smooth connected unpotent normal kk-subgroup" which is probably easier.

Matthew Ballard (Feb 06 2026 at 14:24):

More to point: Kevin, what do you want it to prove to show it has the right implementation?

Kevin Buzzard (Feb 06 2026 at 14:25):

Classification over an algebraically closed field, i.e. connected reductive groups are the same thing as based root data.

Matthew Ballard (Feb 06 2026 at 14:26):

What things on the way to that?

Justin Asher (Feb 06 2026 at 14:26):

That certainly would be a lot of work?

Matthew Ballard (Feb 06 2026 at 14:29):

For a given natural language notion, what is the set of natural language API that convinces you "this is the right thing"? From those tools, you should feel confident you can build most else you need.

Kevin Buzzard (Feb 06 2026 at 14:42):

Right, so this experiment is really shedding light on the kind of questions that there are here.

My answer to the question "why formalise mathematics" has changed hugely over time, and continues to evolve. In 2017 it was "it's no more pointless than proving theorems in the Langlands program and I did that for 25 years; in fact my instinct is that it's probably a lot less pointless even though right now we're still proving Sylow's theorems". In 2019 it was "formalizing mathematics is digitising mathematics and thus has the potential to revolutionise the area just like Spotify revolutionised music (but I cannot give any specific and realistic examples of how it might really turn research mathematics on its head because we only just proved the fundamental theorem of Galois theory)". By 2021 it was "oh look Lean can check a proof which Scholze claimed would not fit into his head properly, even though it took us 6 months" and by this point the mathematical community were beginning to think that the answers were actually becoming respectable. In 2026 part of my answer is "maybe Lean can help humans (a) check old proofs and (b) find new proofs", but if we actually want this to be happening in Langlands rather than in Erdos problems then we'd better define the main objects of study in Lean, which are automorphic representations for connected reductive groups over global fields. We are many thousands of lines of code away from this definition, and currently I am trying to think of ways of using (a) humans and (b) AI to get us there. Right now there seem to be serious problems with making either approach happen quickly and I am a very impatient person. We absolutely need a correct definition. Whether it should be powered by slop or by carefully-curated code is the question I raised in the sloplib thread, and this thread is experimenting with an AI approach. But I am still none the wiser about how we should be approaching this and even if I had a massive bag of money to throw at this question (which I don't) it is still not at all clear to me in which direction to throw it.

Matthew Ballard (Feb 06 2026 at 14:50):

It's not an either-or. You need humans that know what they are doing and can use the AI tools productively. More manager and less direct developer. But the person has to quickly and deeply understand the goals and the outputs. Otherwise, you just lose too much time in the cycles to really make it productive.

Kevin Buzzard (Feb 06 2026 at 16:31):

So the person we are looking for is expert in (1) Langlands (2) Lean and (3) AI as well as (4) not already snapped up by a tech company. Should we be looking to create these people? I don't have time to train anyone right now, I am already juggling two big projects. What is a practical vision for accelerating mathlib here?

Chris Birkbeck (Feb 06 2026 at 16:34):

Hire one postdoc that knows the math and another that knows lean, give them lots of Claude code credits and let them have at it :sweat_smile:

Matthew Ballard (Feb 06 2026 at 16:42):

I don't have a better practical answer that. Picking up facility with the tools will take less time than the other two but it still takes some time. Getting everyone in one place physically to start with some people having experience with the tools and conversant in lean and math will lower the activation energy.

Kim Morrison (Feb 06 2026 at 22:54):

How useful is PLAN.md?

If it's really true that we don't have that the Hopf antipode is an anti-homomorphism, and that is on the critical path, we should be getting someone on that. :-)

(I thought I proved this already, but perhaps only for Hopf algebra objects in an arbitrary category, not the concrete version. :rolling_on_the_floor_laughing:)

Yaël Dillies (Feb 07 2026 at 07:57):

@Kim Morrison, this fact is proved in Toric. Upstreaming is currently blocked on #32245. Maybe you can get someone to review that? :wink:

Kevin Buzzard (Feb 07 2026 at 16:01):

Kim Morrison said:

How useful is PLAN.md?

I've read the first paragraph

We work with affine group schemes that are:

  • Affine: Represented by a commutative Hopf algebra A over a field k
  • Finite type: A is finitely generated as a k-algebra
  • Smooth: A is geometrically reduced (equivalently, A ⊗ k̄ is reduced)

and I don't think that's the definition of smooth: e.g. C[X,Y]/(XY)\mathbb{C}[X,Y]/(XY) is reduced but the corresponding variety over C\mathbb{C} is the union of the lines X=0X=0 and Y=0Y=0 so is not reduced. Although perhaps for group schemes the notions are equivalent? But I would imagine that by now we have the definition of a smooth morphism of rings in mathlib, so probably we should stick to this?

Missing infrastructure: my memory was that people were working on functor of points associated to a Hopf algebra. @Yaël Dillies is this true?

Yaël Dillies (Feb 07 2026 at 17:14):

Yeah, we have this in Toric

Kevin Buzzard (Feb 08 2026 at 15:13):

Kim Morrison said:

How useful is PLAN.md?

I've had a proper look at it now, and it looks to me like a coherent plan. Parts of it (e.g. "we could do this instead which is right in char 0" or "we could just do GL_n, SL_n, tori first") I would just delete, because "it kind of works but it's not the right generality" is not something we want in mathlib.


Last updated: Feb 28 2026 at 14:05 UTC