Zulip Chat Archive

Stream: Lean for Scientists and Engineers 2024

Topic: Using LLMs to assist when needed

Eric Taucher (Jul 12 2024 at 15:55):

The title is not exact but needed something so can be changed if needed.

The goal of this is to share some LLM prompts with others to help them move along in the class. These are not meant to help students solve the problems, etc. but to help keep pace with the class when they get stuck or to possibly gain some insights of using Lean from reading the prompts and replies.

If the instructor Tyler Josephson asks for this topic to be removed that is OK, no need to explain.

This is not meant for just me so feel free to chime in, enhance and repost, share examples, etc.

Eric Taucher (Jul 12 2024 at 15:57):

Prompt (Claude)

Given the following word problem, create the Lean 4 prover code as a student exercise to be solved using single chain of expressions.

Use Lean example instead of theorem or lemma.
Include variable after keyword example in a format like example { x y : R}.
Use hypothesis in a format like (h1 : 10 = (a - b) ).
Put example and hypothesizes on separate lines.
Use Lean sorry instead of providing solution.
Use words for variable name with underscores.
Convert each statement to one hypothesis if possible.

 word problem
-- A plane flies 450 miles with the wind in 3 hours
-- Flying back against the wind, the plane takes 5 hours to make the trip
-- Prove that the plane's speed in still air would be 120 miles/hour

Reply

Here's the Lean 4 prover code set up as a student exercise based on the given word problem:

example { plane_speed wind_speed : ℝ}
(h1 : 450 = (plane_speed + wind_speed) * 3)
(h2 : 450 = (plane_speed - wind_speed) * 5)
: plane_speed = 120 :=
sorry

Would you like me to explain or break down the code?

Kevin Buzzard (Jul 12 2024 at 16:17):

Ask it how to prove it :-) The spacing is poor but I guess we can forgive that. The pro solution to how to prove it is

import Mathlib.Tactic

example { plane_speed wind_speed : ℝ}
    (h1 : 450 = (plane_speed + wind_speed) * 3)
    (h2 : 450 = (plane_speed - wind_speed) * 5) :
    plane_speed = 120 := by
  polyrith

Tyler Josephson ⚛️ (Jul 12 2024 at 16:20):

Great idea! Neat to see this thread. You're asking a question about "autoformalization", one task of which is "how to ask AI to translate natural language proofs into formal proofs," specifically, in your case "how to translate an informal theorem statement into a formal one?"

A different task is "how to get an AI to autocomplete a proof."

You'll probably find the Zulip channel https://leanprover.zulipchat.com/#narrow/stream/219941-Machine-Learning-for-Theorem-Proving to be interesting

Simon Beaumont (Jul 12 2024 at 16:20):

If it can come up with the most efficient tactic -- I'll buy it ;-) @Kevin Buzzard thanks for polyrith another one to add to my list...

Kevin Buzzard (Jul 12 2024 at 16:25):

Warning: polyrith only works if you're connected to the internet ;-) (so I can't use it on the London underground)

Tyler Josephson ⚛️ (Jul 12 2024 at 16:25):

Related to your prompt above - be careful about leaking aspects of the solution into the prompt. To me, it's not surprising that it types the speeds as \R and formats the hypotheses in that way, since you include those in the prompt. It's often interesting to see what style choices it makes on its own.

Or - if including `example { x y : Nat}' in the prompt, does it figure out that maybe \R would be a better choice for these properties?

Eric Taucher (Jul 12 2024 at 16:31):

Another use for LLMs is to understand words, phrases, etc.

Tyler Josephson ⚛️ said:

"autoformalization"

Prompt (Claude)

Explain `autoformalization` in the context of proof assistant.

Reply

Autoformalization in the context of proof assistants refers to the process of automatically converting informal mathematical statements or proofs into formal, machine-checkable representations that can be verified by a proof assistant.

Proof assistants are software tools that help mathematicians and computer scientists develop and verify formal proofs of mathematical theorems. They typically use a specific formal language to express mathematical concepts, definitions, and proofs. However, writing mathematics directly in these formal languages can be time-consuming and requires specialized knowledge.

Autoformalization aims to bridge the gap between the way mathematicians typically write proofs (in natural language with some mathematical notation) and the formal representations required by proof assistants. The key aspects of autoformalization include:

Natural language processing: Analyzing and interpreting informal mathematical text written in natural language.
Symbol recognition: Identifying and interpreting mathematical symbols and notation.
Semantic parsing: Understanding the logical structure and meaning of mathematical statements.
Formal language generation: Translating the interpreted content into the formal language used by the proof assistant.
Proof structure conversion: Transforming the logical flow of an informal proof into a sequence of formal proof steps.

The benefits of autoformalization include:

Increased accessibility: Allowing mathematicians to work with proof assistants without extensive training in formal methods.
Time-saving: Reducing the manual effort required to formalize mathematical content.
Error detection: Identifying ambiguities or gaps in informal proofs during the formalization process.
Knowledge transfer: Facilitating the conversion of existing mathematical literature into formal, machine-checkable formats.

While autoformalization is an active area of research, it remains a challenging problem due to the complexity and nuance of mathematical language. Current approaches often combine machine learning techniques with rule-based systems to achieve better results.

Would you like me to elaborate on any specific aspect of autoformalization or its applications in proof assistants?

Eric Taucher (Jul 12 2024 at 16:38):

Tyler Josephson ⚛️ said:

it's not surprising that it types the speeds as \R and formats the hypotheses in that way, since you include those in the prompt.

Thanks for the gloss note.

The version of the prompt posted here is based on several revisions that I started with GPT 4o. Because the prompt was re-edited and submitted and the context window had access to the entire conversation, the AI started to give me several hypothesis instead of the desired two, so I took the prompt you see posted here and posted it to Claude to see if the context was messing up the reply. Yes I know I could have started a new conversation on GPT 4o but was also interested in seeing if the prompt worked elsewhere.

Eric Taucher (Jul 12 2024 at 17:19):

Tyler Josephson ⚛️ said:

if including `example { x y : Nat}' in the prompt, does it figure out that maybe \R would be a better choice for these properties?

Prompt

Given the following word problem, create the Lean 4 prover code as a student exercise to be solved using single chain of expressions.

Use Lean example instead of theorem or lemma.
Include variable after keyword example in a format like example { x y : Nat}.
Use hypothesis in a format like (h1 : 10 = (a - b) ).
Put example and hypothesizes on separate lines.
Use Lean sorry instead of providing solution.
Use words for variable name with underscores.
Convert each statement to one hypothesis if possible.

-- A plane flies 450 miles with the wind in 3 hours
-- Flying back against the wind, the plane takes 5 hours to make the trip
-- Prove that the plane's speed in still air would be 120 miles/hour

Reply (Claude) - Does not change Nat to Real.

example { plane_speed wind_speed : Nat } (h1 : 450 = (plane_speed + wind_speed) * 3) (h2 : 450 = (plane_speed - wind_speed) * 5) : plane_speed = 120 := by sorry

Reply (ChatGPT 4o) - Does not change Nat to Real.

example { plane_speed wind_speed : Nat }
  (h1 : 450 = plane_speed + wind_speed * 3)
  (h2 : 450 = plane_speed - wind_speed * 5) :
  plane_speed = 120 :=
sorry

Reply (Mistral Codetral) - Does not change Nat to Real.

example { wind_speed : Nat } (h1 : 450 = 3 * (wind_speed + 120)) (h2 : 450 = 5 * (wind_speed - 120)) : 120 = wind_speed :=
sorry

Reply (Mistral Large) - Does not change Nat to Real.

example { plane_speed_with_wind plane_speed_against_wind plane_speed_in_still_air : Nat }
  (h1 : 450 = 3 * plane_speed_with_wind)
  (h2 : 450 = 5 * plane_speed_against_wind)
  (h3 : plane_speed_in_still_air = plane_speed_with_wind + plane_speed_against_wind)
  : plane_speed_in_still_air = 120 :=
sorry

Tyler Josephson ⚛️ (Jul 12 2024 at 17:55):

Great test! Exactly - you can see the models aren’t reflecting on / understanding the context of the task at hand. They’re matching the pattern without “thinking” about what they’re doing.

Eric Taucher (Jul 12 2024 at 19:52):

Having thought about this a bit, does anyone know if Lean has a type that is not specific that Lean can use like sorry to be filled in by the user later or that Lean can use type inferencing to resolve the type if the rest of the proof is sound?

My prefered programming language is Prolog, specifically SWI-Prolog so working without types when possible is quite intuitive for me. I am also quite familiar with some of the ML family of functional languages so quite use to using type inferencing.

Eric Taucher (Jul 12 2024 at 20:02):

Prompt (ChatGPT 4o)

Does Lean have a type that is not specific that Lean can use like sorry to be filled in by the user later or that Lean can use type inferencing to resolve the type if the rest of the proof is sound?

Reply (Be mindful of possible hallucinations)

Yes, Lean has a mechanism for dealing with unspecified types that can be filled in later or inferred. Here are some tools that Lean provides for such scenarios:

_ (underscore): Lean can use an underscore to denote a placeholder for a type or term that Lean's type inference mechanism will attempt to fill in. This can be used both in types and in terms.

Example:
lean def example (x : _) := x + 1
Here, Lean will infer the type of x based on the context.
sorry: This keyword is used to temporarily skip a proof obligation. When you use sorry, Lean will accept the proof as correct, but with a warning that it is incomplete. This is useful for incrementally developing a proof.

Example:
lean theorem my_theorem (x : ℕ) : x + 1 > x := sorry
Implicit arguments: Lean allows you to define functions with implicit arguments that can be inferred from the context.

Example:
lean def add {α : Type} [has_add α] (a b : α) : α := a + b
Here, α and the instance of has_add α are implicit and will be inferred from the usage context.
Type classes: Lean's type class mechanism can be used to automatically infer types and instances based on the context.

Example:
lean def add_monoid {α : Type} [add_monoid α] (a b : α) : α := a + b
In this example, Lean will infer the type α and the corresponding instance of add_monoid based on how the function is used.

These tools make it easier to write generic and reusable code in Lean by allowing the type inference mechanism to handle unspecified types and terms.

Since I do not know which may be valid or a hallucination will have to verify each part.

Eric Taucher (Jul 12 2024 at 20:42):

Tyler Josephson ⚛️ said:

if including `example { x y : Nat}' in the prompt, does it figure out that maybe \R would be a better choice for these properties?

However changing the type from R to _ the LLMs are setting the type to Nat when generating the code for this example.

Modified part of prompt

* Include variable after keyword `example` in a format like `example { x y : _ }`.

Reply

example { wind_speed : Nat, plane_speed : Nat }
  (h1 : 450 = 3 * (plane_speed + wind_speed))
  (h2 : 450 = 5 * (plane_speed - wind_speed))
: plane_speed = 120 :=
sorry

Eric Taucher (Jul 12 2024 at 20:47):

Eric Taucher said:

_ (underscore): Lean can use an underscore to denote a placeholder for a type or term that Lean's type inference mechanism will attempt to fill in.

See: 2.2. Types as Objects

Equivalently, we can write Type _ or Type* to avoid giving the arbitrary universe a name:

constant α : Type _
#check α

constant β : Type*
#check β

Eric Taucher (Jul 12 2024 at 20:56):

Also tried modifying the prompt to use Type* but the LLMs would always insert a bound type for the generated code, some use R, some used Nat. If someone knows how to modify the prompt to insertType* in the generated examples, please post here.

Tyler Josephson ⚛️ (Jul 12 2024 at 23:52):

While yes, Lean has mechanisms for type inference (they’re invoked implicitly in this example to infer the type of “450” - beyond this, though, I’m not an expert in how they work), that’s still a syntactic question, not a semantic question. My only reason for preferring R over Nat is my understanding of the context of the problem - speeds of airplanes and winds are often values in between the Nats (4.5 for example), so R or at least Q would be more appropriate than Nat.

I could write the theorem statement with Nat or with R or with Q and the ensuing proof would remain valid. But someone else reading this might object “is velocity really a natural number?”

Eric Taucher (Jul 13 2024 at 12:00):

Was reading this by Terence Tao, amazing how even new users to Lean will see it is not just them that encounter similar situations.

Journaling my Lean learning process: Oct 9 2023, Oct 10 2023, Oct 11 2023, Oct 12 2023, Oct 14 2023 part 1, Oct 14 2023 part 2, Oct 14 2023 part 3, Oct 16 2023, Oct 17 2023.

One gem in reading was Lean phrasebook
I don't know if he is updating it but it is something of value that I hope gets expanded or others can share similar data, e.g. Mathematical English and Lean equivalent

Eric Taucher (Jul 13 2024 at 12:07):

The other section of interest related to this topic from Terrence's blog

AI, proof assistants, and other machine assistants:

You will have to scroll down to find the section, no direct link. Same as with Lean, new users to LLMs will see it is not just them that encounter similar situations.

Last updated: May 02 2025 at 03:31 UTC