Zulip Chat Archive
Stream: new members
Topic: Intro / First PR
Knala Solomon (Jun 29 2025 at 17:39):
Hi, I'm Knala. I'm new here and just wanted to pop in and say hi, as I have made my first PR to mathlib. Inspired by a TODO I saw in the docs, I defined the von Neumann hierarchy in ZFC, and connected my definition with ZFSet.rank. File here. I would appreciate any feedback on code or style if anyone has capacity to offer, as I did not have much experience with lean prior to this project.
Kenny Lau (Jun 29 2025 at 17:49):
Wow, the code looks really neat! Well done!
Kenny Lau (Jun 29 2025 at 17:50):
Next time please link to the PR as well! #26518
Knala Solomon (Jun 29 2025 at 17:52):
thanks!
Knala Solomon (Jun 29 2025 at 17:52):
Unfortunately, I am seeing now that my PR did not pass CI :smiling_face_with_tear:. it seems to be because I was not able to run lake exe mk_all on my own machine before pushing. Here is the error message I get
dyld: Symbol not found: __ZNKSt3__115basic_stringbufIcNS_11char_traitsIcEENS_9allocatorIcEEE3strEv
Referenced from: $HOME/.elan/toolchains/leanprover--lean4---v4.21.0-rc3/bin/../lib/libLLVM.dylib (which was built for Mac OS X 10.19)
Expected in: /usr/lib/libc++.1.dylib
error: external command '$HOME/.elan/toolchains/leanprover--lean4---v4.21.0-rc3/bin/clang' exited with code 134
The error appears not when trying to exe my new file, but rather on a series of Mathlib files I have not modified. I suspect that it may have to do with my OS version (I am using mac OS 10.15 Catalina, which according what I can see is sufficent for lake), but I am a little confused as I am not sure that "Mac OS X 10.19" even exists...
Kenny Lau (Jun 29 2025 at 17:55):
What do you mean "trying to exe my new file"? You should just run lake exe mk_all once in the main directory (the directory containing .gitignore and all the other relevant files)
Knala Solomon (Jun 29 2025 at 17:58):
that is what i did. I just meant that the error came up before lake reached the file I modified.
Kenny Lau (Jun 29 2025 at 18:02):
@Knala Solomon this is not really recommended, but you can go to Mathlib.lean (which is sorted alphabetically) and manually add your file to the list.
Knala Solomon (Jun 29 2025 at 18:07):
do you have any idea what would be causing the dyld error?
Kenny Lau (Jun 29 2025 at 18:09):
I've never seen that, sorry; maybe you can try asking GPT
Weiyi Wang (Jun 29 2025 at 18:13):
Kenny Lau said:
Knala Solomon this is not really recommended, but you can go to
Mathlib.lean(which is sorted alphabetically) and manually add your file to the list.
wait... you are not supposed to manually edit Mathlib.lean? I have done that for at least three merged PRs :sweat_smile:
Ruben Van de Velde (Jun 29 2025 at 18:16):
It's not forbidden :)
Aaron Liu (Jun 29 2025 at 18:28):
The intended course of action is to run lake exe mk_all
Knala Solomon (Jun 29 2025 at 20:26):
i got it to work!
Kenny Lau (Jun 29 2025 at 20:29):
@Knala Solomon how did you fix it?
Kenny Lau (Jun 29 2025 at 20:59):
@Knala Solomon (btw if you're wondering why there's been nothing, I'm still reviewing your PR)
Knala Solomon (Jun 29 2025 at 20:59):
I set my DYLD_LIBRARY_PATH to pull from the Homebrew version of llvm rather than the system one
Knala Solomon (Jun 29 2025 at 21:00):
I'm patient. I imagine there are lots of PR's for ppl to review.
Kenny Lau (Jun 29 2025 at 21:03):
no i'm actually only reviewing your PR
Knala Solomon (Jun 29 2025 at 21:03):
cool, thanks!
Kenny Lau (Jun 29 2025 at 22:05):
@Knala Solomon I left a big review :slight_smile: Feel free to ask me questions, but I'll probably take a break for now.
Kenny Lau (Jun 30 2025 at 00:09):
@Knala Solomon I've added the "awaiting-author" tag because you seem to be editing the file. Feel free to remove the tag after you've finished editing, and then I'll review further.
Knala Solomon (Jun 30 2025 at 08:21):
thanks again for your review. I do have a some questions whenever you get a chance:
- is it preferred for mathlib that everything be golfed down to its shortest form?
- is it preferred to use simp for as much as possible? if so, why? that seems to obfuscate the proof.
- can you elaborate on constructivism?
Michael Rothgang (Jun 30 2025 at 08:29):
CC @Damiano Testa he PR summary comment job fails on this PR with an error "needs merge": is this expected (or a known issue)?
Michael Rothgang (Jun 30 2025 at 08:30):
These two red crosses don't seem related to your pull request: for the time being, ignoring them is safe.
Ruben Van de Velde (Jun 30 2025 at 08:33):
- no, but we prefer not making proofs longer than is useful. The limit to that is often up to personal opinion :)
Damiano Testa (Jun 30 2025 at 08:34):
Besides the error that suggests to merge, there is also a merge conflict, so possibly resolving that would make the action work?
Knala Solomon (Jun 30 2025 at 08:34):
i'm also actively editing the code so the pr is not ready to move forward atm
Michael Rothgang (Jun 30 2025 at 09:31):
My take on "how much golfing", which I believe is somewhat shared, is the following:
- we want code to be maintainable going forward: absolute golfing down to a soup of ASCII characters (as you see in some code golf competitions on the internet) is not a goal, because editing that is a mess
- unnecessarily long code also means extra maintenance work: if there's a more succinct, equally nice proof, we prefer that.
- if a proof has "almost no mathematical content", it's fine to golf that. This last item requires a judgement call.
Kenny Lau (Jun 30 2025 at 09:34):
Knala Solomon said:
thanks again for your review. I do have a some questions whenever you get a chance:
- is it preferred for mathlib that everything be golfed down to its shortest form?
- is it preferred to use simp for as much as possible? if so, why? that seems to obfuscate the proof.
- can you elaborate on constructivism?
- I think length serves as a good indicator of how much "stuff" there is in a given proof, or in other words the complexity of a proof. Golfing it down helps to make main steps of the proof more clear. For example,
/-- The stages of the von Neumann hierarchy are transitive as sets,
meaning that every element of a stage is a subset of it. -/
@[simp] theorem isTransitive_stage (o : Ordinal.{u}) : (stage o).IsTransitive := by
induction o using Ordinal.limitRecOn
case zero => simp
case succ a ih => simpa using ih.powerset
case isLimit o olim ih => simp_all [IsTransitive.sUnion']
If I see this proof, then I can immediately know that the main step is an induction, and the substeps are "trivial" (i.e. just simp).
- Again, it helps to clear out the obvious parts and makes the structure clearer.
- So, in short, constructivism is classical logic but without the law of excluded middle (
P or not P) and without double negation elimination ((not not P) implies P), and is actually the core of Lean's logic (i.e. using type theory with inductive types).
The story of type theory representing proofs is that we use inductive types and somehow everything works out to be constructivist logic. Recall the very famous example Nat where we have the principle of induction (which is why they're called inductive types):
We have these two axiomatic constructors of Nat (i.e. "how to build a natural number"):
zero : Natsucc : Nat -> Nat
When I say "axiomatic", I mean that these are basically declared without any proof, as in, I just declare that there is a way to construct the natural number called "zero", and then given a natural number, I declare that you can just form its successor called "succ".
And then implicitly we have a "smallness" constraint, i.e. saying that "every natural number is formed only using those constructors". How do we say this? One way to translate it is to say:
If any set S of natural numbers contains 0, and satisfies the extra requirement that "whenever n is in S, n+1 is also in S", then S has to contain all the natural numbers.
So now in Lean:
import Mathlib
axiom nat : Type
axiom zero : nat
axiom succ : nat → nat
axiom nat.induction : ∀ S : Set nat, (zero ∈ S) → (∀ n ∈ S, succ n ∈ S) → (∀ n, n ∈ S)
But also recall that in Mathlib, Set nat just means nat -> Prop, and then when you do that step, you've basically recovered Lean's Nat.rec:
Nat.rec.{u} {motive : ℕ → Sort u} (zero : motive Nat.zero) (succ : (n : ℕ) → motive n → motive n.succ) (t : ℕ) :
motive t
So, in summary, an inductive type has axiomatic constructors, and then a "recursion principle" that has one argument per constructor.
Now here are the objects you'll see in Prop and their induction principles:
- True:
- One constructor
intro : True - You basically don't need induction.
- One constructor
- False:
- No constructor (you can't make False)
False.rec.{u} : (motive : False → Sort u) → (t : False) → motive t(no constructors, so no additional arguments)
- And:
- Constructor:
intro : p → q → p ∧ q - Recursor:
rec : (p → q → r) → ((p ∧ q) → r)
- Constructor:
- Or:
- Constructors:
left : p → p ∨ q,right : q → p ∨ q - Recursor:
rec : (p → r) → (q → r) → (p ∨ q → r)
- Constructors:
(Not is not an atom, ¬p is just shorthand for p → False.)
So from these axioms built from the theory of "inductive types", you'll hopefully be convinced that there's actually no way to prove p or not p or (not not p) -> p. Basically, why it's called constructivism is because to prove p or q, you essentially have to either supply a proof of p or a proof of q, so you have to construct something, rather than doing a proof by contradiction.
So the way to get around this is that you have an axiom Classical.em : p or not p (it is not an actual axiom but it follows from another axiom, this is just a technicality), from which you can also deduce (not not p) -> p, which let me stress again, you're free to use this as many times as you want in Mathlib.
But the point I'm trying to make here is that Lean is more "inherently" constructivist, so it's easier to use a statement that looks like p → q than a statement that looks like ¬q → ¬p, which I call "you added extra negations for no reason"; and it's easier to use a statement that looks like "induction" (which should have no "not"s anywhere) rather than a statement that "eliminates possibilities".
So, the example is ZFSet's induction:
/-- Induction on the `∈` relation. -/
@[elab_as_elim]
theorem inductionOn {p : ZFSet → Prop} (x) (h : ∀ x, (∀ y ∈ x, p y) → p x) : p x
You can see how "neat" this is because it does not have any unnecessary negations.
The mathematician's formulation of this might instead look like:
import Mathlib
@[elab_as_elim]
theorem ZFSet.Mathematician's_induction {p : ZFSet.{u} → Prop} (h : ¬∀ x, p x) :
∃ x, ¬ p x ∧ ¬ ∃ y ∈ x, ¬ p y := by
by_contra h₁; refine h fun x ↦ ?_
induction x using inductionOn with
| h x ih =>
by_contra hp
exact h₁ ⟨x, hp, fun ⟨y, hyx, hy⟩ ↦ hy <| ih y hyx⟩
(which you can check that it fails to compile! because Lean really doesn't see this as a sort of induction!)
So, in summary, when we state "induction principles", we prefer a formulation without negations that builds "from the ground up", rather than a formulation with negations that "negates possibilities from top to bottom", and that's the core idea of constructivism. (again lemme stress that Classical is not forbidden in Mathlib at all)
Michael Rothgang (Jun 30 2025 at 09:43):
is it preferred to use simp for as much as possible? if so, why? that seems to obfuscate the proof.
If you're reading that file, you can always use simp? to find out which lemmas were used.
In practice, I find that many simp steps would translate to "we simplify the term in a standard way" on paper. This is not where a "new mathematical idea" is hiding --- and if you want to see more detail, simp? already helps.
This is my personal opinion, yours may be different!
Kenny Lau (Jun 30 2025 at 09:51):
I agree with the given translation here.
Junyan Xu (Jun 30 2025 at 12:16):
Note that there's already #17027; it helps to search for open pull requests
(I'd like to cc @Violeta Hernández but she's not subscribed to this channel)
Kenny Lau (Jun 30 2025 at 12:19):
I don't know what to say.
Kenny Lau (Jun 30 2025 at 12:20):
I think the membership theorems there would be a good addition to this PR:
theorem vonNeumann_mem_of_lt {a b : Ordinal} (h : a < b) : V_ a ∈ V_ b
theorem mem_vonNeumann {o : Ordinal} {x : ZFSet} : x ∈ V_ o ↔ rank x < o
as well as:
theorem iUnion_vonNeumann : ⋃ o, (V_ o : Class) = Class.univ
In addition, I think naming it vonNeumann with the notation V_ is also a good idea.
Kenny Lau (Jun 30 2025 at 12:21):
vonNeumann is a better name than stage, and V_ avoids the complaint that the name is too long to type.
Kenny Lau (Jun 30 2025 at 12:21):
@Knala Solomon would you like to add these to your PR?
Kenny Lau (Jun 30 2025 at 12:22):
Also these two:
theorem vonNeumann_mem_vonNeumann_iff {a b : Ordinal} : V_ a ∈ V_ b ↔ a < b := by
theorem vonNeumann_subset_vonNeumann_iff {a b : Ordinal} : V_ a ⊆ V_ b ↔ a ≤ b := by
Junyan Xu (Jun 30 2025 at 12:23):
I've request a review from Violeta on GitHub, and hopefully you can then coordinate to decide what to keep from each PR :)
Kenny Lau (Jun 30 2025 at 12:26):
I propose we move to a new thread.
Last updated: Dec 20 2025 at 21:32 UTC