Zulip Chat Archive

Stream: general

Topic: words


Jeremy Avigad (Jan 14 2020 at 03:25):

In his talk in the Formal Abstracts session of the meeting at Carnegie Mellon last week, Tom Hales pointed out that for some purposes, it may be better to use the word "or" instead of the symbol . The former is not harder to type than the latter, it is familiar, it is easy to read, and everyone knows what it means.

For fun, I took Chris Hughes' proof of quadratic reciprocity in the library and replaced all the logical connectives with the words "not" "and" "or" "implies" "iff" "exists" and "forall". The result is attached.:

quadratic_reciprocity_with_words.lean

The theorem

lemma euler_criterion_units {x : units (zmodp p hp)} :
  (∃ y : units (zmodp p hp), y ^ 2 = x) ↔ x ^ (p / 2) = 1

becomes

lemma euler_criterion {a : zmodp p hp} (ha : a ≠ 0) :
  (exists y : zmodp p hp, y ^ 2 = a) iff a ^ (p / 2) = 1

and the line

have he : ∀ {x}, x ∈ Ico 1 (p / 2).succ → x ≠ 0 ∧ x ≤ p / 2

becomes

have he : forall {x}, x ∈ Ico 1 (p / 2).succ implies x ≠ 0 and x ≤ p / 2

Call me crazy, but I kind of like the text versions.

Enrico Borba (Jan 14 2020 at 04:36):

Some thoughts:

  • In symbol-heavy statements, the words and/not/etc, will stand out more, and make the reading of the theorem much easier.
  • Similarly, in word-heavy statements, the symbolic forms of the connectives will likely be more readable than having even more text
  • I think it fits well with Lean supporting many different "styles" for statements & proofs

Johan Commelin (Jan 14 2020 at 05:52):

@Enrico Borba But maybe in the word-heavy statements some subtle syntax highlighting could make the connectives stand out, even if they are in text form.
Also, in my field of mathematics word-heavy seems to be preferred. I never see the \not symbol. I sometimes see \and and \or, but it's usually words. The quantifier-symbols are used, but I would still say less than their word-variants.

Enrico Borba (Jan 14 2020 at 05:59):

I don't mean the second point as a downside of having "word" form of the connectives. I mean it as a support for both versions to exist. Syntax highlighting will definitely help resolving any ambiguity for sure.

Kevin Buzzard (Jan 14 2020 at 08:12):

My experience is the same as Johan's. I never see the symbols for not, and or or. Hales quoted Knuth as saying that in mathematical prose symbols should be avoided, and looking at the maths papers I read I see a lot of it is in prose. This is one reason to be very excited about Tom's project with Koepke. Their natural language examples looked very natural to me. Tom's work will make Jeremy's examples even better. Ideally we want Euler's criterion to literally look like this: "Let aa be a non-zero element of Z/pZ\mathbb{Z}/p\mathbb{Z}. Then there exists yZ/pZy\in\mathbb{Z}/p\mathbb{Z} such that y2=ay^2=a if and only if ap/2=1a^{p/2}=1."

Kevin Buzzard (Jan 14 2020 at 08:12):

eew that p/2p/2 looks so ugly ;-) I would always write a(p1)/2=1a^{(p-1)/2}=1...

Patrick Massot (Jan 14 2020 at 09:10):

I think we can all agree that almost every mathematicians simply don't have symbols for "and", "or" and "not". And we indeed write a lot of sentences. But we don't write formulas with quantifiers as words. Jeremy's statement of the Euler criterion looks nothing like mathematics to me. I would either write a full sentence like Kevin did or a formula using symbols for quantifiers and equivalences. Having a controlled natural language parser for Lean would be something interesting. But until that happens, please please don't give us ugly Coq-like formulas.

Johan Commelin (Jan 14 2020 at 09:21):

We do write things like

U(g1,,gnf)={xf(x)0 and gi(x)f(x) for all i}U(\frac{g_1, \ldots, g_n}{f}) = \{ x \mid f(x) \ne 0 \text{ and } |g_i(x)| \le |f(x)| \text{ for all } i\}

Johan Commelin (Jan 14 2020 at 09:23):

At least, I see that quite often. (Especially putting quantifiers at the end of a sentence :rolling_on_the_floor_laughing:)

Kevin Buzzard (Jan 14 2020 at 09:28):

Johan -- the original version of that definition had gi(x)f(x)0|g_i(x)|\leq|f(x)|\not=0 and I initially mis-formalised it when doing the perfectoid project! If the index set of ii's was empty then f(x)0f(x)\not=0 was omitted. I only noticed when I started formalising lemmas about the sets ;-) Mathematicians are sloppy!

Chris B (Jan 14 2020 at 09:43):

I think this kind of stuff that lies right at the contact point between the user and the computer is mad important, but if you surveyed 50 Lean users about which language items they prefer as symbols/words I think you'd get 50 different responses. It would be nice if you could tell the editor extension which representation you prefer for a given language item. There are VSCode extensions that have features which can overlay text in the buffer you're working on without actually changing the source code but I don't know how intrusive those are on the implementation side.

Sebastian Reichelt (Jan 15 2020 at 00:57):

I hope it's OK if I chime in, as I'm doing something that's similar to natural-language input for formal math. In my alternative approach, the user selects the natural-language content from menus, so the text never has to be parsed because it was already formal internally. As a result, the user doesn't have quite as much choice, but here is how Euclid's criterion turned out (I hope I didn't mess it up): https://slate-prover.herokuapp.com/libraries/hlm/Algebra/Fields/Prime%20fields/Eulers%20criterion
To get an idea how this is entered, you can click the "edit" button or create a new definition or theorem. But a lot of things are not finished yet, including proof input.
The reason I think this "reverse approach" works better than plain natural-language input is that with the latter, it is not easy for an inexperienced user to find out what the system will or will not accept.

Jeremy Avigad (Jan 15 2020 at 15:30):

Nobody is claiming that the examples I posted look like ordinary mathematics, but I found it striking that the trivial syntactic substitutions made the expressions easier and more pleasant for me to read, despite the fact that, as a logician, I have been steeped in logical notation every day of my adult life.

Of course, readability is a subjective thing. We'd have to resort to polls and psychological experiments for anything like hard data. But Tom Hales, Peter Koepke, and all the people before them who have emphasized the importance of natural language are right that readability is important to make the technology attractive.

By the way, in his talk, Tom Hales mentioned the "harder direction," translating formal expressions into something approximating natural language. Many years ago, an MS student of mine, Steve Kieffer, did that:
http://www.andrew.cmu.edu/user/avigad/Papers/mkm/index.html
http://www.andrew.cmu.edu/user/avigad/Papers/mkm/munkres.pdf
Tom is right that the output text does have a run-on computer-generated feel, but I think is is surprisingly reasonable. (The natural-language component wasn't even in Steve's MS thesis, which had only the formal expressions. We generated them afterwards using some very simple heuristics.)

I was happy that Steve came to the meeting last week. He is currently working on graphical tools for exploring / teaching mathematics (https://royalroadmath.org/newpfsc.html). It would be nice to have something like that to explore mathlib, perfectoids, etc.

Johan Commelin (Jan 15 2020 at 16:19):

Thanks for the pointer to that file by Steve Kieffer. I agree that there is too much stuff in the output text to make it feel "natural". But I guess using typing information could certainly improve that.

Johan Commelin (Jan 15 2020 at 16:19):

As well natural language analogues of implicit variables and type classes.

Kevin Buzzard (Jan 15 2020 at 16:21):

Can somebody tell me why I want to write k[X₁,X₂,…Xₙ] and I have to write mv_polynomial (fin n) k, and why I want to write kⁿ and I have to write fin n → k, in my algebraic geometry course? Is this to do with words, or is notation something else?

Kevin Buzzard (Jan 15 2020 at 16:22):

k[X₁,X₂,…Xₙ] and kⁿ are tokens which I would happily use in sentences in a maths paper -- in prose, I mean.

Chris Hughes (Jan 15 2020 at 16:24):

The parser doesn't know that has anything to do with n I guess.

Kevin Buzzard (Jan 15 2020 at 16:25):

Is this fixed in Lean 4? My readers are pretty good at spotting this.

Johan Commelin (Jan 15 2020 at 17:50):

@Kevin Buzzard But you don't type kⁿ into LaTeX either.

Johan Commelin (Jan 15 2020 at 17:50):

So maybe there could be some intermediate form... which is exactly what Hales and Koepke are going for, I think.

Sebastian Ullrich (Jan 17 2020 at 10:29):

The parser doesn't know that has anything to do with n I guess.

Yes, that is a first issue. We can tell both Lean 3 (I think) and Lean 4 to interpret xⁿ (for any x, but specifically for ) as x ^ n, though it should be a bit simpler to implement in Lean 4. And then of course we can overload ^ to mean a function space if the LHS is a type and hope that Lean can disambiguate it. Perhaps not a great idea in general, but acceptable in specific contexts?

Mario Carneiro (Jan 17 2020 at 10:31):

what happens with name resolution here?

Mario Carneiro (Jan 17 2020 at 10:31):

(does lean have "hygienic notation"?)

Sebastian Ullrich (Jan 17 2020 at 11:25):

notation is always hygienic in Lean 3 & Lean 4:

postfix `ⁿ`:100 := λ x, x ^ n

does not work because n is unbound. However, every good hygienic macro system also allows hygiene-bending operations, which in Lean 4 you can use like

macro (x : term) "ⁿ":100 => `(x ^ $(mkTermId `n))

Now n will be resolved in the caller's scope, not the macro declaration's scope.

Kevin Buzzard (Jan 17 2020 at 16:54):

In this part of my course, n is fixed once and for all, so I would at this point be happy if I could just have a token kⁿ meaning n -> k

Sebastian Ullrich (Jan 19 2020 at 21:27):

@Kevin Buzzard If n and k are both variable/parameters, notation `kⁿ` := n -> k should work in Lean 3, no?

Kevin Buzzard (Jan 19 2020 at 21:38):

invalid notation declaration, contains reference to local variables

Sebastian Ullrich (Jan 19 2020 at 21:39):

Ah, should be local notation I assume?

Kevin Buzzard (Jan 19 2020 at 21:40):

Oh this is great :D

Sebastian Ullrich (Jan 19 2020 at 21:43):

Just not very general or reusable :) . But for use in a single file it's probably fine, yes.

Sebastian Ullrich (Jan 19 2020 at 21:46):

(fun fact: is a valid character in a name in Lean 3. Other superscripts have been removed at some point, I removed as well now in Lean 4 so that the macro above can indeed work: https://github.com/leanprover/lean4/blob/master/tests/lean/run/kevin.lean#L6)


Last updated: Dec 20 2023 at 11:08 UTC