Zulip Chat Archive

Stream: general

Topic: parsing question about `∥x∥`


view this post on Zulip Kevin Buzzard (Feb 17 2021 at 23:40):

I am surprised about the following:

import analysis.normed_space.basic

example (α : Type) [has_norm α] [has_add α] (x y : α) :
  x + y  max x y := sorry -- doesn't compile :-(

The error is

type mismatch at application
  ∥x + y∥ ≤ max
term
  max
has type
  ?m_1 → ?m_1 → ?m_1 : Type ?
but is expected to have type
  ℝ : Type
-/

In analysis.normed_space.basic we have this:

class has_norm (α : Type*) := (norm : α  )

export has_norm (norm)

notation `∥`:1024 e:1 `∥`:1 := norm e

but yet if I #print notation \|| I get

`∥`:1 _:1 `∥`:1 := has_norm.norm #0

In fact even the following file doesn't make the binding power go above 1:

def norm : Type := sorry

notation `∥`:1024 e:1 `∥`:1 := norm e

#print notation  -- `∥`:1 _:1 `∥`:1 := norm #0

What have I misunderstood here?

view this post on Zulip Kevin Buzzard (Feb 17 2021 at 23:42):

def norm : Type := sorry

notation `∥`:512 e:37 `∥`:999 := norm e

#print notation  -- `∥`:999 _:37 `∥`:999 := norm #0

Is this to be expected? I can't have the first and second \|| with different binding power? This is clearly not what the authors of analysis.normed_space.basic were expecting.

view this post on Zulip Eric Wieser (Feb 17 2021 at 23:46):

I think binding power is global per-token?

view this post on Zulip Kevin Buzzard (Feb 17 2021 at 23:47):

But ∥x + y∥ ≤ max ∥x∥ ∥y∥ is a set phrase in nonarchimedean geometry! Can we not get it to parse? ∥x + y∥ ≤ (max ∥x∥ ∥y∥) is the current workaround.

view this post on Zulip Eric Wieser (Feb 17 2021 at 23:50):

How does parsing fail if you use 1024 for both priorities?

view this post on Zulip Kevin Buzzard (Feb 17 2021 at 23:50):

`⌊`:1024 _:1 `⌋`:1 := floor #0 is the correct choice, so it seems; it's just that we are bitten by both sides being the same token.

view this post on Zulip Kevin Buzzard (Feb 17 2021 at 23:51):

with 1024 on both I get two invalid expression errors, it's hard to tell what's going wrong.

view this post on Zulip Kevin Buzzard (Feb 17 2021 at 23:52):

one red squiggle under \le, the other under ||y||

view this post on Zulip Mario Carneiro (Feb 18 2021 at 00:01):

Yeah, bracketing operators that are the same on both sides are nasty to parse, for example |x+|+f|y|+z|+z|

view this post on Zulip Mario Carneiro (Feb 18 2021 at 00:03):

I think there are vertical bar tokens in unicode with little left/right ticks on them that are an okay compromise IMO

view this post on Zulip Kevin Buzzard (Feb 18 2021 at 00:09):

Mario Carneiro said:

Yeah, bracketing operators that are the same on both sides are nasty to parse, for example |x+|+f|y|+z|+z|

You just need good spacing :-)

view this post on Zulip Kevin Buzzard (Feb 18 2021 at 00:12):

@Patrick Massot @Heather Macbeth @Sebastien Gouezel @Yury G. Kudryashov did you never have problems with parsing and \||? If not then you are presumably not too interested in the idea of replacing it with some variant of lfloor and rfloor (or any other two different tokens). What you have here in analysis.normed_space.basic is being ignored by the parser -- the 1024 is apparently overruled by the 1, which is quite a big difference! I'm surprised this issue didn't show up sooner. Or maybe it did and you know about it...

view this post on Zulip Eric Wieser (Feb 18 2021 at 00:26):

How about the "new" tokens (∥x + y∥) - I hear parts of mathlib already use those /s

view this post on Zulip Mario Carneiro (Feb 18 2021 at 00:31):

that also works here, if you write ∥x + y∥ ≤ max (∥x∥) (∥y∥)

view this post on Zulip Mario Carneiro (Feb 18 2021 at 00:35):

Suggestion: ⸡x⸠

view this post on Zulip Mario Carneiro (Feb 18 2021 at 00:36):

Note that unicode calls the left and the right, but I think it makes more sense with the quills pointing out rather than in

view this post on Zulip Eric Wieser (Feb 18 2021 at 00:50):

That's "⸡x⸠" for android users

view this post on Zulip Yury G. Kudryashov (Feb 18 2021 at 00:52):

You've pasted the same symbol, so we see the same symbol.

view this post on Zulip Yury G. Kudryashov (Feb 18 2021 at 00:53):

@Kevin Buzzard Sometimes I see parsing errors and add parentheses.

view this post on Zulip Alex J. Best (Feb 18 2021 at 01:27):

How about a subscript?

import data.real.basic
class has_norm (α : Type*) := (norm : α  )

export has_norm (norm)

notation `∥`:1024 e:1 `∥ₙ`:1 := norm e

example (α : Type) [has_norm α] [has_add α] (x y : α) :
  x + y∥ₙ  max x∥ₙ y∥ₙ := sorry

seems to work fine and looks way more mathematically reasonable than ⸡x⸠

view this post on Zulip Alex J. Best (Feb 18 2021 at 01:28):

( for norm :smile: )

view this post on Zulip Mario Carneiro (Feb 18 2021 at 01:32):

What's mathematically unreasonable about it? I think the quills are almost invisible, just barely big enough that you can tell which is which. Just look at them as vertical bars with an orientation

view this post on Zulip Mario Carneiro (Feb 18 2021 at 01:32):

FYI I got the idea from mathematica, which uses characters that look essentially like this when you use the absolute value notation

view this post on Zulip Mario Carneiro (Feb 18 2021 at 01:33):

https://reference.wolfram.com/language/ref/character/LeftBracketingBar.html

view this post on Zulip Mario Carneiro (Feb 18 2021 at 01:34):

Drawn in monospaced fonts with a small left‐pointing tee to indicate direction.

I guess they draw it as a plain vertical bar in "presentation mode"

view this post on Zulip Alex J. Best (Feb 18 2021 at 01:44):

Mario Carneiro said:

What's mathematically unreasonable about it? I think the quills are almost invisible, just barely big enough that you can tell which is which. Just look at them as vertical bars with an orientation

Well I've never seen that character used in a math paper and if I saw it out of context I would have less idea what it meant than ∥y∥ₙ, and I assume most mathematicians would feel the same. A big strength of lean for me is that I can show mathematicians who don't know lean formalized statements and quite often they can understand them, I think using less "weird" symbols helps with this psychologically.

view this post on Zulip Alex J. Best (Feb 18 2021 at 01:46):

Alex J. Best said:

( for norm :smile: )

Actually I guess one issue with n is that it might be interpreted as "some natural number"

view this post on Zulip Mario Carneiro (Feb 18 2021 at 03:03):

I would hope that the context makes it clear:

example (α : Type) [has_norm α] [has_add α] (x y : α) : x + y  max x y := sorry

Frankly I think there are much worse offenders when it comes to the set of unicode characters we use, like all the arrow variations. This is a vertical bar variation, and it's maybe not obvious what the variation means but that's going to be true for any notation - at some point you have to actually look it up or guess based on the names of theorems about it. I don't like a subscript n because then it looks parameterized, like a p-norm or something

view this post on Zulip Mario Carneiro (Feb 18 2021 at 03:05):

I think being able to visually see the difference between the left and right delimiters is also useful for human consumers. I think the mathematica design is telling: they use the plain bars for presentation but make the left/right thing visible in the monospace font so that authors don't confuse the two

view this post on Zulip Heather Macbeth (Feb 18 2021 at 03:15):

Personally, I'd rather keep the current notation, and just add parentheses when Lean can't figure it out . . . I really value that Lean code is semi-readable to outsiders, I think any step away from standard math notation should be very carefully weighed.

view this post on Zulip Heather Macbeth (Feb 18 2021 at 03:18):

@Alex J. Best's suggestion just shows up to me as blank boxes ... maybe I don't have the right fonts installed, but again, half our newcomer readers also won't?

view this post on Zulip Alex J. Best (Feb 18 2021 at 03:21):

That's very strange, they show as boxes for me too on Android (as does Mario's suggestion) but apart from the subscript I just copy pasted the usual norm bars from Kevin's post which do show on my phone. So I have no idea how those characters could have ended up different

view this post on Zulip Johan Commelin (Feb 18 2021 at 04:23):

will Lean4's parser be able to deal with the problem better? I got the impression that it's quite a bit more powerful. If so, I wouldn't refactor the notation just to switch back in half a year.

view this post on Zulip Sebastien Gouezel (Feb 18 2021 at 07:12):

I'm on the same line as Heather and Yury here. We know that the current notation works fine if we are ready to add parentheses when it doesn't :-) And it has the very valuable advantage of being self-explanatory. So I wouldn't change it.

view this post on Zulip Eric Wieser (Feb 18 2021 at 07:47):

As less offensive notation, could a subscript go on the closing ||?

view this post on Zulip Kevin Buzzard (Feb 18 2021 at 07:48):

Yes I also very much value the fact that right now it's readable and any change will make it less so. This is very much an area where the CS people don't seem to understand the value of this, I had some Twitter spat recently with a few people who argued that all unicode notation was bad because it just added an extra layer of unnecessary noise. Ok if it doesn't cause too many problems I think we should leave it. I'm still very much interested in the question about whether this issue can be fixed in lean 4

view this post on Zulip Eric Wieser (Feb 18 2021 at 07:49):

Like, perhaps xr\|x\|_r with r standing for R\mathbb{R}

view this post on Zulip Kevin Buzzard (Feb 18 2021 at 07:50):

This is the other option but we've just heard from the people who use this notation a lot that it doesn't seem to be a problem in practice

view this post on Zulip Sebastian Ullrich (Feb 18 2021 at 11:25):

Johan Commelin said:

will Lean4's parser be able to deal with the problem better? I got the impression that it's quite a bit more powerful. If so, I wouldn't refactor the notation just to switch back in half a year.

It turns out that this works out of the box in Lean 4.

axiom max {α} : α  α  α
axiom norm {α} : α  α
notation "∥" e "∥" => norm e

#check fun (x y : Nat) => x + y  max x y

I was a bit surprised by that, but the secret is that function application in Lean 4 has to include whitespace. So y∥ is not confused as an incomplete application.

view this post on Zulip Patrick Massot (Feb 18 2021 at 11:32):

This is somehow an accident though. It doesn't answer the question: could we assign different binding power to the opening and closing ? Or is it no longer a meaningful question?

view this post on Zulip Sebastian Ullrich (Feb 18 2021 at 11:44):

Correct, it is no longer meaningful because precedences are now associated with parsers, i.e. e and the whole notation. Nor would an extended Lean 3 model help: You have to tell the parser that y∥ is not a valid start point of the notation, but max ∥ is. That cannot be solved with precedences alone. The parser (or any such parser) is not even thinking about the precedence of the closing at this point.

view this post on Zulip Kevin Buzzard (Feb 18 2021 at 11:47):

This is good news, presumably? Are you saying that this whole "|":a e:b "|":c thing with three numbers was a hack all along? Does Lean 4 use a Pratt parser? My understanding of binding is based on this.

view this post on Zulip Sebastian Ullrich (Feb 18 2021 at 11:51):

It does, I've written a bit about it at https://leanprover.github.io/lean4/doc/syntax.html. Do tell me if that's intelligible for non-CS audiences :) .

view this post on Zulip Sebastian Ullrich (Feb 18 2021 at 11:53):

Kevin Buzzard said:

This is good news, presumably? Are you saying that this whole "|":a e:b "|":c thing with three numbers was a hack all along?

Well, now it's two numbers (the expanded form is notation:max "∥" e:0 "∥" => norm e), but at least they're now local to the notation. I'm not sure right now if the third number ever mattered in Lean 3 either.

view this post on Zulip Kevin Buzzard (Feb 18 2021 at 12:03):

While introducing new notations is a relatively rare feature in programming languages and sometimes even frowned upon because of its potential to obscure code, it is an invaluable tool in formalization for expressing established conventions and notations of the respective field succinctly in code.

Did you see the Twitter conversation I had about this? Andrej Bauer asked a maths question in Coq and my initial reaction was "that is so ugly to read, here is how it looks in Lean, mathematicians can read my version" and then several people popped up saying "unicode is just a stupid gimmick, it makes people's lives worse" and I was just dumbfouded. I will say it again -- on several occasions mathematicians have said to me that they are attracted to Lean because it looks more readable to them than the other languages. I think it is marvellous that we are going in this direction. The CS people were arguing that it adds an extra layer of obfuscation. What they don't seem to realise is that for mathematicians it is serious unobfuscation. It is a translation which is bridging the gap between our two worlds.

view this post on Zulip Eric Wieser (Feb 18 2021 at 12:05):

Heather Macbeth said:

Alex J. Best's suggestion just shows up to me as blank boxes ... maybe I don't have the right fonts installed, but again, half our newcomer readers also won't?

Mathlib is full of notation that doesn't render properly in zulip on android - even linear maps often don't. I don't think we should consider zulip android monospace font bugs in mathlibs design. Discounting the blank box complaint, do you still prefer (∥x + y∥) over ∥x + y∥ₙ?

view this post on Zulip Gabriel Ebner (Feb 18 2021 at 12:06):

Is that a zulip bug though? It seems to be limited to certain android versions. (IIRC you mentioned that you couldn't see the characters on github either.)

view this post on Zulip Kevin Buzzard (Feb 18 2021 at 12:09):

This is fixed in Lean 4 so for me the matter of whether we should change notation is now closed.

view this post on Zulip Sebastian Ullrich (Feb 18 2021 at 12:37):

Kevin Buzzard said:

Did you see the Twitter conversation I had about this?

I did, and I agree with you (modulo choice of words :) ). I'm pretty sure though that this wasn't "CS people" but (mostly) "Coq people". No-one coming from Agda would challenge use of Unicode, while in Isabelle they're not even content with Unicode but use their own extended markup, font, and text rendering (so if you want an ITP that actually allows you to put \N in the subscript...).

view this post on Zulip Mario Carneiro (Feb 18 2021 at 20:51):

Sebastian Ullrich said:

the secret is that function application in Lean 4 has to include whitespace. So y∥ is not confused as an incomplete application.

Does this mean that folks can't pretend lean has C style function call syntax anymore, e.g. f(x, y, z) (which works as long as f : A x B x C -> D)?

view this post on Zulip Mario Carneiro (Feb 18 2021 at 20:53):

I'm not sure how common this is, but I routinely see people posting lean 3 proofs using things like P(sin(x))

view this post on Zulip Sebastian Ullrich (Feb 18 2021 at 21:11):

Mario Carneiro said:

Sebastian Ullrich said:

the secret is that function application in Lean 4 has to include whitespace. So y∥ is not confused as an incomplete application.

Does this mean that folks can't pretend lean has C style function call syntax anymore, e.g. f(x, y, z) (which works as long as f : A x B x C -> D)?

Even better, they can now implement that syntax properly without clashing with the internal application syntax, and then reevaluate their life choices.


Last updated: May 08 2021 at 09:11 UTC