Zulip Chat Archive

Stream: lean4

Topic: Bug/Observation about Empty Lists


Shreyas Srinivas (May 23 2023 at 09:43):

Without importing Mathlib [] is treated as an empty list.
After I import mathlib I get a term expected error on all occurrences of []. I am guessing there is some similar operator in mathlib that has higher precedence. Because when I hover the mouse over the error term I get the docstring : The ore localization of a monoid and a submonoid fulfilling the ore condition. Isn't Lean sufficiently polymorphic to infer that when an argument of a list type is expected [] is an empty list?

Shreyas Srinivas (May 23 2023 at 09:45):

Or is the corresponding syntax parser/notation/macro/elaborator doing its work before any such inference is done?

Mario Carneiro (May 23 2023 at 09:45):

that's definitely a missing scoped on a notation

Mario Carneiro (May 23 2023 at 09:46):

presumably this one

@[inherit_doc OreLocalization]
notation:1075 R "[" S "⁻¹]" => OreLocalization R S

Shreyas Srinivas (May 23 2023 at 09:48):

Mario Carneiro said:

presumably this one

@[inherit_doc OreLocalization]
notation:1075 R "[" S "⁻¹]" => OreLocalization R S

Is it possible to make notations "typed"?

Mario Carneiro (May 23 2023 at 09:48):

it's not a typing problem, it's a parse error

Mario Carneiro (May 23 2023 at 09:48):

setting scopedness aside this notation is broken

Shreyas Srinivas (May 23 2023 at 09:52):

Mario Carneiro said:

it's not a typing problem, it's a parse error

No I understand this. I also understand that in Lean, notation defines a syntactic unit that is parsed in a certain way, as long as the input string matches this notation (and precedence, longest match....). But in agda's mixfix definitions or haskell's infix operator definitions, the operators only accept arguments of the right type and are suitably polymorphic. I am asking if there is such a feature in lean or a possibility to implement one.

Mario Carneiro (May 23 2023 at 09:53):

that feature exists

Mario Carneiro (May 23 2023 at 09:55):

I can't replicate the issue in OreLocalization.lean, do you have an example?

Shreyas Srinivas (May 23 2023 at 09:55):

what is it called (so that I can search in the docs)?

Mario Carneiro (May 23 2023 at 09:56):

it doesn't have a name, it's just part of elaboration, notations can be overloaded and are determined based on the types

Shreyas Srinivas (May 23 2023 at 09:56):

import Mathlib
def z := Expr.const ``Nat.zero [] --error

Mario Carneiro (May 23 2023 at 09:56):

do you have an example that imports less than all of mathlib?

Mario Carneiro (May 23 2023 at 09:57):

that's not even correct, Expr is in the Lean namespace

Shreyas Srinivas (May 23 2023 at 09:57):

Mario Carneiro said:

do you have an example that imports less than all of mathlib?

sorry there was an open Lean

Shreyas Srinivas (May 23 2023 at 09:58):

I have some examples in other projects, but I would have to search for them now.

Shreyas Srinivas (May 23 2023 at 09:59):

And I usually just import Mathlib so I am not sure any of them would be the kind of example you are looking for.

Max Nowak 🐉 (May 23 2023 at 09:59):

Probably importing the file which defines the notation.

Shreyas Srinivas (May 23 2023 at 10:02):

import Mathlib.RingTheory.OreLocalization.Basic

open Lean

def z := Expr.const `Nat.zero []

Mario Carneiro (May 23 2023 at 10:02):

that was what I had tried, but I did just #check [] which seems to work fine, not the expr.const thing

Sebastian Ullrich (May 23 2023 at 10:06):

Yes, this notation is problematic under the local backtracking the Lean parser implements. It should really use noWs

Mario Carneiro (May 23 2023 at 10:21):

this is what my patch looks like:

@[inherit_doc OreLocalization]
scoped syntax:1075 term noWs atomic("[" term "⁻¹]") : term
macro_rules | `($R[$S⁻¹]) => ``(OreLocalization $R $S)

unfortunately you can't use atomic with notation

Mario Carneiro (May 23 2023 at 10:21):

using noWs alone isn't good enough since it will still clash with array indexing IIUC

Mario Carneiro (May 23 2023 at 10:26):

!4#4267

Mario Carneiro (May 23 2023 at 10:28):

Even with this fix, this notation is just terrible. You still can't write #check [1, 2⁻¹] because it makes ⁻¹] a token

Kevin Buzzard (May 23 2023 at 12:08):

Perfect is the enemy of good ;-) You guys are so obsessed about edge cases :-)

Shreyas Srinivas (May 23 2023 at 12:38):

Kevin Buzzard said:

Perfect is the enemy of good ;-) You guys are so obsessed about edge cases :-)

I was just bitten by this edge case. Also, strictly speaking the empty list is a base case :wink: and appears as such in lots of places.

Kevin Buzzard (May 23 2023 at 13:25):

Yeah I'm well aware that this sort of attitude to notation is not acceptable :-) Mario has always held us to high standards.

Shreyas Srinivas (May 23 2023 at 13:50):

Kevin Buzzard said:

Yeah I'm well aware that this sort of attitude to notation is not acceptable :-) Mario has always held us to high standards.

At some level I am curious to see how this tension between mathematicians' wild and inconsistent notation and the limits of parsers plays out. Mathematicians did not develop their notation with the foresight that some day they would have to be sufficiently consistent and unique. Lean (and to varying degrees, Coq and Isabelle) seem to do what they can to accommodate all sorts of extensions that work without giving incurable headaches so far. I am waiting to see when the point comes where computer scientists say "nope, can't do".

Patrick Massot (May 23 2023 at 13:52):

This already happens all the time.

Kevin Buzzard (May 23 2023 at 14:43):

Having the same character for beginning and ending certain parens like x|x| apparently really stinks but we do it

Shreyas Srinivas (May 23 2023 at 15:07):

Kevin Buzzard said:

Having the same character for beginning and ending certain parens like x|x| apparently really stinks but we do it

Interesting. How is each occurrence of | identified as a closing or opening | in say | |x| + |y| + | { v \in V | ... }| | |? I can imagine that there are much more annoying expressions like that and the notation would have to be defined with great care.

Eric Wieser (May 23 2023 at 15:22):

In mathlib3 you see a lot of (|x|) because of this issue

Sebastian Ullrich (May 23 2023 at 15:23):

[a, b) etc. are also fun, I don't think anyone even considered importing these into a theorem prover because they are just asking for trouble

Eric Wieser (May 23 2023 at 15:23):

I wouldn't be so sure that it hasn't been considered...

Patrick Massot (May 23 2023 at 15:27):

We definitely considered it. More precisely we considered it was obviously needed, and then we were very disappointed.

Sebastian Ullrich (May 23 2023 at 15:29):

I mean, it should actually work. Except for VS Code's paren highlighter of course.

Shreyas Srinivas (May 23 2023 at 15:32):

Sebastian Ullrich said:

I mean, it should actually work. Except for VS Code's paren highlighter of course.

Yes it already messes up much more basic things for e.g. with markdown + math. An _ for subscript in math-mode is read as the beginning of an italics block.

Kyle Miller (May 23 2023 at 15:59):

Sebastian Ullrich said:

[a, b) etc. are also fun

Even better would be supporting [a,b[ etc if those aren't trouble enough :smile:

Sebastian Ullrich (May 23 2023 at 16:01):

Ah, that's what I actually had in mind I guess

Patrick Massot (May 23 2023 at 16:08):

Of course that would be the ideal notation.

Johan Commelin (May 23 2023 at 16:12):

Knuth has a whole algorithm in TeX for determining whether a symbol is an opening or closing delimiter. I suppose that it could be mostly reused by the Lean parser?

Johan Commelin (May 23 2023 at 16:12):

I don't have the TeXbook with me here in Banff (yes, I know, I'm sorry), but I can look it up next week

Shreyas Srinivas (May 23 2023 at 16:19):

Johan Commelin said:

Knuth has a whole algorithm in TeX for determining whether a symbol is an opening or closing delimiter. I suppose that it could be mostly reused by the Lean parser?

Doesn't Tex get away with a lot by requiring escaped characters where ambiguities might arise between a character to be printed and a special character?

Shreyas Srinivas (May 23 2023 at 16:22):

Quoting directly :

Every once in a while it is necessary to treat part of a manuscript as a unit, so you
need to indicate somehow where that part begins and where it ends. For this
purpose TEX gives special interpretation to two “grouping characters,” which
(like the escape character) are treated differently from the normal symbols that
you type. We assume in this manual that { and } are the grouping characters,
since they are the ones used in plain TEX

Sebastien Gouezel (May 23 2023 at 16:43):

In LaTeX, when writing the interval ]a, b] with the French convention, you are supposed to write it as \mathopen{]} a, b], as the spacing is not correct otherwise. Essentially everyone gets it wrong...

Johan Commelin (May 23 2023 at 17:06):

But TeX gets the spacing in |a| + |b| right automatically. So registering [ and ] as the correct character class should also get you quite far, I think.


Last updated: Dec 20 2023 at 11:08 UTC