Zulip Chat Archive

Stream: lean4

Topic: Ambiguous braces


Gabriel Ebner (May 24 2022 at 18:56):

Sebastian's talk just reminded me of an issue that has been on my mind for some time: the syntax {a, b, c} is ambiguous at the moment (with the set syntax from mathlib). It can mean either:

  1. The structure instance { a := a, b := b, c := c }, or
  2. the "set" with the three elements a, b, and c (or anything with insert and singleton operations).

Currently the ambiguity is resolved by always picking (2), which is nonideal. Another option would be to make the decision based on whether we have HasInsert and HasSingleton instances (but that means #check {1,2,3} doesn't work). Yet another option would be to use a different syntax for set literals, but I don't see any common alternative to {1,2,3}.

Sebastian Ullrich (May 24 2022 at 19:17):

#check {1,2,3} doesn't do all that much in Lean 3 either, does it? How about simply failing with an appropriate message if the expected type (still) isn't sufficiently known?

Sebastian Ullrich (May 24 2022 at 19:17):

Branching on "is a struct" might be a bit more robust than "has a HasInsert"

Henrik Böving (May 24 2022 at 19:18):

Alternatively to failing one could also pick a default for the two when what is desired is not recognizable from context I guess?

Gabriel Ebner (May 24 2022 at 19:19):

#check {1,2,3} doesn't do all that much in Lean 3 either, does it?

It prints {1, 2, 3} : ?M_1, which is better than nothing.

Gabriel Ebner (May 24 2022 at 19:20):

Branching on "is a struct" might be a bit more robust than "has a HasInsert"

This is unfortunately not correct. Finsetis a structure. We might want to turn Set into a structure as well.

Sebastian Ullrich (May 24 2022 at 19:21):

Hah, yes... checking for HasInsert is probably better than checking for "is a structure with these exact field names"

Sebastian Ullrich (May 24 2022 at 19:26):

Gabriel Ebner said:

#check {1,2,3} doesn't do all that much in Lean 3 either, does it?

It prints {1, 2, 3} : ?M_1, which is better than nothing.

I'm still surprised about this part though. Is "could be any type" really better than "please provide a type here"?

Gabriel Ebner (May 24 2022 at 19:28):

How about simply failing with an appropriate message if the expected type (still) isn't sufficiently known?

I've just realized that overloaded syntax does exactly this. It just requires being very careful with antiquotations, which is why I couldn't make it work the last time. https://github.com/leanprover-community/mathlib4/pull/274

Sebastian Ullrich (May 24 2022 at 19:29):

We should do something against that :thinking:

Gabriel Ebner (May 24 2022 at 19:32):

I'm still surprised about this part though. Is "could be any type" really better than "please provide a type here"?

At least in Lean 3 this was very important because the type would need to be available immediately. Type class constraints can be solved later when the type is filled in.

In Lean 4 there is still a bit of an issue, where you might see 1 ∈ ?m... in the context because the expected type of the set syntax is not available yet. If we elaborate into the type class operations immediately, then you'll see 1 ∈ {a,b,c} even if the type class instance is not synthesized yet.

Sebastian Ullrich (May 24 2022 at 19:35):

I see, interesting

Gabriel Ebner (May 24 2022 at 19:44):

One downside of the overloaded syntax solution is that it doesn't work in patterns:

let { ctx, fvarIdToLemmaId, dischargeWrapper }  ...

For now I've added a , .. for disambiguation. Longer term it would be great if the pattern syntax was at least somewhat extensible.

James Gallicchio (May 24 2022 at 19:56):

It feels like a kinda bad idea to overload structure syntax, no? To programmer-me a finset is just a type of collection, so some sort of overloaded syntax for all collections would be worth considering; maybe overload #[...] syntax with an OfList typeclass?

Henrik Böving (May 24 2022 at 19:59):

I think having syntactic difference between notation does very much help readability of Lean code, especially once we start having more collections sticking all of them in just one type of syntax seems like a bad idea to me.

Sebastian Ullrich (May 24 2022 at 21:12):

I feel like it would be quite hard to defend any set syntax other than the current one; while at the same time, we certainly don't want to change the structure syntax either

Sebastian Ullrich (May 24 2022 at 21:13):

So naturally we should make the choice dependent on the presence of whitespace inside the braces :grimacing:

James Gallicchio (May 24 2022 at 21:18):

are there unicode braces you can use? :joy:

Kyle Miller (May 24 2022 at 21:31):

It's not ideal, but maybe the structure syntax could be extended to allow an optional with at the beginning, so to disambiguate you could write {with a, b, c}. This is meant to generalize the structure update syntax {x with a, b, c}, which works and is short for {x with a := a, b := b, c := c}.

James Gallicchio (May 24 2022 at 21:58):

Is the mathlib set syntax behind an import? Or at the top level if you have mathlib imported?

Julian Berman (May 24 2022 at 22:14):

It's not ideal, but maybe the structure syntax could be extended to allow an optional with at the beginning, so to disambiguate you could write {with a, b, c}

Or another alternative is to not automatically do a := a right? E.g. it's a remote analogy, in Python there's f"{name=}" which prints name = name, so maybe in Lean doing {a :=, b :=, c :=} to expand to a := a etc. would be a possible resolution. Or maybe the other way is less ugly. {:= a, := b, := c}.

Henrik Böving (May 24 2022 at 22:40):

If we are already talking about changing the syntax of structures why not just go the Array way and #{} ?

Julian Berman (May 24 2022 at 22:44):

Sebastian above said "we certainly don't want to change the structure syntax either" so don't know, that may already mean even these small changes (Kyle's or ^) aren't interesting :D

Sebastian Ullrich (May 25 2022 at 06:57):

Yes, moving away from curly braces for structures at least seems quite unlikely to happen. := a in fact would have been my original proposal in anticipation of this issue, but by it's hard to imagine moving away from the tantalizingly compact current notation as well...

Evgeniy Kuznetsov (May 25 2022 at 08:38):

it is useful to have an extra syntax when you can't or don't want to rely on type inference for generic literal notation:

ˢ{a, b, c} => ({a, b, c} : Set _)
{a, b, c} => ({a, b, c} : Finset _)
{a, b, c} => ({a, b, c} : Multiset _)

Leonardo de Moura (May 26 2022 at 02:21):

Gabriel Ebner said:

One downside of the overloaded syntax solution is that it doesn't work in patterns:

let { ctx, fvarIdToLemmaId, dischargeWrapper }  ...

For now I've added a , .. for disambiguation. Longer term it would be great if the pattern syntax was at least somewhat extensible.

I pushed a workaround for this issue. We can work on a better solution in the future.
I collected a few examples here: https://github.com/leanprover/lean4/blob/master/tests/lean/run/setStructInstNotation.lean#L39-L71
I tried to collect all examples posted here, and they all seem to be working. Do you have other examples that are not working?


Last updated: Dec 20 2023 at 11:08 UTC