Zulip Chat Archive

Stream: lean4

Topic: no semicolon required before return


Mario Carneiro (Jun 06 2023 at 17:45):

This apparently parses:

#check do pure () return

and means the same as

#check do pure (); return

I don't think this is intentional?

Mario Carneiro (Jun 06 2023 at 17:47):

other doSeq/term constructors have varying behavior:

Interpreted as an argument:

  • do foo () for x in [] do e means do foo () (for x in [] do e)
  • do foo () try e catch _ => e means do foo () (try e catch _ => e')
  • do foo () try e finally e means do foo () (try e finally e')
  • do foo () do e means do foo () (do e)
  • do foo () unless e do e' means do foo () (unless e do e')

Interpreted as a doSeq:

  • do foo () return means do foo (); return
  • do foo () break means do foo (); break
  • do foo () continue means do foo (); continue
  • do foo () dbg_trace e means do foo (); dbg_trace e
  • do foo () assert! e means do foo (); assert! e
  • do foo () repeat e means do foo (); repeat e
  • do foo () while e do e' means do foo (); while e do e'
  • do foo () match e with | _ => e' means do foo (); match e with | _ => e'
  • do foo () if e then e else e means do foo (); if e then e else e

Reid Barton (Jun 06 2023 at 18:03):

Is there a reason layout doesn't just work the same way as in Haskell?

Mario Carneiro (Jun 06 2023 at 18:05):

details?

Mario Carneiro (Jun 06 2023 at 18:10):

my guess is that most of the problems here are fixed by using sepBy1IndentSemicolon instead of many1Indent in the definition of doSeq. That will give doSeq the same semicolon-or-newline behavior as tacticSeq

Reid Barton (Jun 06 2023 at 18:10):

https://www.haskell.org/onlinereport/lexemes.html section 2.7 but in brief, in a layout context, an equally-indented line desugars to ; and a less-indented line desugars to } (or multiple }s)

Mario Carneiro (Jun 06 2023 at 18:11):

and a more-indented line?

Reid Barton (Jun 06 2023 at 18:12):

is just whitespace

Mario Carneiro (Jun 06 2023 at 18:14):

I think this would not be a small change, lean's layout mechanism is architected very differently

Reid Barton (Jun 06 2023 at 18:15):

Lean's system seems a lot less predictable for the non-omniscient user

Mario Carneiro (Jun 06 2023 at 18:16):

One difference of note is that adding semicolons doesn't make the result whitespace-insensitive:

by
  . skip;
    skip;
  skip;

the indentation of the third line matters

Reid Barton (Jun 06 2023 at 18:17):

I think that e.g.

do
  do x;
     y;
  z;

means the same in Haskell

Reid Barton (Jun 06 2023 at 18:18):

Layout in Haskell is only disabled if you use braces

Mario Carneiro (Jun 06 2023 at 18:18):

can you use braces without semicolons?

Reid Barton (Jun 06 2023 at 18:19):

(I'm not sure if the extra semicolons are actually legal--it's an unnatural thing to write--but I am sure that the block that y belongs to will be determined by the indentation)

Reid Barton (Jun 06 2023 at 18:19):

if there's only one thing :upside_down:

Reid Barton (Jun 06 2023 at 18:20):

weird but legal

a = let { x =
 1 } in x

Reid Barton (Jun 06 2023 at 18:22):

Anyways the main effect is that the layout rules effectively take priority over whatever other syntax is happening, e.g., something starting in column 1 is definitely a new top-level command, which I think is not the case in Lean 4 (based on error messages at least)

Reid Barton (Jun 06 2023 at 18:24):

actually I'm sure it's not the case because I have been putting definition bodies there

Mario Carneiro (Jun 06 2023 at 18:24):

Yeah I'm reading the layout rules in https://www.haskell.org/onlinereport/syntax-iso.html#layout and

A nested context must be further indented than the enclosing context (n>m). If not, L fails, and the compiler should indicate a layout error.

is definitely not true for lean

Mario Carneiro (Jun 06 2023 at 18:25):

one of the effects of a bracketing syntax is withoutPosition, so you can insert nested scopes with less indentation

Reid Barton (Jun 06 2023 at 18:25):

GHC has some extensions that allow optional semicolons in things like if c [;] then expr1 [;] else expr2 for people who don't want to nest those

Reid Barton (Jun 06 2023 at 18:26):

(when the if appears in a do block)

Reid Barton (Jun 06 2023 at 18:28):

Being able to understand what the compiler thinks the nesting structure from the indentation is a very useful feature, even just for better error reporting

Reid Barton (Jun 06 2023 at 18:28):

the rest of the details aren't very important

Reid Barton (Jun 06 2023 at 18:29):

hmm I think there is also something like RelaxedLayout in GHC

Mario Carneiro (Jun 06 2023 at 18:29):

I think that those notions should broadly apply, do you have an example in mind where you need something galaxy-brained to interpret valid lean code?

Reid Barton (Jun 06 2023 at 18:32):

I'm under the impression that e.g. apply can take its argument on the next line unindented?

Reid Barton (Jun 06 2023 at 18:32):

Maybe there are rules and I just don't know them

Reid Barton (Jun 06 2023 at 18:33):

I mean also the examples you posted at the start of this thread

Mario Carneiro (Jun 06 2023 at 18:35):

yes, most tactics that have a mandatory argument do not require colGt, perhaps that is a mistake

Reid Barton (Jun 06 2023 at 18:37):

right while optional arguments obviously can't be given on the next line at the same indentation level

Reid Barton (Jun 06 2023 at 18:38):

and the same thing happens following using, where, etc. whenever there is a mandatory argument I guess

Mario Carneiro (Jun 06 2023 at 18:38):

the reasoning is that the next thing has to be an expr regardless of where it appears. It's still difficult to take advantage of this in practice because the app parser does require a column check, so

by
        apply
f

parses but

by
        apply
f x

does not

Reid Barton (Jun 06 2023 at 18:38):

so in order to make sense of the layout, you have to know the whole syntax of every tactic

Mario Carneiro (Jun 06 2023 at 18:40):

I get worried about column checks in too many places because it can make it difficult to keep the whitespace-insensitive variant working

Mario Carneiro (Jun 06 2023 at 18:41):

whitespace-insensitive syntax is about as well maintained in lean 4 as ascii equivalent symbols in lean 3

Reid Barton (Jun 06 2023 at 18:42):

Presumably all sensible systems will agree on the meaning of correct, tastefully written files

Reid Barton (Jun 06 2023 at 18:43):

so what's at stake is the system not inventing an obviously unintended meaning of something (e.g. an unindented "argument" to apply)

Reid Barton (Jun 06 2023 at 18:44):

Mario Carneiro said:

I get worried about column checks in too many places because it can make it difficult to keep the whitespace-insensitive variant working

Yeah so Haskell's layout rules can be completely avoided if you always write a { after the layout keywords, which is nice.

Reid Barton (Jun 06 2023 at 18:44):

People don't really write code like that (though I think there is still some in GHC itself) but programs might want to

Reid Barton (Jun 06 2023 at 18:45):

OTOH, maybe generating textual Lean code is not a really sensible thing to attempt.

Mario Carneiro (Jun 06 2023 at 18:45):

the haskell report has this lovely example of tasteless code:

The parse-error rule is hard to implement in its full generality, because doing so involves fixities. For example, the expression

 do a == b == c

has a single unambiguous (albeit probably type-incorrect) parse, namely

(do { a == b }) == c

because (==) is non-associative. Programmers are therefore advised to avoid writing code that requires the parser to insert a closing brace in such situations.

Reid Barton (Jun 06 2023 at 18:46):

Yeah this is a bit of a dark corner. I think it is to support let a = 1 in

Reid Barton (Jun 06 2023 at 18:48):

otherwise you would logically be required to use either braces or a newline before in

Mario Carneiro (Jun 06 2023 at 18:48):

Reid Barton said:

OTOH, maybe generating textual Lean code is not a really sensible thing to attempt.

Heh, this has been done by many people, including me. It's difficult to tell people not to

Mario Carneiro (Jun 06 2023 at 18:50):

unless you are working in lean it's not all that easy to construct lean source directly via Syntax

Mario Carneiro (Jun 06 2023 at 18:51):

maybe we need a lean frontend that reads JSON ASTs

Mac Malone (Jun 07 2023 at 01:24):

Reid Barton said:

Presumably all sensible systems will agree on the meaning of correct, tastefully written files

I certainly would not presume this. For a Lean example, parts of a definition can be flushed left when in many other languages the different parts would be considered separate statements. For example:

opaque foo :
ExceptT String
(StateT Nat BaseIO)
(Nat × String) :=
return (0,"hi")

Furthermore, there is actually a lot of Lean code that leverages this style, so at least some users consider it "tasteful". However, this design decision also prevents terms from being lifted into commands . This means things like Template Haskell shorthand would be impossible in Lean (and this style impossible in Template Haskell).

Scott Morrison (Jun 07 2023 at 01:29):

... which Lean code uses this style, I'm curious?

Mario Carneiro (Jun 07 2023 at 01:58):

Mac's

Mario Carneiro (Jun 07 2023 at 02:03):

(I have to resist the urge to reformat things whenever I read lake code)

Mac Malone (Jun 07 2023 at 07:34):

@Mario Carneiro Also, Lean core code. However, not as much as I thought. I did notice you reformatted some cases when changing String.lean. Maybe the few cases were just mistakes and they made more of impression on my perception of Lean style than they should have. However, Mathlib code also flushes definition contents to left at times as well.

Mac Malone (Jun 07 2023 at 07:36):

Note that for my code, I only flush parameters and return types, not contents (except in the special case of multi-line strings due to the fact that such indentation would end up in the string).

Mac Malone (Jun 07 2023 at 07:37):

Also, FYI, my example was meant to be a particularly egregious case of the flush left style, not an example of my own code, but apparently there is overlap from others' perspectives. :rofl:

Reid Barton (Jun 07 2023 at 07:40):

I see. I will say that certainly the linked code from lake is not nearly as bad as your example (which among other things could be written on just 1-2 lines).

Mac Malone (Jun 07 2023 at 07:47):

Mario Carneiro said:

(I have to resist the urge to reformat things whenever I read lake code)

As a bikeshedding tangent, I am curious how you would format the signature of compileLeanModule. I would think there are too many parameters to go on one line, so some multi-line formatting would be inevitable. I also would think that indenting them to the same level as the definition contents would be more confusing as it would then be hard to visually separate them. So I am interested in what your approach would be.

Sebastian Ullrich (Jun 07 2023 at 08:00):

We usually indent the declaration header by 4 spaces for this reason

Sebastian Ullrich (Jun 07 2023 at 08:01):

Mac said:

Mario Carneiro Also, Lean core code.

This is old code written in Lean 3 style

Mac Malone (Jun 07 2023 at 08:13):

@Sebastian Ullrich Ah, I was not aware the indentation style had changed between Lean 3 and Lean 4.

Mac Malone (Jun 07 2023 at 08:25):

Sebastian Ullrich said:

We usually indent the declaration header by 4 spaces for this reason

So, do you mean like this?

def foo
    (bar : Nat) : String :=
  baz

Personally, this is hard for me to parse because the dedent in baz makes me feel like it should be a separate, outer statement of (bar ... :=, when it is actually the RHS of :=. This is probably I habit I formed from the common programming style of requiring the RHS of an operator to be grater than the LHS indent of the operator. Essentially, that style feels like this:

"some long string" + "another long string" +
      "more long strings" ==
    "even more and more long strings" + "and more and more"

Which feels wrong (and does not parse in some whitespace-sensitive languages).

Sebastian Ullrich (Jun 07 2023 at 09:18):

It would be more like

def foo (bar : Nat)
    (baz : Nat) :
    String :=
  baz

depending on the available space.

This is probably I habit I formed from the common programming style of requiring the RHS of an operator to be grater than the LHS indent of the operator

You are talking about infix operators. def ... := ... is not an infix operator. This argument looks very subjective. Which is fine, it just doesn't seem to be the predominant opinion.

Mac Malone (Jun 07 2023 at 20:57):

Sebastian Ullrich said:

def ... := ... is not an infix operator.

The := sort of is, and it feels like one to my eyes. It also used as one, for example, in do notation.

This argument looks very subjective. Which is fine, it just doesn't seem to be the predominant opinion.

True, it is subjective and my opinion does seem to be the minority one. I guess that's life. :cry: I do think I have a better grip of what is common Lean 4 style now, which is good. :smile:

Mario Carneiro (Jun 07 2023 at 21:13):

you know, there is a formatter that will tell you what the recommended format is (modulo some bugs)

Mac Malone (Jun 07 2023 at 21:14):

@Mario Carneiro The 'modulo some bugs' however makes it hard to tell what is proper formatting and what is the result of bugs.

Mario Carneiro (Jun 07 2023 at 21:16):

you can always ask, most of the bugs are known at this point since we have a small army looking at hundreds of thousands of lines of auto-formatted code

James Gallicchio (Jun 09 2023 at 21:17):

i am excited for the day when we are all using structure editors, so we can have a sliding bar of how much vertical whitespace to display the code with :) my preferred style there is

def foo
    (bar : Nat)
    (baz : Nat)
  : String
  :=
  baz

which I admit is a lot of newlines for most people :joy:

Damiano Testa (Jun 09 2023 at 21:24):

Your style doesn't typecheck: baz is a Nat, not a String...


Last updated: Dec 20 2023 at 11:08 UTC