Zulip Chat Archive

Stream: PR reviews

Topic: lean#352 Syntactic horror or pure elegance?


Gabriel Ebner (Jun 23 2020 at 13:53):

def new_string_literal := /-"
## Markdown heading

String blocks are like comments:
``lean
/-- My declaration -/
def my_declaration : string :=
/-"
  They can be nicely nested
"-/
``
"-/

Reid Barton (Jun 23 2020 at 13:56):

What's the first character of the literal? # or newline?

Reid Barton (Jun 23 2020 at 13:56):

and the last?

Gabriel Ebner (Jun 23 2020 at 13:57):

The newline is the first character. The last two characters are ` and a newline.

Anne Baanen (Jun 23 2020 at 14:14):

I'm not convinced by this specific syntax, because my mental lexing algorithm uses /- to always indicate the start of a comment, and the next character in the case of /-- or /-! just specify which kind of comment. Turning it into a string literal would cause a bunch more mental lookahead. But I can't find any alternative nestable string literal syntax in a popular language, except perhaps CDATA in XML...

Anne Baanen (Jun 23 2020 at 14:16):

What is the expected result for /-" am I a string or a comment? -/?

Gabriel Ebner (Jun 23 2020 at 14:55):

Obviously unknown command `«/-"»` .

Reid Barton (Jun 23 2020 at 14:57):

how about using and for nestable strings?

Gabriel Ebner (Jun 23 2020 at 14:57):

Let me explain the motivation behind making the syntax similar to comments. A big requirement is here that comments and strings can be nested in any alternation. The comments already had code to support the /- /- -/ -/ nesting. If we introduced a brand-new syntax for string literals, then we would also have to adapt the nesting code and the syntax highlighting.

Gabriel Ebner (Jun 23 2020 at 15:03):

It is also problematic to use the same token for both the begin and the end. Let's say we choose """:

{ description := """
How to write documentation:
`​``lean
add_tactic_doc {
 description := """
   Description goes here.
 """
}
`​``
""" }

Gabriel Ebner (Jun 23 2020 at 15:04):

Oh, the Zulip highlighting correctly highlights the issue.

Scott Olson (Jun 26 2020 at 11:21):

I haven't seen a programming language with nestable "normal" strings, but a few languages have nestable "raw" strings (which don't interpret escape sequences), by tagging the outermost delimiter with a string that never occurs inside the string. In Rust this looks like:

let example1 = r#"
  These "quotes" do not cause the end of the string literal, only a quote with a # can.
"#;

let example2 = r#####"
  With more hashes like this, the string can mention ", "#, "##, "###, and "#### without ending.
"#####;

C++ implements the same idea with R"foo( ... )foo" instead of r###" ... "### (arbitrary identifier labels instead of #s).

Your alternative approach seems to be driven by a need these languages didn't have - specifically embedding Lean code in Lean strings, so you can depend on the fact the internal delimiters will be balanced. This is the same reason some languages adopted nestable comments, since you are often putting, say, Lean code in Lean comments.

Gabriel Ebner (Jun 26 2020 at 11:35):

Yes, we already have nestable comments. And we also use comments for extended documentation. It seemed a reasonable choice to reuse the comment syntax since /-" "-/ will be mostly used by strings containing documentation.

Gabriel Ebner (Jun 26 2020 at 11:38):

There are actually a few other (not so hip) languages that have nestable string syntax:

  • Perl: print q{q{}q{q{}}} prints q{}q{q{}}. For bonus points you can use any kind of parenthesis after q. Other kinds of strings have similar prefixes like qq (with escapes), or m (for regexps).
  • Tcl: puts {{a}}. Interestingly, Tcl also uses the string syntax for function bodies (because they are strings)

Scott Olson (Jun 26 2020 at 11:41):

I completely forgot that Perl (and Ruby, which copied it from Perl) had this feature literally, nice.

Scott Olson (Jun 26 2020 at 11:44):

Raku (previously Perl 6) which seems to aspire to be the cutest programming language, has Q[...] syntax akin to Perl but also allows the use of Japanese quoting like 「this, which also 「nests」 properly」.

Scott Olson (Jun 26 2020 at 11:45):

Come to think of it, Lean already uses French quotation marks, which could nest, but for identifiers instead of strings.


Last updated: Dec 20 2023 at 11:08 UTC