Zulip Chat Archive

Stream: lean4

Topic: Macro for testing error messages


David Thrane Christiansen (Apr 09 2022 at 12:23):

I'm working on a document in which I'd like to accurately describe error messages thrown by Lean, and have CI keep them up to date as the document and the compiler evolve over time.

To support this, I'm trying to build a macro. This should be a command macro that takes a command and an expected error as arguments. If the command elaborates successfully, then an error should be thrown. If the command results in an error, then the error should be compared with the expected error. If they are equal, the result of elaboration should be thrown away and nothing more should happen. If they are not equal, then an error should be thrown.

Here's my quick attempt to make this work:

import Lean

syntax withPosition("expect" "error" colGt command "message" str "end") : command

elab_rules : command
  | `(expect error $cmd:command message $msg:str end) =>
    open Lean.Elab.Command in
    open Lean.Meta in do
      let savedState <- get
      try
        elabCommandTopLevel cmd
        throwError "no error"
      catch ex =>
        match msg.isStrLit? with
          | none => throwError "Invalid message format {msg}, expected a string literal"
          | some str => do
            let foundString <- ex.toMessageData.toString
            if str == foundString
              then return ()
              else throwError "Expected error {str} but got {foundString}"
      finally
        modify fun s => savedState

expect error
  def x : Nat := "I am not a Nat"
message
  "nope"
end

I would expect this to throw an error like Expected error "nope" but got "type mismatch...". However, it throws the error Expected error nope but got no error, which makes me think that I need to do something after the call to elabCommandTopLevel to make it fail if there's errors. I get the same result with elabCommand - I'm not sure which is appropriate here.

Any help would be appreciated!

David Thrane Christiansen (Apr 09 2022 at 12:26):

Also, as an aside, the conversion from the string literal required in the grammar to an inhabitant of type String seems a bit overly fiddly here - is there a better way to project a string out of a syntax pattern match when the grammar declares it to be so?

Arthur Paulino (Apr 09 2022 at 12:39):

You can use dbg_trace to see what's going on:


let foundString <- ex.toMessageData.toString
dbg_trace str
dbg_trace foundString
if str == foundString

It's outputting:

nope
no error

Arthur Paulino (Apr 09 2022 at 12:40):

Those are indeed different, so Lean is throwing an error with the message "Expected error {str} but got {foundString}", which results in that exact message we're seeing

David Thrane Christiansen (Apr 09 2022 at 12:41):

Thanks!

That part doesn't surprise me. What surprises me is that throwError "no error" is running, because I would expect the call to elabCommandTopLevel to throw an error itself, short-circuiting the evaluation of the try, and throwing a different error.

Arthur Paulino (Apr 09 2022 at 12:43):

Got it. I'll try debugging in a few minutes

David Thrane Christiansen (Apr 09 2022 at 12:47):

Thank you!

Sebastian Ullrich (Apr 09 2022 at 12:53):

The error is caught in withLogging inside elabCommand so that Lean continues elaboration after a failed command. I think the simplest solution is to check (← get).messages before and after elabCommandTopLevel cmd; for example, whether the size increased by exactly 1 and the head is the expected error.

David Thrane Christiansen (Apr 09 2022 at 12:58):

That makes sense - thanks! I'll give it a shot.

Could a command elaborator plausibly throw more than one message?

Sebastian Ullrich (Apr 09 2022 at 13:04):

Yes, error recovery inside the term elaborator can do that! Or a command macro that unfolds to multiple commands.

David Thrane Christiansen (Apr 09 2022 at 13:05):

Sounds like I should instead look for errors that were introduced by the sub-elaborator, and then ensure that the one I wanted happens to be present, then. Should be reasonable.

David Thrane Christiansen (Apr 09 2022 at 14:11):

Well, this seems to work:

syntax withPosition("expect" "error" colGt command "message" str "end") : command

elab_rules : command
  | `(expect error $cmd:command message $msg:str end) =>
    open Lean.Elab.Command in
    open Lean in
    open Lean.Meta in do
      let savedState <- get
      match msg.isStrLit? with
      | none => throwError "Desired message must be a string, but got {msg}"
      | some desiredError => do
        try
          elabCommand cmd
          let afterState <- get
          let lengthDifference := afterState.messages.msgs.size - savedState.messages.msgs.size
          let newMessages := afterState.messages.msgs.getN! lengthDifference
          let newErrors := newMessages.filter (fun m => m.severity == MessageSeverity.error)
          let mut errStrings : List String := []

          for err in newErrors do
            let asString <- err.data.toString
            errStrings := asString :: errStrings
            if (<- err.data.toString) == desiredError
              then return ()
              else pure ()
          throwError "The desired error {desiredError} was not found in\n{errStrings.reverse}"

        finally
          modify fun s => savedState

expect error
  def x : Nat := "I am not a Nat"
message
"type mismatch
  \"I am not a Nat\"
has type
  String : Type
but is expected to have type
  Nat : Type"
end

Thanks for the help!

It looks like the removal of traverse and its replacement with forM a while back makes let mut and for ... in mandatory here, whereas my Haskell-loving self would normally have wanted to first calculate the list of strings using traverse and then done a separate check for membership, rather than having to interleave them as I did above. Am I missing something about that?

David Thrane Christiansen (Apr 09 2022 at 14:12):

Also, is there something like Python's triple-quoted strings to avoid escaping quotes in the message? This is all going to be slurped through some regexp nonsense that I'd like to keep simple if possible :-)

Leonardo de Moura (Apr 09 2022 at 14:25):

It looks like the removal of traverse and its replacement with forM a while back makes let mut and for ... in mandatory here, whereas my Haskell-loving self would normally have wanted to first calculate the list of strings using traverse and then done a separate check for membership, rather than having to interleave them as I did above. Am I missing something about that?

You can also write

          let errStrings  newErrors.mapM fun err => err.data.toString
          unless errStrings.contains desiredError do
            throwError "The desired error {desiredError} was not found in\n{errStrings.reverse}"

It seems closer to what you want.

Leonardo de Moura (Apr 09 2022 at 14:29):

David Thrane Christiansen said:

Also, is there something like Python's triple-quoted strings to avoid escaping quotes in the message? This is all going to be slurped through some regexp nonsense that I'd like to keep simple if possible :-)

We don't, but I agree it would be great for the expect error macro.

David Thrane Christiansen (Apr 09 2022 at 14:33):

Thanks for that update, I think it makes things much easier to follow. I can even get rid of .reverse :-)

And the lack of triple-quotes strings will be easy enough to hack around.

For those keeping score, here's the macro:

syntax withPosition("expect" "error" colGt command "message" str "end") : command

elab_rules : command
  | `(expect error $cmd:command message $msg:str end) =>
    open Lean.Elab.Command in
    open Lean in
    open Lean.Meta in do
      let savedState <- get
      match msg.isStrLit? with
      | none => throwError "Desired message must be a string, but got {msg}"
      | some desiredError => do
        try
          elabCommand cmd
          let afterState <- get
          let lengthDifference := afterState.messages.msgs.size - savedState.messages.msgs.size
          let newMessages := afterState.messages.msgs.getN! lengthDifference
          let newErrors := newMessages.filter fun m => m.severity == MessageSeverity.error
          let errStrings <- newErrors.mapM fun err => err.data.toString
          unless errStrings.contains desiredError do
            throwError "The desired error {desiredError} was not found in\n{errStrings}"
        finally
          set savedState

expect error
  def x : Nat := "I am not a Nat"
message
"type mismatch
  \"I am not a Nat\"
has type
  String : Type
but is expected to have type
  Nat : Type"
end

Sebastian Ullrich (Apr 09 2022 at 17:04):

David Thrane Christiansen said:

Also, is there something like Python's triple-quoted strings to avoid escaping quotes in the message? This is all going to be slurped through some regexp nonsense that I'd like to keep simple if possible :-)

I believe the primary purpose of triple-quoted strings in Python is to enable multi-line literals, which is already the case for standard string literals in Lean. So I'd rather like to see "raw" string literals (which Python also has) that also disable other escape characters. There was a little prior discussion in https://github.com/leanprover/lean4/issues/407.

David Thrane Christiansen (Apr 09 2022 at 17:52):

Multi-line literals are nice, but another effect of triple-quoted strings in Python is that single quotes don't automatically terminate the string. For instance:

>>> """Strings are "great", right?"""
'Strings are "great", right?'

There's some corner cases when the quote is the last character in the string, but it is still useful in many cases for multi-line strings that contain string literals.

Sebastian Ullrich (Apr 10 2022 at 07:34):

Right, but this can also be true for raw string literals (e.g. Rust's), which are a more general solution to the problem of embedding a string literally

David Thrane Christiansen (Apr 10 2022 at 08:22):

That's true, and the "add more #s" solution is a nice one.

Patrick Massot (Aug 05 2022 at 12:05):

Is there any progress on the question of triple-quoted strings? Is there any term elaborator that could do that? The syntax doesn't have to be the same as in python, but the effect would be that we have multiline line strings where you can put as many " or ' as you want without escaping.

Sebastian Ullrich (Aug 05 2022 at 12:13):

I created a separate issue at least: https://github.com/leanprover/lean4/issues/1422
Feel free to upvote important issues I'd say, we could use that for prioritization

Patrick Massot (Aug 05 2022 at 12:17):

Thanks, I upvoted!

Siddhartha Gadgil (Aug 05 2022 at 14:03):

Just today I wanted this and had to use a lot of escaped quotes.

Mac (Aug 05 2022 at 18:05):

@David Thrane Christiansen
Leonardo de Moura said:

It looks like the removal of traverse and its replacement with forM a while back makes let mut and for ... in mandatory here, whereas my Haskell-loving self would normally have wanted to first calculate the list of strings using traverse and then done a separate check for membership, rather than having to interleave them as I did above. Am I missing something about that?

You can also write

          let errStrings  newErrors.mapM fun err => err.data.toString
          unless errStrings.contains desiredError do
            throwError "The desired error {desiredError} was not found in\n{errStrings.reverse}"

It seems closer to what you want.

Sorry for resuscitating an old discussion, but the Haskell-esque solution is much less efficient in Lean, right? (Since Lean, unless I am mistaken, lacks list fusion). That is, the conventional imperative approach only iterates through the list once, but the functional version loops through it twice.

David Thrane Christiansen (Aug 05 2022 at 18:07):

You're absolutely right about that. These lists typically contain 2-3 elements at most, so I'm not particularly concerned here, but in other contexts it could matter quite a bit.


Last updated: Dec 20 2023 at 11:08 UTC