Zulip Chat Archive

Stream: lean4

Topic: Pattern matching on Syntax nodes


Siddharth Bhat (Oct 10 2021 at 18:09):

It seems like if I want to match on Syntax nodes where the terms represent "nested" constructors, for example, things like Foo.mk (some 10), I need to nest pattern matches. I'm not able to pattern match on such terms using a single complex pattern match. For concretencess, Consider:

inductive Foo
| mk: Option Int ->  Foo

@[appUnexpander Foo.mk]
def unexpandEinSym :
Lean.PrettyPrinter.Unexpander
| `(Foo.mk `(some $k)) => do -- A
       `(mk_some_at_once $k)
| `(Foo.mk (some $k)) => do -- B
       `(mk_some_at_once_two $k)
| `(Foo.mk Option.none) => do -- C
      `(mk_none_at_once)
| `(Foo.mk $x) => do
     match x with
     | `(some $y) => `(mk_then_some) -- E
     | `(none) => `(mk_then_none) -- F
     | _ => `("missing")
| _ => throw ()


theorem thm (x: Int): Foo.mk (some x) = Foo.mk none  := by {
-- ⊢ mk_then_some = mk_then_none

}

It seems that the rules that are firing are those that correspond to the nested match .. match (the rules marked E and F) , and not the rules marked A, B, or C.

Is there some way to write the above pattern-match with a single level of match? I feel like I'm misunderstanding something about the syntax that allows us to pattern on Syntax using backticks.

Sebastian Ullrich (Oct 10 2021 at 18:19):

This is a bit unfortunate. The delaborator runs before the parenthesizer, so your input term lacks the parentheses of pattern B. You cannot wrote down the pattern that would match the whole input immediately.

Siddharth Bhat (Oct 10 2021 at 18:23):

Thanks for the explanation! I now understand the behaviour. I feel that allowing such pattern matches would help expressivity. I guess it's hard to implement?

Mac (Oct 10 2021 at 19:31):

@Siddharth Bhat I would argue that doing it another way would hurt expressivity. Your examples B and C are matching different things then E and F.

B matches exactly the term Foo.mk (some $k) (with parentheses) whereas E matches any term and then does matching on that term for some $y. To get B to work either parentheses would need to be added before delaboration or parentheses would have to be ignored in patterns. Running the parenthesizer before the delaborator would mean that pattern matches like E would have to worry about potentially inserted parentheses, which would make a lot of simple examples much more complex. On the other hand, Ignoring parentheses in B would break matches where parentheses are semantically significant (and would require a special case in antiquote pattern matching for a specific syntax kind). Thus, I think the current approach is the best solution for both expressivity and simplicity.

C matches the exact identifier Option.none whereas F matches the exact identifier none. Resolving the identifier at match time would break matches where a specific identifier is desired. However, on this point there is a good question as to whether the resolved identifier or the exact identifier is a better default for antiquote patterns. An argument for the choice of exact matches is avoids special casing for particular syntax type, thus making pattern matching behavior more uniform. An argument for resolved matches is that it is usually what one wants and that ident is already a special type of Syntax and there is already special behavior for Syntax.atoms. There is also a question of what to do if the ident can't be resolved in the scope in question.

Siddharth Bhat (Oct 10 2021 at 21:59):

Thank you for the detailed response!

  • I was imagining ignoring parentheses, yes, but you're right, matching on parens would be complicated.
  • I didn't realise that Syntax patterns could witness the difference between things like Option.none and none. I imagined that things were "normalized" before being matched. I guess this is too simplistic.

I imagine I'll develop more of a sense for what defaults I would like when I use the macro system for longer :)


Last updated: Dec 20 2023 at 11:08 UTC