Zulip Chat Archive
Stream: new members
Topic: e-graphs
Bryan Gin-ge Chen (Mar 21 2021 at 15:52):
Welcome! I guess you're talking about the paper "Efficient E-matching for SMT Solvers"? I'm probably not the right person to try to answer your question but maybe @Mario Carneiro or @Gabriel Ebner can help.
Mario Carneiro (Mar 21 2021 at 15:53):
What are e-graphs? Maybe I know them by another name, but that doesn't ring a bell
Miguel Raz Guzmán Macedo (Mar 27 2021 at 20:07):
Chiming here - would also be interested in knowing about egraph use in Lean. I take it the pattern matcher for tactics got an algorithmkc overhaul from Lean 3
Miguel Raz Guzmán Macedo (Apr 11 2021 at 20:33):
Bump. Does the synbolic rewriter in rfl
use a particular algorithm for egraphs?
Miguel Raz Guzmán Macedo (Apr 11 2021 at 20:33):
If one wanted to add tactics similar to RUBI, would that be by overloading simp
?
Miguel Raz Guzmán Macedo (Apr 11 2021 at 20:37):
https://rulebasedintegration.org/
Eric Wieser (Apr 11 2021 at 21:22):
What do you mean by the symbolic rewriter in rfl
?
Eric Wieser (Apr 11 2021 at 21:23):
rfl
is just a shorthand for docs#eq.refl, a constructor for docs#eq
Miguel Raz Guzmán Macedo (Apr 12 2021 at 01:07):
(deleted)
Miguel Raz Guzmán Macedo (Apr 12 2021 at 01:11):
@Eric Wieser here's waht I understand:
- Lean4 has a "symbolic" rewriter for terms in the goals
From the manual:
-- You can prove theorems about your functions.
-- The following theorem states that for any natural number `a`,
-- adding 2 twice produces a value equal to `a + 4`.
theorem twiceAdd2 (a : Nat) : twice (fun x => x + 2) a = a + 4 :=
-- The proof is by reflexivity. Lean "symbolically" reduces both sides of the equality
-- until they are identical.
rfl
- the
eq.refl
you pointed to is in mathlib, aka Lean3. - In RUBI, the symbolic term rewriter is crucial for performance and solvability of integration problems.
- Leo published a fundamental paper of egraphs for symbolic term rewriting, and @Alessandro and I are wondering if that was implemented as the symbolic term rewriter in Lean4.
Mario Carneiro (Apr 12 2021 at 01:11):
Lean 4's rfl is the same as lean 3's rfl. It's not a "symbolic rewriter", it compares values up to a relation called definitional equality
Miguel Raz Guzmán Macedo (Apr 12 2021 at 01:12):
Thanks Mario, nice to see you around here too :D
Mario Carneiro (Apr 12 2021 at 01:12):
which involves weakly normalizing both sides to a constructor
Mario Carneiro (Apr 12 2021 at 01:12):
I am everywhere
Miguel Raz Guzmán Macedo (Apr 12 2021 at 01:13):
Well I'm bringing this up because I'm "close" :tm: to porting RUBI to Julia, and from there porting RUBI to Lean wouldn't be "that much" additional effort. I would want to know which tactic I could dump a lhs = rhs
ruleset, if at all feasible.
Mario Carneiro (Apr 12 2021 at 01:14):
As far as I know, nothing about egraphs or symbolic term rewriting is involved in the implementation of rfl and/or defeq in any version of lean. This is much more based on literature on dependent type theory, reduction and strong normalization stuff
Miguel Raz Guzmán Macedo (Apr 12 2021 at 01:15):
Oh neat.
Mario Carneiro (Apr 12 2021 at 01:15):
However, you will find more of leo's creativity in the simp
tactic, which does symbolic term rewriting closer to what you are talking about
Miguel Raz Guzmán Macedo (Apr 12 2021 at 01:15):
Ah, that's the one. So I could in theory overload simp
with a bunch of these integration rules, right?
Mario Carneiro (Apr 12 2021 at 01:16):
yes
Mario Carneiro (Apr 12 2021 at 01:16):
We've done that for derivatives in mathlib
Mario Carneiro (Apr 12 2021 at 01:16):
I think integration too but I guess the coverage is lower
Mario Carneiro (Apr 12 2021 at 01:17):
its ability to pattern match on lambdas is limited though, so most likely you would do a full integration routine as a dedicated tactic
Miguel Raz Guzmán Macedo (Apr 12 2021 at 01:17):
Ouch.
Mario Carneiro (Apr 12 2021 at 01:20):
I mean, it's pretty hard to recognize that in the function \lam x, sin x * (2 / cos x)
you want to pull out the constant factor 2 (let's say this is a preprocessing step)
Mario Carneiro (Apr 12 2021 at 01:21):
you would have a huge number of possible patterns to look for to find the 2
Miguel Raz Guzmán Macedo (Apr 12 2021 at 01:22):
Yup, that's a transformation that egraphs can achieve linear in the number of patterns that match, without running into the phase ordering problem of some transformations will bork others.
Hence why we're asking :D
Miguel Raz Guzmán Macedo (Apr 12 2021 at 01:22):
This is a good overview (that I'm still becoming familiar with myself)
https://egraphs-good.github.io/
Mario Carneiro (Apr 12 2021 at 01:25):
I don't think I have seen anything like this implemented in lean
Mario Carneiro (Apr 12 2021 at 01:25):
It is possible to implement it yourself, of course
Mario Carneiro (Apr 12 2021 at 01:26):
(in lean 3 or lean 4, although it's probably better suited for lean 4)
Miguel Raz Guzmán Macedo (Apr 12 2021 at 01:36):
Yeah, condolences to the mathlib crowd, I'm learning Lean4 - the garbage collector's too good to pass up :muscle:
Miguel Raz Guzmán Macedo (Apr 12 2021 at 01:37):
(And in fact it will probably do amazingly in the benchmarks what with the functional but in place style not generating tons of allocations in all the symbolic computations.)
Wojciech Nawrocki (Jan 18 2022 at 19:02):
Bumping this thread to note a new work which (claims to?) achieve exponential (!) speedups on e-matching queries in the egg library.
Martin Dvořák (Jan 18 2022 at 19:25):
:eyes: 2022 :eyes:
Utensil Song (May 22 2023 at 13:59):
I was looking for the counter-part of e-graphs in Lean and came across this thread. Based on some search here, I assume it's not implemented in Lean yet.
I'm working on something based on it that should have been done in a proof assistant like Lean, but it's tempting to give it a try as it's a lighter, seemingly sufficient and possible faster solution.
I wonder:
- the level of its efficiency in the eyes of Lean tactic developers
- its soundness in the eyes of Lean type theorists
- the closest mechanism in Lean, pros&cons etc.
Any comments will be greatly appreciated!
Also, here's their latest progress:
- paper Better Together: Unifying Datalog and Equality Saturation
- web demo (rust compiled to wasm): https://egraphs-good.github.io/egglog/ , there are quite some serious and sophisticated use cases like for Herbie’s EqSat
- code (rust): https://github.com/egraphs-good/egglog/
- code (high-level python binding): https://github.com/metadsl/egglog-python
Marcus Rossel (May 22 2023 at 16:14):
cc @Andrés Goens
Andrés Goens (May 22 2023 at 16:18):
@Utensil Song E-graphs (and congruence closure) were implemented in older versions of Lean: see this paper for the Lean 2 description and there's a mathlib tactic for Lean 3 called cc that ports that. If you use Lean 3, then that's great to use! As far as I understand it's not been ported to Lean 4 yet.
Andrés Goens (May 22 2023 at 16:21):
We have a prototype that uses egg, the predecessor of egglog in Lean 4, see: https://github.com/opencompl/egg-tactic-code this is not really mature yet, but we might get back to it at some point hopefully, perhaps integrate with egglog instead and/or with a native Lean 4 implementation, but that's work in progress. This is sound too, because it checks the proof witness generated by egg. If this is useful we're happy to help you use it too, but chances are it might not work for your use case (e.g. currently it doesn't support different types in the term, etc, so it's very bare-bones).
Andrés Goens (May 22 2023 at 16:21):
What's your use case?
Utensil Song (May 22 2023 at 23:45):
Thanks for the comprehensive explanation! It's great to know it naturally worked/works/will work with Lean 2/3/4 ! I'll check them out
Utensil Song (May 23 2023 at 00:04):
I need some experiments before I can determine the extent of my use cases. Now that knowing the results could integrate with Lean, it's definitely worth to dig deeper :big_smile:
Utensil Song (May 23 2023 at 00:13):
I was aware of cc
but didn't connect the dots. It would seem that a full integration with egg/egglog could quite some more than cc
.
Utensil Song (May 23 2023 at 00:16):
I was a bit concerned with its efficiency (particularly the space complexity) since it preserve more info than usual tree-ish heuristic based approach although it's quite efficient for a graph-based one. Will experiment with complicated use cases and report back.
Last updated: Dec 20 2023 at 11:08 UTC