Zulip Chat Archive

Stream: lean4

Topic: bv_decide questions

Vlad Tsyrklevich (Jan 19 2025 at 15:45):

@Henrik Böving Picking up from the conversation here, I played around with implementing some of these lemmas this weekend and I have maybe about ~10 done; however, I realize that upstreaming these also requires understanding what the proper form of statements required for bv_decide is. I have a couple of questions.

1) For now I've mostly been focusing on simple statements that I really consider just missing BitVec API lemmas rather than something that has to only live in the bv_normalize simpset; however, I'm not sure how you think about this. For example, docs#Std.Tactic.BVDecide.Normalize.BitVec.and_contra and docs#Std.Tactic.BVDecide.Normalize.BitVec.add_not seem like they should just be BitVec simp lemmas to me. Or e.g. docs#Std.Tactic.BVDecide.Normalize.BitVec.shiftLeft_zero' seems like it's an exact copy of docs#BitVec.shiftLeft_zero' I guess I'm just looking for if you have specific recommendations about where to put a lemma, or if it's somewhat up to personal interpretation on the given lemma's generality.

2) lean4#5297 adds BitVec.lt_irrefl (among other theorems), but it's not in the simpset. Is that just an oversight or intentional?

3) For now I'm avoiding anything with constant folding (as I haven't looked at the front end code at all), ite/dite (due to lean4#6453), and negative constants (since it seems like reduceNeg always runs first?). Is there anything else I should avoid?

Henrik Böving (Jan 19 2025 at 15:59):

Yes, the reason here is that I mostly wanted to push further on as many simple rules as I could without too much regarder for the greater BitVec reasoning API at the time. Syncing up what we have in BitVec now and in the BVDecide.Normalize space such that general purpose BitVec lemmas get pulled out or in case they already are, removed and then just tagged with the bv_normalize attribute would be a good contribution.
5297 was in preparation for work on UIntX around the time but yes we could add thsi into bv_normalize , note that the canonical form of < is BitVec.ult in the bv_normalize simp set as I want to avoid working with both Prop and Bool so everything just gets pushed over to Bool. So the proper form of this lemma is BitVec.ult x x = false.
To comment on each:
- touching ite,dite could be fine, note however that we normalize to bif: https://github.com/leanprover/lean4/blob/master/src/Std/Tactic/BVDecide/Normalize/Bool.lean#L48, so the lemmas should be written in terms of bif. The reason for this is that we avoid carrying around decidable instances that might need to be type checked in this way
- Everything with constants is a bit interesting yes, in particular things that involve negations (usually -1#w for example) are usually handled in the Simproc module that I sent you. This has the additional benefit that we will also detect a -1 even if it's in its written out decimal form.
- One thing to consider avoiding might be the lemmas that have a lot of potentially symmetric appliactions. For example a && (!a && b) = false may be written in a multitude of other ways, the two sides of the and could be flipped and the inner and could be flipped as well so there are 4 versions of this lemma. While we can of course write them out there is also a world where simp learns to match lemmas up to commutativity and we might be able to make use of that.

Do you have some particular things to work on in mind?

Vlad Tsyrklevich (Jan 19 2025 at 16:16):

I don't have much lean experience, so to start I think I'd like to just fill in some simple lemmas that I think the BitVec API should have regardless of bv_decide since I've started looking at it already. I think looking at adding more constant folding may be a simple goal to get my toes wet in meta-programming, but I should probably read MPIL first and I need to finish one of the books I'm reading before starting any more.

I'll send some PRs slowly, for updates to BitVec/Lemmas.lean should I add you as a reviewer? And should I add lemmas to the bv_normalize simpset simultaneously, or is there perf/other verification you do on your end that I should be aware of?

Henrik Böving (Jan 19 2025 at 16:23):

I usually put a unit test for each rule that is added to bv_normalize into here: https://github.com/leanprover/lean4/blob/master/tests/lean/run/bv_decide_rewriter.lean. Evaluation on SMTLIB always takes ~half a day so we only do this on a periodic basis. Additionally one of the biggest performance issues right now is that simp does not perform well when hunreds of let are involved which is quite often the case in SMTLIB problems. That will be tackled eventually but it's going to take some time until it happens so the benefit that a lot of the rules get us will probably remain hidden underneath the noise that is caused by simp' struggling on large SMTLIB problems for a bit. That being said I have seen on multiple occurences in the past that adding rules can make even small problems speed up by as much as 10x so it will certainly be beneficial regardless!

Yes please do ad me as a reviewer, if the rules are within the form that bv_normalize expects you can also tag them as shown in the stdlib files yes.

Henrik Böving (Jan 19 2025 at 16:24):

Note that one thing you can do to figure out the "correct form of a lemma" is just write out the rule you want to prove before putting it into bv_normalize, then running bv_normalize and inspecting the goal state

Vlad Tsyrklevich (Jan 26 2025 at 13:39):

Question about reviews: lean4#6718 is awaiting review, it's not urgent at all, but I wasn't sure if I should ping the ticket since it may have fallen of the review stack, or what the normal procedure is in such cases.

I've been finished with what I considered all of the low-hanging fruit on the list for a few days. I figured I'd share the illustrative tests here in case you have comments on any of them (in parens is the line of the spreadsheet it correlates to). I think with the exception of the last rule (mul + shiftleft), there should not be concerns about confluence with any of them, though I wasn't sure 100% if there are concerns for the lemmas for sub->add (44 in the spreadsheet)

-- neg_mul / mul_neg (181,182)
example (x y : BitVec 16) : (-x) * y = -(x * y) := by bv_normalize
example (x y : BitVec 16) : x * (-y) = -(x * y) := by bv_normalize
example (x y : BitVec 16) : -x * -y = x * y := by bv_normalize
example (x y : BitVec 16) : (~~~x + 1) * y = ~~~(x * y) + 1 := by bv_normalize
example (x y : BitVec 16) : (1 + ~~~x) * y = ~~~(x * y) + 1 := by bv_normalize
example (x y : BitVec 16) : x * (~~~y + 1) = ~~~(x * y) + 1 := by bv_normalize
example (x y : BitVec 16) : x * (1 + ~~~y) = ~~~(x * y) + 1 := by bv_normalize
example (x y : BitVec 16) : (~~~x + 1) * (~~~y + 1) = x * y := by bv_normalize
example (x y : BitVec 16) : (1 + ~~~x) * (~~~y + 1) = x * y := by bv_normalize
example (x y : BitVec 16) : (1 + ~~~x) * (1 + ~~~y) = x * y := by bv_normalize

-- or_beq_zero_iff (25)
example (x y : BitVec 16) : (x ||| y == 0) = (x == 0 && y == 0) := by bv_normalize

-- xor_beq_zero_iff (24)
example (x y : BitVec 16) : (x ^^^ y == 0) = (x == y) := by bv_normalize

-- add_left_eq_self / add_right_eq_self (42)
example (x y : BitVec 16) : (x + y == x) = (y == 0) := by bv_normalize
example (x y : BitVec 16) : (x + y == y) = (x == 0) := by bv_normalize
example (x y : BitVec 16) : (x == x + y) = (y == 0) := by bv_normalize
example (x y : BitVec 16) : (x == y + x) = (y == 0) := by bv_normalize

-- add_left_inj / add_right_inj (43,36)
example (x y z : BitVec 16) : (x + z == y + z) = (x == y) := by bv_normalize
example (x y z : BitVec 16) : (x + z == z + y) = (x == y) := by bv_normalize
example (x y z : BitVec 16) : (z + x == y + z) = (x == y) := by bv_normalize
example (x y z : BitVec 16) : (z + x == z + y) = (x == y) := by bv_normalize
-- TODO remove
example (x y : BitVec 16) : (-x == -y) = (x == y) := by bv_normalize -- Didn't use to work

-- eq_sub_iff_add_eq / sub_eq_iff_eq_add (44)
example (x y z : BitVec 16) : (x - y == z) = (x == z + y) := by bv_normalize
example (x y z : BitVec 16) : (-x + y == z) = (y == z + x) := by bv_normalize
example (x y z : BitVec 16) : (z == x - y) = (z + y == x) := by bv_normalize
example (x y z : BitVec 16) : (z == -x + y) = (z + x == y) := by bv_normalize

-- lt_irrefl (218,248)
example (x : BitVec 16) : ¬x < x := by bv_normalize
example (x : BitVec 16) : !(x.ult x) := by bv_normalize
example (x : BitVec 16) : !(x.slt x) := by bv_normalize

-- not_lt_zero (245)
example (x : BitVec 16) : ¬x < 0 := by bv_normalize
example (x : BitVec 16) : x ≥ 0 := by bv_normalize
example (x : BitVec 16) : !(x.ult 0) := by bv_normalize

-- lt_one_iff (246)
example (x : BitVec 16) : (x < 1) ↔ (x = 0) := by bv_normalize
example (x : BitVec 16) : (x.ult 1) = (x == 0) := by bv_normalize

-- ushiftRight_self (209)
example (x : BitVec 16) : (x >>> x) == 0 := by bv_normalize

-- ushiftRight_eq_zero / shiftLeft_eq_zero (200,207)
example (x : BitVec 16) : (x >>> 16) == 0 := by bv_normalize
example (x : BitVec 16) : (x >>> 100) == 0 := by bv_normalize
example (x : BitVec 16) : (x <<< 16) == 0 := by bv_normalize
example (x : BitVec 16) : (x <<< 100) == 0 := by bv_normalize

-- shiftLeft_mul / mul_shiftLeft (185)
example (x y : BitVec 16) (n : Nat) : (x <<< n) * y == (x * y) <<< n := by bv_normalize
example (x y : BitVec 16) (n : Nat) : x * (y <<< n) == (x * y) <<< n := by bv_normalize

Henrik Böving (Jan 26 2025 at 14:45):

6718 is fine from my side, I'm just waiting for Kim to approve but she is on holiday till tomorrow so that's why its taking a while

Henrik Böving (Jan 26 2025 at 14:47):

And I think these are pretty great! I'd suggest to wait until Kim approved 6718 and then we can start upstreaming the rest.

Henrik Böving (Jan 27 2025 at 08:58):

Though looking through the lemmas I think the easiest to merge are probably going to be (apart from the one alrady in PR):

218,248
245
246
209
42
43,36

and then we can look into the remainder. In particular if you can prove some of these without extending any theory by BitVec I'd also be willing to review and merge them myself, I'm just holding off on the current PR because I'm no expert on the Int theory

Henrik Böving (Jan 27 2025 at 08:59):

@Vlad Tsyrklevich

Vlad Tsyrklevich (Jan 27 2025 at 09:07):

Sounds good, I figured I'd submit follow-up PRs one-by-one until I feel confident enough that I'm aligned with reviewer expectations that I'm not wasting reviewer time parallelizing them.

I just looked at my local git diff and I've implemented every one of these in BitVec/Lemmas.lean (modulo maybe adding some symmetries for bv_normalize.) The backing theorems seem (to me) to be general enough to be worth having outside of bv_normalize, though I'm happy to move them if you or Kim disagree.

Henrik Böving (Jan 27 2025 at 09:08):

I do think that at least for a couple of these it's useful to have them in BitVec.Lemmas so feel free to leave them there and sen the PR for irrefl :thumbs_up:

Henrik Böving (Jan 27 2025 at 09:48):

@Vlad Tsyrklevich one thing, could you comment changelog-language on your PR so that it lands in the correct changelog section?

Vlad Tsyrklevich (Jan 27 2025 at 09:49):

Sure, should I do this for every PR? The PR submission text I think mentions changelog-library or no-changelog being added by reviewers for external PRs

Henrik Böving (Jan 27 2025 at 09:50):

In general we are touching a language feature here so it should be in the language changelog

Henrik Böving (Jan 27 2025 at 09:51):

Also it seems one of the counter example tests broke, let's see

Henrik Böving (Jan 27 2025 at 09:52):

Ah the last test is now so easy that bv_normalize already changes it to True

Henrik Böving (Jan 27 2025 at 09:54):

https://github.com/leanprover/lean4/blob/master/tests/lean/run/bv_counterexample.lean#L112-L118 If you make this a slightly harder example the test should pass again. The background is that if bv_normalize already detects that a theorem is always incorrect by rewriting we currently throw this error about not finding any theorems as we just reduce the proof by contradiction hypothesis to True

Vlad Tsyrklevich (Jan 27 2025 at 09:54):

sorry, I thought i had run the test suite locally and verified it was green. Will fix now

Henrik Böving (Jan 27 2025 at 14:30):

Vlad Tsyrklevich said:

sorry, I thought i had run the test suite locally and verified it was green. Will fix now

Merged, feel free to continue^^

Kim Morrison (Jan 27 2025 at 22:55):

I've left a comment. I'd like to have the mod/tmod/fmod variants of your new bmod_neg_bmod theorem (if such exist?). I'm happy either if you add them here, or we add them to an internal TODO list (I don't want to bog down a new contributor too much!)

Henrik Böving (Jan 28 2025 at 08:48):

@Vlad Tsyrklevich I put the lt_one_iff one on the merge queue.

Henrik Böving (Jan 29 2025 at 14:17):

@Vlad Tsyrklevich https://github.com/leanprover/lean4/pull/6849#issuecomment-2621777762 is writing a simproc something you feel able to do? Or should I take that?

Vlad Tsyrklevich (Jan 30 2025 at 08:03):

I think not at the moment, I'd rather have a better grasp of meta-programming before spending anyone else's time to review my code. I am curious about the answer to https://github.com/leanprover/lean4/pull/6849#issuecomment-2623782902 though, e.g. when would the simproc handle cases that it doesn't right now? And when does Lean auto-synthesize those hypotheses actually?

Henrik Böving (Jan 30 2025 at 08:09):

When simp tries to apply a simp theorem that has pre conditions it tries to run its so called discharger to resolve them. It would appear that the default discharger is powerful enough for the constant case

Vlad Tsyrklevich (Jan 30 2025 at 08:12):

Yeah from local testing it can't handle the 16 ≤ x + 20 case

Henrik Böving (Jan 30 2025 at 08:17):

Yes one would need to change the discharger to e.g. omega for that but we don't want that for bv_decide

Vlad Tsyrklevich (Jan 30 2025 at 08:20):

The 3 remaining changes I have locally are shiftLeft_mul / eq_sub_iff_add_eq / or_beq_zero_iff + xor_beq_zero_iff from above. Do you think any of these should also be handled by simproc? I could imagine perhaps the or/xor ones could be handled more generally by splitting ands/ors/xors comparisons to constants into bitsliced comparisons instead of just handling the 0 cases.

Henrik Böving (Jan 30 2025 at 08:31):

shiftLeft_mul is probably going to have to be implemented differently. After all we wish to reduce shifts by constants to a zero vec + extract so the pattern from your simp lemma is not going to match that. It would probably still be useful if we had the theorem for the x <<< y y : BitVec w case.
eq_sub_iff_add_eq is probably fine but it's of course important to keep the symmetries in mind here
or_beq_zero_iff and xor_beq_zero_iff are fine I would say

Henrik Böving (Jan 30 2025 at 09:38):

btw as a little motivation towards this effort, when we measured SMTLIB the last time benchmarks like sage/app1/bench_2187.smt2 used to take hundreds of seconds. I took another measurement yesterday and this one is now solved instantly in the rewriting stage https://share.firefox.dev/40TuUje this is of course very punctual, we didn't generally get a 100x improvement but it shows how effective rewriting systems can be. Looking forward to getting some fresh bv_decide data potentially next week on all of SMTLIB^^

Frederick Pu (Jan 30 2025 at 21:19):

is the below a refactor that still needs to be done?
1. Yes, the reason here is that I mostly wanted to push further on as many simple rules as I could without too much regarder for the greater `BitVec` reasoning API at the time. Syncing up what we have in `BitVec` now and in the `BVDecide.Normalize` space such that general purpose `BitVec` lemmas get pulled out or in case they already are, removed and then just tagged with the `bv_normalize` attribute would be a good contribution.

Henrik Böving (Jan 30 2025 at 21:23):

I would say it's been getting less and less as Vlad has been pushing on the rewrites I would say, scrolling through the file right now nothing instantly jumps at me anymore. @Vlad Tsyrklevich do you have some left on your TODO list?

Vlad Tsyrklevich (Jan 30 2025 at 22:14):

I think there's some unification that could be done (e.g. and_contra can now depend on BitVec.and_not_self, this example isn't any shorter line-wise but it's just a bit cleaner as it's depending on BitVec results instead of reproving things), and some results can still be in BitVec like BitVec.zero_lt_iff_zero_neq. I haven't really looked at it, I thought I might try to do a once over once my current changes are merged, but happy to have someone else take a look

Henrik Böving (Jan 31 2025 at 13:22):

@Vlad Tsyrklevich did you not derive a bv_normalize theorem from and_eq_allOnes_iff on purpose?

Vlad Tsyrklevich (Jan 31 2025 at 13:24):

I thought since it needs to match against -1 it needs a simproc

Henrik Böving (Jan 31 2025 at 13:25):

Ah! You are correct yes

Vlad Tsyrklevich (Jan 31 2025 at 13:26):

Yea I just included it for API completeness on the bitvec side

Henrik Böving (Jan 31 2025 at 13:26):

Makes sense yes :thumbs_up:

Henrik Böving (Jan 31 2025 at 13:31):

Okay I put the stuff on the merge queue now. What's the next step from your side?

Vlad Tsyrklevich (Jan 31 2025 at 13:52):

I will submit eq_sub_iff_add_eq / sub_eq_iff_eq_add and I need to think about what you said about shiftLeft_mul / mul_shiftLeft and see if it still makes sense to contribute. After that I'll take another scan of the spreadsheet, but I may be done with the simplest rewrites that don't require simprocs. If I see nothing else I'll probably have another PR or two to clean up things I noticed but otherwise I'll set it down for now.

Henrik Böving (Jan 31 2025 at 13:57):

Sounds good, if you want to get into writing simprocs feel free to ask questions^^

Last updated: May 02 2025 at 03:31 UTC