Zulip Chat Archive

Stream: mathlib4

Topic: bug in linarith


Mario Carneiro (Jan 23 2023 at 09:26):

This error seems to be similar to the one that we found in lean 3 linarith while porting to lean 4. The indented hypotheses below are not used in the proof, but when they are present linarith fails to close the goal. Removing any of them or reordering the hypotheses will make linarith prove the goal. (This is actually a permutation of one of the existing linarith test cases, I stumbled on it when changing the way hypotheses are collected by linarith.)

set_option maxHeartbeats 300000 in
example [LinearOrderedCommRing α] : (u v x A B : α)
-> (3 * A * B < u * x + v * x + u * v)
  -> (0 < (A - v) * (A - v))
  -> (0 < (A - v) * (A - u))
  -> (0 < (A - v) * v)
-> (0 < (A - v) * (B - 1))
-> (0 < (A - u) * (B - 1))
-> (0 < v * (B - x))
-> (0 < v * (B - 1))
-> (0 < u * (A - v))
-> (0 < u * (B - x))
-> (0 < u * (1 - A))
-> (0 < A - v)
-> (0 < A - u)
  -> (0 < (A - v) * (1 - A))
  -> (0 < (A - u) * v)
  -> (0 < (A - u) * u)
  -> (0 < A * (1 - A))
-> False := by
  intros
  linarith

Mario Carneiro (Jan 23 2023 at 09:44):

if anyone feels up to minimizing this example even more, that will be very helpful. This one is large and also expensive (note the maxHeartbeats setting) even when everything is going correctly.

Mario Carneiro (Jan 23 2023 at 09:45):

I don't really want to put this in the test suite after it is fixed

Mario Carneiro (Jan 24 2023 at 01:35):

I have reduced the problem to:

example :
  (3 * x4 - x3 - x2 - x1 : ) < 0 
  x5 - x4 < 0 
  2 * (x5 - x4) < 0 
  -x6 + x3 < 0 
  -x6 + x2 < 0 
  2 * (x6 - x5) < 0 
  x8 - x7 < 0 
  -x8 + x2 < 0 
  -x8 + x7 - x5 + x1 < 0 
  x7 - x5 < 0 
  False := by intros; linarith

Mario Carneiro (Jan 24 2023 at 01:40):

It appears that the issue is in the redundancy criterion (again). Here are the last few steps of elimination:

round 5
[(4, 3), (3, -1), (2, -1), (1, -1)]<0, history = [0], effective = [], implicit = []
[(5, -2), (2, 2)]<0, history = [4, 5], effective = [6], implicit = []
[(5, -2), (3, 2)]<0, history = [3, 5], effective = [6], implicit = []
[(5, -1), (1, 1)]<0, history = [6, 8], effective = [8], implicit = [7]
[(5, -1), (2, 1)]<0, history = [6, 7, 9], effective = [7, 8], implicit = []
[(5, 1), (4, -1)]<0, history = [1], effective = [], implicit = []
[(5, 2), (4, -2)]<0, history = [2], effective = [], implicit = []

accepted [(5, 1), (4, -1)]<0 + [(5, -2), (2, 2)]<0 = [(4, -2), (2, 2)]<0
[(4, -2), (2, 2)]<0, history = [1, 4, 5], effective = [5, 6], implicit = []
accepted [(5, 1), (4, -1)]<0 + [(5, -2), (3, 2)]<0 = [(4, -2), (3, 2)]<0
[(4, -2), (3, 2)]<0, history = [1, 3, 5], effective = [5, 6], implicit = []
accepted [(5, 1), (4, -1)]<0 + [(5, -1), (1, 1)]<0 = [(4, -1), (1, 1)]<0
[(4, -1), (1, 1)]<0, history = [1, 6, 8], effective = [5, 8], implicit = [7]
accepted [(5, 1), (4, -1)]<0 + [(5, -1), (2, 1)]<0 = [(4, -1), (2, 1)]<0
[(4, -1), (2, 1)]<0, history = [1, 6, 7, 9], effective = [5, 7, 8], implicit = []
accepted [(5, 2), (4, -2)]<0 + [(5, -2), (2, 2)]<0 = [(4, -2), (2, 2)]<0
[(4, -2), (2, 2)]<0, history = [2, 4, 5], effective = [5, 6], implicit = []
accepted [(5, 2), (4, -2)]<0 + [(5, -2), (3, 2)]<0 = [(4, -2), (3, 2)]<0
[(4, -2), (3, 2)]<0, history = [2, 3, 5], effective = [5, 6], implicit = []
accepted [(5, 2), (4, -2)]<0 + [(5, -1), (1, 1)]<0 = [(4, -2), (1, 2)]<0
[(4, -2), (1, 2)]<0, history = [2, 6, 8], effective = [5, 8], implicit = [7]
accepted [(5, 2), (4, -2)]<0 + [(5, -1), (2, 1)]<0 = [(4, -2), (2, 2)]<0
[(4, -2), (2, 2)]<0, history = [2, 6, 7, 9], effective = [5, 7, 8], implicit = []

round 4
[(4, -2), (1, 2)]<0, history = [2, 6, 8], effective = [5, 8], implicit = [7]
[(4, -2), (2, 2)]<0, history = [2, 6, 7, 9], effective = [5, 7, 8], implicit = []
[(4, -2), (3, 2)]<0, history = [2, 3, 5], effective = [5, 6], implicit = []
[(4, -1), (1, 1)]<0, history = [1, 6, 8], effective = [5, 8], implicit = [7]
[(4, -1), (2, 1)]<0, history = [1, 6, 7, 9], effective = [5, 7, 8], implicit = []
[(4, 3), (3, -1), (2, -1), (1, -1)]<0, history = [0], effective = [], implicit = []

accepted [(4, 3), (3, -1), (2, -1), (1, -1)]<0 + [(4, -2), (1, 2)]<0 = [(3, -2), (2, -2), (1, 4)]<0
[(3, -2), (2, -2), (1, 4)]<0, history = [0, 2, 6, 8], effective = [4, 5, 8], implicit = [7]
accepted [(4, 3), (3, -1), (2, -1), (1, -1)]<0 + [(4, -2), (2, 2)]<0 = [(3, -2), (2, 4), (1, -2)]<0
[(3, -2), (2, 4), (1, -2)]<0, history = [0, 2, 6, 7, 9], effective = [4, 5, 7, 8], implicit = []
accepted [(4, 3), (3, -1), (2, -1), (1, -1)]<0 + [(4, -2), (3, 2)]<0 = [(3, 4), (2, -2), (1, -2)]<0
[(3, 4), (2, -2), (1, -2)]<0, history = [0, 2, 3, 5], effective = [4, 5, 6], implicit = []
accepted [(4, 3), (3, -1), (2, -1), (1, -1)]<0 + [(4, -1), (1, 1)]<0 = [(3, -1), (2, -1), (1, 2)]<0
[(3, -1), (2, -1), (1, 2)]<0, history = [0, 1, 6, 8], effective = [4, 5, 8], implicit = [7]
accepted [(4, 3), (3, -1), (2, -1), (1, -1)]<0 + [(4, -1), (2, 1)]<0 = [(3, -1), (2, 2), (1, -1)]<0
[(3, -1), (2, 2), (1, -1)]<0, history = [0, 1, 6, 7, 9], effective = [4, 5, 7, 8], implicit = []

round 3
[(3, -2), (2, -2), (1, 4)]<0, history = [0, 2, 6, 8], effective = [4, 5, 8], implicit = [7]
[(3, -2), (2, 4), (1, -2)]<0, history = [0, 2, 6, 7, 9], effective = [4, 5, 7, 8], implicit = []
[(3, -1), (2, -1), (1, 2)]<0, history = [0, 1, 6, 8], effective = [4, 5, 8], implicit = [7]
[(3, -1), (2, 2), (1, -1)]<0, history = [0, 1, 6, 7, 9], effective = [4, 5, 7, 8], implicit = []
[(3, 4), (2, -2), (1, -2)]<0, history = [0, 2, 3, 5], effective = [4, 5, 6], implicit = []

accepted [(3, 4), (2, -2), (1, -2)]<0 + [(3, -2), (2, -2), (1, 4)]<0 = [(2, -6), (1, 6)]<0
[(2, -6), (1, 6)]<0, history = [0, 2, 3, 5, 6, 8], effective = [3, 4, 5, 6, 8], implicit = [7]
accepted [(3, 4), (2, -2), (1, -2)]<0 + [(3, -2), (2, 4), (1, -2)]<0 = [(2, 6), (1, -6)]<0
[(2, 6), (1, -6)]<0, history = [0, 2, 3, 5, 6, 7, 9], effective = [3, 4, 5, 6, 7, 8], implicit = []
accepted [(3, 4), (2, -2), (1, -2)]<0 + [(3, -1), (2, -1), (1, 2)]<0 = [(2, -6), (1, 6)]<0
[(2, -6), (1, 6)]<0, history = [0, 1, 2, 3, 5, 6, 8], effective = [3, 4, 5, 6, 8], implicit = [7]
rejected [(3, 4), (2, -2), (1, -2)]<0 + [(3, -1), (2, 2), (1, -1)]<0 = [(2, 6), (1, -6)]<0
[(2, 6), (1, -6)]<0, history = [0, 1, 2, 3, 5, 6, 7, 9], effective = [3, 4, 5, 6, 7, 8], implicit = []

round 2
[(2, -6), (1, 6)]<0, history = [0, 1, 2, 3, 5, 6, 8], effective = [3, 4, 5, 6, 8], implicit = [7]
[(2, 6), (1, -6)]<0, history = [0, 2, 3, 5, 6, 7, 9], effective = [3, 4, 5, 6, 7, 8], implicit = []

rejected [(2, 6), (1, -6)]<0 + [(2, -6), (1, 6)]<0 = []<0
[]<0, history = [0, 1, 2, 3, 5, 6, 7, 8, 9], effective = [2, 3, 4, 5, 6, 7, 8], implicit = [1]

round 1

round 0

fail: no inequalities remaining

Mario Carneiro (Jan 24 2023 at 01:50):

Here each accepted means we took two inequalities from the previous round and combined them, and rejected means that the resulting inequality violates Imbert's bound. At the end of round 5, we generate the same inequality [(4, -2), (2, 2)]<0 in two ways:

accepted [(5, 2), (4, -2)]<0 + [(5, -2), (2, 2)]<0 = [(4, -2), (2, 2)]<0
[(4, -2), (2, 2)]<0, history = [2, 4, 5], effective = [5, 6], implicit = []
accepted [(5, 2), (4, -2)]<0 + [(5, -1), (2, 1)]<0 = [(4, -2), (2, 2)]<0
[(4, -2), (2, 2)]<0, history = [2, 6, 7, 9], effective = [5, 7, 8], implicit = []

If you omit the last hypothesis (called equation 9 here), then the second one is not generated, but with it the second one overrides the first and so we get the same inequality but with a different history [2, 6, 7, 9] instead of [2, 4, 5]. This on its own is not a problem, but later we end up being unable to construct the contradiction without violating Imbert's bound.

I think the issue is the assumption that we can remove redundant inequalities if they prove the same thing (which we are currently enforcing by putting them in an RBSet based keyed on the inequality). I haven't found too many accessible online sources about the Cernikov-Fourier algorithm but https://www.airc.aist.go.jp/aitec-icot/ICOT/Museum/FGCS/FGCS92en-proc2/92eCLP-1.pdf (sec 3.2.2) mentions that it is incorrect to throw away redundant inequalities unless one history subsumes the other, and gives an example

Heather Macbeth (Jan 24 2023 at 02:03):

We have been thinking along the same lines, but you got further: here's what I had reached:

import Mathlib.Tactic.Linarith

example [LinearOrderedCommRing α]
    {u v A B a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 : α}
    (h1 : 3 * a5 - a7 - a8 - a9 < 0)
    (h2 : -a1 + 2 * a2 - a6 < 0) -- comment out this
    (h3 : -a1 + a2 + a3 - a9 < 0) -- and this
    (h4 : - a2 + a6 < 0) -- and this
    (h5 : - a5 + a10 + A - v < 0)
    (h6 : - a5 + a11 + A - u < 0)
    (h7 : - a10 + a8 < 0)
    (h8 : - a10 + v < 0)
    (h9 : - a3 + a9 < 0)
    (h10 : 0 < a11 - a7)
    (h11 : 0 < u - a3)
    (h12 : 0 < A - v)
    (h13 : 0 < A - u)
    (h14 : 0 < A - v - a1 + a2)
    (h15 : 0 < a2 - a9)
    (h16 : 0 < a3 - a4)
    (h17 : 0 < A - a1) :
    False := by
  linarith

Heather Macbeth (Jan 24 2023 at 02:04):

It feels like @Rob Lewis might be the person to consult on the weeds of Fourier-Motzkin.

Mario Carneiro (Jan 24 2023 at 02:27):

Unfortunately simply not removing redundant inequalities results in really bad performance; Imbert's bound alone is apparently not enough to make Fourier-Motzkin not crazy slow

Heather Macbeth (Jan 24 2023 at 02:28):

Does mathlib3 linarith have the same bug?

Heather Macbeth (Jan 24 2023 at 02:29):

... or Coq lra?

Mario Carneiro (Jan 24 2023 at 02:35):

pretty sure mathlib3 linarith has the same bug, although I haven't tried to replicate it. No idea about lra

Mario Carneiro (Jan 24 2023 at 02:45):

lra does not use F-M, it uses the simplex method

Heather Macbeth (Jan 24 2023 at 03:02):

In the short term, it seems better to keep the version that proves everything it's supposed to, even if it's very slow. What do you think?

Heather Macbeth (Mar 08 2023 at 07:43):

I opened !4#2717 to track this.


Last updated: Dec 20 2023 at 11:08 UTC