Zulip Chat Archive

Stream: mathlib4

Topic: bug in linarith


Mario Carneiro (Jan 23 2023 at 09:26):

This error seems to be similar to the one that we found in lean 3 linarith while porting to lean 4. The indented hypotheses below are not used in the proof, but when they are present linarith fails to close the goal. Removing any of them or reordering the hypotheses will make linarith prove the goal. (This is actually a permutation of one of the existing linarith test cases, I stumbled on it when changing the way hypotheses are collected by linarith.)

set_option maxHeartbeats 300000 in
example [LinearOrderedCommRing α] : (u v x A B : α)
-> (3 * A * B < u * x + v * x + u * v)
  -> (0 < (A - v) * (A - v))
  -> (0 < (A - v) * (A - u))
  -> (0 < (A - v) * v)
-> (0 < (A - v) * (B - 1))
-> (0 < (A - u) * (B - 1))
-> (0 < v * (B - x))
-> (0 < v * (B - 1))
-> (0 < u * (A - v))
-> (0 < u * (B - x))
-> (0 < u * (1 - A))
-> (0 < A - v)
-> (0 < A - u)
  -> (0 < (A - v) * (1 - A))
  -> (0 < (A - u) * v)
  -> (0 < (A - u) * u)
  -> (0 < A * (1 - A))
-> False := by
  intros
  linarith

Mario Carneiro (Jan 23 2023 at 09:44):

if anyone feels up to minimizing this example even more, that will be very helpful. This one is large and also expensive (note the maxHeartbeats setting) even when everything is going correctly.

Mario Carneiro (Jan 23 2023 at 09:45):

I don't really want to put this in the test suite after it is fixed

Mario Carneiro (Jan 24 2023 at 01:35):

I have reduced the problem to:

example :
  (3 * x4 - x3 - x2 - x1 : ) < 0 
  x5 - x4 < 0 
  2 * (x5 - x4) < 0 
  -x6 + x3 < 0 
  -x6 + x2 < 0 
  2 * (x6 - x5) < 0 
  x8 - x7 < 0 
  -x8 + x2 < 0 
  -x8 + x7 - x5 + x1 < 0 
  x7 - x5 < 0 
  False := by intros; linarith

Mario Carneiro (Jan 24 2023 at 01:40):

It appears that the issue is in the redundancy criterion (again). Here are the last few steps of elimination:

round 5
[(4, 3), (3, -1), (2, -1), (1, -1)]<0, history = [0], effective = [], implicit = []
[(5, -2), (2, 2)]<0, history = [4, 5], effective = [6], implicit = []
[(5, -2), (3, 2)]<0, history = [3, 5], effective = [6], implicit = []
[(5, -1), (1, 1)]<0, history = [6, 8], effective = [8], implicit = [7]
[(5, -1), (2, 1)]<0, history = [6, 7, 9], effective = [7, 8], implicit = []
[(5, 1), (4, -1)]<0, history = [1], effective = [], implicit = []
[(5, 2), (4, -2)]<0, history = [2], effective = [], implicit = []

accepted [(5, 1), (4, -1)]<0 + [(5, -2), (2, 2)]<0 = [(4, -2), (2, 2)]<0
[(4, -2), (2, 2)]<0, history = [1, 4, 5], effective = [5, 6], implicit = []
accepted [(5, 1), (4, -1)]<0 + [(5, -2), (3, 2)]<0 = [(4, -2), (3, 2)]<0
[(4, -2), (3, 2)]<0, history = [1, 3, 5], effective = [5, 6], implicit = []
accepted [(5, 1), (4, -1)]<0 + [(5, -1), (1, 1)]<0 = [(4, -1), (1, 1)]<0
[(4, -1), (1, 1)]<0, history = [1, 6, 8], effective = [5, 8], implicit = [7]
accepted [(5, 1), (4, -1)]<0 + [(5, -1), (2, 1)]<0 = [(4, -1), (2, 1)]<0
[(4, -1), (2, 1)]<0, history = [1, 6, 7, 9], effective = [5, 7, 8], implicit = []
accepted [(5, 2), (4, -2)]<0 + [(5, -2), (2, 2)]<0 = [(4, -2), (2, 2)]<0
[(4, -2), (2, 2)]<0, history = [2, 4, 5], effective = [5, 6], implicit = []
accepted [(5, 2), (4, -2)]<0 + [(5, -2), (3, 2)]<0 = [(4, -2), (3, 2)]<0
[(4, -2), (3, 2)]<0, history = [2, 3, 5], effective = [5, 6], implicit = []
accepted [(5, 2), (4, -2)]<0 + [(5, -1), (1, 1)]<0 = [(4, -2), (1, 2)]<0
[(4, -2), (1, 2)]<0, history = [2, 6, 8], effective = [5, 8], implicit = [7]
accepted [(5, 2), (4, -2)]<0 + [(5, -1), (2, 1)]<0 = [(4, -2), (2, 2)]<0
[(4, -2), (2, 2)]<0, history = [2, 6, 7, 9], effective = [5, 7, 8], implicit = []

round 4
[(4, -2), (1, 2)]<0, history = [2, 6, 8], effective = [5, 8], implicit = [7]
[(4, -2), (2, 2)]<0, history = [2, 6, 7, 9], effective = [5, 7, 8], implicit = []
[(4, -2), (3, 2)]<0, history = [2, 3, 5], effective = [5, 6], implicit = []
[(4, -1), (1, 1)]<0, history = [1, 6, 8], effective = [5, 8], implicit = [7]
[(4, -1), (2, 1)]<0, history = [1, 6, 7, 9], effective = [5, 7, 8], implicit = []
[(4, 3), (3, -1), (2, -1), (1, -1)]<0, history = [0], effective = [], implicit = []

accepted [(4, 3), (3, -1), (2, -1), (1, -1)]<0 + [(4, -2), (1, 2)]<0 = [(3, -2), (2, -2), (1, 4)]<0
[(3, -2), (2, -2), (1, 4)]<0, history = [0, 2, 6, 8], effective = [4, 5, 8], implicit = [7]
accepted [(4, 3), (3, -1), (2, -1), (1, -1)]<0 + [(4, -2), (2, 2)]<0 = [(3, -2), (2, 4), (1, -2)]<0
[(3, -2), (2, 4), (1, -2)]<0, history = [0, 2, 6, 7, 9], effective = [4, 5, 7, 8], implicit = []
accepted [(4, 3), (3, -1), (2, -1), (1, -1)]<0 + [(4, -2), (3, 2)]<0 = [(3, 4), (2, -2), (1, -2)]<0
[(3, 4), (2, -2), (1, -2)]<0, history = [0, 2, 3, 5], effective = [4, 5, 6], implicit = []
accepted [(4, 3), (3, -1), (2, -1), (1, -1)]<0 + [(4, -1), (1, 1)]<0 = [(3, -1), (2, -1), (1, 2)]<0
[(3, -1), (2, -1), (1, 2)]<0, history = [0, 1, 6, 8], effective = [4, 5, 8], implicit = [7]
accepted [(4, 3), (3, -1), (2, -1), (1, -1)]<0 + [(4, -1), (2, 1)]<0 = [(3, -1), (2, 2), (1, -1)]<0
[(3, -1), (2, 2), (1, -1)]<0, history = [0, 1, 6, 7, 9], effective = [4, 5, 7, 8], implicit = []

round 3
[(3, -2), (2, -2), (1, 4)]<0, history = [0, 2, 6, 8], effective = [4, 5, 8], implicit = [7]
[(3, -2), (2, 4), (1, -2)]<0, history = [0, 2, 6, 7, 9], effective = [4, 5, 7, 8], implicit = []
[(3, -1), (2, -1), (1, 2)]<0, history = [0, 1, 6, 8], effective = [4, 5, 8], implicit = [7]
[(3, -1), (2, 2), (1, -1)]<0, history = [0, 1, 6, 7, 9], effective = [4, 5, 7, 8], implicit = []
[(3, 4), (2, -2), (1, -2)]<0, history = [0, 2, 3, 5], effective = [4, 5, 6], implicit = []

accepted [(3, 4), (2, -2), (1, -2)]<0 + [(3, -2), (2, -2), (1, 4)]<0 = [(2, -6), (1, 6)]<0
[(2, -6), (1, 6)]<0, history = [0, 2, 3, 5, 6, 8], effective = [3, 4, 5, 6, 8], implicit = [7]
accepted [(3, 4), (2, -2), (1, -2)]<0 + [(3, -2), (2, 4), (1, -2)]<0 = [(2, 6), (1, -6)]<0
[(2, 6), (1, -6)]<0, history = [0, 2, 3, 5, 6, 7, 9], effective = [3, 4, 5, 6, 7, 8], implicit = []
accepted [(3, 4), (2, -2), (1, -2)]<0 + [(3, -1), (2, -1), (1, 2)]<0 = [(2, -6), (1, 6)]<0
[(2, -6), (1, 6)]<0, history = [0, 1, 2, 3, 5, 6, 8], effective = [3, 4, 5, 6, 8], implicit = [7]
rejected [(3, 4), (2, -2), (1, -2)]<0 + [(3, -1), (2, 2), (1, -1)]<0 = [(2, 6), (1, -6)]<0
[(2, 6), (1, -6)]<0, history = [0, 1, 2, 3, 5, 6, 7, 9], effective = [3, 4, 5, 6, 7, 8], implicit = []

round 2
[(2, -6), (1, 6)]<0, history = [0, 1, 2, 3, 5, 6, 8], effective = [3, 4, 5, 6, 8], implicit = [7]
[(2, 6), (1, -6)]<0, history = [0, 2, 3, 5, 6, 7, 9], effective = [3, 4, 5, 6, 7, 8], implicit = []

rejected [(2, 6), (1, -6)]<0 + [(2, -6), (1, 6)]<0 = []<0
[]<0, history = [0, 1, 2, 3, 5, 6, 7, 8, 9], effective = [2, 3, 4, 5, 6, 7, 8], implicit = [1]

round 1

round 0

fail: no inequalities remaining

Mario Carneiro (Jan 24 2023 at 01:50):

Here each accepted means we took two inequalities from the previous round and combined them, and rejected means that the resulting inequality violates Imbert's bound. At the end of round 5, we generate the same inequality [(4, -2), (2, 2)]<0 in two ways:

accepted [(5, 2), (4, -2)]<0 + [(5, -2), (2, 2)]<0 = [(4, -2), (2, 2)]<0
[(4, -2), (2, 2)]<0, history = [2, 4, 5], effective = [5, 6], implicit = []
accepted [(5, 2), (4, -2)]<0 + [(5, -1), (2, 1)]<0 = [(4, -2), (2, 2)]<0
[(4, -2), (2, 2)]<0, history = [2, 6, 7, 9], effective = [5, 7, 8], implicit = []

If you omit the last hypothesis (called equation 9 here), then the second one is not generated, but with it the second one overrides the first and so we get the same inequality but with a different history [2, 6, 7, 9] instead of [2, 4, 5]. This on its own is not a problem, but later we end up being unable to construct the contradiction without violating Imbert's bound.

I think the issue is the assumption that we can remove redundant inequalities if they prove the same thing (which we are currently enforcing by putting them in an RBSet based keyed on the inequality). I haven't found too many accessible online sources about the Cernikov-Fourier algorithm but https://www.airc.aist.go.jp/aitec-icot/ICOT/Museum/FGCS/FGCS92en-proc2/92eCLP-1.pdf (sec 3.2.2) mentions that it is incorrect to throw away redundant inequalities unless one history subsumes the other, and gives an example

Heather Macbeth (Jan 24 2023 at 02:03):

We have been thinking along the same lines, but you got further: here's what I had reached:

import Mathlib.Tactic.Linarith

example [LinearOrderedCommRing α]
    {u v A B a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 : α}
    (h1 : 3 * a5 - a7 - a8 - a9 < 0)
    (h2 : -a1 + 2 * a2 - a6 < 0) -- comment out this
    (h3 : -a1 + a2 + a3 - a9 < 0) -- and this
    (h4 : - a2 + a6 < 0) -- and this
    (h5 : - a5 + a10 + A - v < 0)
    (h6 : - a5 + a11 + A - u < 0)
    (h7 : - a10 + a8 < 0)
    (h8 : - a10 + v < 0)
    (h9 : - a3 + a9 < 0)
    (h10 : 0 < a11 - a7)
    (h11 : 0 < u - a3)
    (h12 : 0 < A - v)
    (h13 : 0 < A - u)
    (h14 : 0 < A - v - a1 + a2)
    (h15 : 0 < a2 - a9)
    (h16 : 0 < a3 - a4)
    (h17 : 0 < A - a1) :
    False := by
  linarith

Heather Macbeth (Jan 24 2023 at 02:04):

It feels like @Rob Lewis might be the person to consult on the weeds of Fourier-Motzkin.

Mario Carneiro (Jan 24 2023 at 02:27):

Unfortunately simply not removing redundant inequalities results in really bad performance; Imbert's bound alone is apparently not enough to make Fourier-Motzkin not crazy slow

Heather Macbeth (Jan 24 2023 at 02:28):

Does mathlib3 linarith have the same bug?

Heather Macbeth (Jan 24 2023 at 02:29):

... or Coq lra?

Mario Carneiro (Jan 24 2023 at 02:35):

pretty sure mathlib3 linarith has the same bug, although I haven't tried to replicate it. No idea about lra

Mario Carneiro (Jan 24 2023 at 02:45):

lra does not use F-M, it uses the simplex method

Heather Macbeth (Jan 24 2023 at 03:02):

In the short term, it seems better to keep the version that proves everything it's supposed to, even if it's very slow. What do you think?

Heather Macbeth (Mar 08 2023 at 07:43):

I opened !4#2717 to track this.

Yury G. Kudryashov (May 03 2024 at 22:13):

The following example doesn't work:

import Mathlib

example (a b : ) (h1 : a + b = 2) (h2 : 2 * a + b = 3) : a = 1 := by
  linarith

Yury G. Kudryashov (May 03 2024 at 22:13):

@Mario Carneiro @Rob Lewis :up: Any hints about ways to debug this?

Yury G. Kudryashov (May 03 2024 at 22:14):

Here is the output with set_option trace.linarith true and set_option trace.linarith.detail true:

[linarith] target is an equality: splitting
[linarith] linarith is running on the following hypotheses:
[linarith] [, , a + b = 2, 2 * a + b = 3, a < 1]
[linarith] Preprocessing: split conjunctions
[linarith] [, , a + b = 2, 2 * a + b = 3, a < 1]
[linarith] Preprocessing: filter terms that are not proofs of comparisons
[linarith] [a + b = 2, 2 * a + b = 3, a < 1]
[linarith] Preprocessing: replace negations of comparisons
[linarith] [a + b = 2, 2 * a + b = 3, a < 1]
[linarith] Preprocessing: move nats to ints
[linarith] [a + b = 2, 2 * a + b = 3, a < 1]
[linarith] Preprocessing: strengthen strict inequalities over int
[linarith] [a + b = 2, 2 * a + b = 3, a < 1]
[linarith] Preprocessing: make comparisons with zero
[linarith] [a + b - 2 = 0, 2 * a + b - 3 = 0, a - 1 < 0]
[linarith] Preprocessing: cancel denominators
[linarith] [a + b - 2 = 0, 2 * a + b - 3 = 0, a - 1 < 0]
[linarith] after preprocessing, linarith has 3 facts:
[linarith] [a + b - 2 = 0, 2 * a + b - 3 = 0, a - 1 < 0]
[linarith] hypotheses appear in 1 different types
[linarith.detail] Beginning work in `proveFalseByLinarith`.
[linarith.detail] ... finished `addNegEqProofs`.
[linarith.detail] ... finished `mkNegOneLtZeroProof`.
[linarith.detail] [-1 < 0, -(a + b - 2) = 0, a + b - 2 = 0, -(2 * a + b - 3) = 0, 2 * a + b - 3 = 0, a - 1 < 0]
[linarith.detail] monomial map: [([], 0), ([(1, 1)], 2), ([(2, 1)], 1)]
[linarith.detail] ... finished `linearFormsAndMaxVar`.
[linarith.detail] [[(0, -1)]<0,
     [(2, -1), (1, -1), (0, 2)]=0,
     [(2, 1), (1, 1), (0, -2)]=0,
     [(2, -2), (1, -1), (0, 3)]=0,
     [(2, 2), (1, 1), (0, -3)]=0,
     [(2, 1), (0, -1)]<0]
[linarith] failed

Yury G. Kudryashov (May 03 2024 at 22:15):

It works with integer variables but fails with rational or real vars.

Yury G. Kudryashov (May 03 2024 at 22:18):

Strange detail: linarith only [h1, h2] works while linarith fails.

Jireh Loreaux (May 03 2024 at 23:01):

Does it work in the PR with the simplex oracle for linarith?

Shreyas Srinivas (May 04 2024 at 00:42):

why does nlinarith work?

Yury G. Kudryashov (May 04 2024 at 00:53):

It works with the new oracle, thanks!


Last updated: May 02 2025 at 03:31 UTC