Zulip Chat Archive
Stream: general
Topic: Minimizer script
Anne Baanen (Oct 11 2023 at 15:07):
I wrote a semi-automatic minimization script for creating a #mwe from a test case. This script is inspired by coq-bug-minimizer
.
Basic usage is to create a file, say testcase.lean
, that exhibits some error, for example a timeout. Run ./scripts/minimizer testcase.lean
and it will automatically create and update a more minimal testcase.min.1.lean
which exhibits the same error¹. The file is updated as more minimization passes succeed. When no more minimization applies and the script stops, or you are tired of waiting and you kill the script, you can edit testcase.min.1.lean
manually, and restart the script on your edited output by running ./scripts/minimizer testcase.min.1.lean
. The script produces testcase.min.2.lean
etc.
The architecture applies a sequence of minimization passes repeatedly, stopping when no more pass has made progress. The passes can do basically anything with the source, and make progress when the new file is different but exhibits the same error. I've implemented a few naïve textual replacements like substituting everything that comes after a :=
with sorry
. Since only a subset of those changes will actually work, the script includes a framework that bisects the set of changes to find exactly what can be done. That also means very brute-force changes like deleting declarations can be done in a pass.
Current transformations are:
- stripping out comments
- deleting
#align
and #align_import` - deleting declarations
- replacing everything after
:=
withsorry
- deleting imports
- replacing imports with the contents of those files
¹Currently, the script checks that the Lean compiler's stdout is the same on the minimized file and the original, ignoring filenames, line numbers and sorry
.
Anne Baanen (Oct 11 2023 at 15:13):
Things are still very naïve and unoptimized, but it should be enough to get your code going.
Scott Morrison (Oct 11 2023 at 17:09):
This is amazing. A potential very useful transformation is replacing a structure field definition with sorry
.
Scott Morrison (Oct 11 2023 at 17:12):
An unfortunately non-local transformation that is often essential for good minimisation is to remove a structure field entirely (which then requires removing it from all constructors, too). I guess this can be converted into a sequence of local transformations by allowing adding a default value of sorry
?
Julian Berman (Oct 11 2023 at 17:31):
In case it's interesting, or even better perhaps useful, https://github.com/DRMacIver/structureshrink is a cool thing. Not sure if what the minimization script is doing can be coerced to run within it, but it implements a concrete shrinking algorithm.
Anne Baanen (Oct 11 2023 at 17:52):
Scott Morrison said:
An unfortunately non-local transformation that is often essential for good minimisation is to remove a structure field entirely (which then requires removing it from all constructors, too). I guess this can be converted into a sequence of local transformations by allowing adding a default value of
sorry
?
That's an interesting idea! But I suspect that the local transformation wouldn't work in case someone calls the constructor directly or uses anonymous structure notation. On the other hand, there's nothing in the setup that prohibits calling Lean during a pass, so we could try to use that as feedback for how to fix a broken build.
Anne Baanen (Oct 11 2023 at 17:57):
Julian Berman said:
In case it's interesting, or even better perhaps useful, https://github.com/DRMacIver/structureshrink is a cool thing. Not sure if what the minimization script is doing can be coerced to run within it, but it implements a concrete shrinking algorithm.
That's an interesting algorithm! I don't think we want to interface with structureshrink directly, since the notion of minimal is quite different (in particular, we want our file to grow when we inline imports).
Scott Morrison (Oct 12 2023 at 03:30):
Something I've found essential in minimization previously is to be able to run the sources against two versions of Lean, preserving that it works in one, and errors with a specified error message in the other.
Anne Baanen (Oct 12 2023 at 08:34):
I have left the script run on the field structure for DirectLimit
for a night. Its progress is... not so impressive:
import Mathlib.Algebra.DirectSum.Basic
import Mathlib.LinearAlgebra.Basis
import Mathlib.LinearAlgebra.DFinsupp
import Mathlib.RingTheory.FreeCommRing
import Mathlib.RingTheory.Ideal.Quotient
universe u v w u₁
namespace DirectSum
open DirectSum
section General
variable {R : Type u} [Semiring R]
variable {ι : Type v} [dec_ι : DecidableEq ι]
variable {M : ι → Type w} [∀ i, AddCommMonoid (M i)] [∀ i, Module R (M i)]
instance : Module R (⨁ i, M i) :=
DFinsupp.module
instance {S : Type*} [Semiring S] [∀ i, Module S (M i)] [∀ i, SMulCommClass R S (M i)] :
SMulCommClass R S (⨁ i, M i) := sorry
instance {S : Type*} [Semiring S] [SMul R S] [∀ i, Module S (M i)] [∀ i, IsScalarTower R S (M i)] :
IsScalarTower R S (⨁ i, M i) := sorry
instance [∀ i, Module Rᵐᵒᵖ (M i)] [∀ i, IsCentralScalar R (M i)] : IsCentralScalar R (⨁ i, M i) := sorry
theorem smul_apply (b : R) (v : ⨁ i, M i) (i : ι) : (b • v) i = b • v i := sorry
variable (R ι M)
def lmk : ∀ s : Finset ι, (∀ i : (↑s : Set ι), M i.val) →ₗ[R] ⨁ i, M i := sorry
def lof : ∀ i : ι, M i →ₗ[R] ⨁ i, M i :=
DFinsupp.lsingle
theorem lof_eq_of (i : ι) (b : M i) : lof R ι M i b = of M i b := sorry
variable {ι M}
theorem single_eq_lof (i : ι) (b : M i) : DFinsupp.single i b = lof R ι M i b := sorry
theorem mk_smul (s : Finset ι) (c : R) (x) : mk M s (c • x) = c • mk M s x := sorry
theorem of_smul (i : ι) (c : R) (x) : of M i (c • x) = c • of M i x := sorry
variable {R}
theorem support_smul [∀ (i : ι) (x : M i), Decidable (x ≠ 0)] (c : R) (v : ⨁ i, M i) :
(c • v).support ⊆ v.support := sorry
variable {N : Type u₁} [AddCommMonoid N] [Module R N]
variable (φ : ∀ i, M i →ₗ[R] N)
variable (R ι N)
def toModule : (⨁ i, M i) →ₗ[R] N :=
FunLike.coe (DFinsupp.lsum ℕ) φ
theorem coe_toModule_eq_coe_toAddMonoid :
(toModule R ι N φ : (⨁ i, M i) → N) = toAddMonoid fun i ↦ (φ i).toAddMonoidHom := sorry
variable {ι N φ}
@[simp]
theorem toModule_lof (i) (x : M i) : toModule R ι N φ (lof R ι M i x) = φ i x := sorry
variable (ψ : (⨁ i, M i) →ₗ[R] N)
theorem toModule.unique (f : ⨁ i, M i) : ψ f = toModule R ι N (fun i ↦ ψ.comp <| lof R ι M i) f := sorry
variable {ψ} {ψ' : (⨁ i, M i) →ₗ[R] N}
@[ext]
theorem linearMap_ext ⦃ψ ψ' : (⨁ i, M i) →ₗ[R] N⦄
(H : ∀ i, ψ.comp (lof R ι M i) = ψ'.comp (lof R ι M i)) : ψ = ψ' := sorry
def lsetToSet (S T : Set ι) (H : S ⊆ T) : (⨁ i : S, M i) →ₗ[R] ⨁ i : T, M i := sorry
variable (ι M)
@[simps apply]
def linearEquivFunOnFintype [Fintype ι] : (⨁ i, M i) ≃ₗ[R] ∀ i, M i :=
{ DFinsupp.equivFunOnFintype with
toFun := (↑)
map_add' := fun f g ↦ by
ext
rw [add_apply, Pi.add_apply]
map_smul' := fun c f ↦ by
simp_rw [RingHom.id_apply]
rw [DFinsupp.coe_smul] }
variable {ι M}
@[simp]
theorem linearEquivFunOnFintype_lof [Fintype ι] [DecidableEq ι] (i : ι) (m : M i) :
(linearEquivFunOnFintype R ι M) (lof R ι M i m) = Pi.single i m := sorry
@[simp]
theorem linearEquivFunOnFintype_symm_single [Fintype ι] [DecidableEq ι] (i : ι) (m : M i) :
(linearEquivFunOnFintype R ι M).symm (Pi.single i m) = lof R ι M i m := sorry
@[simp]
theorem linearEquivFunOnFintype_symm_coe [Fintype ι] (f : ⨁ i, M i) :
(linearEquivFunOnFintype R ι M).symm f = f := sorry
protected def lid (M : Type v) (ι : Type* := PUnit) [AddCommMonoid M] [Module R M] [Unique ι] :
(⨁ _ : ι, M) ≃ₗ[R] M :=
{ DirectSum.id M ι, toModule R ι M fun _ ↦ LinearMap.id with }
variable (ι M)
def component (i : ι) : (⨁ i, M i) →ₗ[R] M i := sorry
variable {ι M}
@[ext]
theorem ext {f g : ⨁ i, M i} (h : ∀ i, component R ι M i f = component R ι M i g) : f = g := sorry
theorem ext_iff {f g : ⨁ i, M i} : f = g ↔ ∀ i, component R ι M i f = component R ι M i g := sorry
@[simp]
theorem lof_apply (i : ι) (b : M i) : ((lof R ι M i) b) i = b := sorry
theorem component.of (i j : ι) (b : M j) :
component R ι M i ((lof R ι M j) b) = if h : j = i then Eq.recOn h b else 0 := sorry
section CongrLeft
variable {κ : Type*}
def lequivCongrLeft (h : ι ≃ κ) : (⨁ i, M i) ≃ₗ[R] ⨁ k, M (h.symm k) := sorry
@[simp]
theorem lequivCongrLeft_apply (h : ι ≃ κ) (f : ⨁ i, M i) (k : κ) :
lequivCongrLeft R h f k = f (h.symm k) := sorry
end CongrLeft
section Sigma
variable {α : ι → Type*} {δ : ∀ i, α i → Type w}
variable [∀ i j, AddCommMonoid (δ i j)] [∀ i j, Module R (δ i j)]
def sigmaLcurry : (⨁ i : Σi, _, δ i.1 i.2) →ₗ[R] ⨁ (i) (j), δ i j := sorry
noncomputable def sigmaLuncurry [∀ i, DecidableEq (α i)] [∀ i j, DecidableEq (δ i j)] :
(⨁ (i) (j), δ i j) →ₗ[R] ⨁ i : Σ_, _, δ i.1 i.2 := sorry
end Sigma
section Option
variable {α : Option ι → Type w} [∀ i, AddCommMonoid (α i)] [∀ i, Module R (α i)]
@[simps]
noncomputable def lequivProdDirectSum : (⨁ i, α i) ≃ₗ[R] α none × ⨁ i, α (some i) :=
{ addEquivProdDirectSum with map_smul' := DFinsupp.equivProdDFinsupp_smul }
end Option
end General
section Submodule
section Semiring
variable {R : Type u} [Semiring R]
variable {ι : Type v} [dec_ι : DecidableEq ι]
variable {M : Type*} [AddCommMonoid M] [Module R M]
variable (A : ι → Submodule R M)
def coeLinearMap : (⨁ i, A i) →ₗ[R] M :=
toModule R ι M fun i ↦ (A i).subtype
@[simp]
theorem coeLinearMap_of (i : ι) (x : A i) : DirectSum.coeLinearMap A (of (fun i ↦ A i) i x) = x :=
sorry
variable {A}
theorem IsInternal.submodule_iSup_eq_top (h : IsInternal A) : iSup A = ⊤ := sorry
theorem IsInternal.submodule_independent (h : IsInternal A) : CompleteLattice.Independent A := sorry
noncomputable def IsInternal.collectedBasis (h : IsInternal A) {α : ι → Type*}
(v : ∀ i, Basis (α i) R (A i)) : Basis (Σi, α i) R M where
repr :=
((LinearEquiv.ofBijective (DirectSum.coeLinearMap A) h).symm ≪≫ₗ
DFinsupp.mapRange.linearEquiv fun i ↦ (v i).repr) ≪≫ₗ
(sigmaFinsuppLequivDFinsupp R).symm
theorem IsInternal.collectedBasis_mem (h : IsInternal A) {α : ι → Type*}
(v : ∀ i, Basis (α i) R (A i)) (a : Σi, α i) : h.collectedBasis v a ∈ A a.1 := sorry
theorem IsInternal.isCompl {A : ι → Submodule R M} {i j : ι} (hij : i ≠ j)
(h : (Set.univ : Set ι) = {i, j}) (hi : IsInternal A) : IsCompl (A i) (A j) := sorry
end Semiring
section Ring
variable {R : Type u} [Ring R]
variable {ι : Type v} [dec_ι : DecidableEq ι]
variable {M : Type*} [AddCommGroup M] [Module R M]
theorem isInternal_submodule_of_independent_of_iSup_eq_top {A : ι → Submodule R M}
(hi : CompleteLattice.Independent A) (hs : iSup A = ⊤) : IsInternal A := sorry
theorem isInternal_submodule_iff_independent_and_iSup_eq_top (A : ι → Submodule R M) :
IsInternal A ↔ CompleteLattice.Independent A ∧ iSup A = ⊤ := sorry
theorem IsInternal.addSubgroup_independent {M : Type*} [AddCommGroup M] {A : ι → AddSubgroup M}
(h : IsInternal A) : CompleteLattice.Independent A :=
CompleteLattice.independent_of_dfinsupp_sumAddHom_injective' _ h.injective
end Ring
end Submodule
end DirectSum
open Submodule
variable {R : Type u} [Ring R]
variable {ι : Type v}
variable [dec_ι : DecidableEq ι] [Preorder ι]
variable (G : ι → Type w)
class DirectedSystem (f : ∀ i j, i ≤ j → G i → G j) : Prop where
map_self' : ∀ i x h, f i i h x = x
map_map' : ∀ {i j k} (hij hjk x), f j k hjk (f i j hij x) = f i k (le_trans hij hjk) x
section
variable [∀ i, AddCommGroup (G i)] [∀ i, Module R (G i)]
variable {G} (f : ∀ i j, i ≤ j → G i →ₗ[R] G j)
nonrec theorem DirectedSystem.map_map [DirectedSystem G fun i j h => f i j h] {i j k} (hij hjk x) :
f j k hjk (f i j hij x) = f i k (le_trans hij hjk) x := sorry
variable (G)
def DirectLimit : Type max v w :=
DirectSum ι G ⧸
(span R <|
{ a |
∃ (i j : _) (H : i ≤ j) (x : _),
DirectSum.lof R ι G i x - DirectSum.lof R ι G j (f i j H x) = a })
namespace DirectLimit
instance addCommGroup : AddCommGroup (DirectLimit G f) :=
Quotient.addCommGroup _
instance module : Module R (DirectLimit G f) :=
Quotient.module _
variable (R ι)
def of (i) : G i →ₗ[R] DirectLimit G f :=
(mkQ _).comp <| DirectSum.lof R ι G i
end DirectLimit
namespace AddCommGroup
variable [∀ i, AddCommGroup (G i)]
def DirectLimit (f : ∀ i j, i ≤ j → G i →+ G j) : Type _ := sorry
variable (f : ∀ i j, i ≤ j → G i →+ G j)
variable (P : Type u₁) [AddCommGroup P]
variable (g : ∀ i, G i →+ P)
variable (Hg : ∀ i j hij x, g j (f i j hij x) = g i x)
namespace Ring
variable [∀ i, CommRing (G i)]
variable (f : ∀ i j, i ≤ j → G i → G j)
open FreeCommRing
def DirectLimit : Type max v w :=
FreeCommRing (Σi, G i) ⧸
Ideal.span
{ a |
(∃ i j H x, of (⟨j, f i j H x⟩ : Σi, G i) - of ⟨i, x⟩ = a) ∨
(∃ i, of (⟨i, 1⟩ : Σi, G i) - 1 = a) ∨
(∃ i x y, of (⟨i, x + y⟩ : Σi, G i) - (of ⟨i, x⟩ + of ⟨i, y⟩) = a) ∨
∃ i x y, of (⟨i, x * y⟩ : Σi, G i) - of ⟨i, x⟩ * of ⟨i, y⟩ = a }
namespace DirectLimit
instance commRing : CommRing (DirectLimit G f) :=
Ideal.Quotient.commRing _
end DirectLimit
namespace Field
variable (f' : ∀ i j, i ≤ j → G i →+* G j)
namespace DirectLimit
instance nontrivial [DirectedSystem G fun i j h => f' i j h] :
Nontrivial (Ring.DirectLimit G fun i j h => f' i j h) := sorry
theorem exists_inv {p : Ring.DirectLimit G f} : p ≠ 0 → ∃ y, p * y = 1 := sorry
open Classical
noncomputable def inv (p : Ring.DirectLimit G f) : Ring.DirectLimit G f :=
if H : p = 0 then 0 else Classical.choose (DirectLimit.exists_inv G f H)
protected theorem mul_inv_cancel {p : Ring.DirectLimit G f} (hp : p ≠ 0) : p * inv G f p = 1 := sorry
@[reducible]
protected noncomputable def field [DirectedSystem G fun i j h => f' i j h] :
Field (Ring.DirectLimit G fun i j h => f' i j h) :=
{ Ring.DirectLimit.commRing G fun i j h => f' i j h,
DirectLimit.nontrivial G fun i j h => f' i j h with
inv := inv G fun i j h => f' i j h
mul_inv_cancel := fun p => DirectLimit.mul_inv_cancel G fun i j h => f' i j h
inv_zero := dif_pos rfl }
Mario Carneiro (Oct 12 2023 at 10:30):
is it still converging?
Mario Carneiro (Oct 12 2023 at 10:34):
One thing you could do to speed it up, assuming that your edits are quick and don't require lean information but the checking time is long, is to perform many steps in one go, and bisect if you overshoot. This is a good option for edits you don't think are likely to fail (like inlining) and less good for cases where most edits will fail (like deleting declarations, if the input is already mostly tree-shaken)
Anne Baanen (Oct 12 2023 at 12:19):
In fact, the modifications I made last night already do this :D
I think the real issue is that changes that have been rejected in previous passes are retried in the next pass, when we should really only focus on the parts of the file that changed.
Last updated: Dec 20 2023 at 11:08 UTC