Zulip Chat Archive

Stream: mathlib4

Topic: Matroids in mathlib - feedback?

Peter Nelson (Jul 21 2023 at 00:45):

I'm making this post to solicit feedback on my lean4 matroid repo : https://github.com/apnelson1/Matroid

This work has undergone many iterations in the last couple of years, including huge refactors to make it more sane, and most recently a porting to lean4; now I think it's basically in a decent state for mathlib . The reason I haven't PRed it yet is that I'm waiting on a couple of PRs currently in the queue (#5908 and #5911) to give me some API for enat set cardinality and some minimality/maximality stuff that I need. (This is all in the ForMathlib subfolder).

I'm imagining that the sequence of PRs for the project itself will take quite some reviewing, since this adds up to a lot of new stuff, so I'm hoping to get a jump on at least the first one by inviting anyone interested to take a look now.

The first PR will be the contents of the file https://github.com/apnelson1/Matroid/blob/main/Matroid/Basic.lean, which defines matroids, and contains definitions of bases, independence, and bases of sets, and gives a few different constructors for matroids via sets of axioms. The docstring contains a detailed discussion of what I would consider the important design decisions. These decisions are mostly my own, and I'd be very happy to learn from experts how I could improve things.

The rest of the project is less well-document but I think it's nearly all in good shape - it contains around 5k loc, with API for the majority of basic matroid notions including duality, minors, circuits, closure, flats, hyperplanes, uniform matroids, the rank function, and relaxation. Apart from one or two leaf files, it's all sorry-free.

Notable omissions are linear representations of matroids and graph representations of matroids, which are important topics but are still work in progress.

Jireh Loreaux (Jul 21 2023 at 02:48):

@Peter Nelson I'll review #5908 tomorrow.

Yaël Dillies (Jul 21 2023 at 06:50):

Just to comment: For having helped Alena a bit with her sorries, I think the entire repo needs a general cleanup and design review. Once that's done, in mathlib it goes.

Peter Nelson (Jul 21 2023 at 13:48):

Yes, a design review is what I'm asking for. Do you have any specific comments?

Eric Wieser (Jul 21 2023 at 14:24):

Having only looked very briefly: is there a reason you need the ground set in matroids vs using the type directly?

Eric Wieser (Jul 21 2023 at 14:25):

For instance, why not

structure Matroid (α : Type _) where
  (Base : Set α → Prop)
  (exists_base' : ∃ B, Base B)
  (base_exchange' : Matroid.ExchangeProperty Base)
  (maximality' : ∀ X, Matroid.ExistsMaximalSubsetProperty (∃ B, Base B ∧ · ⊆ B) X)

instead of your

@[ext] structure Matroid (α : Type _) where
  (ground : Set α)
  (Base : Set α → Prop)
  (exists_base' : ∃ B, Base B)
  (base_exchange' : Matroid.ExchangeProperty Base)
  (maximality' : ∀ X, X ⊆ ground →
    Matroid.ExistsMaximalSubsetProperty (∃ B, Base B ∧ · ⊆ B) X)
  (subset_ground' : ∀ B, Base B → B ⊆ ground)

Peter Nelson (Jul 21 2023 at 14:25):

Yes, this is discussed in the docstring at the top. It's basically because of how often one needs to consider a number of related matroids on related but distinct ground sets.

Kyle Miller (Jul 21 2023 at 14:25):

https://github.com/apnelson1/Matroid/blob/main/Matroid/Basic.lean#L70

Eric Wieser (Jul 21 2023 at 14:32):

Many of the concerns there feel like they could also be applied to vector spaces, which we solved by having a separate Module _ V typeclass and a Submodule R V structure (and ModuleCat.{R} _ which bundles the ground type). Would a similar approach work for matroids, or are they not similar enough for the analogy to hold?

Kyle Miller (Jul 21 2023 at 14:33):

I was thinking about this earlier today, comparing it to the choices made for graphs. For graphs, this would affect graph homomorphisms, since a graph homomorphism is a map on vertices that carries adjacent vertices to adjacent vertices. If they used ground sets, then you'd need to either work with functions on the ambient type that restrict to the ground sets or with functions on the subtypes. With the first, you lose extensionality, and with the second, you have the annoyance of subtypes.

Are matroid homomorphisms used much in the literature? I once saw a definition for strong maps (inverse images of flats are flats), but I have no idea how important these are.

Kyle Miller (Jul 21 2023 at 14:37):

(For simple graphs, there docs#SimpleGraph with a fixed vertex type and docs#SimpleGraph.Subgraph for bundling in a ground set; here you can use the coincidence that the Subgraphs of the complete graph give you the alternative definition of a simple graph. Only SimpleGraph has homomomorphisms. @Eric Wieser I think this is closer of an analogy than Module/Submodule.)

Johan Commelin (Jul 21 2023 at 14:38):

Eric Wieser said:

Many of the concerns there feel like they could also be applied to vector spaces, which we solved by having a separate Module _ V typeclass and a Submodule R V structure (and ModuleCat.{R} _ which bundles the ground type). Would a similar approach work for matroids, or are they not similar enough for the analogy to hold?

To be fair, converting between these different POVs can be quite painful. (i) The DG-algebra challenge is still an open problem. (ii) Working with graded rings/modules is not as smooth in Lean as in the informal counterpart.

Peter Nelson (Jul 21 2023 at 14:39):

Strong/weak maps do show up, but aren't particularly central, at least to the research on my personal radar. (I have weak maps defined in my lean3 repo, where I hadn't yet adopted this 'ground set is a Set' approach).

A similar issue shows up when I define matroid isomorphisms in the Equivfile. They are defined using LocalEquiv, which is a little awkward. Dealing with minors, which are everywhere in the literature, is so absolutely hellish if the ground set is a type, that I've learned to live with such things.

Eric Wieser (Jul 21 2023 at 14:40):

@Johan Commelin, I agree that the Module/Submodule/ModuleCat design may not be perfect; but there's a lot of value to using the same approach throughout mathlib for consistency

Peter Nelson (Jul 21 2023 at 14:42):

I've thought about the module/submodule analogy; in fact, it kinda got me to where I am now with this. For matroid theory, one ends up being drawn towards absolutely everything in the API existing for the analogous object to a Submodule, and getting to a point where the Module object is secondary.

Kyle Miller (Jul 21 2023 at 14:43):

Eric Wieser said:

there's a lot of value to using the same approach throughout mathlib for consistency

Just so long as it's not a foolish consistency.

Already it's different because Module/Submodule/ModuleCat is designed around using typeclasses for implementing the usual synecdoche (that M is a module). Matroids don't use synecdoche.

Kyle Miller (Jul 21 2023 at 14:45):

Typeclasses also implement transparent coercions between different structures in the algebraic hierarchy, but I'm not sure matroids need that either.

Peter Nelson (Jul 21 2023 at 14:47):

I think the concrete problem with the ground set being a type is that you can't talk about minors reasonably. See the lemmas here, for example : https://github.com/apnelson1/Matroid/blob/main/Matroid/Minor.lean#L560

They are mathematically trivial, but would have formal statements involving canonical maps between images/preimages of coercions of four sets between multiple subtypes, subtypes of subtypes, etc. I tried this for a long long time, becoming intimately familiar with preimage_image_coe-type lemmas, but it just isn't sustainable.

Kyle Miller (Jul 21 2023 at 14:54):

Another potential design for minors (which definitely involves a tradeoff) is to introduce another type that records what should be deleted and what should be contracted. This at least works for graphs (you record a set of edges to keep and a subset of those edges to contract), but the issue is that you can't apply any theorems about graphs to minors without using a toGraph function.

It seems that the reasons for wanting this in graphs (such as avoiding quotient types until you really need them) don't apply to matroids though.

Kyle Miller (Jul 21 2023 at 14:55):

It's nice that for matroids with an embedded ground set you can get contraction-deletion as an Eq

Peter Nelson (Jul 21 2023 at 14:56):

Kyle Miller said:

It's nice that for matroids with an embedded ground set you can get contraction-deletion as an Eq

Yes. The first time I used rw with such an equality (after months of DTT hell) filled me with joy!

Peter Nelson (Jul 21 2023 at 14:59):

I actually think a similar approach might be reasonable for minors in graph theory in some cases. If you have an embedded edge and vertex set, then you can contract an edge uv by choosing one of the ends uv to contract it to, so you get a graph whose vertex and edge sets live in the same type, avoiding quotients. Tradeoffs abound, of course.

Kyle Miller (Jul 21 2023 at 15:03):

Yeah, though it's too bad that choosing representatives isn't very stable (ex: contracting an edge between two contracted components yields a representative that might have no relation to either components' representative), and the edge set won't generally be a subset of the original edge set.

Peter Nelson (Jul 21 2023 at 15:04):

the edge set will still be a subset if you have an edge type (if you're doing anything with minors, this seems almost a necessity to me)

Peter Nelson (Jul 21 2023 at 15:06):

(to be clear, I mean a type in which the edges are an embedded set)

Peter Nelson (Jul 21 2023 at 15:35):

Kyle Miller said:

Are matroid homomorphisms used much in the literature? I once saw a definition for strong maps (inverse images of flats are flats), but I have no idea how important these are.

On this topic, much of the interest of strong/weak maps can be captured in the special case where they are between two matroids on the same ground set; so my approach doesn't rule that out. Homomorphisms aren't really well-defined for general matroids, but they are pretty much just linear maps in the representable case, and that is quite compatible with the 'embedded ground sets' approach as well; elements outside the ground set are just 'zero'.

Peter Nelson (Aug 03 2023 at 18:52):

#6352
I started this before my kid was born; he's now 2.5 years old and I'm finally making a matroid PR!

Antoine Chambert-Loir (Apr 23 2024 at 19:44):

Kyle Miller said:

Are matroid homomorphisms used much in the literature? I once saw a definition for strong maps (inverse images of flats are flats), but I have no idea how important these are.

The classic theory of matroids isn't categorized, but there is now a definition of a morphism of matroids in this paper by Eur and Huh. It is not clear to me how much this definition will fit the need.

Peter Nelson (Apr 24 2024 at 14:44):

Eric Wieser said:

I think a lot of this is sort of a symptom that the design of docs#Matroid doesn't mesh well with the rest of the library (c.f. the discussion here), and it forces you to write A : set \alpha and use A \inter B everywhere, instead of just using A : Set \beta

@Eric Wieser

You've raised this point in review, and I want to reiterate that I really don't know a better way, and having the design of matroids 'mesh well' with mathlib would make it incredibly difficult to make even this simplest statements that appear in actual proofs. I know this because I tried for a long time - the current approach was suggested to me by @Johan Commelin last year, as a response to the issues that came up.

Here is the specific problem. In matroid theory, statements like M / C \ D / e \ f = M / (C ∪ {e}) \ (D ∪ {f}), or more generally 'contractions and deletions commute' are everywhere, and need to be possible to state. If each matroid is tied to a type, then that = is a disastrous combination of isomorphisms, and we descend into DTT hell. If there is a Minor M type for each E : Type and M : Matroid E (in the vein of SimpleGraph.Subgraph), then we end up needing to duplicate the entire API for Minor, which is tantamount to the current approach anyway.

If you can show me a way to reasonably state M / C \ D / e \ f = M / (C ∪ {e}) \ (D ∪ {f}) that isn't the current approach, I would really love to hear it. Otherwise, I think options are some 'principled' answer involving unavailable HOTT, or treating matroids as the set-theoretic objects they are in practice.

On that topic, I think that the same issues will arise when doing serious work on minors of graphs. A student of mine will be embarking on this journey for his PhD project later this year.

Kyle Miller (Apr 24 2024 at 14:59):

Something that I found to be frustrating when working on the graph theory library is how I had to keep re-explaining why it is that graphs are not algebraic objects, so the design patterns we've learned for how to deal with algebraic objects do not necessarily apply.

Kyle Miller said:

Eric Wieser said:

there's a lot of value to using the same approach throughout mathlib for consistency

Already it's different because Module/Submodule/ModuleCat is designed around using typeclasses for implementing the usual synecdoche (that M is a module). Matroids don't use synecdoche.

These combinatorial objects do not use coercions and typeclasses to make it so you can identify the ground set with the object.
These combinatorial objects are not associated to a particular type. The common case is studying multiple objects with the same ground set, unlike algebraic objects where there's "the" set of operations.

Can we agree to let combinatorics explore these new design patterns here without needing to keep re-justifying why the algebraic hierarchy pattern isn't appropriate?

Also: I think the fact that deletion-contraction was proved using an Eq is very strong evidence that @Peter Nelson came up with a great design for matroids. Let's not talk about "symptoms" that it was designed incorrectly unless there's an alternative working design backing that up.

Eric Wieser (Apr 24 2024 at 15:05):

I was careful to chose my words above; I don't think the design of Matroid is incorrect, merely that it's out-of-distribution for the way various bits of API elsewhere in mathlib are designed. I completely agree that it's fine to let combinatorics explore new design patterns, and part of that is noting friction (but continuing despite it) that appears between the two designs.

Kyle Miller (Apr 24 2024 at 15:22):

I think it's fine taking note of design tradeoffs, but they're tradeoffs.

My belief is that, if they must be called symptoms, they are symptoms of the underlying design complexities. The complexities will express symptoms somewhere no matter the design. For example, you may see DTT hell rather than a clean deletion-contraction Eq. I think it's unfair to phrase it as "a symptom of the design of Matroid," which is a sort of sentiment that I personally find to be discouraging. Unless the design isn't already at the Pareto frontier, it's not fair to bring up deficiencies along single axes like it's an issue with the design.

To what extent is this an issue with Matroid not "meshing well" with mathlib, or rather is it mathlib not yet having the supporting set theory? I'm not sure there's any project that hasn't contributed something to it. I think it would be short-sighted to design based on what set theory is already there.

Eric Wieser (Apr 24 2024 at 16:52):

Peter Nelson said:

If you can show me a way to reasonably state M / C \ D / e \ f = M / (C ∪ {e}) \ (D ∪ {f}) that isn't the current approach

Just to check; is this "current approach" in mathlib, in a branch, or in a repo elsewehere?

Peter Nelson (Apr 24 2024 at 16:54):

https://github.com/apnelson1/Matroid/blob/main/Matroid/Minor/Basic.lean

Minors work smoothly if the ground set is a Set.

nicheng0019 (Apr 25 2024 at 00:16):

(deleted)

Peter Nelson (Apr 25 2024 at 10:46):

Yaël Dillies said:

For the record, I already gave you an alternative design which seems strictly better to me (from poking around in your repo), namely remove the ground field and instead have a IsSupportedOn predicate checking that everything is defined on a given ground set.

How exactly would this work? From the way I understand it, a supportedOn predicate would give Eq for my example, but it would also give no way to distinguish between a matroid not containing e, and one where e is a loop.

And this would (for instance) break duality - it would require something like Matroid.dualWRT (E : Set α) rather than Matroid.dual. This would cascade to every piece of the theory that considers loops, which do not dualize to loops. Statements that want to talk about loops would have to remember the ground set explicitly. So then {M : Matroid α // M.supportedOn E} would come into play in any proof where you need to keep track of the ground set. And this gives worse versions of all the problems I've mentioned.

Peter Nelson (Apr 25 2024 at 13:19):

And the real problem is that if a Matroid doesn't know its ground set, a Matroid isn't actually a matroid, in that it doesn't carry as much information. So while you have M / C \ D / e \ f = M / (C ∪ {e}) \ (D ∪ {f}), you also have mathematically incorrect statements as theorems, like M / {e,f} = M / e for a parallel pair e,f. (even if M itself is loopless).

Martin Dvořák (Apr 26 2024 at 08:59):

Let me summarize the situation. Peter Nelson put a lot of effort into definitions and results related to matroids. However, most of this work hasn't been integrated into Mathlib yet. Because of that, Ivan and I are in the process of re-doing a good portion of that work. Additionally, since Peter and Ivan&I worked in isolation, all of us have made some design decisions that make sense in the context of our own work, but might not be ideal in the long run. As a result, now we find it difficult to consolidate our contributions and to move on to formalizing more complex results.

I would like all of us to team up and combine our efforts. To make it easier to proceed, we should review our work, decide on a good architecture, refactor the existing code, and make a large consolidated pull request with all the materials we have so far. What do you think?

Eric Wieser (Apr 26 2024 at 09:02):

and make a large consolidated pull request with all the materials we have so far. What do you think?

This seems doomed to be stuck in review. In the short term, I think it would be better for you to contribute your matroid work directly to @Peter Nelson's repository, assuming Peter is on board with that.

Peter Nelson (Apr 26 2024 at 14:14):

Having this come up has given me a good opportunity to think about what the issues are and why they led to the decisions they did, and I think it's good to have some of the discussion recorded here. The following is in a sense a restatement of what @Kyle Miller has said, but I hope it helps for those curious.

Here is a design question about a more familiar mathematical object than a matroid. (the fact that it is mathematically completely unhinged is not the point.)

Suppose I have produced a 'union' operation on groups, so

for groups $G = (S, *)$ and $G = (S', *')$ on sets $S,S'$ , there is a group $G \cup G' = (S \cup S', *'')$ on the union of $S$ and $S'.$ (Here, S and S' might intersect nontrivially)

Suppose that I also have a 'difference' operation on groups, so

For each group $G = (S, *)$ and each subset $X$ of $S$ , then there is a group $G \backslash X$ on $(S \backslash X)$ .

Now I tell you that I have proved that my operations obey the following commutative/ associative/distribute-type laws, so

$(G \cup G') \cup G'' = G \cup (G' \cup G'')$
$G \backslash X \backslash Y = G \backslash Y \backslash X = G \backslash (X \cup Y)$ for all $X,Y$ contained in $G$ , and
for $X$ contained in $G$ and $G'$ , we have $(G \cup G') \backslash X = (G \backslash X) \cup (G' \backslash X).$

How do I formalize these statements?

As far as I can tell, it's a good thing that the above is mathematically nonsense, because mathlib's Group would have nothing pretty to offer. (at least nothing I'd want to interact with). This is fine, because it is nonsense. That's not what groups are for.

For combinatorial objects like graphs and matroids, on the other hand, these are analagous to things we actually do! We glue things together on common intersections, throw things away, identify different things, fiddle with adjacency/independence predicates, and lots more. And not always in algebraically principled ways. And ideally, we need these things to be flexible enough to have Eq and rewrites wherever possible.

In my opinion, designs for combinatorics that embrace the set theory more than most of mathlib, are the way forward.

Dean Young (Apr 26 2024 at 23:04):

Hi Peter,

I've been trying to find someone who is interested in the matroids for Laurent series- the ones involving L[[t₁,...,tₙ]] and L((t₁,...,tₙ))

I thought you would know or would be interested in the matroids associated to these, roughly analogous to the ones for transcendence degree, and maybe evening adding them in as examples.

The most interesting part to me is that L((t,s)) has no natural locally compact topology.

Peter Nelson (Apr 27 2024 at 12:18):

I don’t know anything about that, sorry!

Eric Wieser (Apr 27 2024 at 12:58):

(@Peter Nelson, note that you can make LaTeX work in your message above by using $$ $$ instead of $ $ )

Eric Wieser (Apr 27 2024 at 13:41):

In your example of glueing together G and G' with nontrivial intersection, is $*$ required to agree with $*'$ on that intersection?

Peter Nelson (Apr 27 2024 at 13:43):

I can say yes for now to make things easier, but probably in some (analogous) applications the answer is no, for example the union of two graphs with disjoint edge sets and intersecting vertex sets.

Eric Wieser (Apr 27 2024 at 14:35):

I think in the "yes" case the existing docs#Subgroup design is adequate and works well in algebra. The "no" case sounds like the interesting one

Peter Nelson (Apr 27 2024 at 14:37):

Doesn’t Subgroup presuppose a global ambient group structure? I’m certainly not going to be that generous.

Eric Wieser (Apr 27 2024 at 14:44):

It does, but it's not yet clear to me that that's actually really consuming any generosity; if the largest object you're going to talk about is $G_1 ∪ G_2 ∪ G_3$ then you can just pretend you are working in some arbitrary ambient group G such that (⊤ : Subgroup G) = G₁ ⊔ G₂ ⊔ G₃ (and in most cases you'll find you don't use this assumption).

Eric Wieser (Apr 27 2024 at 14:45):

The point of the ambient group is to enforce in the type system that your $*$ operator always agrees on intersections

Eric Wieser (Apr 27 2024 at 14:56):

Maybe a good example of where this lack of generosity manifests in mathlib today is when talking about subgroups of a ring, where no global inverse is available. There our solution is to talk about Subgroup Rˣ instead, which is usually not too painful.

Peter Nelson (Apr 27 2024 at 15:48):

I think I’m still missing something. What are the type signatures of the relevant functions for associativity, and if my proof that the union group operation gives a group is long and complicated, where is that mathematics hiding?

Eric Wieser (Apr 27 2024 at 15:52):

The associativity lemma in this case would just be docs#sup_assoc, which follows from a lattice instance somewhere.

and if my proof that the union group operation gives a group is long and complicated, where is that mathematics hiding?

The mathematics is hiding in showing that the Subgroup is closed under the ambient * operations, as the "agreement on intersections" requirement means that there isn't actually any new group structure to define.

Peter Nelson (Apr 27 2024 at 15:59):

Ah thanks, I understand - the case resolved by the Sub_ pattern is where there is at most one substructure for each set in the ambient type. This isn’t the case for matroids or graphs with minors, so I’ll revise my earlier answer to ‘no’.

Peter Nelson (Apr 27 2024 at 16:10):

Or slightly more generally, I suppose, it resolves the case where substructures are SetLike, as with Subgraph

Last updated: May 02 2025 at 03:31 UTC