Zulip Chat Archive

Stream: graph theory

Topic: What kind of graphs are in Mathlib already?

Pieter Cuijpers (Mar 28 2025 at 15:56):

Hi, I'm trying to figure out which types of graphs are defined already in Mathlib and where to find them, and which ones are not.

So far, I've seen SimpleGraph, Digraph, and Quiver. These give rise to two questions.

1) In the first two definitions, I would have expected that SimpleGraph would extend Digraph, and that things like walk would be defined on Digraph, but that seems not to be the case. Is this just a matter of did not happen yet or is there a rationale behind that? And how does this community go about refactoring efforts - if it is indeed a matter of did not happen yet?

2) The definition of Quiver is the next step up from Digraph and generalizes Prop to sort as a way of generating different edges between vertices. I understand that this is helpful as a step towards defining Category later, but in some of the graph theoretic work that I encounter considering a MultiGraph (I think it would be called) as:

class MultiGraph (Vertex : Type*) (Edge: Type*) where
   source : Edge -> Vertex
   destination : Edge -> Vertex

seems helpful, in that it seems a lot easier to me when I want to consider all the edges in the graph at the same time. When I use Quiver I always have to first identify the source and destination explicitly, which is sometimes cumbersome. Does that definition (or similar) exist under a different name perhaps? Is there a reason not to have it? And if not to both, should I try setting up a PR for it?

Rida Hamadani (Mar 28 2025 at 16:16):

Regarding part (1), it is a case of "nobody did it yet". It shouldn't be too hard, feel free to do it!

Rida Hamadani (Mar 28 2025 at 16:23):

Actually "nobody did it" is technically wrong, @Jeremy Tan did it here #16018, but that PR got closed and nobody worked on it after that.

Digraphs are relatively recent compared to simple graphs, and that part is still lacking, I wanted to port parts of simple graph to digraphs but got too busy with other things, I would be happy to help if somebody else starts working on it.

I think this is worth it because there are a lot of important results on, for instance, tournaments, directed SRGs, flow networks, etc and this part of the library can be the foundation for formalizing them.

Rida Hamadani (Mar 28 2025 at 16:28):

cc. @Peter Nelson regarding point (2).

Kyle Miller (Mar 28 2025 at 16:49):

With 1, it's that it's not necessarily useful to have it literally extend Digraph. If you're thinking about this in terms of object-oriented programming and subtyping, Lean doesn't work the same way. You wouldn't be able to pass a SimpleGraph to a function expecting a Digraph, except in special cases (via dot notation), and you can't use dot notation to make Digraph functions that operate on SimpleGraphs that return SimpleGraphs.

#4127 has a class-based approach to unify the API, but as you can see it's quite an old PR. I think I have a branch somewhere with a more updated version, but it might be on my desktop computer...

Pieter Cuijpers (Mar 28 2025 at 17:06):

@Kyle Miller I do get that the subtyping doesn't work in exactly the same way, but if the notion of 'walk' was defined on Digraphs, it would automatically be defined on SimpleGraphs, so it would prevent duplication of code - right?

The tricky part, I think, that in some cases the concepts may generalize, but the proofs would go via a different route. Worse, the formulation of the concepts may be different. For example my question on Multigraphs vs Quivers is an example of that. The two definitions may - up to some details - be equivalent (up to some type-theoretic differences), but the proofs of some theorems will be awkward on one, while going smoothly on the other, and vice versa. In such a situation, people may prefer one basic definition over another, and there is no clear decision anymore on how the inheritance should go.

Pieter Cuijpers (Mar 28 2025 at 17:09):

Rida Hamadani said:

Actually "nobody did it" is technically wrong, Jeremy Tan did it here #16018, but that PR got closed and nobody worked on it after that.

Digraphs are relatively recent compared to simple graphs, and that part is still lacking, I wanted to port parts of simple graph to digraphs but got too busy with other things, I would be happy to help if somebody else starts working on it.

I think this is worth it because there are a lot of important results on, for instance, tournaments, directed SRGs, flow networks, etc and this part of the library can be the foundation for formalizing them.

Hm... that is not too encouraging... hence my question "how does refactoring work" in this community? Is there a way to avoid that such an effort "fails" (if that is the right word).

Kyle Miller (Mar 28 2025 at 17:10):

Pieter Cuijpers said:

it would automatically be defined on SimpleGraphs, so it would prevent duplication of code - right?

No, not exactly. If G : SimpleGraph V, you can't write Digraph.Walk G u v. However, the dot notation elaboration would let you write G.Walk u v to get Digraph.Walk G.toDigraph u v.

The class system I mentioned in the PR I linked to is the way to go to get the unified interface you want (or maybe not done exactly like in the PR, but the general idea). That's how homomorphisms work in mathlib; they have similar issues, since you want, for example, a homomorphism that's both multiplicative and additive to "be" a multiplicative homomorphism and an additive homomorphism.

Kyle Miller (Mar 28 2025 at 17:12):

extends is a very very weak concept, and it's not worth it here I think. That's not to say that no one wants a unified interface for graphs! It's just that extends isn't the correct mechanism.

Kyle Miller (Mar 28 2025 at 17:13):

extends works a lot better for class than structure, since the parents each give a projection instance, and that simulates the OO-like subtyping. For graphs, what we want are classes that provide the interface for what it means to be a graph with certain kinds of features, and then definitions can make use of these classes. They would not be in terms of a concrete data type.

Kyle Miller (Mar 28 2025 at 17:19):

The reason I stopped pursuing #4127 is (1) I wasn't sure about some of the interface I was developing (e.g. should the vertex type of a graph type be allowed to vary with the graph? or does the graph type itself fix the vertex type? I ran into some Lean bugs here, and they might be fixed by now) and (2) I like dot notation and don't want to give it up, which the class-based system requires, and in the meantime I've been working on Lean features to preserve dot notation.

Pieter Cuijpers (Mar 28 2025 at 17:19):

I'm a bit lost on why extends does not work on structure, but that aside I think it is indeed the unified interface that I'm looking for, which usually is created either by extends or by an instance theorem (which is also fine with me). And both give the dot notation elaboration, which is I think quite an okay way to achieve the interface we need.

(I'm probably not enough of a programmer to use all the jargon correctly.)

Pieter Cuijpers (Mar 28 2025 at 17:22):

Kyle Miller said:

The reason I stopped pursuing #4127 is (1) I wasn't sure about some of the interface I was developing (e.g. should the vertex type of a graph type be allowed to vary with the graph? or does the graph type itself fix the vertex type? I ran into some Lean bugs here, and they might be fixed by now) and (2) I like dot notation and don't want to give it up, which the class-based system requires, and in the meantime I've been working on Lean features to preserve dot notation.

You have my full support on liking dot notation and prioritizing that :-)

What you mention about #4127 sounds like the decision between Multigraph as I proposed it above and Quiver as it is in the library to me, or am I mistaken? There may be value to both. And strictly speaking, a Multigraph would be a Quiver, but not necessarily the other way around?

Pieter Cuijpers (Mar 28 2025 at 17:23):

I may be overlooking important details here.

Kyle Miller (Mar 28 2025 at 17:26):

I'm a bit lost on why extends does not work on structure

I invite you to experiment here with using extends on structure. It "works", but there are some deep usability problems, because you don't get a coercion from SimpleGraph to Digraph that actually works. This is why I point out that it's not like the OO concept of extension — the OO one is how people think it will work, even if they aren't consciously thinking about it that way.

There's also a fundamental conceptual issue: a simple graph isn't a digraph. Sure, you can model a simple graph as a symmetric irreflexive relation, but that doesn't mean that there are edges going in both directions. For simplicity's sake we may as well take the perspective that simple graphs are digraphs, but I think it's worth keeping this in mind. It has a big impact on how you name functions. E.g., you want the neighbor set of a vertex, right? Is that the set of heads or tails from a given vertex?

Kyle Miller (Mar 28 2025 at 17:31):

What you mention about #4127 sounds like the decision between Multigraph as I proposed it above and Quiver as it is in the library to me, or am I mistaken?

It's different. With Quiver/MultiGraph, you're putting a fixed structure on a given Vertex and Edge type. With my PR, you have an actual type of graphs under consideration, and each term of that type has a graph structure.

We care about the lattice structure on SimpleGraph and Digraph for example, and we might have general theorems about it.

Multigraphs generally don't have a good lattice structure however, so your MultiGraph type could be workable. You just need to keep in mind that you should never have two graphs with the same Vertex and Edge types. (It's possible to have a Multigraph type with a lattice structure; recall that there are two definitions of a multigraph, one is that the edge set is a multiset of unordered pairs, and the other is what you have. That is, indistinguishable vs distinguishable multiedges. They're slightly different, and I think we should be able to support both.)

Peter Nelson (Mar 28 2025 at 17:33):

You just need to keep in mind that you should never have two graphs with the same Vertex and Edge types

I disagree on this point. For instance, I could remove an edge from multigraph, and add it back somewhere else.

Kyle Miller (Mar 28 2025 at 17:33):

What are you disagreeing with? I'm pointing this out as a problem with the arrangement.

Peter Nelson (Mar 28 2025 at 17:34):

I interpreted 'should' as a design mandate. If it's not that, I have no disagreement.

Kyle Miller (Mar 28 2025 at 17:34):

It's a "should" as in "class instances are supposed to be canonical, so you shouldn't violate that"

Peter Nelson (Mar 28 2025 at 17:36):

This goes against the design of matroids that seems to be working very well - for instance, operations like truncation, and results like the matroid intersection theorem, use different matroids on the same ground set (type).

Peter Nelson (Mar 28 2025 at 17:37):

Ah, the difference is the 'class' part.

Kyle Miller (Mar 28 2025 at 17:37):

Yes, I'm saying why the MultiGraph class from the beginning of this thread has some difficulties. It's workable, and I'm sure it's great for certain applications, but it's not necessarily what everyone would want.

Kyle Miller (Mar 28 2025 at 17:39):

We want a class (or a system of classees) that describes a generic Multigraph interface. Concrete implementations of multigraphs would then implement these classes, and each person can use their favorite kind of multigraph with the generic library.

Pieter Cuijpers (Mar 28 2025 at 17:40):

Got that, but I would be fine with making MultiGraph a structure as well. I was not aware of a cannonical "lattice of Digraphs", but it makes sense to consider it as a structure too.

Kyle Miller (Mar 28 2025 at 17:41):

Digraph V and SimpleGraph V have lattice structures, via edge containment

Peter Nelson (Mar 28 2025 at 17:51):

The plan my student and I have at the moment is something like the following:

structure Graph (α β : Type*) where
-- vertex set
  V : Set α
-- edge set
  E : Set β
-- edge-vertex incidence
  Inc : β → α → Prop
  vertex_support : ∀ e v, Inc e v → v ∈ V
  edge_support : ∀ e v, Inc e v → e ∈ E
-- extra propositional stuff, such as the fact that each edge is incident with at most two vertices.

Peter Nelson (Mar 28 2025 at 17:54):

This uses the 'embedded set' design that has worked well for matroids. I am growing more and more convinced that this will be the best way to formalize real-world proofs involving minors.

If contracting an edge of a graph forces passing to a new graph with a different vertex type, then any vaguely complicated 'contract an edge, apply induction'-type proof will descend to DTT hell.

Kyle Miller (Mar 28 2025 at 17:57):

Seems reasonable. I think this basically what's in https://link.springer.com/chapter/10.1007/3-540-58450-1_40, but with types for the elements of the V and E sets. (I can't check right now.)

Kyle Miller (Mar 28 2025 at 17:59):

Here's another definition I experimented with: https://github.com/leanprover-community/mathlib3/blob/3c940418e050f85795b91778aa17fdf7c354c700/src/combinatorics/multigraph/basic.lean#L58-L60

I think it might be equivalent; it uses, basically, a set of triples V x V x E that's symmetric in the first two components, rather than an Inc relation.

Kyle Miller (Mar 28 2025 at 18:02):

(As always, you'll have to decide how you want to define the degree of a vertex that has a loop. The topologist in me wants it to be 2, but with all these proposals the easiest answer is 1.)

Peter Nelson (Mar 28 2025 at 18:02):

It's 2!

Peter Nelson (Mar 28 2025 at 18:07):

(Provided you want a handshake theorem, at least)

Peter Nelson (Mar 28 2025 at 18:10):

(and nice linear algebra over Z2)

Kyle Miller (Mar 28 2025 at 18:10):

How do you define degree to get 2 though?

Peter Nelson (Mar 28 2025 at 18:17):

edit : nevermind

Peter Nelson (Mar 28 2025 at 18:50):

Yeah, it's annoying. Define a function f from edges to Sym2; then the degree of a vertex is obtained by the sum over e of the sum over f e of an indicator function.

Peter Nelson (Mar 28 2025 at 18:52):

But the rewards are obvious!

Peter Nelson (Mar 28 2025 at 19:48):

This might be look a bit heavy-handed, but I think it works.

structure Graph (α β : Type*) where
  V : Set α
  E : Set β
  inc : β → α →₀ ℕ
  sum_eq : ∀ e ∈ E, (inc e).sum (fun _ x ↦ x) = 2
  vertex_support : ∀ e v, inc e v ≠ 0 → v ∈ V
  edge_support : ∀ e v, inc e v ≠ 0 → e ∈ E

def Graph.IsLoopAt (G : Graph α β) (e : β) (x : α) : Prop := G.inc e x = 2

def Graph.IsNonloopAt (G : Graph α β) (e : β) (x : α) : Prop := G.inc e x = 1

Peter Nelson (Mar 28 2025 at 19:49):

The handshake theorem should naturally be true with no fussing with loops, and I think even contractions will work nicely.

Aaron Liu (Mar 28 2025 at 19:50):

What is sum_eq : ∀ e ∈ E, (inc e).sum (fun _ x ↦ x) = 2 for?

Aaron Liu (Mar 28 2025 at 19:52):

Why not just have inc : β → Sym2 α instead?

Peter Nelson (Mar 28 2025 at 19:52):

The sum of the incidences for the vertices at each edge e is either 1+1=2 (for a nonloop e) or 2=2 (for a loop e)

Peter Nelson (Mar 28 2025 at 19:53):

Aaron Liu said:

Why not just have inc : β → Sym2 α instead?

Because that will be ugly for nonelements of E.

Peter Nelson (Mar 28 2025 at 19:54):

(In my opinion) the design should allow for deleting edges in a Graph a b to get another Graph a b, without changing the edge type.

Aaron Liu (Mar 28 2025 at 19:56):

You could also go hypergraph and drop it and then edges connect arbitrarily many vertices

Peter Nelson (Mar 28 2025 at 19:57):

You can also allow for digraphs by having the incidence function be Int-valued (plus or minus 1) and summing absolute values.

Peter Nelson (Mar 28 2025 at 20:26):

The annoyance is that this will require Decidable in places that Prop-valued predicates won’t

Peter Nelson (Mar 29 2025 at 10:50):

I believe there is no way to avoid the decidability assumption in the following :

def Graph.edgeDel (G : Graph α β) (D : Set β) [DecidablePred (· ∈ D)] : Graph α β where
  V := G.V
  E := G.E \ D
  inc := fun e ↦ if e ∈ D then 0 else G.inc e
  sum_eq e he := by simp [he.2, G.sum_eq e he.1]
  vertex_support e v h := G.vertex_support e _ <| by aesop
  edge_support e v h := ⟨G.edge_support _ v (by aesop), by aesop⟩

Peter Nelson (Mar 29 2025 at 11:00):

Ok, I was wrong; you can make it mostly invisible this way.

noncomputable def Graph.edgeDel (G : Graph α β) (D : Set β) : Graph α β where
  V := G.V
  E := G.E \ D
  inc e :=
    haveI := Classical.dec (e ∈ D)
    if e ∈ D then 0 else G.inc e
  sum_eq e he := by simp [he.2, G.sum_eq e he.1]
  vertex_support e v h := G.vertex_support e _ <| by aesop
  edge_support e v h := ⟨G.edge_support _ v (by aesop), by aesop⟩

Peter Nelson (Mar 29 2025 at 19:50):

This file uses the above to define multigraphs, edge deletions, extensionality with prop-valued incidence, loops, nonloops, images under maps that merge vertices, and the handshake theorem (with no special-casing for loops) in just over 200 lines.

Pieter Cuijpers (Mar 31 2025 at 10:08):

Ahh! The discussion is spiralling in quite a different direction then I would have expected, but it does give me some answer as to why there is no clear-cut definition yet :sweat_smile: . I was just being happily naive.

Pieter Cuijpers (Mar 31 2025 at 10:30):

But why the explicit addition of having sets of vertices and edges?
Would something like:

structure Graph (V E : Type*) where
  inc : E → V → ... whatever needs to be here

already give you enough structure to work with?

From the remarks you're making, you have experience with how the proofs will go further down the line.
I'm trying to understand why you make the choices the way you do here.

Peter Nelson (Mar 31 2025 at 10:50):

Yes, there is a reasonable discussion to be had about design here, and on this matter I’m expressing one particular point of view, which is informed by my work on matroids. I discuss this at length in this post: https://leanprover.zulipchat.com/#narrow/stream/287929-mathlib4/topic/The.20design.20of.20matroids

The issues in graph theory aren’t the same, but the analogies are there.

The shorter version: in real-world graph theory proofs, you might do something like contract an edge e, delete a vertex x, and consider the role of a vertex v and a path P, before and after the contractions and deletions were performed. If the graphs $G$ , $G-x$ , $G / e$ and $(G-x)/e$ are all modeled using different types, then there are already four ‘versions’ of v and P, related by a mess of invisible canonical maps. The more involved the proof, the worse this picture gets, and in my opinion this is a road to disaster.

To avoid this, you want the four graphs to have the same type. I don’t know a better choice for this type than simply Graph a b with vertex and edge sets as structure fields.

Pieter Cuijpers (Mar 31 2025 at 11:22):

I see. Would there, alternatively, be a way to turn the bookkeeping over all these invisible canonical maps into a tactic?

Personally, I like it when the definitions are kept as simple as possible, sometimes at the cost of a bit of bookkeeping in the proofs. In my work on process algebras (computer science) the bookkeeping is usually unavoidable, which is the very reason I like theorem provers. They do that bookkeeping for me. But it's only fun if it remains readable indeed.

Pieter Cuijpers (Mar 31 2025 at 11:24):

I see that in the other discussion, there were also people who usually have fixed graphs they are studying, in which case my definition seems preferable.

Pieter Cuijpers (Mar 31 2025 at 11:26):

Perhaps there is a general trend here... "fixed structures" versus "mutable structures", requiring different definitions even while they belong to the same branch of mathematics.

Aaron Liu (Mar 31 2025 at 11:28):

Peter Nelson said:

To avoid this, you want the four graphs to have the same type. I don’t know a better choice for this type than simply Graph a b with vertex and edge sets as structure fields.

You could bundle the type, so you have Graph.{v, u} : Type max u v, and for g : Graph you would talk about g.Vertex and g.Edge.

Peter Nelson (Mar 31 2025 at 11:32):

Maybe a bookkeeping tactic is possible. But it is hard to even precisely delineate the intended scope of such a tactic, and often it is the statements rather than the proofs of theorems that become horrible. Think about statements of the form 'minor operations commute' for example.

In my approach, the bookkeeping tactic needed is one that is capable of proving using existing assumptions that things are 'real' vertices and edges (members of G.V and G.E respectively). This is much more limited in scope, and can be implemented in practice.

Also, the model where the edges and vertices are types are easy to describe in terms of the more general one. Have propositional typeclasses [FullVertex G] and [FullEdge G] for G : Graph a b that state G.V = Set.univ and G.E = Set.univ. A lot of the API can be shared with the general case, and there can be specialized sugar API for the FullVertex/FullEdge cases where it is needed.

Peter Nelson (Mar 31 2025 at 11:36):

Aaron Liu said:

Peter Nelson said:

To avoid this, you want the four graphs to have the same type. I don’t know a better choice for this type than simply Graph a b with vertex and edge sets as structure fields.

You could bundle the type, so you have Graph.{v, u} : Type max u v, and for g : Graph you would talk about g.Vertex and g.Edge.

If I remove a vertex v, then how do the types G.vertex and (G-v).vertex relate?

Aaron Liu (Mar 31 2025 at 11:36):

Peter Nelson said:

Think about statements of the form 'minor operations commute' for example.

This seems relatively pain-free to write out, what problem are you seeing?

Peter Nelson (Mar 31 2025 at 11:37):

Suppose I have two edges e, f, and I want to contract them in one order versus another. How do I state that I have 'the same' graph both ways?

Aaron Liu (Mar 31 2025 at 11:40):

This is a def contractCommIso : (g.contract e).contract (map (contractGraphHom g e) f) ≃g (g.contract f).contract (map (contractGraphHom g f) e)

Aaron Liu (Mar 31 2025 at 11:40):

hmmm, this turned out longer than I thought it would be

Peter Nelson (Mar 31 2025 at 11:41):

And it's the beginning of DTT hell. Now you are pushing forward paths along that map, taking initial subsegments of the paths, etc etc.

Peter Nelson (Mar 31 2025 at 11:41):

It should be an = if at all possible, and for this example, = is possible.

Aaron Liu (Mar 31 2025 at 11:42):

I don't see how using = makes it any better though...

Peter Nelson (Mar 31 2025 at 11:44):

It means that pushforward/pullback API stops being API, and becomes rw.

Pieter Cuijpers (Mar 31 2025 at 11:52):

(newb question: what is DTT?)

Peter Nelson (Mar 31 2025 at 11:52):

Dependent type theory.

Peter Nelson (Apr 03 2025 at 16:42):

To give a concrete example on this point, I was recently looking at the definition docs#SimpleGraph.Subgraph.Connected . There is roughly 500 lines of API for it (and it's still incomplete), even though it is just SimpleGraph.Connected modulo some mathematically invisible maps.

If the vertex set for SimpleGraph were aSet, there would have been no need to have separate SimpleGraph and Subgraph types at all, and this kind of API nearly disappears.

Peter Nelson (Apr 16 2025 at 21:05):

#24122

Last updated: May 02 2025 at 03:31 UTC