## Stream: Is there code for X?

### Topic: Graph Theory

#### Anatole Dedecker (Jun 23 2020 at 21:07):

Do we have Graph Theory in mathlib ? If not, I'd be happy to give it a try this summer

#### Alex J. Best (Jun 23 2020 at 21:08):

There was some work on it with the goal of formalizing the disproof of hedetniemis conjecture at https://github.com/leanprover-community/mathlib/tree/hedetniemi/src/graph_theory

#### Alena Gusakov (Jun 23 2020 at 21:10):

There's some stuff like Ramsey theory, Kruskal-Katona/Erdos-Ko-Rado (not in mathlib yet, they're on personal repos)

#### Alex J. Best (Jun 23 2020 at 21:10):

Searching for Hedetniemi in chat will bring up some threads I guess.

#### Alena Gusakov (Jun 23 2020 at 21:11):

Sorry those things aren't in mathlib btw

#### Alena Gusakov (Jun 23 2020 at 21:13):

I started trying to work on some basic definitions but I kept changing my mind on how I wanted to do that lol

#### Bhavik Mehta (Jun 23 2020 at 21:14):

Alena Gusakov said:

I started trying to work on some basic definitions but I kept changing my mind on how I wanted to do that lol

I think this is the same reason no-one else has made a PR yet either

#### Bryan Gin-ge Chen (Jun 23 2020 at 21:14):

There's an even older post here: https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/Type.20classes.20again/near/167340123 (this could probably be PR'd to archive/ once we get some of the stuff into mathlib)

#### Kyle Miller (Jun 23 2020 at 21:15):

I'm doing some graph theory things right now. There are a few ways of defining graphs, depending on what you're wanting to do. One version is using relations, another is Type-valued "relations" for multiple edges between vertices, and one I have now is like 1-D CW complexes. With the latter, I proved yesterday the handshaking lemma, that the sum of the degrees of the vertices is twice the number of edges.

#### Bryan Gin-ge Chen (Jun 23 2020 at 21:17):

There are some papers on formalizing graph theory in Coq by Doczkal and collaborators here which might be helpful when thinking about design decisions. I haven't had a chance to really dig in though.

#### Bhavik Mehta (Jun 23 2020 at 21:17):

Kyle Miller said:

I proved yesterday the handshaking lemma, that the sum of the degrees of the vertices is twice the number of edges.

Yeah this is a good exercise - I did it too when I proved Ramsey: https://github.com/b-mehta/combinatorics/blob/extras/src/handshaking.lean#L122

#### Anatole Dedecker (Jun 23 2020 at 21:21):

Ok so it has been done a lot but not in mathlib

#### Anatole Dedecker (Jun 23 2020 at 21:21):

Alex J. Best said:

There was some work on it with the goal of formalizing the disproof of hedetniemis conjecture at https://github.com/leanprover-community/mathlib/tree/hedetniemi/src/graph_theory

By "was" do you mean it's inactive ?

#### Bhavik Mehta (Jun 23 2020 at 21:22):

Yeah, there's no one actively working on it any more unfortunately; though as far as I know it's the biggest bit of graph theory that's been done in lean

#### Kyle Miller (Jun 23 2020 at 21:22):

@Bhavik Mehta It looks like you have it for simple graphs. Since I'm going to need it, I used a definition that allows loops and multiple edges: https://github.com/kmill/lean-graphs/blob/master/src/graphs.lean#L25

#### Alex J. Best (Jun 23 2020 at 21:22):

Well the last commit was a couple of months ago, and the thread died off, but it was only this year so it should be in good shape still.

#### Bhavik Mehta (Jun 23 2020 at 21:23):

Kyle Miller said:

Bhavik Mehta It looks like you have it for simple graphs. Since I'm going to need it, I used a definition that allows loops and multiple edges: https://github.com/kmill/lean-graphs/blob/master/src/graphs.lean#L25

Ah nice, yeah I see

#### Kyle Miller (Jun 23 2020 at 21:23):

(and every relation-based graph can be represented as such a graph using the constructor at https://github.com/kmill/lean-graphs/blob/master/src/graphs.lean#L128)

#### Alex J. Best (Jun 23 2020 at 21:24):

I wasn't too involved, but an important part of the disproof is the construction of some specific graph with a particular girth and chromatic number, the paper used a probabilistic argument here and my impression was that such arguments are harder to develop in lean at present. Might be better to try a more explicit approach to constructing this part of the proof instead.

#### Anatole Dedecker (Jun 23 2020 at 21:27):

Alex J. Best said:

Well the last commit was a couple of months ago, and the thread died off, but it was only this year so it should be in good shape still.

I'll have a look and see if I can help, though I may be overestimating myself here :sweat_smile:

#### Alex J. Best (Jun 23 2020 at 21:32):

Yeah I'm thinking of the paper https://arxiv.org/abs/2004.09028 which looks like it might give a more explicit example for that step than the Erdos-Renyi stuff (but once again I am not a graph theorist so would welcome corrections!)

#### Anatole Dedecker (Jun 23 2020 at 21:36):

Bryan Gin-ge Chen said:

There are some papers on formalizing graph theory in Coq by Doczkal and collaborators here which might be helpful when thinking about design decisions. I haven't had a chance to really dig in though.

If I wanna work on a generic-purpose approach, I think I'm going to keep that as a reference

(deleted)

(deleted)

#### Kyle Miller (Jun 23 2020 at 21:52):

@Anatole Dedecker Given my experiences so far with graphs (where I need non-simple graphs with multiple edges) and what I've seen in those papers that @Bryan Gin-ge Chen kindly shared, here's how I might define directed and undirected non-simple graphs (optionally edge-labeled):

-- An edge-labeled directed graph
structure dgraph (L : Type*) :=
(V : Type*) (E : Type*) (p : E → V × V) (ℓ : E → L)

inductive sym2_rel (α : Type*) : (α × α) → (α × α) → Prop
| refl (x y : α) : sym2_rel ⟨x,y⟩ ⟨x,y⟩
| swap (x y : α) : sym2_rel ⟨x,y⟩ ⟨y,x⟩

-- The symmetric square is the cartesian product α × α modulo swap.
def sym2 (α : Type*) := quot (sym2_rel α)
def incl_diag {α : Type*} (x : α) : sym2 α := quot.mk (sym2_rel α) ⟨x, x⟩

-- An edge-labeled undirected graph
structure graph (L : Type*) :=
(V : Type*) (E : Type*) (p : E → sym2 V) (ℓ : E → L)

-- For example,
def loopless {L} (g : graph L) :=
∀ (v : g.V), ¬∃ (e : g.E), g.p e = incl_diag v


The benefit with this definition for undirected graphs is that the edges are intrinsically undirected. (Depending on your tastes, you might change sym2 to finset and add in the axiom that (p e).card is either 1 or 2.)

#### Anatole Dedecker (Jun 23 2020 at 21:56):

Yeah I think I'd have done something like that. In classical logic I would rather have used quotient, but I've never used them in Lean yet, so I don't really know

#### Yury G. Kudryashov (Jun 23 2020 at 22:13):

I'd move V to the arguments because sometimes you want to consider a graph on a given type and having two ways to represent a type (α and g.V) can be bad for simp etc. But this is not too important, and it would be nice of someone finally PRs one of the definitions + basic theory to mathlib.

#### Anatole Dedecker (Jun 23 2020 at 22:16):

Yup that is true indeed, in some sense we are building a structure of graph on the type of its vertices

#### Kyle Miller (Jun 23 2020 at 22:22):

@Bhavik Mehta Yeah, the "Type/Sort-valued relation" seems like a good generalization of the Prop-valued relation approach to graphs, and it probably also makes edge labelings sort of come for free if you want them. One thing I want to do is induct on the edge set in a graph, and I went for the sort of dual definition (edges are attached to the vertices) in anticipation of that. You also get the number of edges in the graph by looking at the cardinality of the edge set this way, rather than having to sum up the edge sets over all pairs of vertices.

One thing I've struggled with when thinking about a graph library is how a lot of graph theory can be turned into statements about irreflexive symmetric relations on a (finite) type. It makes me think that there should be some part of mathlib about this specific case, maybe defining relation.graph and giving basic statements about these sorts of relations. However, there's also the part of graph theory that considers non-simple graphs, and for this you might have something like the definitions I gave. @Yury G. Kudryashov Do you think it's reasonable to have both definitions in mathlib?

#### Kyle Miller (Jun 23 2020 at 22:27):

Anatole Dedecker said:

Yup that is true indeed, in some sense we are building a structure of graph on the type of its vertices

I think the decision might be more based on whether you think you want to consider lots of different graphs on the same vertex set. I wasn't sure which way to go with the above definitions (and, in fact, I've been doing it with graph V in my own code). It seems like it would be annoying working with {g // g.V = V}, and if you really did need the type of all graphs, there's always Σ (V : Type*), graph V.

#### Yury G. Kudryashov (Jun 23 2020 at 22:28):

It's Σ V, graph V, not Π.

Oh, thanks!

#### Yury G. Kudryashov (Jun 23 2020 at 22:28):

Π type is the type of functions sending each V to some graph on V.

#### Yury G. Kudryashov (Jun 23 2020 at 22:31):

I'm not sure how many definitions of a graph do we want to have in mathlib but I'm pretty sure that we need at least one.

#### Yury G. Kudryashov (Jun 23 2020 at 22:31):

I'd suggest starting with some general definition, then see what special cases do we need.

#### Kyle Miller (Jun 23 2020 at 22:39):

Do you know if mathlib already has symmetric products (like sym2)? If not, where would be a good place to put their definition?

#### Scott Morrison (Jun 23 2020 at 22:40):

I'd encourage PR'ing parts of the hedetniemi branch. The definitions there have been at least partially battle-tested.

#### Kyle Miller (Jun 23 2020 at 22:57):

It seems like the definition of a multigraph isn't quite right for loop edges. You'd want inv to be the identity equivalence on the diagonal, I think.

#### Aaron Anderson (Jun 23 2020 at 23:40):

@Jalex Stark and I have done all of the graph theory in a proof of the Friendship Theorem, but we've slowed down on the linear algebra. https://github.com/jalex-stark/friendship-theorem/tree/master/src

#### Aaron Anderson (Jun 23 2020 at 23:43):

I'd be happy to join a bigger graph theory (or other combinatorics) project, if people want a summer collaboration.

#### Yury G. Kudryashov (Jun 24 2020 at 02:27):

Don't forget: whatever is not PRed, will stop compiling against master soon.

#### Yury G. Kudryashov (Jun 24 2020 at 02:28):

And it's better to make 10 small PRs than one huge PR.

#### Yury G. Kudryashov (Jun 24 2020 at 02:30):

So, don't wait till you have nice theorems.

#### Yury G. Kudryashov (Jun 24 2020 at 02:31):

A PR with basic definitions and "obvious" simp lemmas is a good start.

#### Scott Morrison (Jun 24 2020 at 04:01):

We can always fix the definitions in later PRs. :-)

#### Bryan Gin-ge Chen (Jun 24 2020 at 04:40):

If no one is already working on a PR, I propose that someone who worked on the hedetniemi branch either:

1. makes a PR, or
2. gives pointers to what files are in PR-ready shape

:slight_smile:

#### Kyle Miller (Jun 24 2020 at 05:04):

I'm working on a PR. I'm taking some of what's in hedetniemi and reworking it for multigraphs. I'm just trying to prove that the definition I'm using is equivalent to (a corrected version) of what's already there.

#### Bryan Gin-ge Chen (Jun 24 2020 at 05:09):

Great! I've sent you an invitation to collaborate, so you can push to a branch in mathlib and PR from that.

#### Scott Morrison (Jun 24 2020 at 07:26):

@Kyle Miller, what was the mistake that needed correcting?

#### Scott Morrison (Jun 24 2020 at 07:27):

Will your generalisation to multigraphs still be usable for simple graphs?

#### Kyle Miller (Jun 24 2020 at 07:30):

The mistake is that in

structure multigraph (V : Type u) extends directed_multigraph.{v} V :=
(inv : Π (x y), edge x y ≃ edge y x)


the inv x x permutation might not be trivial, so you can do some weird quantum thing where you keep flipping over the edge and get a totally new one. It seems loop edges aren't particularly popular to consider, but I use them a lot in my own work so I'd like them to be supported.

#### Kyle Miller (Jun 24 2020 at 07:33):

My plan for graphs is something like this:

structure multigraph (V : Type u) :=
(E : Type v) (ends : E → sym2 V)

structure graph (V : Type u) extends multigraph V :=
(single_edge : injective ends)

structure simple_graph (V : Type u) extends graph V :=
(loopless : ∀ e : E, ¬(ends e).is_diag)


There would be a number of constructors to put the main ways of working with graphs into one of these forms. For example, there's one that takes a corrected version of the hedetniemi-branch multigraph in the following format:

structure edge_set_desc (V : Type u) :=
(edge : V → V → Sort v)
(inv : Π x y, edge x y ≃ edge y x)
(loops : Π x, inv x x = equiv.refl (edge x x))


#### Kyle Miller (Jun 24 2020 at 07:36):

Another constructor (not yet written) would be

def from_relation {α : Type u} {r : α → α → Prop} (h1 : symmetric r) (h2 : irreflexive r): simple_graph α


#### Kyle Miller (Jun 24 2020 at 07:37):

Does this use of structure extensions seem OK?

#### Kyle Miller (Jun 24 2020 at 07:38):

I'm also unsure about the terminology. And, maybe it would make sense to combine graph and simple_graph, since now that I think about it, graph doesn't seem particularly useful.

#### David Wärn (Jun 24 2020 at 08:43):

It's good to hear that you're working on a PR! There are a lot of questions here about how to set up definitions. One of the most basic is whether we want a type E of all edges together with an assignment of endpoints, or an indexed family V -> V -> Sort u. One issue with the former definition is that when you say "given a b : V, consider an edge e with endpoints a b", then the endpoints of e will never be definitionally equal to a b -- instead you will have some equalities to rewrite along. (It's easier to bundle an unbundled definition than it is to unbundle a bundled definition.)

#### David Wärn (Jun 24 2020 at 08:53):

I think this issue of loops being their own reversal never came up because Johan was working exclusively with the Prop-case, where loops definitionally equal their reversals. Another lawfulness issue is that edge reversal should be involutive -- the reversal of the reversal is the original loop. I think sometimes you might want this condition, while allowing loops which do not equal their reversal? (The example I have in mind is a groupoid, where the reversal of an arrow / edge is its inverse.)

#### Kyle Miller (Jun 24 2020 at 08:53):

An awkwardness with undirected graphs is that edges don't really have endpoints in a specific order. It seems the indexed family approach requires you to sort of keep track of a quotient type yourself by having some involution that inverts edges. The dual to the approach I was considering might be having the edges be an indexed family sym2 V -> Sort u, which makes the edges inherently unoriented.

I need to go to sleep now, but here's what I have so far: https://github.com/leanprover-community/mathlib/tree/graphs/src/combinatorics/graphs

#### David Wärn (Jun 24 2020 at 08:59):

Yes, undirected graphs are awkward in the indexed approach. edge : sym2 V -> Sort u might also be problematic since edge (a, b) and edge (b, a) are still not def eq.

#### Kyle Miller (Jun 24 2020 at 09:00):

David Wärn said:

I think this issue of loops being their own reversal never came up because Johan was working exclusively with the Prop-case, where loops definitionally equal their reversals. Another lawfulness issue is that edge reversal should be involutive -- the reversal of the reversal is the original loop. I think sometimes you might want this condition, while allowing loops which do not equal their reversal? (The example I have in mind is a groupoid, where the reversal of an arrow / edge is its inverse.)

Loops don't seem to be too popular, so that's fair. The sorts of things I have in mind in the future (say, Tutte polynomials) involve any numbers of loops at vertex, so it would be nice to make sure this works.

(I sort of like the approach of gluing edges to the vertices, but I'm probably just biased as a topologist.)

#### Kyle Miller (Jun 24 2020 at 09:08):

@David Wärn Another approach I was using in a different project is

structure multigraph (V : Type u) :=
(D : Type u) (ε : perm D) (ϕ : D → V) (ε_inv : involutive ε) (ε_fp_free : ¬has_fixed_point ε)


Here, D is the set of "darts," which are the two ends of an edge. The edges are the orbits under ε. This, at least, makes accounting easier for the proof that ∑ (v : V), g.deg v = 2 * g.nedges, since it has an intermediate g.ndarts.

#### Anatole Dedecker (Jun 24 2020 at 11:16):

Wow great to see someone taking care of it ! @Kyle Miller Are you restarting from scratch or do you use the hedetniemi base ?

#### Kyle Miller (Jun 24 2020 at 18:21):

@Anatole Dedecker Right now, I'm looking deeper into the literature and trying to prove some basic theorems with a given formalism to evaluate the options, and the plan is to port over the hedetniemi branch as much as possible. I'm happy to have some help, especially since you wanted to work on graphs, too -- I mostly just want to make sure whatever graph formalism mathlib has is general enough for the sorts of things I like to do with graphs.

This paper's approach looks very promising. It avoids the mess of dealing with quotient types (sym2), which has been OK but leaves something to be desired. Here's the paper's definition in Lean (modified a bit to make use of some Lean features):

universes u v

-- A *link* represents an edge along with its endpoints in some given
-- order.
structure links (V : Type u) (E : Type v) :=
(src : V) (via : E) (dest : V)

variables {V : Type u} {E : Type v}

-- Since we are modeling undirected graphs, two links are essentially
-- the same if they correspond to the same edge.
λ x₁ x₂, x₁ = x₂ ∨ x₁ = x₂.rev

-- Surjectivity of f restricted to s.
def surjective_on {α : Type u} {β : Type v} (f : α → β) (s : set α) := ∀ y, ∃ x ∈ s, f x = y

-- Injectivity of f on a given set s modulo a given relation R.
def injective_mod_on {α : Type u} {β : Type v} (R : α → α → Prop) (f : α → β) (s : set α)
:= ∀ x ∈ s, ∀ y ∈ s, y ∈ s → f x = f y → R x y

-- A multigraph consists of a vertex set and an edge set along with
-- exactly two links per non-loop edge and one link per loop edge,
-- representing how the edge is glued to the vertices.
structure multigraph (V : Type u) :=
(E : Type v) (L : set (links V E))


This definition will support things like paths through a graph as a sequence of links with compatible src and dest fields.

A "dual" version, like in hedetneimi, might be

structure multigraph (V : Type u) :=
(E : Type v) (edges : V → V → set E)
(all_edges : ∀ e, ∃ x y, e ∈ edges x y)
(reversible : ∀ x y, edges x y = edges y x)
(adequate : ∀ {x y} (e ∈ edges x y) {x' y'}, e ∈ edges x' y' → (x = x' ∧ y = y') ∨ (x = y' ∧ y = x'))


This unbundles the hedetneimi graph definition even further, I think, since we don't need to think of edges as being equivalence classes under the inv operation: there is an actual type representing the edges.

I need to think about @David Wärn's point about definitional equalities. It seems to me that the first definition might indirectly address it: you ask for the set of links between vertices a and b, and these carry the corresponding edges. The second definition gives you edge sets, and the reversible axiom identifies edge a b with edge b a. I'd like having a total edge set, like in this definition, since you can do things like define a finite graph as one where both V and g.E are fintypes, so perhaps this is a nice variation on the hedetneimi branch's definition.

I'm going to try rewriting everything I have using this second definition and see how it goes -- while many things were OK already, I think this one will go more smoothly. (I would appreciate any thoughts about improving these definitions, including finding better names for the structure's fields, especially adequate.)

#### Anatole Dedecker (Jun 25 2020 at 17:37):

This all looks very promising ! To be honest I'm not at all a specialist of graph theory, so I'm happy you're doing the hard part of choosing the best definition, I wouldn't have been self-confident enough to do it :sweat_smile: . But once you've made your choice I'll be glad to help you stating and proving as many useful lemmas as I can (I even do know some friends who might be interested too) !

#### Kyle Miller (Jun 25 2020 at 18:27):

Anatole Dedecker said:

But once you've made your choice I'll be glad to help you stating and proving as many useful lemmas as I can (I even do know some friends who might be interested too) !

Great! My goal is to finish up this current design today or tomorrow, since a variation on Chou94's approach seems like it's working out well enough for what you'd want out of multigraphs and simple graphs.

There are some surprising issues when it comes to some basic graph operations. For example, none of the approaches seem to let you define a computable function that gives you the vertex opposite a given edge from a vertex. The Chou approach sidesteps this by having you work with links instead, which already carry that opposite vertex (proving it was computable). I'm wanting to avoid anything noncomputable as far as possible.

The definition ends up being very similar to the one in hedetneimi. There, you effectively have a function edges : V → V → set E that gives the edge set between two vertices along with an axiom ∀ v w, edges v w = edges w v. Here, it's instead essentially a function links : set (V × E × V) with an axiom that reversing a link is still a link. While this is just an uncurried version of the edges function (recall, a set is a function to Prop), it's proven to be convenient working with elements of V × E × V. (Don't worry, I'm just expanding definitions here for sake of explanation. There is a structure definition for links.)

#### David Wärn (Jun 25 2020 at 20:57):

Hmm, it's interesting that "the other end" should be noncomputable. Fwiw, I think the issue is with Prop -- it generally causes issues when you want things to be computable. Here I computably define "the other element" of an unordered pair, by defining membership using trunc instead of Prop:

import trunc
import tactic
universe u
variable (α : Type u)

-- the relation "equal as unordered pairs"
inductive swap_rel : α × α → α × α → Prop
| same (a b : α) : swap_rel (a, b) (a, b)
| swap (a b : α) : swap_rel (a, b) (b, a)

-- the relation "equal as unordered pairs" is an equivalence
instance swap_setoid : setoid (α × α) :=
begin
refine_struct { r:= swap_rel α }, tidy;
{ cases_matching* swap_rel _ _ _;
apply swap_rel.same <|> apply swap_rel.swap, },
end

-- hence we can quotient by it
def unordered_pair := quotient (swap_setoid α)

variable {α}

-- swapping order really does give the same unordered pair
lemma eq_swap {a b : α} : ⟦(a, b)⟧ = ⟦(b, a)⟧ :=
by { rw quotient.eq, apply swap_rel.swap }

-- the "other element" of an unordered pair is unique
lemma other_unique (a b c : α) (h : ⟦(a, b)⟧ = ⟦(a, c)⟧) : b = c :=
by { rw quotient.eq at h, cases h; refl }

-- trunc_mem a p is a constructive way of saying that a is an element of p.
-- Crucially, it is a subsingleton, but not a Prop.
def trunc_mem (a : α) (p : unordered_pair α) : Type u :=
trunc ${b // p = ⟦(a, b)⟧} -- if a is an element of p, then we can computably extract "the other element" of p def other (a : α) (p : unordered_pair α) (h : trunc_mem a p) : α := trunc.rec_on h subtype.val$ λ ⟨b, _⟩ ⟨c, _⟩,
by { convert other_unique a b c (by cc), simp }

-- checking that other computes as expected
variables (a b : α)
#reduce other a ⟦(a, b)⟧ (trunc.mk ⟨b, rfl⟩) -- b
#reduce other b ⟦(a, b)⟧ (trunc.mk ⟨a, eq_swap⟩) -- a


#### Kyle Miller (Jun 25 2020 at 21:29):

Thanks for showing me this use of trunc. A few days ago I was trying to prove other for unordered_pair, but I ran into issues because I only had a ∈ p with a has_mem instance that checked if a equaled either component of the pair.

#### Aaron Anderson (Jul 18 2020 at 03:41):

@Jalex Stark and I are almost done with the Friendship Theorem, and have started breaking up non-graph-theoretic parts into PRs. https://github.com/jalex-stark/friendship-theorem

#### Aaron Anderson (Jul 18 2020 at 03:44):

The definition of a graph is on adjacency_matrix.lean. Does anyone with experience on Hedetniemi want to discuss with me whether I should change that definition? Perhaps model theory has biased me, but I like thinking of simple graphs as basically just relations, which can be coerced to be two-sorted multigraphs when relevant.

#### Aaron Anderson (Jul 18 2020 at 03:45):

@Kyle Miller @David Wärn @Johan Commelin

#### Aaron Anderson (Jul 18 2020 at 03:47):

While some of this is just going in the Freek 100 archive, I want to be consistent with whatever else is going into mathlib.

#### Kyle Miller (Jul 18 2020 at 04:44):

This seems to be the correct definition for a simple graph. In what I've been working on, I have the same simple_graph (though I decided to reserve E for sym2.from_rel of the relation) and a multigraphs typeclass for giving it the structure of a multigraph when needed.

#### Aaron Anderson (Jul 18 2020 at 05:00):

Cool. It sounds like merging eventually will be easy, but link me to some code if you want me to adjust mine.

#### Kyle Miller (Jul 18 2020 at 06:02):

I've been mostly focusing on multigraphs, but here's what the simple graphs version might look like. This seems like a good enough time to check the underlying idea with everyone. One design goal is to be able to treat subgraphs of a given graph as graphs themselves.

A conceptual problem with graphs is that the usual synecdoche of referring to a structure by its carrier type fails: a graph is both its vertex and edge sets simultaneously. A way I found to get around this is to say a type consists of graphs. Then, you can have the variables {α : Type*} [simple_graphs α] (G : α)  to be able to refer to G as a graph, as you'd want. We can also write G' : subgraph G to denote a subgraph, and there is an instance so that the graph interface applies to G', too.

I'd like to hear if there are any problems with this approach, or better ways of doing things.

import data.sym2
import tactic

/--
A type consists of simple graphs if each term has a corresponding
vertex type and symmetric irreflexive adjacency relation on the
vertices.  See simple_graph for the primary implementation.
-/
class simple_graphs (α : Type*) :=
(V : Π G : α, Type*)
(adj : Π G : α, V G → V G → Prop)
(sym : Π G, symmetric (adj G))
(loopless : Π G, irreflexive (adj G))

namespace simple_graphs

/--
The edge set of a simple graph consists of all the unordered pairs
-/
def E {α : Type*} [simple_graphs α] (G : α) : set (sym2 (V G)) :=
sym2.from_rel (sym G)

end simple_graphs

/--
A simple graph on a vertex set V is an irreflexive symmetric
relation, representing which vertices are adjacent.
-/
structure simple_graph (V : Type*) :=
(adj : V → V → Prop)

/--
Of course, the type of simple graphs on a vertex set consists of
simple graphs.
-/
instance {V : Type*} : simple_graphs (simple_graph V) :=
{ V := λ G, V,
sym := simple_graph.sym,
loopless := simple_graph.loopless }

def complete_graph (V : Type*) : simple_graph V :=
{ adj := λ v w, v ≠ w,
sym := by tidy,
loopless := by tidy }

open simple_graphs

/--
A subgraph of a simple graph G.
-/
structure subgraph {α : Type*} [simple_graphs α] (G : α) :=
(V' : set (V G))
(E' : set (sym2 (V G)))
(edge_subset : E' ⊆ E G)
(has_ends : ∀ (e ∈ E') (v ∈ e), v ∈ V')

instance {α : Type*} [simple_graphs α] (G : α) : simple_graphs (subgraph G) :=
{ V := λ G', subgraph.V' G',
adj := λ G' v w, ⟦(v.val, w.val)⟧ ∈ subgraph.E' G',
sym := by { intros G' v w h, rwa sym2.eq_swap },
loopless := begin
intros G' v h,
apply sym2.from_rel_irreflexive.mp (simple_graphs.loopless G) (subgraph.edge_subset _ h),
rw sym2.is_diag_iff_proj_eq,
end }

/--
The type of subgraphs on a given simple graph is a bounded lattice.
-/
instance {α : Type*} [simple_graphs α] (G : α) : bounded_lattice (subgraph G) := sorry

/--
A spanning subgraph consists of all of the vertices along with a
subset of the edges.
-/
structure spanning_subgraph {α : Type*} [simple_graphs α] (G : α) :=
(E' : set (sym2 (V G)))
(edge_subset : E' ⊆ E G)

instance {α : Type*} [simple_graphs α] (G : α) : simple_graphs (spanning_subgraph G) :=
{ V := λ G', V G,
adj := λ G' v w, ⟦(v, w)⟧ ∈ spanning_subgraph.E' G',
sym := by { intros G' v w h, rwa sym2.eq_swap },
loopless := begin
intros G' v h,
apply sym2.from_rel_irreflexive.mp (simple_graphs.loopless G) (spanning_subgraph.edge_subset _ h),
rw sym2.is_diag_iff_proj_eq,
end }

instance {α : Type*} [simple_graphs α] (G : α) : bounded_lattice (spanning_subgraph G) := sorry


#### Aaron Anderson (Jul 19 2020 at 20:52):

@Kyle Miller, @Jalex Stark and I are about to PR a tiny bit of graph theory. We put your name on one of the files because I incorporated some of that text into it.

#3458

#### Julian Külshammer (Jul 21 2020 at 11:58):

Maybe naive question, but is there a reason why not to use has_hom from the category theory library for directed multigraphs (or vice versa). They seem more or less the same thing to me (apart from the fact that one is a structure and the other is a class).

#### Kyle Miller (Jul 21 2020 at 21:41):

@Julian Külshammer A reason I can think of is that a given vertex set might have multiple possible edge sets associated to it. In category theory, we assume that the object type determines a natural collection of morphisms. This assumption breaks down for graphs.

This, perhaps, is a way to make it work:

class digraphs (α : Type*) :=
(V : Π (G : α), Type u)
(edges : Π (G : α), V G → V G → Type v)


This says that each term G : α has an associated vertex type and has_hom structure on it.

There's also a dual version of this. Something like it is used in a paper about labeled multigraphs:

class digraphs (α : Type*) :=
(V : Π (G : α), Type u)
(E : Π (G : α), Type v)
(s : Π (G : α), E G → V G)
(t : Π (G : α), E G → V G)


A benefit is that there is a concrete edge set. This is useful, for example, in defining what a finite graph is.

Yet another possible definition, which is one step away from undirected multigraphs (which is what I was working on) is

structure link (V : Type u) (E : Type v) :=
(s : V) (e : E) (t : V)

class digraphs (α : Type*) :=
(V : Π (G : α), Type u)
(E : Π (G : α), Type v)
(links : Π (G : α), set (link (V G) (E G)))
(all_edges : Π (G : α), ∀ (e : E G), ∃ (x ∈ links G), link.e x = e)


This is surprisingly nice to work with. The links field is essentially a function V G → V G → E G → Prop (which is equivalent to V G → V G → set (E G)), but having it as a tuple seems to let you rewrite fewer things in proofs.

It's not too hard to define functions that go between each of these representations.

#### Julian Külshammer (Jul 22 2020 at 07:18):

@Kyle Miller There are many examples in category theory where you have the same vertex "set" but different kinds of morphisms between them. One very common example of this is the category of complexes, its homotopy category, or its derived category. Another example is that of the Kleisli category of a monad.

Thanks a lot for your different possible implementations of multigraphs. What you call the "dual version" is definitely the most common representation in the area of mathematics I am working in (representation theory of finite dimensional algebras), where such a thing is called a quiver.

I also saw a comment by @Reid Barton in https://github.com/leanprover-community/mathlib/pull/613 describing that has_hom was introduced to talk more easily about the free category on a graph, so I was wondering whether the two notions should be connected (or one removed and pointing to the other). I don't know enough about lean yet to answer this question, so I thought I ask here.

#### Johan Commelin (Jul 22 2020 at 07:41):

@Julian Külshammer A common trick that we use when we have the same "underlying set" is a so-called type wrapper (which is just the identity function).

def derived_category (C : Type*) := C


And then you can have different typeclasses on C and on derived_category C. In particular, their morphisms can be different.

#### Johan Commelin (Jul 22 2020 at 08:24):

Just to be clear: here I'm already thinking of C as the category of chain complexes. Not some sort of abelian category. So maybe I shouldn't have called it derived_category, but more something like localisation_category, or something like that.

#### Kyle Miller (Jul 22 2020 at 08:25):

@Julian Külshammer Yeah, I sort of had path algebras in the back of my mind when I was writing the second (and third) definitions.

Since you asked, here's how you might connect the second definition to has_hom:

import category_theory.category

universes u v

/--
Give the terms of a type the structure of a directed graph.
-/
class digraphs (α : Type*) :=
(V : Π (G : α), Type u)
(E : Π (G : α), Type v)
(source : Π {G : α}, E G → V G)
(target : Π {G : α}, E G → V G)

open digraphs

/--
Associate a has_hom structure to the digraph's vertex type.
-/
instance {α : Type*} [digraphs α] (G : α) : category_theory.has_hom (V G) :=
{ hom := λ v w, {e : E G // source e = v ∧ target e = w} }

/--
A concrete directed graph on a given vertex and edge set.
-/
structure digraph (V : Type u) (E : Type v):=
(source : E → V)
(target : E → V)

instance (V : Type u) (E : Type v) : digraphs (digraph V E) :=
{ V := λ _, V,
E := λ _, E,
source := digraph.source,
target := digraph.target }

structure subdigraph {α : Type*} [digraphs α] (G : α) :=
(V' : set (V G))
(E' : set (E G))
(has_sources : E'.image source ⊆ V')
(has_targets : E'.image target ⊆ V')

instance {α : Type*} [digraphs α] (G : α) : digraphs (subdigraph G) :=
{ V := λ H, subtype H.V',
E := λ H, subtype H.E',
source := λ H ⟨e, h⟩, ⟨source e, H.has_sources ⟨e, h, rfl⟩⟩,
target := λ H ⟨e, h⟩, ⟨target e, H.has_targets ⟨e, h, rfl⟩⟩ }


I also give a concrete definition of a digraph and sub-digraphs for illustration of the possible interface.

What I was talking about earlier is that in Lean, many algebraic and categorical objects are referred to by some sort of carrier type. Then, class resolution is able to give you the tacit structure. As you note, there are many categories that are defined on the same object set, but, as Johan explains, Lean's class resolution gives you a trick to have it give different tacit structures.

With the above has_hom instance, even though different digraphs might have the same underlying vertex types, because they are referred to V G for a specific G, the class resolution will find the correct has_hom instance for that graph. At least, that's my understanding -- I don't have much practical experience here.

#### Kevin Buzzard (Jul 22 2020 at 08:35):

@Johan Commelin Oh sorry I misunderstood! But how about

def derived_category (C : Type*) := chain_complex C


? One could imagine this line of code actually being written, right? @Scott Morrison have I understood this correctly? It just looks like a cool definition.

#### Johan Commelin (Jul 22 2020 at 08:36):

Of course derived categories will actually be a major pain when you try to formalise them.

#### Johan Commelin (Jul 22 2020 at 08:37):

Maybe that's the point where we should go infty-cats

Do we know this?

Do we know what?

#### Kevin Buzzard (Jul 22 2020 at 08:39):

I spent a long time poo-pooing the category theory library, inspired by Mario, but it's just getting better and better. The universe issues have been solved, the notation is slowly sinking in, and the fact that mathematicians like @Bhavik Mehta just seem to have learnt Scott's current API effortlessly makes me wonder how far Lean 3 can go.

#### Kevin Buzzard (Jul 22 2020 at 08:39):

The more we push Lean 3 in ways like this, the more we learn about what we want from Lean 4

#### Johan Commelin (Jul 22 2020 at 08:40):

Sure... but derived categories are different right?

#### Kevin Buzzard (Jul 22 2020 at 08:41):

I don't understand why

#### Johan Commelin (Jul 22 2020 at 08:41):

Well, maybe not the derived categories per se, but taking derived functors...

#### Johan Commelin (Jul 22 2020 at 08:41):

you need to make awful choices... and lots of stuff gets swept under the rug in informal math.

#### Kevin Buzzard (Jul 22 2020 at 08:41):

Are you saying that at least one of derived categories and derived functors will be impossible in Lean 3?

#### Kevin Buzzard (Jul 22 2020 at 08:41):

Or just that they will be very difficult?

#### Johan Commelin (Jul 22 2020 at 08:42):

I fear that even though possible, getting a useful API will be really tricky.

#### Johan Commelin (Jul 22 2020 at 08:42):

We've seen how tricky sheaves are.

#### Johan Commelin (Jul 22 2020 at 08:42):

Defining a sheaf is easy peasy lemon squeezy

#### Johan Commelin (Jul 22 2020 at 08:43):

But making them usable...

#### Kevin Buzzard (Jul 22 2020 at 08:43):

Kenny got good at them!

#### Kevin Buzzard (Jul 22 2020 at 08:43):

I guess "an expert managed to do it" is not a great response here

#### Kyle Miller (Jul 22 2020 at 08:44):

(Don't mind me as you're discussing formalizing derived categories :slight_smile:)

Julian Külshammer said:

There are many examples in category theory where you have the same vertex "set" but different kinds of morphisms between them.

One more thought about this: For categories, there are a handful of different possibilities in use. But for graphs, sort of the whole point is varying the edge sets. This suggests, at least to me, that directed graphs need to be handled somewhat differently from the way categories are handled in Lean. This informed, to some degree, the above proposed design of digraphs.

#### Johan Commelin (Jul 22 2020 at 08:45):

Ooh, sorry. We should move to another thread.

#### Kevin Buzzard (Jul 22 2020 at 08:47):

Oh I'm sorry @Kyle Miller :-) I'll move to another thread.

#### Kyle Miller (Jul 22 2020 at 08:50):

No problem! It's sort of related, and I didn't mean to interrupt you.

#### Julian Külshammer (Jul 22 2020 at 10:00):

@Kyle Miller @Johan Commelin @Kevin Buzzard Thanks for all the interesting comments. It seems there is much to learn and explore in that direction.

#### Alena Gusakov (Aug 05 2020 at 00:53):

So I'm a bit late to the party here, but I was wondering if I could join in on the graph theory project? I've been trying to do small projects here and there to learn Lean but graph theory is what I ultimately wanted to do

#### Jalex Stark (Aug 05 2020 at 00:58):

hmm are you asking a question like "if I want to contribute to the development of graph theory in mathlib, where should I start"?

#### Alena Gusakov (Aug 05 2020 at 00:58):

yes! thank you that's a better way of putting it haha

we need more API

#### Jalex Stark (Aug 05 2020 at 00:59):

I don't know how to prove anything about the cardinality of the edge set of a graph

#### Alena Gusakov (Aug 05 2020 at 00:59):

i was looking at stuff like subgraphs and connectivity - i didn't see it in the PR, are you guys working on that?

noted!

#### Jalex Stark (Aug 05 2020 at 01:00):

@Michael Hahn and I are working with eulerian circuits in a separate repo

#### Jalex Stark (Aug 05 2020 at 01:01):

i think at least our induction principle for graphs is ready to be PRed

#### Alena Gusakov (Aug 05 2020 at 01:02):

gotcha. i wanted to try to define some basic stuff for trees, would you be able to link the repo?

#### Alena Gusakov (Aug 05 2020 at 01:03):

too fast. thanks!

#### Alena Gusakov (Aug 05 2020 at 01:04):

should i just fork it or something?

hmm

#### Jalex Stark (Aug 05 2020 at 01:05):

What I would do is work in a branch of mathlib, and if there are things from that project I wanted, I would copy-paste them

#### Alena Gusakov (Aug 05 2020 at 01:06):

okay cool, thank you!

#### Jalex Stark (Aug 05 2020 at 01:07):

working in a branch of mathlib will make it easier to collaborate with people here, and also easier to spin off parts of your branch into a PR

#### Jalex Stark (Aug 05 2020 at 01:08):

michael and I are working in a separate repo because this is a summer project for him and we want there to be a place where it lives when it's done, even if it doesn't make it to mathlib

makes sense

#### Alena Gusakov (Aug 05 2020 at 01:10):

speaking of summer projects, i did originally intend to do graph theory for that, i just initially didn't see much stuff done and felt intimidated, and then i didn't keep up with stuff here. ya live and you learn ig lol

#### Jalex Stark (Aug 05 2020 at 01:14):

it's still summer, I think

i mean yeah fair

#### Alena Gusakov (Aug 05 2020 at 01:14):

wanna do a project with me lol

#### Alena Gusakov (Aug 05 2020 at 01:14):

speedrun :racecar:

#### Jalex Stark (Aug 05 2020 at 01:38):

i stepped away at an inopportune time, sorry. I'm happy to follow / code review your graph work if it happens in either a mathlib branch or a repo that you tell me about.

#### Alena Gusakov (Aug 05 2020 at 02:22):

you're fine! https://github.com/agusakov/graph_theory_2020
it doesn't have anything yet, still gotta make my first commits

appreciate it!

#### Jalex Stark (Aug 05 2020 at 02:45):

do you know which commands to type to start up the repo with a lean project? (now i see some of your other GitHub projects; i'm guessing the answer is yes)

yep!

#### Alena Gusakov (Aug 05 2020 at 03:09):

made my first commit, i only have some tiny definitions so far

#### Jalex Stark (Aug 05 2020 at 03:12):

paths can have repeated edges

#### Jalex Stark (Aug 05 2020 at 03:12):

p.is_tour says that there are no repeated edges

#### Alena Gusakov (Aug 05 2020 at 03:17):

oh gosh thanks haha

#### Alena Gusakov (Aug 05 2020 at 03:26):

i'm having trouble figuring out how to use it - i keep getting errors and i can't really find it anywhere in the files

#### Jalex Stark (Aug 05 2020 at 04:01):

sorry, it's actually called p.is_trail, defined at the end of this file

#### Alena Gusakov (Aug 05 2020 at 04:03):

gotcha! i actually went ahead and defined an is_tour cause i saw that is_trail allows for repeated vertices, just not repeated edges so it all worked out lol

#### Jalex Stark (Aug 05 2020 at 04:31):

some (most?) of our definitions are adapted from the mathlib:hedetniemi branch

#### Bhavik Mehta (Aug 06 2020 at 02:25):

Taking a look at this since I did some graph theory in lean a while back - are you working on the mathlib_simple_graph branch or master?

#### Jalex Stark (Aug 06 2020 at 03:50):

mathlib_simple_graph

#### Bhavik Mehta (Aug 06 2020 at 03:50):

alright, are you accepting PRs?

yes

#### Bhavik Mehta (Aug 06 2020 at 03:54):

and finally, what's the progress on the things listed here: https://github.com/apurvnakade/mc2020-projects/tree/mathlib_simple_graph/src/michael? I've done the fourth a while ago

#### Jalex Stark (Aug 06 2020 at 03:59):

@Michael Hahn wrote that list. I guess we currently don't have a definition of the konigsberg graph, because we haven't yet allowed multiple edges

#### Jalex Stark (Aug 06 2020 at 04:03):

i find the current type of edges pretty awkward to work with

#### Jalex Stark (Aug 06 2020 at 04:08):

we need at least a coercion H.E \to G.E when H.is_subgraph G before we can make a serious application of subgraph induction.

Last updated: May 07 2021 at 21:10 UTC