Zulip Chat Archive

Stream: new members

Topic: Design questions,


Jonathan Whitehead (May 20 2023 at 20:10):

I've been working on learning Lean the past week and I started by developing some graph objects and operations on graphs. I've learned a lot and I think that I'm finally getting used to using Lean. However, after ironing out a lot of the details and finishing all of the proofs about correctness I decided it was time to take a step back and think about the proper way to do things.

While developing the notion of a graph, I first started out with a structure that had a list of vertices and a list of edges. But then I realized that I needed more structure than that. The list of vertices needed to have no duplicates no_dups and be non empty nonempty, while the list of edges needed to have their endpoints in the list of vertices closure. So, I decided to package those properties up in the definition of a graph, but later moved those properties into the definition of a Vertex_List and an Edge_List. See below:

import Mathlib.Data.List.Basic
import Mathlib.Data.Set.Basic
import Mathlib.Tactic

@[ext]

structure Vertex :=
  (name : String)
  deriving DecidableEq, Repr

structure Edge :=
  (name : String)
  (edge : Vertex × Vertex)
  deriving DecidableEq, Repr

structure Vertex_List :=
  (l : List Vertex)
  (nonempty : l.length > 0)
  (no_dups :  (u v : Vertex), u  l  v  l  u.name = v.name  u = v)
  deriving DecidableEq, Repr

structure Edge_List (vertices : Vertex_List):=
  (edges : List Edge)
  (closure :  e  edges, Prod.fst e.edge  vertices.l  Prod.snd e.edge  vertices.l)
  deriving DecidableEq, Repr

structure Graph :=
(vertices : Vertex_List)
(edges : Edge_List vertices)
deriving DecidableEq, Repr

One of the problems (maybe it's not a problem?) is that every time that I write a program to operate on a Graph (like add_edge, add_vertex, etc.), I have to prove that the result is a graph inside the program. Hence, every program I create has to include the proofs nonempty, no_dups, and closure. It quickly became laborious.

So after researching other people's implementation of Graphs, I began noticing that other definitions of graphs are missing proofs. E.g.

Graph Theory Library I found

structure Edge (β : Type) where
  target : Nat
  weight : β

structure Vertex (α : Type) (β : Type) where
  payload : α
  adjacencyList : Array (Edge β) := #[]

instance [Inhabited α] : Inhabited (Vertex α β) :=  { payload := default } 

structure Graph (α : Type) (β : Type) where
  vertices : Array (Graph.Vertex α β) := #[]

This person used an adjacency list but I can't find any proofs about the list being nonempty or no_dups. Using an adjacency list seems to eliminate the need for closure.

When I look at the notion of a simple graph in Mathlib, it's nice how some of the proofs were packaged into symmetric and irreflexive, but I still don't see a proof that there are no parallel edges.

Mathlib Simple Graph

structure simple_graph (V : Type u) :=
(adj : V  V  Prop)
(symm : symmetric adj . obviously)
(loopless : irreflexive adj . obviously)

It feels like I'm doing something fundamentally wrong and it's difficult for me to see at this time how these definitions on graphs capture all of the properties that they need to have. Can someone help me see a better way to write? I'm mainly worried that when I get to defining something like the augmenting path algorithm that I'm going to have 300+ lines of code that are just proofs. How can we mitigate the excessive amounts of coding proofs?

Damiano Testa (May 20 2023 at 20:25):

Here is a bit of empty philosophical remarks, that hopefully help answer some of your questions.

When formalizing mathematics, it tends to be simplest to avoid putting in conditions (proofs), for as long as you can. You started with graphs, though I think that most people come across this with division by zero. What happens there is that you simply define division between any two real numbers, with the convention that when the denominator is 0, the result is 0.

Of course, not all theorems about divisions of real numbers are true for this "extended division", but it turns out that a surprising amount still are. The gain is that a lot of the theorems about division do not have the hypothesis that the denominator is non-zero. In turn, this means that when you apply these results, you do not need to prove that your denominator is non-zero! Of course, when you want to use that a number times its inverse equals 1, this specific theorem will have the non-zero assumption. But most of the time, you can work with division without worrying about this.

Damiano Testa (May 20 2023 at 20:29):

In the specific case of simple_graph, the parallel edges are excluded by design choice: the "edges" are the relation adj: if the relation holds between a pair, then there is an edge, if it doesn't, then there isn't. There is no further possibility!

This is another way on which design choices affect how easy or hard it is to proceed: finding a good implementation of a mathematical concept is hard. Making a good choice means that you will be able to formalize results more easily.

Kevin Buzzard (May 20 2023 at 20:30):

@Jonathan Whitehead writing definitions is really hard and is not really a good first project for a beginner. What exactly is wrong with just using mathlib's definition of a simple graph? What do you mean by parallel edges? adj is prop-valued, so adj v w is either true (there's an edge) or false (there's no edge). Does that answer your question?

Kevin Buzzard (May 20 2023 at 20:30):

When I am teaching undergraduates to use the software I first get them to write a gazillion proofs, with no definitions at all.

Damiano Testa (May 20 2023 at 20:30):

The more you formalize, the more you will appreciate the difference between the platonic idea of what the mathematical concept is and what is the best way of implementing it.

Jonathan Whitehead (May 20 2023 at 20:47):

Kevin Buzzard said:

Jonathan Whitehead writing definitions is really hard and is not really a good first project for a beginner. What exactly is wrong with just using mathlib's definition of a simple graph? What do you mean by parallel edges? adj is prop-valued, so adj v w is either true (there's an edge) or false (there's no edge). Does that answer your question?

Before I started trying to create my own definitions, I couldn't even read the Mathlib definitions. For me, it's not a good learning experience to start with the fully developed ideas in the Mathlib library. In fact, that's why a lot of tutorials for theorem provers start out by building their own version of the natural numbers and arithmetic operations. After working through some tutorials on theorem provers, they all teach you how to handle objects that are defined inductively, which seems to be easier. But when you come to a more advanced library, most of the objects aren't defined inductively, and learning how to work with these other kinds of objects just isn't explained very well in the tutorials I've used. There's a lot of reasons why I wouldn't make it very far in my first project if I started with the Mathlib library.

Jonathan Whitehead (May 20 2023 at 20:56):

In addition, I also don't understand why the library starts with a simple graph. A lot of the exercises that I want to prove/formalize aren't even about simple graphs. So I need a definition of a Graph.

Eric Wieser (May 20 2023 at 21:06):

A quick remark: your no_dups condition is equivalent to true (i.e. it doesn't add any constraints)

Jonathan Whitehead (May 20 2023 at 21:06):

Also, what I meant by parallel edges was that the adjacency list tells me that if v ~ u, then there exists an edge between v and u. But it doesn't tell me that there exists precisely one edge between v and u.

Jonathan Whitehead (May 20 2023 at 21:09):

Eric Wieser said:

A quick remark: your no_dups condition is equivalent to true (i.e. it doesn't add any constraints)

This is a good example of something that I would encounter in the Mathlib definitions. I would not know what this meant or how to show/prove it. I still can't say that I know what you mean. Are you saying that the field no_dups : <proof statement> is equivalent to no_dups : true?

Eric Wieser (May 20 2023 at 21:12):

Yes. But perhaps to be more clear, I'm saying that your definition considers [v, v ] to have no duplicates

Damiano Testa (May 20 2023 at 21:13):

Jonathan, you probably want to quantify over elements of l not of Vertex.

Eric Wieser (May 20 2023 at 21:13):

That's not the issue Damiano

Eric Wieser (May 20 2023 at 21:14):

The problem is that u.name = v.name already implies u = v for every possible u and b.

Damiano Testa (May 20 2023 at 21:14):

Sorry, I meant the indices of elements of l!

Jonathan Whitehead (May 20 2023 at 21:15):

Damiano Testa said:

In the specific case of simple_graph, the parallel edges are excluded by design choice: the "edges" are the relation adj: if the relation holds between a pair, then there is an edge, if it doesn't, then there isn't. There is no further possibility!

This is another way on which design choices affect how easy or hard it is to proceed: finding a good implementation of a mathematical concept is hard. Making a good choice means that you will be able to formalize results more easily.

I think I see what you're saying here. With division by zero, I imagine there are a lot of statements with proofs that don't depend on the denominator being non-zero. But in graph theory, a lot of statements about simple graphs need the requirement that only one edge exists between the vertices. So, what happens when you want to prove these statements? Do you have to create a structure Simple_Graph_Extended with the additional field that there are no parallel edges?

Damiano Testa (May 20 2023 at 21:16):

Let me try to say this better: different índices of l correspond to different elements of Vertex.

Jonathan Whitehead (May 20 2023 at 21:19):

Damiano Testa said:

Let me try to say this better: different índices of l correspond to different elements of Vertex.

Are you saying that l should be a List String instead of List Vertex?

Jonathan Whitehead (May 20 2023 at 21:19):

Then I could remove Vertex

Damiano Testa (May 20 2023 at 21:20):

simple_graph already models a graph with no parallel edges. You seem to be interpreting adj as saying "is there at least one edge between these two vertices", whereas it is "is there one or no edge between them".

Damiano Testa (May 20 2023 at 21:22):

The issue that Eric pointed out is that two elements of Vertex with the same name are equal, whether you ask for it or not. This is the extensionality question that you asked earlier.

Jonathan Whitehead (May 20 2023 at 21:22):

Eric Wieser said:

A quick remark: your no_dups condition is equivalent to true (i.e. it doesn't add any constraints)

Okay, I think I see what you're saying now. So no_dups : Vertex.mk.injEq instead of the statement I have. At the time I didn't know Vertex.mk.injEq existed

Damiano Testa (May 20 2023 at 21:23):

What you probably meant is that your list of vertices should contain no duplicates.

Jonathan Whitehead (May 20 2023 at 21:23):

Damiano Testa said:

What you probably meant is that your list of vertices should contain no duplicates.

Yeah

Damiano Testa (May 20 2023 at 21:25):

What you told lean is that if you take two vertices that are in your list and have the same name, then they are the same. Eric was saying: two vertices with the same name are the same, regardless of whether they are in your list or not.

Damiano Testa (May 20 2023 at 21:26):

So, your no_dups condition is vacuous and not what you meant it to be.

Eric Wieser (May 20 2023 at 21:39):

And what Damiano is saying is that the correct condition is "if two nodes are equal then they are at the same index in l"

Eric Wieser (May 20 2023 at 21:39):

But mathlib already has this; it's docs#list.nodup

Kyle Miller (May 20 2023 at 21:56):

Jonathan Whitehead said:

In addition, I also don't understand why the library starts with a simple graph. A lot of the exercises that I want to prove/formalize aren't even about simple graphs. So I need a definition of a Graph.

I can answer that. Simple graphs were fairly clear how to define (though even so there were some initial difficulties coming up with it), and there are a number of ways one can define a multigraph, so we decided to start with simple graphs and see how the library developed before trying to have some theoretical depth in too many combinatorial objects at the same time.

Another reason is that math textbooks about graph theory tend to be about simple graphs, so may as well start there.

Something that might not be initially obvious is that the adj relation is equivalent to set (V × V). The way this goes is set (V × V) = (V × V -> Prop) ≈ (V -> V -> Prop), where the equivalence is from currying, and the equality is the definition of set. This explains why there are no multiple edges: it's a set of pairs, two per edge (individual elements of this set are called "darts", and they're given by simple_graph.dart).

Kyle Miller (May 20 2023 at 21:58):

The symmetric axiom let you pretend that this is a set of unordered pairs. The mathlib name for unordered pairs is sym2 (for "symmetric square"), and there's some material for going back and forth between this sort of set and a set (sym2 V).

Kyle Miller (May 20 2023 at 22:02):

Kevin Buzzard said:

What exactly is wrong with just using mathlib's definition of a simple graph?

The mathlib definition of a simple graph isn't really set up for computation per se, though if you are good with decidable instances and writing fintypes then you can use it for that. There's not really much you can #eval when all you know is that you have Prop-valued things...

Though mathlib's simple graphs can be used as an interface for proving things about concrete objects that can be converted to simple_graph. There are a number of ways to represent a graph structure in programs (bool matrices, adjacency lists, among others), and simple_graph doesn't try to be any one of these.

Jonathan Whitehead (May 20 2023 at 23:05):

Kyle Miller said:

Jonathan Whitehead said:

In addition, I also don't understand why the library starts with a simple graph. A lot of the exercises that I want to prove/formalize aren't even about simple graphs. So I need a definition of a Graph.

I can answer that. Simple graphs were fairly clear how to define (though even so there were some initial difficulties coming up with it), and there are a number of ways one can define a multigraph, so we decided to start with simple graphs and see how the library developed before trying to have some theoretical depth in too many combinatorial objects at the same time.

Another reason is that math textbooks about graph theory tend to be about simple graphs, so may as well start there.

Something that might not be initially obvious is that the adj relation is equivalent to set (V × V). The way this goes is set (V × V) = (V × V -> Prop) ≈ (V -> V -> Prop), where the equivalence is from currying, and the equality is the definition of set. This explains why there are no multiple edges: it's a set of pairs, two per edge (individual elements of this set are called "darts", and they're given by simple_graph.dart).

This is a lot of good information and I'm still unpacking it, but I went ahead and loaded the Mathlib4 implementation of a SimpleGraph into my editor and I wanted to show/ask a few questions. The answers to which may be in your reply but I'm still trying to digest it.

import Mathlib.Combinatorics.SimpleGraph.Basic

#check SimpleGraph.Adj

def example_graph : SimpleGraph V :=
{ Adj := fun v w => (v, w)  [ (a, b), (b, c), (a, c) ] }

First of all, when I type check SimpleGraph.Adj it returns

 {V : Type u}  SimpleGraph V  V  V  Prop

which is why I couldn't start with the Mathlib implementation. When I see this, I think of SimpleGraph being a function, where Lean will infer V, but it tells me to "plug in" a SimpleGraph V, along with two other V's and it will give me a Prop. For someone who's just starting out, this is hard to interpret and plus, I don't think that I want a Prop? I thought that the structure should return an actual graph. I feel like SimpleGraph should be a graph where I can do things like G + v or G - e, or maybe at some point G/e.

The example graph that I defined doesn't work because the aesop_graph tactic fails to prove that the adjacency relation is symmetric and irreflexive. I think this is telling me that there's a particular way to define adj, which I think is theoretically answered in your reply when you talk about the relation being a set. But as a beginner, I can't figure out what a simple concrete implementation of a SimpleGraph looks like. Assuming vertices = [a, b, c] and edges = [(a,b), (b,c), (a,c)].

I feel like the simple answer is that the implementation of SimpleGraph isn't made to do what I'm wanting. Which brings me back to my original post where I'm asking for ways to implement my ideas without them becoming overburdened with proof objects. I think that Damiano Testa really helped in his first reply when he said that some properties can be left out (depending on what I want to do), and avoiding conditions for as long as possible.

Kyle Miller (May 20 2023 at 23:09):

SimpleGraph.Adj is the accessor function (the "projection") for the adjacency relation. You can write SimpleGraph.Adj example_graph or example_graph.Adj.

Kyle Miller (May 20 2023 at 23:10):

Prop "is" Bool, except for the fact that you can't decide whether the value is True or False in general (though Decidable instances can be registered to give the answer). But the fact is it's either True or False.

Kyle Miller (May 20 2023 at 23:13):

It's easier to prove general facts about SimpleGraphs rather than things about particular SimpleGraphs at the moment.

Kyle Miller (May 20 2023 at 23:14):

I think I shared it in another thread, but here's a concrete graph with some proofs about it: https://github.com/leanprover-community/mathlib/blob/master/archive/100-theorems-list/54_konigsberg.lean (That's Lean 3, but it shouldn't be hard to translate it.)

Jonathan Whitehead (May 20 2023 at 23:16):

Kyle Miller said:

I think I shared it in another thread, but here's a concrete graph with some proofs about it: https://github.com/leanprover-community/mathlib/blob/master/archive/100-theorems-list/54_konigsberg.lean (That's Lean 3, but it shouldn't be hard to translate it.)

I think I saw you post the Konigsberg problem in another of my posts but at the time I couldn't read it. I've learned enough now to be able to see what's going on. This looks like a good example. I'm going to go through it. Thanks!

Kyle Miller (May 20 2023 at 23:16):

I'm not too familiar yet with what we can expect aesop_graph to be able to prove. In mathlib 3, this was the tidy tactic, and while many times it could prove the two axioms, many times you still had to fill in a proof (see that link). If aesop_graph can't prove the axioms, I would not draw the conclusion that you defined Adj wrong.

Kyle Miller (May 20 2023 at 23:18):

In the Konigsberg file, half of it (edit: it seemed bigger in my memory, but it's just 6 lines) is precomputing some data (namely the degrees of all the vertices) so that the proof is a trivial application of another theorem.

Jonathan Whitehead (May 20 2023 at 23:20):

So, when someone formalizes a definition of a general graph, will the SimpleGraph definition become obsolete and the new SimpleGraph definition will just be an extension of the general graph definition? Is that typically how people do things? I'm just thinking of how to define a general graph, then a simple graph and proving that the extension of the graph to a simple graph matches the same implementation as SimpleGraph.

Jonathan Whitehead (May 20 2023 at 23:20):

By extension, I mean literally using the extension keyword

Kyle Miller (May 20 2023 at 23:33):

I don't think so, or at least I haven't been anticipating that. There will at least be functions to convert a SimpleGraph to a multigraph along with lemmas that we'll prove to say how structure is preserved.

Kyle Miller (May 21 2023 at 00:10):

Here's it ported to Lean 4, with some small improvements:

/-
Copyright (c) 2022 Kyle Miller. All rights reserved.
Released under Apache 2.0 license as described in the file LICENSE.
Authors: Kyle Miller
-/
import Mathlib.Combinatorics.SimpleGraph.Trails
import Mathlib.Tactic.DeriveFintype

/-!
# The Königsberg bridges problem

We show that a graph that represents the islands and mainlands of Königsberg and seven bridges
between them has no Eulerian trail.
-/

namespace konigsberg

/-- The vertices for the Königsberg graph; four vertices for the bodies of land and seven
vertices for the bridges. -/
inductive Verts : Type
  | V1 | V2 | V3 | V4 -- The islands and mainlands
  | B1 | B2 | B3 | B4 | B5 | B6 | B7 -- The bridges
deriving DecidableEq, Fintype

open Verts

/-- Each of the connections between the islands/mainlands and the bridges.
These are ordered pairs, but the data becomes symmetric in `konigsberg.Adj`. -/
def edges : List (Verts × Verts) :=
[ (V1, B1), (V1, B2), (V1, B3), (V1, B4), (V1, B5),
  (B1, V2), (B2, V2), (B3, V4), (B4, V3), (B5, V3),
  (V2, B6), (B6, V4),
  (V3, B7), (B7, V4) ]

/-- The adjacency relation for the Königsberg graph. -/
def adj (v w : Verts) : Bool := (v, w)  edges || (w, v)  edges

/-- The Königsberg graph structure. While the Königsberg bridge problem
is usually described using a multigraph, the we use a "mediant" construction
to transform it into a simple graph -- every edge in the multigraph is subdivided
into a path of two edges. This construction preserves whether a graph is Eulerian.

(TODO: once mathlib has multigraphs, either prove the mediant construction preserves the
Eulerian property or switch this file to use multigraphs. -/
@[simps]
def graph : SimpleGraph Verts where
  Adj v w := adj v w
  symm := by
    dsimp [Symmetric, adj]
    decide
  loopless := by
    dsimp [Irreflexive, adj]
    decide

instance : DecidableRel graph.Adj := fun a b =>
  inferInstanceAs <| Decidable (adj a b)

/-- To speed up the proof, this is a cache of all the degrees of each vertex,
proved in `konigsberg.degree_eq_degree`. -/
def degree : Verts  
  | V1 => 5 | V2 => 3 | V3 => 3 | V4 => 3
  | B1 => 2 | B2 => 2 | B3 => 2 | B4 => 2 | B5 => 2 | B6 => 2 | B7 => 2

@[simp] lemma degree_eq_degree (v : Verts) : graph.degree v = degree v := by cases v <;> rfl

lemma not_even_degree_iff (w : Verts) : ¬Even (degree w)  w = V1  w = V2  w = V3  w = V4 := by
  cases w <;> simp

lemma setOf_odd_degree_eq :
    {v | Odd (graph.degree v)} = {Verts.V1, Verts.V2, Verts.V3, Verts.V4} := by
  ext w
  simp [not_even_degree_iff]

/-- The Königsberg graph is not Eulerian. -/
theorem not_is_eulerian {u v : Verts} (p : graph.Walk u v) (h : p.IsEulerian) : False := by
  have h := h.card_odd_degree
  have h' := setOf_odd_degree_eq
  apply_fun Fintype.card at h'
  rw [h'] at h
  norm_num at h

end konigsberg

Last updated: Dec 20 2023 at 11:08 UTC