7. Topology

Calculus is based on the concept of a function, which is used to model quantities that depend on one another. For example, it is common to study quantities that change over time. The notion of a limit is also fundamental. We may say that the limit of a function \(f(x)\) is a value \(b\) as \(x\) approaches a value \(a\), or that \(f(x)\) converges to \(b\) as \(x\) approaches \(a\). Equivalently, we may say that a \(f(x)\) approaches \(a\) as \(x\) approaches a value \(b\), or that it tends to \(b\) as \(x\) tends to \(a\). We have already begun to consider such notions in Section 3.6.

Topology is the abstract study of limits and continuity. Having covered the essentials of formalization in Chapters 2 to 6, in this chapter, we will explain how topological notions are formalized in mathlib. Not only do topological abstractions apply in much greater generality, but that also, somewhat paradoxically, make it easier to reason about limits and continuity in concrete instances.

Topological notions build on quite a few layers of mathematical structure. The first layer is naive set theory, as described in Chapter 4. The next layer is the theory of filters, which we will describe in Section 7.1. On top of that, we layer the theories of topological spaces, metric spaces, and a slightly more exotic intermediate notion called a uniform space.

Whereas previous chapters relied on mathematical notions that were likely familiar to you, the notion of a filter less well known, even to many working mathematicians. The notion is essential, however, for formalizing mathematics effectively. Let us explain why. Let f : be any function. We can consider the limit of f x as x approaches some value x₀, but we can also consider the limit of f x as x approaches infinity or negative infinity. We can moreover consider the limit of f x as x approaches x₀ from the right, conventionally written x₀⁺, or from the left, written x₀⁻. There are variations where x approaches x₀ or x₀⁺ or x₀⁻ but is not allowed to take on the value x₀ itself. This results in at least eight ways that x can approach something. We can also restrict to rational values of x or place other constraints on the domain, but let’s stick to those 8 cases.

We have a similar variety of options on the codomain: we can specify that f x approaches a value from the left or right, or that it approaches positive or negative infinity, and so on. For example, we may wish to say that f x tends to +∞ when x tends to x₀ from the right without being equal to x₀. This results in 64 different kinds of limit statements, and we haven’t even begun to deal with limits of sequences, as we did in Section 3.6.

The problem is compounded even further when it comes to the supporting lemmas. For instance, limits compose: if f x tends to y₀ when x tends to x₀ and g y tends to z₀ when y tends to y₀ then g f x tends to z₀ when x tends to x₀. There are three notions of “tends to” at play here, each of which can be instantiated in any of the eight ways described in the previous paragraph. This results in 512 lemmas, a lot to have to add to a library! Informally, mathematicians generally prove two or three of these and simply note that the rest can be proved “in the same way.” Formalizing mathematics requires making the relevant notion of “sameness” fully explicit, and that is exactly what Bourbaki’s theory of filters manages to do.

7.1. Filters

A filter on a type X is a collection of sets of X that satisfies three conditions that we will spell out below. The notion supports two related ideas:

  • limits, including all the kinds of limits discussed above: finite and infinite limits of sequences, finite and infinite limits of functions at a point or at infinity, and so on.

  • things happening eventually, including things happening for large enough n : , or sufficiently near a point x, or for sufficiently close pairs of points, or almost everywhere in the sense of measure theory. Dually, filters can also express the idea of things happening often: for arbitrarily large n, at a point in any neighborhood of given a point, etc.

The filters that correspond to these descriptions will be defined later in this section, but we can already name them:

  • (at_top : filter ℕ), made of sets of containing {n | n N} for some N

  • 𝓝 x, made of neighborhoods of x in a topological space

  • 𝓤 X, made of entourages of a uniform space (uniform spaces generalize metric spaces and topological groups)

  • μ.a_e , made of sets whose complement has zero measure with respect to a measure μ.

The general definition is as follows: a filter F : filter X is a collection of sets F.sets : set (set X) satisfying the following:

  • F.univ_sets : univ F.sets

  • F.sets_of_superset : {U V}, U F.sets U V V F.sets

  • F.inter_sets : {U V}, U F.sets V F.sets U V F.sets.

The first condition says that the set of all elements of X belongs to F.sets. The second condition says that if U belongs to F.sets then anything containing U also belongs to F.sets. The third condition says that F.sets is closed under finite intersections. In mathlib, a filter F is defined to be a structure bundling F.sets and its three properties, but the properties carry no additional data, and it is convenient to blur the distinction between F and F.sets. We therefore define U F to mean U F.sets. This explains why the word sets appears in the names of some lemmas that that mention U F.

It may help to think of a filter as defining a notion of a “sufficiently large” set. The first condition then says that univ is sufficiently large, the second one says that a set containing a sufficiently large set is sufficiently large and the third one says that the intersection of two sufficiently large sets is sufficiently large.

It may be even more useful to think of a filter on a type X as a generalized element of set X. For instance, at_top is the “set of very large numbers” and 𝓝 x₀ is the “set of points very close to x₀.” One manifestation of this view is that we can associate to any s : set X the so-called principal filter consisting of all sets that contain s. This definition is already in mathlib and has a notation 𝓟 (localized in the filter namespace). For the purpose of demonstration, we ask you to take this opportunity to work out the definition here.

def principal {α : Type*} (s : set α) : filter α :=
{ sets := {t | s  t},
  univ_sets := sorry,
  sets_of_superset := sorry,
  inter_sets := sorry}

For our second example, we ask you to define the filter at_top : filter . (We could use any type with a preorder instead of .)

example : filter  :=
{ sets := {s |  a,  b, a  b  b  s},
  univ_sets := sorry,
  sets_of_superset := sorry,
  inter_sets := sorry }

We can also directly define the filter 𝓝 x of neighborhoods of any x : . In the real numbers, a neighborhood of x is a set containing an open interval \((x_0 - \varepsilon, x_0 + \varepsilon)\), defined in mathlib as Ioo (x₀ - ε) (x₀ + ε). (This is notion of a neighborhood is only a special case of a more general construction in mathlib.)

With these examples, we can already define what is means for a function f : X Y to converge to some G : filter Y along some F : filter X, as follows:

def tendsto₁ {X Y : Type*} (f : X  Y) (F : filter X) (G : filter Y) :=
 V  G, f ⁻¹' V  F

When X is and Y is , tendsto₁ u at_top (𝓝 x) is equivalent to saying that the sequence u : converges to the real number x. When both X and Y are , tendsto f (𝓝 x₀) (𝓝 y₀) is equivalent to the familiar notion \(\lim_{x \to x₀} f(x) = y₀\). All of the other kinds of limits mentioned in the introduction are also equivalent to instances of tendsto₁ for suitable choices of filters on the source and target.

The notion tendsto₁ above is definitionally equivalent to the notion tendsto that is defined in mathlib, but the latter is defined more abstractly. The problem with the definition of tendsto₁ is that it exposes a quantifier and elements of G, and it hides the intuition that we get by viewing filters as generalized sets. We can hide the quantifier V and make the intuition more salient by using more algebraic and set-theoretic machinery. The first ingredient is the pushforward operation \(f_*\) associated to any map f : X Y, denoted filter.map f in mathlib. Given a filter F on X, filter.map f F : filter Y is defined so that V filter.map f F f ⁻¹' V F holds definitionally. In this examples file we’ve opened the filter namespace so that filter.map can be written as map. This means that we can rewrite the definition of tendsto using the order relation on filter Y, which is reversed inclusion of the set of members. In other words, given G H : filter Y, we have G H V : set Y, V H V G.

def tendsto₂ {X Y : Type*} (f : X  Y) (F : filter X) (G : filter Y) :=
map f F  G

example {X Y : Type*} (f : X  Y) (F : filter X) (G : filter Y) :
  tendsto₂ f F G  tendsto₁ f F G := iff.rfl

It may seem that the order relation on filters is backward. But recall that we can view filters on X as generalized elements of set X, via the inclusion of 𝓟 : set X filter X which maps any set s to the corresponding principal filter. This inclusion is order preserving, so the order relation on filter can indeed be seen as the natural inclusion relation between generalized sets. In this analogy, pushforward is analogous to the direct image. And, indeed, map f (𝓟 s) = 𝓟 (f '' s).

We can now understand intuitively why a sequence u : converges to a point x₀ if and only if we have map u at_top 𝓝 x₀. The inequality means the “direct image under u” of “the set of very big natural numbers” is “included” in “the set of points very close to x₀.”

As promised, the definition of tendsto₂ does not exhibit any quantifiers or sets. It also leverages the algebraic properties of the pushforward operation. First, each filter.map f is monotone. And, second, filter.map is compatible with composition.

#check (@filter.map_mono :  {α β} {m : α  β}, monotone (map m))
#check (@filter.map_map :  {α β γ} {f : filter α} {m : α  β} {m' : β  γ},
                            map m' (map m f) = map (m'  m) f)

Together these two properties allow us to prove that limits compose, yielding in one shot all 256 variants of the composition lemma described in the introduction, and lots more. You can practice proving the following statement using either the definition of tendsto₁ in terms of the universal quantifier or the algebraic definition, together with the two lemmas above.

example {X Y Z : Type*} {F : filter X} {G : filter Y} {H : filter Z} {f : X  Y} {g : Y  Z}
  (hf : tendsto₁ f F G) (hg : tendsto₁ g G H) : tendsto₁ (g  f) F H :=
sorry

The pushforward construction uses a map to push filters from the map source to the map target. There also a pullback operation, filter.comap, going in the other direction. This generalizes the preimage operation on sets. For any map f, filter.map f and filter.comap f form what is known as a Galois connection, which is to say, they satisfy

filter.map_le_iff_le_comap : filter.map f F G F filter.comap f G

for every F and G. This operation could be used to provided another formulation of tendsto that would be provably (but not definitionally) equivalent to the one in mathlib.

The comap operation can be used to restrict filters to a subtype. For instance, suppose we have f : , x₀ : and y₀ : , and suppose we want to state that f x approaches y₀ when x approaches x₀ within the rational numbers. We can pull the filter 𝓝 x₀ back to using the coercion map coe : and state tendsto (f coe : ℝ) (comap coe (𝓝 x₀)) (𝓝 y₀).

variables (f :   ) (x₀ y₀ : )

#check comap (coe :   ) (𝓝 x₀)
#check tendsto (f  coe) (comap (coe :   ) (𝓝 x₀)) (𝓝 y₀)

The pullback operation is also compatible with composition, but it contravariant, which is to say, it reverses the order of the arguments.

section
variables {α β γ : Type*} (F : filter α) {m : γ  β} {n : β  α}

#check (comap_comap : comap m (comap n F) = comap (n  m) F)
end

Let’s now shift attention to the plane × and try to understand how the neighborhoods of a point (x₀, y₀) are related to 𝓝 x₀ and 𝓝 y₀. There is a product operation filter.prod : filter X filter Y filter (X × Y), denoted by ×ᶠ, which answers this question:

example : 𝓝 (x₀, y₀) = 𝓝 x₀ ×ᶠ 𝓝 y₀ := nhds_prod_eq

The product operation is defined in terms of the pullback operation and the inf operation:

F ×ᶠ G = (comap prod.fst F) (comap prod.snd G).

Here the inf operation refers to the lattice structure on filter X for any type X, whereby F G is the greatest filter that is smaller than both F and G. Thus the inf operation generalizes the notion of the intersection of sets.

A lot of proofs in mathlib use all of the aforementioned structure (map, comap, inf, sup, and prod) to give algebraic proofs about convergence without ever referring to members of filters. You can practice doing this in a proof of the following lemma, unfolding the definition of tendsto and filter.prod if needed.

#check le_inf_iff

example (f :    × ) (x₀ y₀ : ) :
  tendsto f at_top (𝓝 (x₀, y₀)) 
    tendsto (prod.fst  f) at_top (𝓝 x₀)  tendsto (prod.snd  f) at_top (𝓝 y₀) :=
sorry

The ordered type filter X is actually a complete lattice, which is to say, there is a bottom element, there is a top element, and every set of filters on X has an Inf and a Sup.

Note that given the second property in the definition of a filter (if U belongs to F then anything larger than U also belongs to F), the first property (the set of all inhabitants of X belongs to F) is equivalent to the property that F is not the empty collection of sets. This shouldn’t be confused with the more subtle question as to whether the empty set is an element of F. The definition of a filter does not prohibit F, but if the empty set is in F then every set is in F, which is to say, U : set X, U F. In this case, F is a rather trivial filter, which is precisely the bottom element of the complete lattice filter X. This contrasts with the definition of filters in Bourbaki, which doesn’t allow filters containing the empty set.

Because we include the trivial filter in our definition, we sometimes need to explicitly assume nontriviality in some lemmas. In return, however, the theory has nicer global properties. We have already seen that including the trivial filter gives us a bottom element. It also allows us to define principal : set X filter X, which maps to , without adding a precondition to rule out the empty set. And it allows us to define the pullback operation without a precondition as well. Indeed, it can happen that comap f F = although F . For instance, given x₀ : and s : set , the pullback of 𝓝 x₀ under the coercion from the subtype corresponding to s is nontrivial if and only if x₀ belongs to the closure of s.

In order to manage lemmas that do need to assume some filter is nontrivial, mathlib has a type class filter.ne_bot, and the library has lemmas that assume (F : filter X) [F.ne_bot]. The instance database knows, for example, that (at_top : filter ℕ).ne_bot, and it knows that pushing forward a nontrivial filter gives a nontrivial filter. As a result, a lemma assuming [F.ne_bot] will automatically apply to map u at_top for any sequence u.

Our tour of the algebraic properties of filters and their relation to limits is essentially done, but we have not yet justified our claim to have recaptured the usual limit notions. Superficially, it may seem that tendsto u at_top (𝓝 x₀) is stronger than the notion of convergence defined in Section 3.6 because we ask that every neighborhood of x₀ has a preimage belonging to at_top, whereas the usual definition only requires this for the standard neighborhoods Ioo (x₀ - ε) (x₀ + ε). The key is that, by definition, every neighborhood contains such a standard one. This observation leads to the notion of a filter basis.

Given F : filter X, a family of sets s : ι → set X is a basis for F if for every set U, we have U F if and only if it contains some s i. In other words, formally speaking, s is a basis if it satisfies U : set X, U F i, s i U. It is even more flexible to consider a predicate on ι that selects only some of the values i in the indexing type. In the case of 𝓝 x₀, we want ι to be , we write ε for i, and the predicate should select the positive values of ε. So the fact that the sets Ioo  (x₀ - ε) (x₀ + ε) form a basis for the neighborhood topology on is stated as follows:

example (x₀ : ) : has_basis (𝓝 x₀) (λ ε : , 0 < ε) (λ ε, Ioo (x₀ - ε) (x₀ + ε)) :=
nhds_basis_Ioo_pos x₀

There is also a nice basis for the filter at_top. The lemma filter.has_basis.tendsto_iff allows us to reformulate a statement of the form tendsto f F G given bases for F and G. Putting these pieces together gives us essentially the notion of convergence that we used in Section 3.6.

example (u :   ) (x₀ : ) :
  tendsto u at_top (𝓝 x₀)   ε > 0,  N,  n  N, u n  Ioo (x₀ - ε) (x₀ + ε) :=
begin
  have : at_top.has_basis (λ n : , true) Ici := at_top_basis,
  rw this.tendsto_iff (nhds_basis_Ioo_pos x₀),
  simp
end

We now show how filters facilitate working with properties that hold for sufficiently large numbers or for points that are sufficiently close to a given point. In Section 3.6, we were often faced with the situation where we knew that some property P n holds for sufficiently large n and that some other property Q n holds for sufficiently large n. Using cases twice gave us N_P and N_Q satisfying n N_P, P n and n N_Q, Q n. Using set N := max N_P N_Q, we could eventually prove n N, P n Q n. Doing this repeatedly becomes tiresome.

We can do better by noting that the statement “P n and Q n hold for large enough n” means that we have {n | P n} at_top and {n | Q n} at_top. The fact that at_top is a filter implies that the intersection of two elements of at_top is again in at_top, so we have {n | P n Q n} at_top. Writing {n | P n} at_top is unpleasant, but we can use the more suggestive notation ∀ᶠ n in at_top, P n. Here the superscripted f stands for “filter.” You can think of the notation as saying that for all n in the “set of very large numbers,” P n holds. The ∀ᶠ notation stands for filter.eventually, and the lemma filter.eventually.and uses the intersection property of filters to do what we just described:

example (P Q :   Prop) (hP : ∀ᶠ n in at_top, P n) (hQ : ∀ᶠ n in at_top, Q n) :
  ∀ᶠ n in at_top, P n  Q n := hP.and hQ

This notation is so convenient and intuitive that we also have specializations when P is an equality or inequality statement. For example, let u and v be two sequences of real numbers, and let us show that if u n and v n coincide for sufficiently large n then u tends to x₀ if and only if v tends to x₀. First we’ll use the generic eventually and then the one specialized for the equality predicate, eventually_eq. The two statements are definitionally equivalent so the same proof work in both cases.

example (u v :   ) (h : ∀ᶠ n in at_top, u n = v n) (x₀ : ) :
  tendsto u at_top (𝓝 x₀)  tendsto v at_top (𝓝 x₀) :=
tendsto_congr' h

example (u v :   ) (h : u =ᶠ[at_top] v) (x₀ : ) :
  tendsto u at_top (𝓝 x₀)  tendsto v at_top (𝓝 x₀) :=
tendsto_congr' h

It is instructive to review the definition of filters in terms of eventually. Given F : filter X, for any predicates P and Q on X,

  • the condition univ F ensures (∀ x, P x) ∀ᶠ x in F, P x,

  • the condition U F U V V F ensures (∀ᶠ x in F, P x) (∀ x, P x Q x) ∀ᶠ x in F, Q x, and

  • the condition U F V F U V F ensures (∀ᶠ x in F, P x) (∀ᶠ x in F, Q x) ∀ᶠ x in F, P x Q x.

#check @eventually_of_forall
#check @eventually.mono
#check @eventually.and

The second item, corresponding to eventually.mono, supports nice ways of using filters, especially when combined with eventually.and. The filter_upwards tactic allows us to combine them. Compare:

example (P Q R :   Prop) (hP : ∀ᶠ n in at_top, P n) (hQ : ∀ᶠ n in at_top, Q n)
  (hR : ∀ᶠ n in at_top, P n  Q n  R n) :
  ∀ᶠ n in at_top, R n :=
begin
  apply (hP.and (hQ.and hR)).mono,
  rintros n h, h', h''⟩,
  exact h'' h, h'
end

example (P Q R :   Prop) (hP : ∀ᶠ n in at_top, P n) (hQ : ∀ᶠ n in at_top, Q n)
  (hR : ∀ᶠ n in at_top, P n  Q n  R n) :
  ∀ᶠ n in at_top, R n :=
begin
  filter_upwards [hP, hQ, hR],
  intros n h h' h'',
  exact h'' h, h'
end

Readers who know about measure theory will note that the filter μ.ae of sets whose complement has measure zero (aka “the set consisting of almost every point”) is not very useful as the source or target of tendsto, but it can be conveniently used with eventually to say that a property holds for almost every point.

There is a dual version of ∀ᶠ x in F, P x, which is occasionally useful: ∃ᶠ x in F, P x means {x | ¬P x} F. For example, ∃ᶠ n in at_top, P n means there are arbitrarily large n such that P n holds. The ∃ᶠ notation stands for filter.frequently.

For a more sophisticated example, consider the following statement about a sequence u, a set M, and a value x:

If u converges to x and u n belongs to M for sufficiently large n then x is in the closure of M.

This can be formalized as follows:

tendsto u at_top (𝓝 x) (∀ᶠ n in at_top, u n M) x closure M.

This is a special case of the theorem mem_closure_of_tendsto from the topology library. See if you can prove it using the quoted lemmas, using the fact that cluster_pt x F means (𝓝 x F).ne_bot.

#check mem_closure_iff_cluster_pt
#check le_principal_iff
#check ne_bot_of_le

example (u :   ) (M : set ) (x : )
  (hux : tendsto u at_top (𝓝 x)) (huM : ∀ᶠ n in at_top, u n  M) : x  closure M :=
sorry

7.2. Metric spaces

Examples in the previous section focus on sequences of real numbers. In this section we will go up a bit in generality and focus on metric spaces. A metric space is a type X equipped with a distance function dist : X X which is a generalization of the function λ x y, |x - y| from the case where X = .

Introducing such a space is easy and we will check all properties required from the distance function.

variables {X : Type*} [metric_space X] (a b c : X)

#check (dist a b : )

#check (dist_nonneg : 0  dist a b)

#check (dist_eq_zero : dist a b  = 0  a = b)

#check (dist_comm a b : dist a b  = dist b a)

#check (dist_triangle a b c : dist a c  dist a b + dist b c)

Note we also have variants where the distance can be infinite or where dist a b can be zero without having a = b or both. They are called emetric_space, pseudo_metric_space and pseudo_emetric_space respectively (here “e” stands for “extended”).

Note that our journey from to metric spaces jumped over the special case of normed spaces that also require linear algebra and will be explained as part of the calculus chapter.

7.2.1. Convergence and continuity

Using distance functions, we can already define convergent sequences and continuous functions between metric spaces. They are actually defined in a more general setting covered in the next section, but we have lemmas recasting the definition is terms of distances.

example {u :   X} {a : X} :
  tendsto u at_top (𝓝 a)   ε > 0,  N,  n  N, dist (u n) a < ε :=
metric.tendsto_at_top

example {X Y : Type*} [metric_space X] [metric_space Y] {f : X  Y} :
  continuous f 
   x : X,  ε > 0,  δ > 0,  x', dist x' x < δ  dist (f x') (f x) < ε :=
metric.continuous_iff

A lot of lemmas have some continuity assumptions, no we end up proving a lot of continuity results and there is a continuity tactic devoted to this task. Let’s prove a continuity statement that will be needed in an exercise below. Notice that Lean knows how to treat a product of two metric spaces as a metric space, so it makes sense to consider continuous functions from X × X to . In particular the (uncurried version of the) distance function is such a function.

example {X Y : Type*} [metric_space X] [metric_space Y] {f : X  Y} (hf : continuous f) :
  continuous (λ p : X × X, dist (f p.1) (f p.2)) :=
by continuity

This tactic is a bit slow, so it is also useful to know how to do it by hand. We first need to use that λ p : X × X, f p.1 is continuous because it is the composition of f, which is continuous by assumption hf, and the projection prod.fst whose continuity is the content of the lemma continuous_fst. The composition property is continuous.comp which is in the continuous namespace so we can use dot notation to compress continuous.comp hf continuous_fst into hf.comp continuous_fst which is actually more readable since it really reads as composing our assumption and our lemma. We can do the same for the second component to get continuity of λ p : X × X, f p.2. We then assemble those two continuities using continuous.prod_mk to get (hf.comp continuous_fst).prod_mk (hf.comp continuous_snd) : continuous p : X × X, (f p.1, f p.2)) and compose once more to get our full proof.

example {X Y : Type*} [metric_space X] [metric_space Y] {f : X  Y} (hf : continuous f) :
  continuous (λ p : X × X, dist (f p.1) (f p.2)) :=
continuous_dist.comp ((hf.comp continuous_fst).prod_mk (hf.comp continuous_snd))

The combination of continuous.prod_mk and continuous_dist via continuous.comp feels clunky, even when heavily using dot notation as above. A more serious issue is that this nice proof requires a lot of planning. Lean accepts the above proof term because it is a full term proving a statement which is definitionally equivalent to our goal, the crucial definition to unfold being that of a composition of functions. Indeed our target function λ p : X × X, dist (f p.1) (f p.2) is not presented as a composition. The proof term we provided proves continuity of dist p : X × X, (f p.1, f p.2)) which happens to be definitionally equal to our target function. But if we try to build this proof gradually using tactics starting with apply continuous_dist.comp then Lean’s elaborator will fail to recognize a composition and refuse to apply this lemma. It is especially bad at this when products of types are involved.

A better lemma to apply here is continuous.dist {f g : X Y} : continuous f continuous g continuous x, dist (f x) (g x)) which is nicer to Lean’s elaborator and also provides a shorter proof when directly providing a full proof term, as can be seen from the following two new proofs of the above statement:

example {X Y : Type*} [metric_space X] [metric_space Y] {f : X  Y} (hf : continuous f) :
  continuous (λ p : X × X, dist (f p.1) (f p.2)) :=
begin
  apply continuous.dist,
  exact hf.comp continuous_fst,
  exact hf.comp continuous_snd
end

example {X Y : Type*} [metric_space X] [metric_space Y] {f : X  Y} (hf : continuous f) :
  continuous (λ p : X × X, dist (f p.1) (f p.2)) :=
(hf.comp continuous_fst).dist (hf.comp continuous_snd)

Note that, without the elaboration issue coming from composition, another way to compress our proof would be to use continuous.prod_map which is sometimes useful and gives as an alternate proof term continuous_dist.comp (hf.prod_map hf) which even shorter to type.

Since it is sad to decide between a version which is better for elaboration and a version which is shorter to type, let us wrap this discussion with a last bit of compression offered by continuous.fst' which allows to compress hf.comp continuous_fst to hf.fst' (and the same with snd) and get our final proof, now bordering obfuscation.

example {X Y : Type*} [metric_space X] [metric_space Y] {f : X  Y} (hf : continuous f) :
  continuous (λ p : X × X, dist (f p.1) (f p.2)) :=
hf.fst'.dist hf.snd'

It’s your turn now to prove some continuity lemma. After trying the continuity tactic, you will need continuous.add, continuous_pow and continuous_id to do it by hand.

example {f :   X} (hf : continuous f) : continuous (λ x : , f (x^2 + x)) :=
sorry

So far we saw continuity as a global notion, but one can also define continuity at a point.

example {X Y : Type*} [metric_space X] [metric_space Y] (f : X  Y) (a : X) :
continuous_at f a   ε > 0,  δ > 0,  {x}, dist x a < δ  dist (f x) (f a) < ε :=
metric.continuous_at_iff

7.2.2. Balls, open sets and closed sets

Once we have a distance function, the most important geometric definitions are (open) balls and closed balls.

variables r : 

example : metric.ball a r = {b | dist b a < r} := rfl

example : metric.closed_ball a r = {b | dist b a  r} := rfl

Note that r is any real number here, there is no sign restriction. Of course some statements do require a radius condition.

example (hr : 0 < r) : a  metric.ball a r := metric.mem_ball_self hr

example (hr : 0  r) : a  metric.closed_ball a r := metric.mem_closed_ball_self hr

Once we have balls, we can define open sets. They are actually defined in a more general setting covered in the next section, but we have lemmas recasting the definition is terms of balls.

example (s : set X) : is_open s   x  s,  ε > 0, metric.ball x ε  s :=
metric.is_open_iff

Then closed sets are sets whose complement is open. Their important property is they are closed under limits. The closure of a set is the smallest subset containing it.

example {s : set X} : is_closed s  is_open s :=
is_open_compl_iff.symm

example {s : set X} (hs : is_closed s) {u :   X} (hu : tendsto u at_top (𝓝 a))
  (hus :  n, u n  s) : a  s :=
hs.mem_of_tendsto hu (eventually_of_forall hus)

example {s : set X} : a  closure s   ε > 0,  b  s, a  metric.ball b ε :=
metric.mem_closure_iff

Do the next exercise without using mem_closure_iff_seq_limit

example {u :   X} (hu : tendsto u at_top (𝓝 a)) {s : set X} (hs :  n, u n  s) :
  a  closure s :=
sorry

Remember from the filters sections that neighborhood filters play a big role in mathlib. In the metric space context, the crucial point is that balls provide bases for those filters. The main lemmas here are metric.nhds_basis_ball and metric.nhds_basis_closed_ball that claim this for open and closed balls with positive radius. The center point is an implicit argument so we can invoke filter.has_basis.mem_iff as in the following example.

example {x : X} {s : set X} : s  𝓝 x   ε > 0, metric.ball x ε  s :=
metric.nhds_basis_ball.mem_iff

example {x : X} {s : set X} : s  𝓝 x   ε > 0, metric.closed_ball x ε  s :=
metric.nhds_basis_closed_ball.mem_iff

7.2.3. Compactness

Compactness is an important topological notion. It distinguishes subsets of a metric space that enjoy the same kind of properties as segments in reals compared to other intervals:

  • Any sequence taking value in a compact set has a subsequence that converges in this set

  • Any continuous function on a nonempty compact set with values in real numbers is bounded and achieves its bounds somewhere (this is called the extreme values theorem).

  • Compact sets are closed sets.

Let us first check that the unit interval in reals is indeed a compact set, and then check the above claims for compact sets in general metric spaces. In the second statement we only need continuity on the given set so we will use continuous_on instead of continuous, and we will give separate statements for the minimum and the maximum. Of course all these results are deduced from more general versions, some of which will be discussed in later sections.

example : is_compact (set.Icc 0 1 : set ) :=
is_compact_Icc

example {s : set X} (hs : is_compact s) {u :   X} (hu :  n, u n  s) :
   a  s,  φ :   , strict_mono φ  tendsto (u  φ) at_top (𝓝 a) :=
hs.tendsto_subseq hu

example {s : set X} (hs : is_compact s) (hs' : s.nonempty)
  {f : X  } (hfs : continuous_on f s) :
   x  s,  y  s, f x  f y :=
hs.exists_forall_le hs' hfs

example {s : set X} (hs : is_compact s) (hs' : s.nonempty)
  {f : X  } (hfs : continuous_on f s) :
   x  s,  y  s, f y  f x :=
hs.exists_forall_ge hs' hfs

example {s : set X} (hs : is_compact s) : is_closed s :=
hs.is_closed

We can also metric spaces which are globally compact, using an extra Prop-valued type class:

example {X : Type*} [metric_space X] [compact_space X] : is_compact (univ : set X) :=
is_compact_univ

In a compact metric space any closed set is compact, this is is_compact.is_closed.

7.2.4. Uniformly continuous functions

We now turn to uniformity notions on metric spaces : uniformly continuous functions, Cauchy sequences and completeness. Again those are defined in a more general context but we have lemmas in the metric name space to access their elementary definitions. We start with uniform continuity.

example {X : Type*} [metric_space X] {Y : Type*} [metric_space Y] {f : X  Y} :
  uniform_continuous f   ε > 0,  δ > 0,  {a b : X}, dist a b < δ  dist (f a) (f b) < ε :=
metric.uniform_continuous_iff

In order to practice manipulating all those definitions, we will prove that continuous functions from a compact metric space to a metric space are uniformly continuous (we will see a more general version in a later section).

We will first give an informal sketch. Let f : X Y be a continuous function from a compact metric space to a metric space. We fix ε > 0 and start looking for some δ.

Let φ : X × X := λ p, dist (f p.1) (f p.2) and let K := { p : X × X | ε φ p }. Observe φ is continuous since f and distance are continuous. And K is clearly closed (use is_closed_le) hence compact since X is compact.

Then we discuss two possibilities using eq_empty_or_nonempty. If K is empty then we are clearly done (we can set δ = 1 for instance). So let’s assume K is not empty, and use the extreme value theorem to choose (x₀, x₁) attaining the infimum of the distance function on K. We can then set δ = dist x₀ x₁ and check everything works.

example {X : Type*} [metric_space X] [compact_space X] {Y : Type*} [metric_space Y]
  {f : X  Y} (hf : continuous f) : uniform_continuous f :=
sorry

7.2.5. Completeness

A Cauchy sequence in a metric space is a sequence whose terms get closer and closer to each other. There are a couple of equivalent ways to state that idea. In particular converging sequences are Cauchy. The converse is true only in so-called complete spaces.

example (u :   X) : cauchy_seq u   ε > 0,  N : ,  m  N,   n  N, dist (u m) (u n) < ε :=
metric.cauchy_seq_iff

example (u :   X) : cauchy_seq u   ε > 0,  N : ,  n  N, dist (u n) (u N) < ε :=
metric.cauchy_seq_iff'


example [complete_space X] (u :   X) (hu : cauchy_seq u) :  x, tendsto u at_top (𝓝 x) :=
cauchy_seq_tendsto_of_complete hu

We’ll practice using this definition by proving a convenient criterion which is a special case of a criterion appearing in mathlib. This is also a good opportunity to practice using big sums in a geometric context. In addition to the explanations from the filters section, you will probably need tendsto_pow_at_top_nhds_0_of_lt_1, tendsto.mul and dist_le_range_sum_dist.

lemma cauchy_seq_of_le_geometric_two' {u :   X} (hu :  (n : ), dist (u n) (u (n + 1))  (1 / 2) ^ n) :
  cauchy_seq u :=
begin
  rw metric.cauchy_seq_iff',
  intros ε ε_pos,
  obtain N, hN :  N : , 1 / 2 ^ N * 2 < ε,
  { sorry },
  use N,
  intros n hn,
  obtain k, rfl : n = N + k := le_iff_exists_add.mp hn,
  calc dist (u (N + k)) (u N) = dist (u (N+0)) (u (N + k)) : sorry
  ...   i in range k, dist (u (N + i)) (u (N + (i + 1))) : sorry
  ...   i in range k, (1/2 : )^(N+i) : sorry
  ... = 1/2^N*∑ i in range k, (1 / 2) ^ i : sorry
  ...  1/2^N*2 : sorry
  ... < ε : sorry
end

We are ready for the final boss of this section: Baire’s theorem for complete metric spaces! The proof skeleton below shows interesting techniques. It uses the choose tactic in its exclamation mark variant (you should experiment with removing this exclamation mark) and it shows how to define something inductively in the middle of a proof using nat.rec_on.

open metric

example [complete_space X] (f :   set X) (ho :  n, is_open (f n)) (hd :  n, dense (f n)) : dense (n, f n) :=
begin
  let B :    := λ n, (1/2)^n,
  have Bpos :  n, 0 < B n, sorry,
  /- Translate the density assumption into two functions `center` and `radius` associating
  to any n, x, δ, δpos a center and a positive radius such that
  `closed_ball center radius` is included both in `f n` and in `closed_ball x δ`.
  We can also require `radius ≤ (1/2)^(n+1)`, to ensure we get a Cauchy sequence later. -/
  have :  (n : )  (x : X) (δ > 0),  (y : X) (r > 0), r  B (n+1)  closed_ball y r  (closed_ball x δ)  f n,
  { sorry },
  choose! center radius Hpos HB Hball using this,
  intros x,
  rw mem_closure_iff_nhds_basis nhds_basis_closed_ball,
  intros ε εpos,
  /- `ε` is positive. We have to find a point in the ball of radius `ε` around `x` belonging to all
  `f n`. For this, we construct inductively a sequence `F n = (c n, r n)` such that the closed ball
  `closed_ball (c n) (r n)` is included in the previous ball and in `f n`, and such that
  `r n` is small enough to ensure that `c n` is a Cauchy sequence. Then `c n` converges to a
  limit which belongs to all the `f n`. -/
  let F :   (X × ) := λn, nat.rec_on n (prod.mk x (min ε (B 0)))
                              (λn p, prod.mk (center n p.1 p.2) (radius n p.1 p.2)),
  let c :   X := λ n, (F n).1,
  let r :    := λ n, (F n).2,
  have rpos :  n, 0 < r n,
  { sorry },

  have rB : n, r n  B n,
  { sorry },
  have incl : n, closed_ball (c (n+1)) (r (n+1))  (closed_ball (c n) (r n))  (f n),
  { sorry },
  have cdist :  n, dist (c n) (c (n+1))  B n,
  { sorry },
  have : cauchy_seq c, from cauchy_seq_of_le_geometric_two' cdist,
  -- as the sequence `c n` is Cauchy in a complete space, it converges to a limit `y`.
  rcases cauchy_seq_tendsto_of_complete this with y, ylim⟩,
  -- this point `y` will be the desired point. We will check that it belongs to all
  -- `f n` and to `ball x ε`.
  use y,
  have I : n,  m  n, closed_ball (c m) (r m)  closed_ball (c n) (r n),
  { sorry },
  have yball : n, y  closed_ball (c n) (r n),
  { sorry },
  sorry
end

7.3. Topological spaces

7.3.1. Fundamentals

We now go up in generality and introduce topological spaces. We will review the two main ways to define topological spaces and then explain how the category of topological spaces is much better behaved than the category of metric spaces. Note that we won’t be using mathlib category theory here, only having a somewhat categorical point of view.

The first way to think about the transition from metric spaces to topological spaces is that we only remember the notion of open sets (or equivalently the notion of closed sets). From this point of view, a topological space is a type equipped with a collection of sets that are called open sets. This collection has to satisfy a number of axioms presented below (this collection is slightly redundant but we will ignore that).

section

variables {X : Type*} [topological_space X]

example : is_open (univ : set X) := is_open_univ

example : is_open ( : set X) := is_open_empty

example {ι : Type*} {s : ι  set X} (hs :  i, is_open $ s i) :
  is_open ( i, s i) :=
is_open_Union hs

example {ι : Type*} [fintype ι] {s : ι  set X} (hs :  i, is_open $ s i) :
  is_open ( i, s i) :=
is_open_Inter hs

Closed sets are then defined as sets whose complement is open. A function between topological spaces is (globally) continuous if all preimages of open sets are open.

variables {Y : Type*} [topological_space Y]

example {f : X  Y} : continuous f   s, is_open s  is_open (f ⁻¹' s) :=
continuous_def

With this definition we already see that, compared to metric spaces, topological spaces only remember enough information to talk about continuous functions: two topological structures on a type are the same if and only if they have the same continuous functions (indeed the identity function will be continuous in both direction if and only if the two structures have the same open sets).

However as soon as we move on to continuity at a point we see the limitations of the approach based on open sets. In mathlib it is much more frequent to think of topological spaces as types equipped with a neighborhood filter 𝓝 x attached to each point x (the corresponding function X filter X satisfies certain conditions explained further down). Remember from the filters section that these gadget play two related roles. First 𝓝 x is seen as the generalized set of points of X that are close to x. And then it is seen as giving a way to say, for any predicate P : X Prop, that this predicates holds for points that are close enough to x. Let us state that f : X Y is continuous at x. The purely filtery way is to say that the direct image under f of the generalized set of points that are close to x is contained in the generalized set of points that are close to f x. Recall this spelled either map f (𝓝 x) 𝓝 (f x) or tendsto f (𝓝 x) (𝓝 (f x)).

example {f : X  Y} {x : X} : continuous_at f x  map f (𝓝 x)  𝓝 (f x) :=
iff.rfl

One can also spell it using both neighborhoods seen as ordinary sets and a neighborhood filter seen as a generalized set: “for any neighborhood U of f x, all points close to x are sent to U”. Note that the proof is again iff.rfl, this point of view is definitionally equivalent to the previous one.

example {f : X  Y} {x : X} : continuous_at f x   U  𝓝 (f x), ∀ᶠ x in 𝓝 x, f x  U :=
iff.rfl

We now explain how to go from one point of view to the other. In terms of open sets, we can simply define members of 𝓝 x as sets that contain an open set containing x.

example {x : X} {s : set X} : s  𝓝 x   t  s, is_open t  x  t :=
mem_nhds_iff

To go in the other direction we need to discuss the condition that 𝓝 : X filter X must satisfy in order to be the neighborhood function of a topology.

The first constraint is that 𝓝 x, seen as a generalized set, contains the set {x} seen as the generalized set pure x (explaining this weird name would be too much of a digression, so we simply accept it for now). Another way to say it is that if a predicate holds for points close to x then it holds at x.

example (x : X) : pure x  𝓝 x := pure_le_nhds x

example (x : X) (P : X  Prop) (h : ∀ᶠ y in 𝓝 x, P y) : P x :=
pure_le_nhds x h

Then a more subtle requirement is that, for any predicate P : X Prop and any x, if P y holds for y close to x then for y close to x and z close to y, P z holds. More precisely we have:

example {P : X  Prop} {x : X} (h : ∀ᶠ y in 𝓝 x, P y) : ∀ᶠ y in 𝓝 x, ∀ᶠ z in 𝓝 y, P z :=
eventually_eventually_nhds.mpr h

Those two results characterize the functions X filter X that are neighborhood functions for a topological space structure on X. There is a still a function topological_space.mk_of_nhds : (X filter X) topological_space X but it will give back its input as a neighborhood function only if it satisfies the above two constraints. More precisely we have a lemma topological_space.nhds_mk_of_nhds saying that in a different way and our next exercise deduces this different way from how we stated it above.

example {α : Type*} (n : α  filter α) (H₀ :  a, pure a  n a)
  (H :  a : α,  p : α  Prop, (∀ᶠ x in n a, p x)  (∀ᶠ y in n a, ∀ᶠ x in n y, p x)) :
   a,  s  n a,  t  n a, t  s   a'  t, s  n a' :=
sorry

Note that topological_space.mk_of_nhds is not so frequently used, but it still good to know in what precise sense the neighborhood filters is all there is in a topological space structure.

The next thing to know in order to efficiently use topological spaces in mathlib is that we use a lot of formal properties of topological_space : Type u Type u. From a purely mathematical point of view, those formal properties are a very clean way to explain how topological spaces solve issues that metric spaces have. From this point of view, the issues solved by topological spaces is that metric spaces enjoy very little fonctoriality, and have very bad categorical properties in general. This comes on top of the fact already discussed that metric spaces contain a lot of geometrical information that is not topologically relevant.

Let us focus on fonctoriality first. A metric space structure can be induced on a subset or, equivalently, it can be pulled back by an injective map. But that’s pretty much everything. They cannot be pulled back by general map or pushed forward, even by surjective maps.

In particular there is no sensible distance to put on a quotient of a metric space or on an uncountable products of metric spaces. Consider for instance the type , seen as a product of copies of indexed by . We would like to say that pointwise convergence of sequences of functions is a respectable notion of convergence. But there is no distance on that gives this notion of convergence. Relatedly, there is no distance ensuring that a map f : X (ℝ ℝ) is continuous if and only λ x, f x t is continuous for every t : .

We now review the data used to solve all those issues. First we can use any map f : X Y to push or pull topologies from one side to the other. Those two operations form a Galois connection.

variables {X Y : Type*}

example (f : X  Y) : topological_space X  topological_space Y :=
topological_space.coinduced f

example (f : X  Y) : topological_space Y  topological_space X :=
topological_space.induced f

example (f : X  Y) (T_X : topological_space X) (T_Y : topological_space Y) :
  topological_space.coinduced f T_X  T_Y  T_X  topological_space.induced f T_Y :=
coinduced_le_iff_le_induced

Those operations are compactible with composition of functions. As usual, pushing forward is covariant and pulling back is contravariant, see coinduced_compose and induced_compose. On paper we will use notations \(f_*T\) for topological_space.coinduced f T and \(f^*T\) for topological_space.induced f T.

Then the next big piece is a complete lattice structure on topological_structure X for any given structure. If you think of topologies are being primarily the data of open sets then you expect the order relation on topological_structure X to come from set (set X), ie you expect t t' if a set u is open for t' as soon as it is open for t. However we already know that mathlib focuses on neighborhoods more than open sets so, for any x : X we want λ T : topological_space X, @nhds X T x to be order preserving. And we know the order relation on filter X is designed to ensure an order preserving principal : set X filter X, allowing to see filters as generalized sets. So the order relation we do use on topological_structure X is opposite to the one coming from set (set X).

example {T T' : topological_space X} :
  T  T'   s, T'.is_open s  T.is_open s  :=
iff.rfl

Now we can recover continuity by combining the push-foward (or pull-back) operation with the order relation.

example (T_X : topological_space X) (T_Y : topological_space Y) (f : X  Y) :
  continuous f  topological_space.coinduced f T_X  T_Y :=
continuous_iff_coinduced_le

With this definition and the compatibility of push-forward and composition, we get for free the universal property that, for any topological space \(Z\), a function \(g : Y → Z\) is continuous for the topology \(f_*T_X\) if and only if \(g ∘ f\) is continuous.

\[\begin{split}g \text{ continuous } &⇔ g_*(f_*T_X) ≤ T_Z \\ &⇔ (g ∘ f)_* T_X ≤ T_Z \\ &⇔ g ∘ f \text{ continuous}\end{split}\]
example {Z : Type*} (f : X  Y)
  (T_X : topological_space X) (T_Z : topological_space Z) (g : Y  Z) :
  @continuous Y Z (topological_space.coinduced f T_X) T_Z g  @continuous X Z T_X T_Z (g  f) :=
by rw [continuous_iff_coinduced_le, coinduced_compose, continuous_iff_coinduced_le]

So we already get quotient topologies (using the projection map as f). This wasn’t using that topological_space X is a complete lattice for all X. Let’s now see how all this structure proves the existence of the product topology by abstract non-sense. We considered the case of above, but let’s now consider the general case of Π i, X i for some ι : Type* and X : ι Type*. We want, for any topological space Z and any function f : Z Π i, X i, that f is continuous if and only if x, x i) f is continuous. Let us explore that constraint “on papar” using notation \(p_i\) for the projection (x : Π i, X i), x i):

\[\begin{split}(∀ i, p_i ∘ f \text{ continuous}) &⇔ ∀ i, (p_i ∘ f)_* T_Z ≤ T_{X_i} \\ &⇔ ∀ i, (p_i)_* f_* T_Z ≤ T_{X_i}\\ &⇔ ∀ i, f_* T_Z ≤ (p_i)^*T_{X_i}\\ &⇔ f_* T_Z ≤ \inf \left[(p_i)^*T_{X_i}\right]\end{split}\]

So we see that what is the topology we want on Π i, X i:

example (ι : Type*) (X : ι  Type*) (T_X : Π i, topological_space $ X i) :
  (Pi.topological_space : topological_space (Π i, X i)) =  i, topological_space.induced (λ x, x i) (T_X i) :=
rfl

This ends our tour of how mathlib thinks that topological spaces fix defects of the theory of metric spaces by being a more functorial theory and having a complete lattice structure for any fixed type.

7.3.2. Separation and countability

We saw that the category of topological spaces have very nice properties. The price to pay for this is existence of rather pathological topological spaces. There are a number of assumptions you can make on a topological space to ensure its behavior is closer to what metric spaces do. The most important is t2_space, also called “Hausdorff”, that will ensure that limits are unique. A stronger separation property is regularity that ensure that each point has a basis of closed neighborhood.

example [topological_space X] [t2_space X] {u :   X} {a b : X}
  (ha : tendsto u at_top (𝓝 a)) (hb : tendsto u at_top (𝓝 b)) : a = b :=
tendsto_nhds_unique ha hb

example [topological_space X] [regular_space X] (a : X) :
    (𝓝 a).has_basis (λ (s : set X), s  𝓝 a  is_closed s) id :=
closed_nhds_basis a

Note that, in every topological space, each point has a basis of open neighborhood, by definition.

example [topological_space X] {x : X} : (𝓝 x).has_basis (λ t : set X, t  𝓝 x  is_open t) id :=
nhds_basis_opens' x

Our main goal is now to prove the basic theorem which allows extension by continuity. From Bourbaki’s general topology book, I.8.5, Theorem 1 (taking only the non-trivial implication):

Let \(X\) be a topological space, \(A\) a dense subset of \(X\), \(f : A → Y\) a continuous mapping of \(A\) into a regular space \(Y\). If, for each \(x\) in \(X\), \(f(y)\) tends to a limit in \(Y\) when \(y\) tends to \(x\) while remaining in \(A\) then there exists a continuous extension \(φ\) of \(f\) to \(X\).

Actually mathlib contains a more general version of the above lemma, dense_inducing.continuous_at_extend, but we’ll stick to Bourbaki’s version here.

Remember that, given A : set X, ↥A is the subtype associated to A, and Lean will automatically insert that funny up arrow when needed. And the (inclusion) coercion map is coe : A X. The assumption “tends to \(x\) while remaining in \(A\)” corresponds to the pull-back filter comap coe (𝓝 x).

Let’s prove first an auxiliary lemma, extracted to simplify the context (in particular we don’t need Y to be a topological space here).

lemma aux {X Y A : Type*} [topological_space X] {c : A  X} {f : A  Y} {x : X} {F : filter Y}
  (h : tendsto f (comap c (𝓝 x)) F) {V' : set Y} (V'_in : V'  F) :
   V  𝓝 x, is_open V  c ⁻¹' V  f ⁻¹' V' :=
sorry

Let’s now turn to the main proof of the extension by continuity theorem.

When Lean needs a topology on ↥A it will use the induced topology, thanks to the instance subtype.topological_space. This all happens automatically. The only relevant lemma is nhds_induced coe : a : ↥A, 𝓝 a = comap coe (𝓝 ↑a) (this is actually a general lemma about induced topologies).

The proof outline is:

The main assumption and the axiom of choice give a function φ such that x, tendsto f (comap coe $ 𝓝 x) (𝓝 x)) (because Y is Hausdorff, φ is entirely determined, but we won’t need that until we try to prove that φ indeed extends f).

Let’s first prove φ is continuous. Fix any x : X. Since Y is regular, it suffices to check that for every closed neighborhood V' of φ x, φ ⁻¹' V' 𝓝 x. The limit assumption gives (through the auxiliary lemma above) some V 𝓝 x such is_open V coe ⁻¹' V f ⁻¹' V'. Since V 𝓝 x, it suffices to prove V φ ⁻¹' V', ie y V, φ y V'. Let’s fix y in V. Because V is open, it is a neighborhood of y. In particular coe ⁻¹' V comap coe (𝓝 y) and a fortiori f ⁻¹' V' comap coe (𝓝 y). In addition comap coe $ 𝓝 y because A is dense. Because we know tendsto f (comap coe $ 𝓝 y) (𝓝 y)) this implies φ y closure V' and, since V' is closed, we have proved φ y V'.

It remains to prove that φ extends f. This is were continuity of f enters the discussion, together with the fact that Y is Hausdorff.

example [topological_space X] [topological_space Y] [regular_space Y]
  {A : set X} (hA :  x, x  closure A)
  {f : A  Y} (f_cont : continuous f)
  (hf :  x : X,  c : Y, tendsto f (comap coe $ 𝓝 x) $ 𝓝 c) :
   φ : X  Y, continuous φ   a : A, φ a = f a :=
sorry

In addition to separation property, the main kind of assumption you can make on a topological space to bring it closer to metric spaces is countability assumption. The main one is first countability asking that every point has a countable neighborhood basic. In particular this ensures that closure of sets can be understood using sequences.

example [topological_space X] [topological_space.first_countable_topology X] {s : set X} {a : X} :
  a  closure s   (u :   X), ( n, u n  s)  tendsto u at_top (𝓝 a) :=
mem_closure_iff_seq_limit

7.3.3. Compactness

Let us now discuss how compactness is defined for topological spaces. As usual there are several ways to think about it and mathlib goes for the filter version.

We first need to define cluster points of filters. Given a filter F on a topological space X, a point x : X is a cluster point of F if F, seen as a generalized set, has non-empty intersection with the generalized set of points that are close to x.

Then we can say that a set s is compact if every nonempty generalized set F contained in s, ie such that F 𝓟 s, has a cluster point in s.

variables [topological_space X]

example {F : filter X} {x : X} : cluster_pt x F  ne_bot (𝓝 x  F) :=
iff.rfl

example {s : set X} :
  is_compact s   (F : filter X) [ne_bot F], F  𝓟 s   a  s, cluster_pt a F :=
iff.rfl

For instance if F is map u at_top, the image under u : X of at_top, the generalized set of very large natural numbers, then the assumption F 𝓟 s means that u n belongs to s for n large enough. Saying that x is a cluster point of map u at_top says the image of very large numbers intersects the set of points that are close to x. In case 𝓝 x has a countable basis, we can interpret this as saying that u has a subsequence converging to x, and we get back what compactness looks like in metric spaces.

example [topological_space.first_countable_topology X]
  {s : set X} {u :   X} (hs : is_compact s) (hu :  n, u n  s) :
   (a  s) (φ :   ), strict_mono φ  tendsto (u  φ) at_top (𝓝 a) :=
hs.tendsto_subseq hu

Cluster points behave nicely with continuous functions.

variables [topological_space Y]

example {x : X} {F : filter X} {G : filter Y} (H : cluster_pt x F)
  {f : X  Y} (hfx : continuous_at f x) (hf : tendsto f F G) :
  cluster_pt (f x) G :=
cluster_pt.map H hfx hf

As an exercise, we will prove that the image of a compact set under a continuous map is compact. In addition to what we saw already, you should use filter.push_pull and ne_bot.of_map.

example [topological_space Y] {f : X   Y} (hf : continuous f)
  {s : set X} (hs : is_compact s) : is_compact (f '' s) :=
begin
  intros F F_ne F_le,
  have map_eq : map f (𝓟 s  comap f F) = 𝓟 (f '' s)  F,
  { sorry },
  haveI Hne : (𝓟 s  comap f F).ne_bot,
  { sorry },
  have Hle : 𝓟 s  comap f F  𝓟 s, from inf_le_left,
  sorry
end

One can also express compactness in terms of open covers: s is compact if every family of open sets that cover s has a finite covering sub-family.

example {ι : Type*} {s : set X} (hs : is_compact s)
  (U : ι  set X) (hUo :  i, is_open (U i)) (hsU : s   i, U i) :
   t : finset ι, s   i  t, U i :=
hs.elim_finite_subcover U hUo hsU

A topological space X is compact if (univ : set X) is compact.

example [compact_space X] : is_compact (univ : set X) :=
is_compact_univ