# Lean mathlib notes

Various implementation details are noted in the mathlib source, and referenced later on. We collect these notes here.

## classical lemma

We make decidability results that depends on `classical.choice`

noncomputable lemmas.

- We have to mark them as noncomputable, because otherwise Lean will try to generate bytecode
for them, and fail because it depends on
`classical.choice`

. - We make them lemmas, and not definitions, because otherwise later definitions will raise
\"failed to generate bytecode\" errors when writing something like
`letI := classical.dec_eq _`

. Cf. https://leanprover-community.github.io/archive/113488general/08268noncomputabletheorem.html

## coercion into rings

Coercions such as `nat.cast_coe`

that go from a concrete structure such as
`ℕ`

to an arbitrary ring `α`

should be set up as follows:

```
@[priority 900] instance : has_coe_t ℕ α := ⟨...⟩
```

It needs to be `has_coe_t`

instead of `has_coe`

because otherwise type-class
inference would loop when constructing the transitive coercion `ℕ → ℕ → ℕ → ...`

.
The reduced priority is necessary so that it doesn't conflict with instances
such as `has_coe_t α (option α)`

.

For this to work, we reduce the priority of the `coe_base`

and `coe_trans`

instances because we want the instances for `has_coe_t`

to be tried in the
following order:

`has_coe_t`

instances declared in mathlib (such as`has_coe_t α (with_top α)`

, etc.)`coe_base`

, which contains instances such as`has_coe (fin n) n`

`nat.cast_coe : has_coe_t ℕ α`

etc.`coe_trans`

If `coe_trans`

is tried first, then `nat.cast_coe`

doesn't get a chance to apply.

## custom simps projection

You can specify custom projections for the `@[simps]`

attribute.
To do this for the projection `my_structure.awesome_projection`

by adding a declaration
`my_structure.simps.awesome_projection`

that is definitionally equal to
`my_structure.awesome_projection`

but has the projection in the desired (simp-normal) form.

You can initialize the projections `@[simps]`

uses with `initialize_simps_projections`

(after declaring any custom projections). This is not necessary, it has the same effect
if you just add `@[simps]`

to a declaration.

If you do anything to change the default projections, make sure to call either `@[simps]`

or
`initialize_simps_projections`

in the same file as the structure declaration. Otherwise, you might
have a file that imports the structure, but not your custom projections.

## decidable namespace

In most of mathlib, we use the law of excluded middle (LEM) and the axiom of choice (AC) freely.
The `decidable`

namespace contains versions of lemmas from the root namespace that explicitly
attempt to avoid the axiom of choice, usually by adding decidability assumptions on the inputs.

You can check if a lemma uses the axiom of choice by using `#print axioms foo`

and seeing if
`classical.choice`

appears in the list.

## dsimp, simp

Many proofs in the category theory library use the `dsimp, simp`

pattern,
which typically isn't necessary elsewhere.

One would usually hope that the same effect could be achieved simply with `simp`

.

The essential issue is that composition of morphisms involves dependent types.
When you have a chain of morphisms being composed, say `f : X ⟶ Y`

and `g : Y ⟶ Z`

,
then `simp`

can operate succesfully on the morphisms
(e.g. if `f`

is the identity it can strip that off).

However if we have an equality of objects, say `Y = Y'`

,
then `simp`

can't operate because it would break the typing of the composition operations.
We rarely have interesting equalities of objects
(because that would be "evil" --- anything interesting should be expressed as an isomorphism
and tracked explicitly),
except of course that we have plenty of definitional equalities of objects.

`dsimp`

can apply these safely, even inside a composition.

After `dsimp`

has cleared up the object level, `simp`

can resume work on the morphism level ---
but without the `dsimp`

step, because `simp`

looks at expressions syntactically,
the relevant lemmas might not fire.

There's no bound on how many times you potentially could have to switch back and forth,
if the `simp`

introduced new objects we again need to `dsimp`

.
In practice this does occur, but only rarely, because `simp`

tends to shorten chains of compositions
(i.e. not introduce new objects at all).

## forgetful inheritance

Suppose that one can put two mathematical structures on a type, a rich one `R`

and a poor one
`P`

, and that one can deduce the poor structure from the rich structure through a map `F`

(called a
forgetful functor) (think `R = metric_space`

and `P = topological_space`

). A possible
implementation would be to have a type class `rich`

containing a field `R`

, a type class `poor`

containing a field `P`

, and an instance from `rich`

to `poor`

. However, this creates diamond
problems, and a better approach is to let `rich`

extend `poor`

and have a field saying that
`F R = P`

.

To illustrate this, consider the pair `metric_space`

/ `topological_space`

. Consider the topology
on a product of two metric spaces. With the first approach, it could be obtained by going first from
each metric space to its topology, and then taking the product topology. But it could also be
obtained by considering the product metric space (with its sup distance) and then the topology
coming from this distance. These would be the same topology, but not definitionally, which means
that from the point of view of Lean's kernel, there would be two different `topological_space`

instances on the product. This is not compatible with the way instances are designed and used:
there should be at most one instance of a kind on each type. This approach has created an instance
diamond that does not commute definitionally.

The second approach solves this issue. Now, a metric space contains both a distance, a topology, and a proof that the topology coincides with the one coming from the distance. When one defines the product of two metric spaces, one uses the sup distance and the product topology, and one has to give the proof that the sup distance induces the product topology. Following both sides of the instance diamond then gives rise (definitionally) to the product topology on the product space.

Another approach would be to have the rich type class take the poor type class as an instance parameter. It would solve the diamond problem, but it would lead to a blow up of the number of type classes one would need to declare to work with complicated classes, say a real inner product space, and would create exponential complexity when working with products of such complicated spaces, that are avoided by bundling things carefully as above.

Note that this description of this specific case of the product of metric spaces is oversimplified
compared to mathlib, as there is an intermediate typeclass between `metric_space`

and
`topological_space`

called `uniform_space`

. The above scheme is used at both levels, embedding a
topology in the uniform space structure, and a uniform structure in the metric space structure.

Note also that, when `P`

is a proposition, there is no such issue as any two proofs of `P`

are
definitionally equivalent in Lean.

To avoid boilerplate, there are some designs that can automatically fill the poor fields when
creating a rich structure if one doesn't want to do something special about them. For instance,
in the definition of metric spaces, default tactics fill the uniform space fields if they are
not given explicitly. One can also have a helper function creating the rich structure from a
structure with less fields, where the helper function fills the remaining fields. See for instance
`uniform_space.of_core`

or `real_inner_product.of_core`

.

For more details on this question, called the forgetful inheritance pattern, see Competing inheritance paths in dependent type theory: a case study in functional analysis.

## function coercion

Many structures such as bundled morphisms coerce to functions so that you can
transparently apply them to arguments. For example, if `e : α ≃ β`

and `a : α`

then you can write `e a`

and this is elaborated as `⇑e a`

. This type of
coercion is implemented using the `has_coe_to_fun`

type class. There is one
important consideration:

If a type coerces to another type which in turn coerces to a function,
then it **must** implement `has_coe_to_fun`

directly:

```
structure sparkling_equiv (α β) extends α ≃ β
-- if we add a `has_coe` instance,
instance {α β} : has_coe (sparkling_equiv α β) (α ≃ β) :=
⟨sparkling_equiv.to_equiv⟩
-- then a `has_coe_to_fun` instance **must** be added as well:
instance {α β} : has_coe_to_fun (sparkling_equiv α β) :=
⟨λ _, α → β, λ f, f.to_equiv.to_fun⟩
```

(Rationale: if we do not declare the direct coercion, then `⇑e a`

is not in
simp-normal form. The lemma `coe_fn_coe_base`

will unfold it to `⇑↑e a`

. This
often causes loops in the simplifier.)

## implicit instance arguments

There are places where typeclass arguments are specified with implicit `{}`

brackets instead of
the usual `[]`

brackets. This is done when the instances can be inferred because they are implicit
arguments to the type of one of the other arguments. When they can be inferred from these other
arguments, it is faster to use this method than to use type class inference.

For example, when writing lemmas about `(f : α →+* β)`

, it is faster to specify the fact that `α`

and `β`

are `semiring`

s as `{rα : semiring α} {rβ : semiring β}`

rather than the usual
`[semiring α] [semiring β]`

.

## is_R_or_C instance

This instance generates a type-class problem with a metavariable `?m`

that should satisfy
`is_R_or_C ?m`

. Since this can only be satisfied by `ℝ`

or `ℂ`

, this does not cause problems.

## likely generated binder names

In surface Lean, we can write anonymous Π binders (i.e. binders where the argument is not named) using the function arrow notation:

```
inductive test : Type
| intro : unit → test
```

After elaboration, however, every binder must have a name, so Lean generates
one. In the example, the binder in the type of `intro`

is anonymous, so Lean
gives it the name `ᾰ`

:

```
test.intro : ∀ (ᾰ : unit), test
```

When there are multiple anonymous binders, they are named `ᾰ_1`

, `ᾰ_2`

etc.

Thus, when we want to know whether the user named a binder, we can check whether the name follows this scheme. Note, however, that this is not reliable. When the user writes (for whatever reason)

```
inductive test : Type
| intro : ∀ (ᾰ : unit), test
```

we cannot tell that the binder was, in fact, named.

The function `name.is_likely_generated_binder_name`

checks if
a name is of the form `ᾰ`

, `ᾰ_1`

, etc.

## lower instance priority

Certain instances always apply during type-class resolution. For example, the instance
`add_comm_group.to_add_group {α} [add_comm_group α] : add_group α`

applies to all type-class
resolution problems of the form `add_group _`

, and type-class inference will then do an
exhaustive search to find a commutative group. These instances take a long time to fail.
Other instances will only apply if the goal has a certain shape. For example
`int.add_group : add_group ℤ`

or
`add_group.prod {α β} [add_group α] [add_group β] : add_group (α × β)`

. Usually these instances
will fail quickly, and when they apply, they are almost the desired instance.
For this reason, we want the instances of the second type (that only apply in specific cases) to
always have higher priority than the instances of the first type (that always apply).
See also #1561.

Therefore, if we create an instance that always applies, we set the priority of these instances to 100 (or something similar, which is below the default value of 1000).

## module definition

Modules are defined as an `abbreviation`

for semimodules,
if the base semiring is a ring.
(A previous definition made `module`

a structure
defined to be `semimodule`

.)
This has as advantage that modules are completely transparent
for type class inference, which means that all instances for semimodules
are immediately picked up for modules as well.
A cosmetic disadvantage is that one can not extend modules as such,
in definitions such as `normed_space`

.
The solution is to extend `semimodule`

instead.

## no instance on morphisms

We have lemmas stating that the composition of two morphisms is again a morphism.
Since composition is reducible, type class inference will always succeed in applying these instances.
For example when the goal is just `⊢ is_mul_hom f`

the instance `is_mul_hom.comp`

will still succeed, unifying `f`

with `f ∘ (λ x, x)`

. This causes type class inference to loop.
To avoid this, we do not make these lemmas instances.

## nolint_ge

Currently, the linter forbids the use of `>`

and `≥`

in definitions and
statements, as they cause problems in rewrites.
They are still allowed in statements such as `bounded (≥)`

or `∀ ε > 0`

or `⨆ n ≥ m`

,
and the linter allows that.
If you write a pattern where you bind two or more variables, like `∃ n m > 0`

, the linter will
flag this as illegal, but it is also allowed. In this case, add the line

```
@[nolint ge_or_gt] -- see Note [nolint_ge]
```

## open expressions

Some declarations work with open expressions, i.e. an expr that has free variables.
Terms will free variables are not well-typed, and one should not use them in tactics like
`infer_type`

or `unify`

. You can still do syntactic analysis/manipulation on them.
The reason for working with open types is for performance: instantiating variables requires
iterating through the expression. In one performance test `pi_binders`

was more than 6x
quicker than `mk_local_pis`

(when applied to the type of all imported declarations 100x).

## operator precedence of big operators

## Operator precedence of `∏`

and `∑`

There is no established mathematical convention
for the operator precedence of big operators like `∏`

and `∑`

.
We will have to make a choice.

Online discussions, such as https://math.stackexchange.com/q/185538/30839
seem to suggest that `∏`

and `∑`

should have the same precedence,
and that this should be somewhere between `*`

and `+`

.
The latter have precedence levels `70`

and `65`

respectively,
and we therefore choose the level `67`

.

In practice, this means that parentheses should be placed as follows:

```
∑ k in K, (a k + b k) = ∑ k in K, a k + ∑ k in K, b k →
∏ k in K, a k * b k = (∏ k in K, a k) * (∏ k in K, b k)
```

(Example taken from page 490 of Knuth's *Concrete Mathematics*.)

## partially-applied ext lemmas

When possible, `ext`

lemmas are stated without a full set of arguments. As an example, for bundled
homs `f`

, `g`

, and `of`

, `f.comp of = g.comp of → f = g`

is a better `ext`

lemma than
`(∀ x, f (of x) = g (of x)) → f = g`

, as the former allows a second type-specific extensionality
lemmas to be applied to `f.comp of = g.comp of`

.
If the domain of `of`

is `ℕ`

or `ℤ`

and `of`

is a `ring_hom`

, such a lemma could then make the goal
`f (of 1) = g (of 1)`

.

For bundled morphisms, there is a `ext`

lemma that always applies of the form
`(∀ x, ⇑f x = ⇑g x) → f = g`

. When adding type-specific `ext`

lemmas like the one above, we want
these to be tried first. This happens automatically since the type-specific lemmas are inevitably
defined later.

## simp-normal form

This note gives you some tips to debug any errors that the simp-normal form linter raises The reason that a lemma was considered faulty is because its left-hand side is not in simp-normal form. These lemmas are hence never used by the simplifier.

This linter gives you a list of other simp lemmas, look at them!

Here are some tips depending on the error raised by the linter:

'the left-hand side reduces to XYZ': you should probably use XYZ as the left-hand side.

'simp can prove this': This typically means that lemma is a duplicate, or is shadowed by another lemma:

2a. Always put more general lemmas after specific ones:

@[simp] lemma zero_add_zero : 0 + 0 = 0 := rfl @[simp] lemma add_zero : x + 0 = x := rfl

And not the other way around! The simplifier always picks the last matching lemma.

2b. You can also use @[priority] instead of moving simp-lemmas around in the file.

Tip: the default priority is 1000. Use

`@[priority 1100]`

instead of moving a lemma down, and`@[priority 900]`

instead of moving a lemma up.2c. Conditional simp lemmas are tried last, if they are shadowed just remove the simp attribute.

2d. If two lemmas are duplicates, the linter will complain about the first one. Try to fix the second one instead! (You can find it among the other simp lemmas the linter prints out!)

'try_for tactic failed, timeout': This typically means that there is a loop of simp lemmas. Try to apply squeeze_simp to the right-hand side (removing this lemma from the simp set) to see what lemmas might be causing the loop.

Another trick is to

`set_option trace.simplify.rewrite true`

and then apply`try_for 10000 { simp }`

to the right-hand side. You will see a periodic sequence of lemma applications in the trace message.

## use has_coe_t

We use the class `has_coe_t`

instead of `has_coe`

if the first argument is a variable,
or if the second argument is a variable not occurring in the first.
Using `has_coe`

would cause looping of type-class inference. See
https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/remove.20all.20instances.20with.20variable.20domain

## user attribute parameters

For performance reasons, it is inadvisable to use `user_attribute.get_param`

.
The parameter is stored as a reflected expression. When calling `get_param`

,
the stored parameter is evaluated using `eval_expr`

, which first compiles the
expression into VM bytecode. The unevaluated expression is available using
`user_attribute.get_param_untyped`

.

In particular, `user_attribute.get_param`

MUST NEVER BE USED in the
implementation of an attribute cache. This is because calling `eval_expr`

disables the attribute cache.

There are several possible workarounds:

- Set a different attribute depending on the parameter.
- Use your own evaluation function instead of
`eval_expr`

, such as e.g.`expr.to_nat`

. - Write your own
`has_reflect Param`

instance (using a more efficient serialization format). The`user_attribute`

code unfortunately checks whether the expression has the correct type, but you can use``(id %%e : Param)`

to pretend that your expression`e`

has type`Param`

.

## vector space definition

Vector spaces are defined as an `abbreviation`

for semimodules,
if the base ring is a field.
(A previous definition made `vector_space`

a structure
defined to be `module`

.)
This has as advantage that vector spaces are completely transparent
for type class inference, which means that all instances for semimodules
are immediately picked up for vector spaces as well.
A cosmetic disadvantage is that one can not extend vector spaces as such,
in definitions such as `normed_space`

.
The solution is to extend `semimodule`

instead.