Zulip Chat Archive

Stream: mathlib4

Topic: BoundingSieve is a class??

Yury G. Kudryashov (May 23 2025 at 02:24):

I've just noticed that we have typeclasses docs#BoundingSieve and docs#SelbergSieve. I'm not an expert in this area, but it seems to me like there aren't going to be any "canonical" instances of these typeclasses. Should we turn them into structures?

Mario Carneiro (May 23 2025 at 02:43):

that seems to be using the technique of nullary typeclasses as fake parameters

Mario Carneiro (May 23 2025 at 02:44):

if you want to keep this trick localized, you could make it a structure and then attribute [local class] it in the file

Yury G. Kudryashov (May 23 2025 at 04:14):

IMHO, if we want the main theorem of the file to be useable outside of the file, then we should turn these typeclasses into structures.

Yury G. Kudryashov (May 23 2025 at 04:15):

If we use local class, then chances are some of the external-facing lemmas will take it as [BoundingSieve] instead of (s : BoundingSieve).

Mario Carneiro (May 23 2025 at 04:20):

I didn't look carefully. Indeed the main theorem should take the structure explicitly or not at all

Mario Carneiro (May 23 2025 at 04:24):

It seems this file doesn't really have what looks like a main theorem, in which case it makes sense to keep it as a global typeclass and just keep using a variable in downstream files which continue developing the theory

Mario Carneiro (May 23 2025 at 04:24):

this file seems to be a leaf right now though

Yury G. Kudryashov (May 23 2025 at 04:29):

@Arend Mellendijk What are your plans about this file?

Michael Rothgang (May 23 2025 at 06:01):

IIUC, #20008 adds the remaining results (which are used in the PNT+ project). Review of that PR and its deps has been slow...

Arend Mellendijk (May 23 2025 at 08:03):

Mario Carneiro said:

if you want to keep this trick localized, you could make it a structure and then attribute [local class] it in the file

I tried this when I first made it a class, but:

import Mathlib

structure Foo where
/- invalid attribute 'class', must be global -/
attribute [local class] Foo

Arend Mellendijk (May 23 2025 at 08:19):

Yury G. Kudryashov said:

Arend Mellendijk What are your plans about this file?

The plan for this file is to have only the main result be user-facing (plus a couple of the definitions used in the main theorem). This is selberg_bound in #20008, and it takes a SelbergSieve explicitly.

You're right that there is never a canonical instance of this class, and in fact every application that I know of defines a family of sieves.The intent is for the user to define the family they care about by making it a def. There should never be an instance of this class. I thought I put that in a docstring, but it must have got lost somewhere.

Robin Carlier (May 23 2025 at 08:22):

@Mario Carneiro can you expand on the local class thing? I am also facing a case in #24991 where for practical reasons I had to make a class something that would make more sense as a structure, and that would probably be solved by a local class, but like @Arend Mellendijk lean is unhappy with the syntax you gave.

Jz Pan (May 23 2025 at 08:47):

Do we have local class? A search on mathlib reveals that there are only local instance.

Robin Carlier (May 23 2025 at 09:05):

The reference manual section on type classes does not mention it (or anything like it) :(

Kevin Buzzard (May 23 2025 at 10:38):

There should never be an instance of this class.

Is this another way of saying "I miss parameters"?

Arend Mellendijk (May 23 2025 at 11:49):

I don't think I've ever had that thought, but maybe I wasn't using parameters correctly when I used Lean 3.

Arend Mellendijk (May 23 2025 at 11:50):

Though here's an argument in favour of keeping it a class. Once your parameters are fixed, the SelbergSieve instance becomes canonical, so in theory one could wrap your parameters around a nullary typeclass and create an instance [Parameters] : SelbergSieve. That way you can apply all of the theorems and definitions about the SelbergSieve without passing any parameters explicitly. Here's what that might look like:

import Mathlib

open SelbergSieve ArithmeticFunction
open scoped Nat.Prime

class PrimeSieve where
  N : ℕ
  y : ℝ
  hy : 1 ≤ y

noncomputable section Proof
open PrimeSieve

private instance [PrimeSieve] : SelbergSieve where
  support := Finset.range (N + 1)
  prodPrimes := primorial (Nat.floor y)
  prodPrimes_squarefree := sorry
  weights := fun _ => 1
  weights_nonneg := fun _ => zero_le_one
  totalMass := N
  nu := (ζ : ArithmeticFunction ℝ).pdiv .id
  nu_mult := by arith_mult
  nu_pos_of_prime := fun p hp _ => by
    simp[if_neg hp.ne_zero, Nat.pos_of_ne_zero hp.ne_zero]
  nu_lt_one_of_prime := fun p hp _ => sorry
  level := y
  one_le_level := hy

variable [PrimeSieve]

open Classical in
/- We get to use `BoundingSieve.siftedSum` for free. -/
theorem siftedSum_eq :
    siftedSum = ((Finset.range (N+1)).filter fun n ↦ ∀ p : ℕ, p.Prime → p ≤ y → ¬ p ∣ n).card := by
  sorry

theorem pi_le_siftedSum  : π N ≤ siftedSum + y := by
  sorry

theorem siftedSum_le :
    siftedSum ≤ 2 * N / Real.log y + 2 * y * (1+Real.log y)^3 := by
  /- Apply the main theorem of the Selberg Sieve. -/
  sorry

end Proof

theorem Nat.primeCounting_le (N : ℕ) (y : ℝ) (hy : 1 ≤ y) :
    π N ≤ 2 * N / Real.log y + 2 * y * (1+Real.log y)^3 + y := by
  let sieve : PrimeSieve := ⟨N, y, hy⟩
  have := pi_le_siftedSum
  have := siftedSum_le
  simp [sieve] at *
  linarith

Yury G. Kudryashov (May 23 2025 at 19:45):

IMHO, it means one of:

you can define a structure that holds params, then a typeclass that takes this structure as an argument;
you can use it as a structure and define custom constructors.

Yury G. Kudryashov (May 23 2025 at 19:46):

In either case, I would prefer to avoid using classes in place of Lean 3 parameters.

Arend Mellendijk (May 23 2025 at 20:31):

I'm afraid I don't understand what you mean by the first bullet point.

The situation is this: I have a (structure/class) holding the data and running assumptions of the main proof. There are about 8 intermediate definitions that are used for the fundamental theorem that depend on the variables in the class. On paper these have names like S, g, R, etc. These definitions are not hidden: they are used in the statement of the final theorem.

By using a class I don't have to pass the structure every time I write down one of these definitions (which could easily be 5 times on a single line). I also get to create hygienic local notation that mimics the paper proof.

Kevin Buzzard (May 23 2025 at 21:06):

FWIW I had a situation in FLT where I had several intermediate definitions which in the blueprint I called $E$ , $X$ , $Y$ , $T$ and $C$ (see the definitions in the section), and in the Lean I've just stuffed them away in the namespace NumberField.AdeleRing.DivisionAlgebra.Aux (see e.g. here). I could even have made them private and if this proof is ever upstreamed to FLT there might be a call to do that.

Arend Mellendijk (May 23 2025 at 21:23):

But those intermediate definitions don't show up in the final theorem, right? My definitions do, and in practise people will want to prove things like "For this family of sieves I've just defined, R d ≤ 2 ^ ω d"

Kevin Buzzard (May 23 2025 at 22:11):

Yes that's correct, my final theorem was "some space or other is compact" and these were intermediate also-compact things that we found along the way.

Jz Pan (May 24 2025 at 03:35):

Yury G. Kudryashov said:

IMHO, it means one of:

you can define a structure that holds params, then a typeclass that takes this structure as an argument;

you can use it as a structure and define custom constructors.

But this reminds me of docs#StateT in lean4 functional programming. Maybe we can reuse/reinvent a similar thing?

Yury G. Kudryashov (May 24 2025 at 12:26):

Arend Mellendijk said:

But those intermediate definitions don't show up in the final theorem, right? My definitions do, and in practise people will want to prove things like "For this family of sieves I've just defined, R d ≤ 2 ^ ω d"

How do you want a user to use your notation for a family of sieves with the typeclass setup? Did you consider using something like abbrev Sieve.R := _, then s.R d?

Arend Mellendijk (May 24 2025 at 13:38):

You're right that if you define your family as a def the notation won't play nice, but consider the setup I gave above:

Code

import Mathlib

open SelbergSieve ArithmeticFunction
open scoped Nat.Prime SelbergSieve.Notation

class PrimeSieve where
  N : ℕ
  z : ℝ
  one_le_z : 1 ≤ z

noncomputable section Proof
open PrimeSieve

private instance [PrimeSieve] : SelbergSieve where
  support := Finset.range (N + 1)
  prodPrimes := primorial (Nat.floor z)
  prodPrimes_squarefree := sorry
  weights := fun _ => 1
  weights_nonneg := fun _ => zero_le_one
  totalMass := N
  nu := (ζ : ArithmeticFunction ℝ).pdiv .id
  nu_mult := by arith_mult
  nu_pos_of_prime := fun p hp _ => by
    simp[if_neg hp.ne_zero, Nat.pos_of_ne_zero hp.ne_zero]
  nu_lt_one_of_prime := fun p hp _ => sorry
  level := z
  one_le_level := one_le_z

variable [PrimeSieve]

open Classical in

theorem y_eq : y = z := rfl

theorem X_eq : X = N := rfl

theorem ν_eq : ν = (ζ : ArithmeticFunction ℝ).pdiv .id := by rfl

theorem ν_apply (d : ℕ) : ν d = 1 / d := by
  simp [ν_eq]
  split_ifs with h <;> simp [h]

@[local simp]
theorem a_apply (d : ℕ) : a d = 1 := rfl

theorem A_eq : A = Finset.range (N+1) := rfl

theorem 𝒜_apply (d : ℕ) : 𝒜 d = ⌊(N / d : ℝ)⌋₊ := by
  simp only [multSum, a_apply, Finset.sum_boole, Nat.floor_nat, id_eq, Nat.cast_inj, A_eq]
  sorry

theorem R_eq (d : ℕ) : R d = ⌊(N/d : ℝ)⌋₊ - N / d := by
  rw [rem, X_eq, ν_apply, 𝒜_apply]
  ring

theorem R_le (d : Nat) : |R d| ≤ 1 := by
  rw [R_eq]
  apply Nat.abs_floor_sub_le
  positivity

end Proof

Arend Mellendijk (May 24 2025 at 13:42):

Did you consider using something like abbrev Sieve.R := _, then s.R d?

I have, but I admit I haven't tried to implement it. The s. seems like visual clutter in expressions that are already fairly large, especially when notation lets you write it exactly like you would on paper.

Kevin Buzzard (May 24 2025 at 14:32):

In FLT I got annoyed with this kind of visual clutter (maths said "let's write $D_{\mathbb{A}}$ for $D\otimes_K\mathbb{A}_K$ " and then after using notation in the naive way I had to write D_𝔸 D K which was horrible). But I fixed this with

set_option quotPrecheck false in
/-- `D_𝔸` is notation for `D ⊗[K] 𝔸_K`. -/
notation "D_𝔸" => (D ⊗[K] AdeleRing (𝓞 K) K)

which hasn't caused me any problems at all so far (although I have only used this in files where $D$ and $K$ really are "parameters" in the sense that they never change, but it sounds like you're in that situation too).

Arend Mellendijk (May 24 2025 at 15:17):

I used to do something similar (that is, unhygienic notation), and maybe this is a reasonable compromise, but it has one big downside: the notation only refers to the specific variables it uses. That means you can't make the notation public, or split the file multiple parts:

import Mathlib

structure Foo where
  long_name_we_want_to_call_X : ℕ

section Theory
variable (f : Foo)

set_option quotPrecheck false in
notation "X" => Foo.long_name_we_want_to_call_X f

/- This works as expected -/
example : X = sorry := by
  sorry

end Theory

/- ... some results not about `f` ... -/

section Theory
variable (f : Foo)

/- unknown constant 'f✝' -/
example : X = f.long_name_we_want_to_call_X := by
  sorry

end Theory

Yury G. Kudryashov (May 24 2025 at 15:18):

I still can't see how it works for a family of sieves.

Arend Mellendijk (May 24 2025 at 15:21):

In the code here, R_le is a statement about a family of sieves parameterised by N and z.

Yury G. Kudryashov (May 24 2025 at 15:22):

I see what you mean.

Yury G. Kudryashov (May 24 2025 at 15:23):

But you can't have 2 sieves in the same statement.

Yury G. Kudryashov (May 24 2025 at 15:23):

(e.g., to say that some sieve is better in some sense)

Yury G. Kudryashov (May 24 2025 at 15:24):

I don't know if there are any statements of this sort in this theory.

Arend Mellendijk (May 24 2025 at 15:37):

I don't know that this happens in practise. I feel that if you want to compare two sieves in a sense what you do is you compare the results they produce, rather than the specific values of the instances.

Mario Carneiro (May 24 2025 at 15:38):

you can't really do that either though, all the functions defined on top take it as a typeclass

Arend Mellendijk (May 24 2025 at 15:50):

The purpose of a sieve is generally to get some concrete (but possibly rough) bound on a function you care about, like $\pi(x) = O(x / \log x)$ . I get the impression that usually we "compare" two sieves only in an informal sense to decide which one to apply to get the result we want.

Also the word 'sieve' is overloaded here: it refers to both instance and class. In the literature one might consider the bounds produced by the large sieve, Brun's sieve, Selberg's sieve, etc., but when applying a sieve result you consider a family of objects which you might also call "sieve"

Mario Carneiro (May 24 2025 at 16:04):

I don't know what this word sieve means here TBH, it seems to be used in a completely contentless way and I can't make heads or tails of the data in the structures. Is there a theorem which is motivating this?

Kevin Buzzard (May 24 2025 at 16:12):

https://en.m.wikipedia.org/wiki/Selberg_sieve

Kevin Buzzard (May 24 2025 at 16:13):

It's the name of a technique in analytic number theory

Arend Mellendijk (May 24 2025 at 16:13):

That's a good overview.

The specific result I've formalised is often called the "Fundamental Theorem for Selberg's Sieve". A variant of the specific statement I've formalised is outlined by Heath-Brown here: https://arxiv.org/abs/math/0209360 (after Lemma 2.2)

Arend Mellendijk (May 24 2025 at 16:16):

Also to understand the data as I've defined it maybe the documentation in Mathlib is helpful:
https://github.com/leanprover-community/mathlib4/blob/b057c67c0905df3e5eabf70651a42625f05a56b9/Mathlib/NumberTheory/SelbergSieve.lean#L50-L79

Kevin Buzzard (May 24 2025 at 16:21):

The technique says "given an annoyingly large amount of data we can prove a mathematical statement involving numbers which you can work out from the data" and the point is presumably that supplying all the data as an input to every lemma gets old very quickly

Mario Carneiro (May 24 2025 at 16:22):

TBH I kind of agree with Yury that this should be used as a structure and not a class, I find it quite confusing as written what is a definition and what is a local variable because everything has tiny names

Kevin Buzzard (May 24 2025 at 16:23):

...which are probably standard in the domain

Mario Carneiro (May 24 2025 at 16:23):

standard in a proof more like

Mario Carneiro (May 24 2025 at 16:24):

this has the look of the internals of some argument

Kevin Buzzard (May 24 2025 at 16:24):

I mean "in the literature"

Kevin Buzzard (May 24 2025 at 16:24):

Like I call my primes p

Mario Carneiro (May 24 2025 at 16:25):

I don't find this kind of literal transcription of the paper mathematics that useful, except during the informal -> formal process

Mario Carneiro (May 24 2025 at 16:26):

for standalone mathlib proof maintenance I would cut way back on the bespoke notations

Kevin Buzzard (May 24 2025 at 16:26):

The objection to using a structure is the same as the objection I would raise if you were to tell me that I had to call my prime s.p throughout my argument

Mario Carneiro (May 24 2025 at 16:26):

you could have a local variable and call it p

Mario Carneiro (May 24 2025 at 16:26):

but I don't like definitions named p

Mario Carneiro (May 24 2025 at 16:27):

I think this kind of technique is fine if used in very local situations where no one ever has to look at the code

Mario Carneiro (May 24 2025 at 16:27):

but it's bad API design

Arend Mellendijk (May 24 2025 at 16:29):

There's some truth to Mario's point: a lot of these variable names differ author by author, though they are invariably one letter long and are only referred to as such.

Arend Mellendijk (May 24 2025 at 16:32):

And sadly the theorem itself is rather leaky - you define the base data (rather annoyingly many of them), then you prove that the intermediate definitions happen to have these nice properties, and you get a theorem at the end stated in terms of these intermediate definitions.

Mario Carneiro (May 24 2025 at 16:33):

I think most of that can be stated as functions taking one structure argument

Mario Carneiro (May 24 2025 at 16:34):

You are already using notations heavily to avoid having to write it anyway, so I don't see that this would make any difference for you

Arend Mellendijk (May 24 2025 at 16:37):

Sorry I'm confused, wasn't your objection to the use of notation? Or do you mean switch to unhygienic local notation so it doesn't seep into the API?

Mario Carneiro (May 24 2025 at 16:37):

yes that

Mario Carneiro (May 24 2025 at 16:39):

aggressively using notation where it's all local and confined to a single proof or file is fine, but there is some gradient from that to multi-file proofs to scoped notations, "domain standard" notations, and global notations

Mario Carneiro (May 24 2025 at 16:39):

as the scope increases more users have to be concerned with learning your notation

Mario Carneiro (May 24 2025 at 16:40):

and in API theorems this is especially problematic because these aren't users that have just gone through the multi page bootstrap section of the proof because they want to learn the proof, they just want to use it and all of the layers get in the way at that point

Arend Mellendijk (May 24 2025 at 16:41):

I'll reiterate this annoyance, but I understand the maintenance burden here might outweigh it.

Mario Carneiro (May 24 2025 at 16:42):

did you try set_option hygiene false there?

Arend Mellendijk (May 24 2025 at 16:44):

Ah! I hadn't. That only works with autoImplicit, doesn't it?

Mario Carneiro (May 24 2025 at 16:44):

no, it's unrelated

Arend Mellendijk (May 24 2025 at 16:44):

oh sorry, yeah that works.

Mario Carneiro (May 24 2025 at 16:44):

an unhygienic notation there will pick up whatever happens to be called f in the local scope

Arend Mellendijk (May 24 2025 at 16:52):

and in API theorems this is especially problematic because these aren't users that have just gone through the multi page bootstrap section of the proof because they want to learn the proof, they just want to use it and all of the layers get in the way at that point

I don't think this happens to be much of a concern for this specific theorem: the actual result is quite technical and phrased multiple different ways in the literature. To apply it you have to look at a bunch of definitions anyway, and the notation is documented very clearly.

That said I understand your fears, and I'll make a PR to switch it back to a structure.

Alex Kontorovich (May 24 2025 at 18:23):

I'm just a bystander in this discussion but I find it fascinating. Is there somewhere one can read about all these different design and implementation decisions, where/why some things are better than others (structure vs class, abbrev vs notation, etc etc)? Is there a concrete "yoga", or are people mostly speaking from experience in similar circumstances (heavy preliminaries that are annoying to have to carry around in each lemma)?

Yury G. Kudryashov (May 24 2025 at 18:24):

Generally, classes are reserved for structures that have canonical instances.

Mario Carneiro (May 24 2025 at 18:24):

I'm not sure there is even a clear difference between those two. Guidelines are just peoples experiences with what works or doesn't written down

Mario Carneiro (May 24 2025 at 18:25):

There is a second use of classes though Yury, which is this parameter thing. It's qualitatively different from the usual use of typeclasses and it has different rules / best practices

Yury G. Kudryashov (May 24 2025 at 18:26):

I guess, I never used classes this way, so I don't know what are the best practices here.

Mario Carneiro (May 24 2025 at 18:26):

I do wonder whether we should just revisit the possibility of re-adding parameter and possibly redesigning it to suit our uses

Mario Carneiro (May 24 2025 at 18:32):

It seems unlikely that this will happen without FRO support though, and I think first we need to have clear problem cases where the (ab)use of typeclasses causes problems. It's not good enough to just be violating a guideline like "there should only be one instance of a class in scope", we need actual demonstrable problems in a practical situation

Mario Carneiro (May 24 2025 at 18:34):

That particular guideline is there because of issues involving non-commuting diamonds in the algebraic hierarchy, and it's not clear that it still applies with these kind of local typeclasses. It might, but we need more data

Alex Kontorovich (May 24 2025 at 18:36):

Has someone given a talk on this topic perhaps (that one can watch on, say, YouTube), like at a "Lean Together" or something ?

Robin Carlier (May 24 2025 at 18:37):

I guess this is not happening without FRO support, but turning local class into a thing would be a solution maybe?
(Edit: I have in mind one more "classes as parameters" example: CatCommSq in CategoryTheory, although in this case there are sometimes some "canonical" instances)

Mario Carneiro (May 24 2025 at 18:38):

@Alex Kontorovich as far as I am aware, no; I'm not sure it's ever been studied in a systematic way

Mario Carneiro (May 24 2025 at 18:39):

I am reminded of @Heather Macbeth 's article on how "good style" is different between formal proofs and informal though

Mario Carneiro (May 24 2025 at 18:39):

but it doesn't get into the nitty gritty of mathlib best practices

Arend Mellendijk (Aug 04 2025 at 16:25):

I know I'm rather late here but #27820 turns BoundingSieve into a struct. I got hung up deciding what to do with the notation, but it was too much effort to get it to pretty print correctly (I'd have to define custom unexpanders for each local notation) so I've just removed the notation completely.

Yaël Dillies (Aug 04 2025 at 16:28):

Stupid question: Can't notation3 define the unexpanders for you?

Arend Mellendijk (Aug 04 2025 at 16:37):

I couldn't get it to work. The difficulty is that SelbergSieve extends BoundingSieve and I want notation that works whichever one I'm working with. You can do this by making the notation unhygienic and using dot notation, but this breaks the unexpanders.

Last updated: Feb 28 2026 at 14:05 UTC