Zulip Chat Archive

Stream: Is there code for X?

Topic: proposed notation for `EqUpToLittleO` etc

Alex Kontorovich (Jan 04 2024 at 07:48):

In number theory (and branches of analysis), it is very standard to claim that two functions are asymptotically equal, up to some factors, usually phrased with big-O or little-o. For example, (a standard version of) the Prime Number Theorem is that the prime counting function $\Pi(x)$ is equal to the logarithmic integral $Li(x)$ up to $O(xe^{-c \sqrt{\log x}})$ , for some $c>0$ . What do you all think of introducing notation and supporting API of the following kind (written here for the case of little-o)? This would allow, e.g., transitivity (and then calc blocks!) to function, looking closer to how things are written in natural language...

import Mathlib

namespace Asymptotics

variable {α : Type*} {E : Type*} {F : Type*} [NormedAddGroup E] [Norm F]

variable {f g : α → E} {h : α → F} {l : Filter α}

/--
  We write `f =ᵤ g upto o[l] h` to mean that `f - g =o[l] h`. We call this `EqUpToLittleO`
-/
notation:100 f " =ᵤ" g "upto o[" l "]" h :100 => IsLittleO l (f - g) h

lemma EqUpToLittleO.trans {k : α → E}
    (hfg : f =ᵤ g upto o[l] h)
    (hgk : g =ᵤ k upto o[l] h) :
    f =ᵤ k upto o[l] h := by
  rw [IsLittleO] at hfg hgk ⊢
  intro ε ε_pos
  have hfgε := @hfg (ε/2) (by linarith)
  have hgkε := @hgk (ε/2) (by linarith)
  rw [IsBigOWith] at hfgε hgkε ⊢
  filter_upwards [hfgε, hgkε]
  intro x _ _
  calc
    _ = ‖(f - g) x + (g - k) x‖ := by simp
    _ ≤ ‖(f - g) x‖ + ‖(g - k) x‖ := by apply norm_add_le
    _ ≤ ε / 2 * ‖h x‖ + ε / 2 * ‖h x‖ := by linarith
    _ = _ := by ring

end Asymptotics

David Loeffler (Jan 04 2024 at 08:28):

+1 from me!

Incidentally, if you change [NormedAddGroup E] to [NormedAddCommGroup E] then the entire proof telescopes to

lemma EqUpToLittleO.trans {k : α → E} (hfg : f =ᵤ g upto o[l] h) (hgk : g =ᵤ k upto o[l] h) :
    f =ᵤ k upto o[l] h := by simpa using hfg.add hgk

The fact that the long-hand proof works without assuming commutativity, but the simp one apparently doesn't, suggests that some of our simp lemmas are maybe not quite optimal.

David Loeffler (Jan 04 2024 at 08:38):

Well, actually the problem is not the simp lemmas as such, but that docs#Asymptotics.IsLittleO.add assumes a SeminormedAddCommGroup when a SeminormedAddGroup would apparently suffice. I wonder if anyone cares enough about non-commutative additive groups to fix this?

Michael Stoll (Jan 04 2024 at 11:19):

We probably want more general lemmas, too, where the two hs need not be the same, but one is big-O of the other...

Anatole Dedecker (Jan 04 2024 at 12:29):

I think Patrick Massot did some experiments with something similar a few months ago. I don’t have a lot of experience with this kind of reasoning, but indeed it looked like it would make a bunch of things easier, so I would be in favor of giving it a try. I think it deserves a new definition though, not just notation, so that we can develop a true API for it.

Patrick Massot (Jan 04 2024 at 14:23):

Yes, this is needed. I think all my code ended up at https://github.com/teorth/symmetric_project/blob/master/Test/EqModBigO.lean#L147.

Patrick Massot (Jan 04 2024 at 14:24):

Please feel free to use anything you see in that file. I don't have time to PR this to Mathlib but I would be grateful if someone else does it.

Terence Tao (Jan 04 2024 at 17:22):

Ah, this would be a good place to mention something I have been quietly working on right now as a possible companion to asymptotic notation, which is asymptotic notation with explicit constants. I have started the project at https://github.com/teorth/asymptotic . It has the following notation (see https://github.com/teorth/asymptotic/blob/master/Asymptotic/majorize.lean ):

namespace Asymptotics

variable {E : Type*} [SeminormedAddCommGroup E]

def Ll (C:NNReal) (X:E) (Y:ℝ) : Prop := ‖X‖ ≤ C * Y

notation:10 X " ≪[" C "] " Y => Ll C X Y

notation:10 X " =[" C "] " Y " + O(" Z ")" => Ll C (X-Y) Z

together with a bunch of basic API. For instance one has

lemma Int.floor_eqPlusBigO {x: ℝ} : ⌊x⌋ =[1] x + O(1)

and I am currently working on establishing

noncomputable def discretize (I: Set ℝ) : Finset ℤ
   := if h : Set.Finite {n:ℤ | (n:ℝ) ∈ I } then h.toFinset else ∅

lemma sum_approx_eq_integral {a b c:ℝ} (h: a ≤ b) (f: ℝ → E)
  (hcont: ContinuousOn f (Set.Icc a b))
  (hderiv: DifferentiableOn ℝ f (Set.Ioo a b))
  (hcont': ContinuousOn (deriv f) (Set.Icc a b))
  (hc: c ∈ Set.Icc a b) (I: Set ℝ)
  (hI: Set.Ioo a b ⊆ I) (hI': I ⊆ Set.Icc a b) :
  ∑ n in discretize I, f n =[1] ∫ t in I, f t ∂ volume
  + O( ‖f c‖ + ∫ t in I, ‖deriv f t‖ ∂ volume)

which is a basic tool that would formalize a lot of elementary analytic number theory arguments and also provide a way to get basic estimates on zeta type functions. The notation is compatible with Mathlib's bigO notation:

lemma isBigO_iff_ll {α : Type*} (l : Filter α) (X : α → E)
  (Y : α → ℝ) (hY: ∀ x, 0 ≤ Y x) :
  X =O[l] Y ↔ ∃ (C:NNReal), ∀ᶠ x in l, X x ≪[C] Y x

so one way to prove bigO type statements is to convert them to explicit constant bigO statements, use the API for that notation, and then move back to bigO at the end. I'm not sure if this approach adapts well to littleO type notation, but in many cases one can replace a littleO statement with a bigO statement with an explicit decay rate in which case this approach might be feasible.

Alex Kontorovich (Jan 04 2024 at 18:26):

I see, so if I understand correctly, the difference between, say, Asymptotics.isBigOWith (e.g.: Asymptotics.isBigOWith_iff) and what you're proposing is that you don't need to speak of a filter (until you convert to bigO)? But I thought the whole point of bigO is not to carry such constants around, as much as possible?

Patrick Massot (Jan 04 2024 at 18:30):

The difference isn't so much about the constant. The difference is that Ll is asking for an inequality everywhere, whereas bigO asks it only on some unspecified set belonging to the given filter.

Terence Tao (Jan 04 2024 at 18:42):

Yes, this is important in elementary number theory calculations, because one often needs estimates that hold uniformly for both large and small values of a parameter, and not just for sufficiently large values. A simple example: to control the sum $\sum_{n \leq x} \tau(n)$ of the divisor function $\tau(n) = \sum_{d|n} 1$ , one can compute

$\sum_{n \leq x} \tau(n) = \sum_{n \leq x} \sum_{d,m: n=dm} 1$

$= \sum_{d \leq x} \sum_{m \leq x/d} 1$

$= \sum_{d \leq x} (\frac{x}{d} + O(1))$

$= x \log x + O(x)$

and it is important here that the asympotic $\sum_{m \leq M} 1 = M + O(1)$ is valid even for very small $M$ , even if one is only interested in the original sum $\sum_{n \leq x} \tau(n)$ for sufficiently large $x$ . One could of course work with the trivial filter rather than the Frechet filter to force uniformity for all choices of parameters, but if one also wants to manipulate constants directly I think it can be convenient to switch, at least temporarily, from the filter-based notation to something more explicit when needed.

Alex Kontorovich (Jan 04 2024 at 19:02):

Agreed. So then can we hide the constant here as well? We would routinely say, $\lfloor x \rfloor = x + O(1)$ , with the understanding that it's valid regardless of the range of $x$ . Then you could have, say,

notation:10 X " = " Y " + O(" Z ")" => ∃ (C:NNReal), Ll C (X-Y) Z

Terence Tao (Jan 04 2024 at 22:58):

Unfortunately this is a trivial notion: for any real numbers X, Y, Z with Z non-zero, we would have X = Y + O(Z) with this definition because we just use ‖ X - Y ‖/Z for C. The problem is that Lean doesn't easily support the notion of constants C that don't depend on one or more of the ambient variables (and the key point here is that you want C to be independent of the parameters that define X, Y, and Z). There may be some very complicated and overengineered topos-theoretic way to get around this (see this previous discussion) and there is also possibility of "procrastinating" the constant C(either through new Lean features, or by packaging all the hypotheses on constants into one giant structure and verifying existence of choice of constants at the end) but I think a workable short-term fix is to have notation to temporarily introduce and then generalize away the constant C as needed.

Alex Kontorovich (Jan 04 2024 at 23:30):

Hmm I thought the workaround (in practice) is to phrase everything in terms of functions. So you wouldn't literally say ⌊x⌋=x+O(1), which I agree can be meaningless for a particular value of $x$ , but rather:

(fun x ↦ Real.floor x) = id + O(1 : ℝ → ℝ)

Wouldn't that solve the problem you're pointing to?...

Terence Tao (Jan 05 2024 at 02:46):

Yeah that would technically work and I think is close to Patrick's code. The price one pays is that if one wants to apply pointwise manipulations to these sorts of functions one has to keep using function extensionality (except perhaps for basic ring operations). For instance the preceding manipulations now become

$x \mapsto \sum_{n \leq x} \tau(n) = x \mapsto \sum_{n \leq x} \sum_{d,m: n=dm} 1$

$= x \mapsto \sum_{d \leq x} \sum_{m \leq x/d} 1$

$= x \mapsto \sum_{d \leq x} (\frac{x}{d} + O(\frac{x}{d} \mapsto 1))$

$= x \mapsto \sum_{d \leq x} (\frac{x}{d} + O(x,d \mapsto 1))$

$= x \log x + O(x \mapsto x)$

(note the need to do a "base change" during these manipulations) and this can be a little tricky to formalize.

Last updated: May 02 2025 at 03:31 UTC