Zulip Chat Archive

Stream: Is there code for X?

Topic: Convexity in terms of gradients

Geoffrey Irving (May 02 2025 at 21:10):

There's a bunch of API connecting docs#ConvexOn to docs#deriv (e.g., docs#ConvexOn.slope_le_leftDeriv), but I can't find equivalent API for the higher dimensional case. What I specially want is (barring typos):

import Mathlib

lemma ConvexOn.le_inner_gradient {V : Type} [NormedAddCommGroup V] [InnerProductSpace ℝ V]
    [CompleteSpace V] {f : V → ℝ} {s : Set V} (sc : Convex ℝ s) (fc : ConvexOn ℝ s f) (fd : DifferentiableAt ℝ f x)
    {x y : V} (xs : x ∈ s) (ys : y ∈ s) :
    f x - f y ≤ inner (gradient f x) (x - y) := by
  sorry

Does something like that exist?

Sébastien Gouëzel (May 03 2025 at 08:14):

I don't think we have that formulation. Note that taking the inner product with the gradient is the same as applying the derivative, so an equivalent formulation would be f x - f y ≤ fderiv ℝ f x (x - y)(or possibly reversing the direction of the inequality if you want it to be correct :-), which has the advantage of making sense in any real normed space. This should follow readily from applying the one-dimensional version along the line from x to y. Tell me if you need a proof!

Geoffrey Irving (May 03 2025 at 09:40):

No need for a proof, as you say it's a trivial consequence of 1D. Thanks!

Geoffrey Irving (May 03 2025 at 11:41):

I expect this can be golfed a bunch, but here it is. I decided to keep the direction of the equality since it was already correct, though I did swap the order of subtraction.

lemma ConvexOn.fderiv_le {V : Type} [NormedAddCommGroup V] [NormedSpace ℝ V] {f : V → ℝ}
    {s : Set V} (sc : Convex ℝ s) (fc : ConvexOn ℝ s f) {x y : V} (fd : DifferentiableAt ℝ f x)
    (xs : x ∈ s) (ys : y ∈ s) : fderiv ℝ f x (y - x) ≤ f y - f x := by
  set g : ℝ → ℝ := f ∘ AffineMap.lineMap x y
  have gc : ConvexOn ℝ (Icc 0 1) g := by
    refine (fc.comp_affineMap _).subset ?_ (convex_Icc 0 1)
    exact fun _ ↦ Convex.lineMap_mem sc xs ys
  have gd : DifferentiableAt ℝ g 0 := by
    apply DifferentiableAt.comp
    · simp [fd]
    · apply AffineMap.differentiableAt
  trans g 1 - g 0
  · have le : deriv g 0 ≤ slope g 0 1 := gc.deriv_le_slope (by simp) (by simp) zero_lt_one gd
    simp only [slope, sub_zero, inv_one, vsub_eq_sub, smul_eq_mul, one_mul] at le
    refine le_trans (le_of_eq ?_) le
    simp only [g, ← fderiv_deriv]
    rw [fderiv_comp]
    · simp
    · simpa
    · apply AffineMap.differentiableAt
  · simp [g]

Geoffrey Irving (May 03 2025 at 11:42):

(The reason for the weird order was that I need it for https://arxiv.org/pdf/1909.05207#page=63.46, and that was that the order there.)

Sébastien Gouëzel (May 03 2025 at 11:53):

This looks great!

Eric Wieser (May 03 2025 at 11:58):

I guess we want a HasDerivAt version too?

Eric Wieser (May 03 2025 at 11:58):

(or perhaps instead?)

Geoffrey Irving (May 03 2025 at 11:59):

Yeah, HasDerivAt is probably better.

Full disclosure: turns out I don't need this for online gradient descent after all, since I should be phrasing things in terms of subgradients (where the conclusion is the definition :)). Pesky absolute value losses...

Last updated: Feb 28 2026 at 14:05 UTC