Zulip Chat Archive
Stream: Is there code for X?
Topic: Convexity in terms of gradients
Geoffrey Irving (May 02 2025 at 21:10):
There's a bunch of API connecting docs#ConvexOn to docs#deriv (e.g., docs#ConvexOn.slope_le_leftDeriv), but I can't find equivalent API for the higher dimensional case. What I specially want is (barring typos):
import Mathlib
lemma ConvexOn.le_inner_gradient {V : Type} [NormedAddCommGroup V] [InnerProductSpace ℝ V]
[CompleteSpace V] {f : V → ℝ} {s : Set V} (sc : Convex ℝ s) (fc : ConvexOn ℝ s f) (fd : DifferentiableAt ℝ f x)
{x y : V} (xs : x ∈ s) (ys : y ∈ s) :
f x - f y ≤ inner (gradient f x) (x - y) := by
sorry
Does something like that exist?
Sébastien Gouëzel (May 03 2025 at 08:14):
I don't think we have that formulation. Note that taking the inner product with the gradient is the same as applying the derivative, so an equivalent formulation would be f x - f y ≤ fderiv ℝ f x (x - y)(or possibly reversing the direction of the inequality if you want it to be correct :-), which has the advantage of making sense in any real normed space. This should follow readily from applying the one-dimensional version along the line from x to y. Tell me if you need a proof!
Geoffrey Irving (May 03 2025 at 09:40):
No need for a proof, as you say it's a trivial consequence of 1D. Thanks!
Geoffrey Irving (May 03 2025 at 11:41):
I expect this can be golfed a bunch, but here it is. I decided to keep the direction of the equality since it was already correct, though I did swap the order of subtraction.
lemma ConvexOn.fderiv_le {V : Type} [NormedAddCommGroup V] [NormedSpace ℝ V] {f : V → ℝ}
{s : Set V} (sc : Convex ℝ s) (fc : ConvexOn ℝ s f) {x y : V} (fd : DifferentiableAt ℝ f x)
(xs : x ∈ s) (ys : y ∈ s) : fderiv ℝ f x (y - x) ≤ f y - f x := by
set g : ℝ → ℝ := f ∘ AffineMap.lineMap x y
have gc : ConvexOn ℝ (Icc 0 1) g := by
refine (fc.comp_affineMap _).subset ?_ (convex_Icc 0 1)
exact fun _ ↦ Convex.lineMap_mem sc xs ys
have gd : DifferentiableAt ℝ g 0 := by
apply DifferentiableAt.comp
· simp [fd]
· apply AffineMap.differentiableAt
trans g 1 - g 0
· have le : deriv g 0 ≤ slope g 0 1 := gc.deriv_le_slope (by simp) (by simp) zero_lt_one gd
simp only [slope, sub_zero, inv_one, vsub_eq_sub, smul_eq_mul, one_mul] at le
refine le_trans (le_of_eq ?_) le
simp only [g, ← fderiv_deriv]
rw [fderiv_comp]
· simp
· simpa
· apply AffineMap.differentiableAt
· simp [g]
Geoffrey Irving (May 03 2025 at 11:42):
(The reason for the weird order was that I need it for https://arxiv.org/pdf/1909.05207#page=63.46, and that was that the order there.)
Sébastien Gouëzel (May 03 2025 at 11:53):
This looks great!
Eric Wieser (May 03 2025 at 11:58):
I guess we want a HasDerivAt version too?
Eric Wieser (May 03 2025 at 11:58):
(or perhaps instead?)
Geoffrey Irving (May 03 2025 at 11:59):
Yeah, HasDerivAt is probably better.
Full disclosure: turns out I don't need this for online gradient descent after all, since I should be phrasing things in terms of subgradients (where the conclusion is the definition :)). Pesky absolute value losses...
Last updated: Dec 20 2025 at 21:32 UTC