Zulip Chat Archive

Here I just want to summarize few things about variational calculus and hopefully dispel some confusion about it. Hopefully I will also manage to pinpoint the difficulty compared to the finite dimensional case.

The canonical example of variational calculus is that the function $u : ℝ → ℝ$ that minimizes the functional

$E[u] = \int \frac12 \|u'(x)\|^2 + u(x)f(x) \, dx$

satisfies the Poisson equation

$\Delta u(x) = f(x)$

where $f : ℝ → ℝ$ is some given function.

The calculus of variation allows you to show that

$\frac{\delta E}{\delta u} = \Delta u - f$

thus the Poisson equation is $\frac{\delta E}{\delta u} = 0$ .

(I'm being purposefully imprecise here as I do not mention the integration domain nor boundary conditions. This is usually done in physics textbooks as they often want to derive only the differential equation and not the boundary conditions.)

Finite dimensional analog

Let's have a look at analogous finite dimensional problem. Discretized version of the above energy, for $u \in \R^n$ , is

$F(u) = \sum_i (u_{i+1} - u_i)^2 + u_i f_i$

in Lean

def F (u : PiLp 2 (fun _ : Fin 10 => ℝ)) : ℝ := ∑ i, ((u (i+1) - u i)^2 + u i * f i)

The necessary condition for minimum, $\nabla F(u) = 0$ , can be expressed with fderiv or with gradient

theorem minimum_condition_1 (u) : (F u = ⨅ u', F u') → fderiv ℝ F x = 0 := sorry

theorem minimum_condition_2 (u) : (F u = ⨅ u', F u') → gradient F x = 0 := sorry

the difference between the two is not interesting in finite dimension but in infinite dimension fderiv leads to so called weak form of the differential equation and gradient leads to the strong form of the differential equation. For exposition we keep the distinction in finite dimensional case too. Let's define the weak and strong forms

def weakForm (u v : PiLp 2 (fun _ : Fin 10 => ℝ)) : ℝ :=
  ∑ i, ((u (i+1) - u i) * (v (i+1) - v i) - v i * f i)

def strongForm (u : PiLp 2 (fun _ : Fin 10 => ℝ)) : PiLp 2 (fun _ : Fin 10 => ℝ) :=
  fun i => u (i+1) - 2 * u i + u (i-1) - u i * f i

Notice that u (i+1) - 2 * u i + u (i-1) in strongForm the discrete version of Laplace operator $\Delta$ .

We can show that fderiv is equal to the weakForm and gradient is equal to the strongForm

theorem fderiv_eq_weakForm : fderiv ℝ F u v = 2 * weakForm u v := sorry

theorem gradient_eq_strongForm : gradient F u = - strongForm u := sorry

The equation strongForm u = 0 is:

∀ i, u (i+1) - 2 * u i + u (i-1) = u i * f i

which is discretized version of Poisson equation.

Operator view

The above equation are much more readable in matrix/operator form. Let's introduce matrix that takes difference of consecutive values (analog of derivative operator)

def D : (PiLp 2 (fun _ : Fin 10 => ℝ)) →L[ℝ] (PiLp 2 (fun _ : Fin 10 => ℝ)) :=
  ⟨⟨⟨fun x => fun i => x (i+1) - x i,sorry⟩,sorry⟩, sorry⟩

and its negative transpose

def D' : (PiLp 2 (fun _ : Fin 10 => ℝ)) →L[ℝ] (PiLp 2 (fun _ : Fin 10 => ℝ)) :=
  ⟨⟨⟨fun x => fun i => x i - x (i-1),sorry⟩,sorry⟩, sorry⟩

D' is just a different way of discretizing derivative and we can show that D.adjoint = - D'.

Now the above equations can be written as:

def F' (x) := ‖D x‖^2
def weakForm' (x u) := ⟪D x, D u⟫_ℝ
def strongForm' (x) := D' (D x)
theorem minimum_weakForm' : fderiv ℝ F' x u = 2 * weakForm' x u := sorry
theorem minimum_strongForm' : gradient F' x = - strongForm' x := sorry

i.e. it boils down to showing these two simple theorems

theorem fderiv_norm2 : fderiv ℝ (fun u => ‖D u‖^2) u v = 2 * ⟪D u, D v⟫_ℝ := sorry
theorem gradient_norm2 : gradient (fun u => ‖D u‖^2) u = - D' (D u) := sorry

and showing that D.adjoint = - D'.

In infinite dimensional case, the calculation is basically the same. The difficulty is to figure out the setting(i.e. appropriate function spaces) such that the calculation can be made formal.

Full code in finite dimension. I believe that the code can be made sorry free.

code

import Mathlib

noncomputable section

namespace FiniteDimensional

opaque f : PiLp 2 (fun _ : Fin 10 => ℝ)

def F (u : PiLp 2 (fun _ : Fin 10 => ℝ)) : ℝ := ∑ i, ((u (i+1) - u i)^2 - u i * f i)

theorem minimum_condition_1 (u) : (F u = ⨅ u', F u') → fderiv ℝ F x = 0 := sorry

theorem minimum_condition_2 (u) : (F u = ⨅ u', F u') → gradient F x = 0 := sorry

def weakForm (u v : PiLp 2 (fun _ : Fin 10 => ℝ)) : ℝ :=
  ∑ i, ((u (i+1) - u i) * (v (i+1) - v i) - v i * f i)

def strongForm (u : PiLp 2 (fun _ : Fin 10 => ℝ)) : PiLp 2 (fun _ : Fin 10 => ℝ) :=
  fun i => u (i+1) - 2 * u i + u (i-1) - u i * f i

theorem fderiv_eq_weakForm : fderiv ℝ F x u = 2 * weakForm x u := sorry

theorem gradient_eq_strongForm : gradient F x = - strongForm x := sorry

def D : (PiLp 2 (fun _ : Fin 10 => ℝ)) →L[ℝ] (PiLp 2 (fun _ : Fin 10 => ℝ)) :=
  ⟨⟨⟨fun x => fun i => x (i+1) - x i,sorry⟩,sorry⟩, sorry⟩

def D' : (PiLp 2 (fun _ : Fin 10 => ℝ)) →L[ℝ] (PiLp 2 (fun _ : Fin 10 => ℝ)) :=
  ⟨⟨⟨fun x => fun i => x i - x (i-1),sorry⟩,sorry⟩, sorry⟩

open InnerProductSpace

def F' (u) := ‖D u‖^2
def weakForm' (u v) := ⟪D u, D v⟫_ℝ
def strongForm' (x) := D' (D x)

theorem fderiv_norm2 : fderiv ℝ (fun u => ‖D u‖^2) u v = 2 * ⟪D u, D v⟫_ℝ := sorry
theorem gradient_norm2 : gradient (fun u => ‖D u‖^2) u = - D' (D u) := sorry

end FiniteDimensional

Difficulties in infinite dimensional case.

We can define the energy

def E (u : ℝ → ℝ) : ℝ := ∫ x, ((deriv u x)^2 + u x * f x)

but we get into trouble immediately. The integral is meaninful only under some integrability and differentiability conditions on u which also requires to specify the domain over which we integrate. Furthermore, the necessary conditions

theorem minimum_condition_1 (u) : (F u = ⨅ u', F u') → fderiv ℝ F u = 0 := sorry
theorem minimum_condition_2 (u) : (F u = ⨅ u', F u') → gradient F u = 0 := sorry

do not even type check as ℝ → ℝ is not NormedSpace for fderiv or InnerProductSpace for gradient.

The standard approach is to restrict E to some appropriate Sobolev space which gives you inner product space structure. In this setting, you should be able to also show the reverse implication gradient F u = 0 → (F u = ⨅ u', F u') with some convexity and coercivity condition on F.

Just for computing the derivative, I think some approach based on test functions and distributions should be possible too. I also have some ideas how to to this using the theory of convenient vector spaces. In both of these cases, you do not have inner product space so one has to be a bit more careful how to define gradient but I will write about that another time.

Tomas Skrivan (Jun 13 2025 at 11:01):

I would like to have a formal setting in which we can do the operator view reasoning because then I know how to write a tactic that automates the computation. Meanwhile we can do the calculation by hand. Here is a skeleton how to to the basic calculation for Poisson equation on [0,1] interval:

import Mathlib

noncomputable section

open Set

variable
  (u δu f : ℝ → ℝ)
  (hu : Differentiable ℝ u) (hδu : Differentiable ℝ δu) (hf : Continuous f)
  (hb₀ : δu 0 = 0) (hb₁ : δu 1 = 0) -- boundary conditions

def E (f u : ℝ → ℝ) := ∫ x in Icc 0 1, 1/2 * (deriv u x)^2 + u x * f x
def B (f u v : ℝ → ℝ) := ∫ x in Icc 0 1, deriv u x * deriv v x + v x * f x

def strongForm (f u : ℝ → ℝ) (x : ℝ) := (- iteratedDeriv 2 u x + f x)

theorem minimum_weakForm :
    (deriv (fun t : ℝ => E f (u + t • δu)) 0)  -- this is more or less the definition of  the symbol `δE/δu`
    =
    B f u δu := by
  unfold E B
  -- 1. exchange integral and derivative
  -- 2. compute simple derivative
  -- 3. clean up expression
  sorry

theorem minimum_strongForm (u δu f : ℝ → ℝ) (hu : Differentiable ℝ u) (hδu : Differentiable ℝ δu) :
    (deriv (fun t : ℝ => E f (u + t • δu)) 0)
    =
    ∫ x in Icc 0 1, strongForm f u x * δu x := by
  simp (disch:=assumption) [minimum_weakForm]
  unfold B strongForm
  -- 1. apply per-partes -- on discrete level this is the step where we show that `D.adjoint = - D'`
  -- 2. remove boundary terms using boundary conditions
  -- 3. clean up expression
  sorry

Then we proceed with the fundamental lemma of variational calculus which states

(∀ δu, ∫ x in Icc 0 1, g x * δu x = 0) → ∀ x in Icc 0 1, g x = 0

for g = strongForm f u we conclude that

iteratedDeriv 2 u x = f x

Tomas Skrivan (Jun 13 2025 at 11:53):

Sketch how to use theory of distributions

Here is a sketch how to set things up to do the calculation with test functions and dsitributions such that you end up with operators for which I know how to write tactic that automates the calculations.

The following is only a sketch! Let 𝒟 ℝ is the space of test functions in reals and 𝒟' ℝis the space if distributions. We can convert between the two with

def toDistrib (f : 𝒟 ℝ) : 𝒟' ℝ := fun φ => ∫ x, f x * φ x
def fromDistrib (u : 𝒟' ℝ) : 𝒟 ℝ := if h : ∃ φ, toDistrib φ = u then choose h else 0

of course fromDistrib is only one way inverse, not every distribution is test function.

There is a well defined notion of adjoint for functions between test functions which produces function between distributions

def adjoint (F : (𝒟 ℝ) → 𝒟 ℝ) : (𝒟' ℝ) → 𝒟' ℝ := fun u φ => u (F φ)

We can lift this to adjoint' that operates only test functions only by using fromDistrib

def adjoint' (F : (𝒟 ℝ) → 𝒟 ℝ) : (𝒟 ℝ) → 𝒟 ℝ := fun u => fromDistrib (adjoint F (toDistrib u))

Not every function has meaningful adjoint this way but many do. For example adjoint' deriv = - deriv.

Now we can define variational gradient:

def varGradient (F : 𝒟 ℝ → ℝ) (ϕ : 𝒟 ℝ) : 𝒟 ℝ :=
  adjoint' ((fun c _ => c) ∘ fderiv ℝ F ϕ) (fun _ => 1)

this is the function that is usually denoted with δ.

I believe if everything is made formal one can then show that

theorem varGradient_dirichelt_energy (f ϕ : 𝒟 ℝ)  :
  varGradient (fun ϕ => ∫ x, 1/2*(deriv ϕ x)^2 + ϕ x * f x) ϕ
  =
  - iteratedDeriv 2 φ + f := sorry

and importantly I know how to write a tactic for fderiv and adjoint' such that this is proven automatically by that tactic.

Joseph Tooby-Smith (Jun 14 2025 at 06:00):

Hi @Tomas Skrivan , many many thanks for writing this out! For me at least, this clears up my understanding of the things involved, so thank you for explaining!

Tomas Skrivan (Jun 16 2025 at 12:33):

Here is a concrete plan to do mechanistically computations in variational calculus and I will demonstrate it on derivation of Euler-Lagrange equations.

This nice thing is that this will lead to an actual derivation of Euler-Lagrange equations from action. Once I port data_synth tactic it will take action as an input and spit out Euler-Lagrange equations and a proof that it is the variational gradient of action.

What I want to show is that if we have the action functional

$S(u) = \int L(u(t),\dot u(t)) \, dt$

then

$\frac{\delta S}{\delta u} = \frac{\partial L}{\partial x}(u,\dot u) - \frac{d}{dt} \frac{\partial L}{\partial \dot x}(u,\dot u)$

There are some issues that I do not know how to resolve regarding the integration domain. Therefore instead of working with $S$ I will be working with $S'$

$S'(u,t) = L(u(t),\dot u(t))$

i.e. $S$ just without the integral and I will define a meaning for $\frac{\delta S'}{\delta u}$ instead of for $\frac{\delta S}{\delta u}$ and I will show that

$\frac{\delta S'}{\delta u}(u,t) = \frac{\partial L}{\partial x}(u(t),\dot u(t)) - \frac{d}{dt} \frac{\partial L}{\partial \dot x}(u(t),\dot u(t))$

The Lean statement of this equation is

theorem euler_lagrange (L : ℝ → ℝ → ℝ) (u : ℝ → ℝ)
    (hL : ContDiff ℝ ∞ (↿L)) (hu : ContDiff ℝ ∞ u) :
    HasVarGradientAt
      (fun u t => L (u t) (deriv u t))
      (fun t =>
        deriv (L · (deriv u t)) ((u t))
        -
        deriv (fun t' => deriv (L (u t') ·) (deriv u t')) t)
      u

where HasVarGradientAt has signature

   HasVarGradientAt (S' : (ℝ → ℝ) → (ℝ → ℝ)) (grad : ℝ → ℝ) (u : ℝ → ℝ) : Prop

saying that the function S' has variational gradient grad at point u. It is formulated with the Has... idiom instead of varGrad S' = ... because there are some problems with uniqueness as grad is most likely uniquely determined only almost everywhere.

The definition of HasVarGradientAt is:

def HasVarGradientAt (S' : (ℝ → ℝ) → (ℝ → ℝ)) (grad : ℝ → ℝ) (u : ℝ → ℝ) :=
  ∃ F',
    HasAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F'
    ∧
    F' (fun _ => 1) = grad

similarly to normal gradient we take an adjoint of derivative and evaluate it at one (at constant one function in this case). The derivative is just normal derivative of S' at u in the direction δu and then we take an adjoint w.r.t. δu. The adjoint is a spacial notion inspired by theory of distributions:

structure HasAdjoint (F : (ℝ → ℝ) → (ℝ → ℝ)) (F' : (ℝ → ℝ) → (ℝ → ℝ)) where
  test_fun_preserving  : ∀ φ, IsTestFunction φ → IsTestFunction (F φ)
  test_fun_preserving' : ∀ φ, IsTestFunction φ → IsTestFunction (F' φ)
  adj : ∀ φ ψ, IsTestFunction φ → IsTestFunction ψ →
    ∫ x, F φ x * ψ x = ∫ x, φ x * F' ψ x

i.e. F has adjoint F' if both F and F' preserve test functions and ⟪F φ, ψ⟫ = ⟪φ, F' ψ⟫ where the "inner product" ⟪·, ·⟫ is meaningful only on test functions.

The computation of the adjoint in the proof of euler_lagrange is

    apply has_adjoint_ext _ _ _
    · apply add_rule
      · apply mul_left_rule
        apply id_rule
        fun_prop [deriv]
      · apply mul_left_rule
        apply deriv_rule
        fun_prop [deriv]
    · intros; rfl

for which I know how to write a tactic to fully automate.

Here is the full code

code

import Mathlib

noncomputable section

open Set

open ContDiff

structure IsTestFunction (f : ℝ → ℝ) where
  smooth : ContDiff ℝ ∞ f
  supp : HasCompactSupport f

theorem IsTestFunction.integrable {f} (hf : IsTestFunction f) : MeasureTheory.Integrable f := sorry

/-- Function `F` transforming functions has adjoint in distributional sense but we require that
its adjoint also maps test function to test functions.

The canonical example is the function `F = deriv` that has adjoint `F' = - deriv`. -/
structure HasAdjoint (F : (ℝ → ℝ) → (ℝ → ℝ)) (F' : (ℝ → ℝ) → (ℝ → ℝ)) where
  test_fun_preserving  : ∀ φ, IsTestFunction φ → IsTestFunction (F φ)
  test_fun_preserving' : ∀ φ, IsTestFunction φ → IsTestFunction (F' φ)
  adj : ∀ φ ψ, IsTestFunction φ → IsTestFunction ψ →
    ∫ x, F φ x * ψ x = ∫ x, φ x * F' ψ x

namespace HasAdjoint

theorem id_rule : HasAdjoint (fun φ => φ) (fun φ => φ) where
  test_fun_preserving  _ hφ := hφ
  test_fun_preserving' _ hφ := hφ
  adj _ _ _ _ := rfl

theorem comp_rule (F G F' G') (hF : HasAdjoint F F') (hG : HasAdjoint G G') :
    HasAdjoint (fun φ => F (G φ)) (fun φ => G' (F' φ)) where
  test_fun_preserving  _ hφ := hF.test_fun_preserving _ (hG.test_fun_preserving _ hφ)
  test_fun_preserving' _ hφ := hG.test_fun_preserving' _ (hF.test_fun_preserving' _ hφ)
  adj φ ψ hφ hψ := by
    rw [hF.adj _ _ (hG.test_fun_preserving φ hφ) hψ]
    rw [hG.adj _ _ hφ (hF.test_fun_preserving' _ hψ)]

theorem deriv_rule :
    HasAdjoint (fun φ => deriv φ) (fun φ => - deriv φ) where
  test_fun_preserving  _ hφ := by
    have ⟨h,h'⟩ := hφ
    constructor
    · fun_prop
    · exact HasCompactSupport.deriv h'
  test_fun_preserving' _ hφ := by
    have ⟨h,h'⟩ := hφ
    constructor
    · eta_expand; dsimp; fun_prop
    · apply HasCompactSupport.neg'
      apply HasCompactSupport.deriv h'
  adj φ ψ hφ hψ := by
    dsimp
    sorry -- integration per partes


theorem has_adjoint_ext (F : (ℝ → ℝ) → (ℝ → ℝ)) (F' F'' : (ℝ → ℝ) → (ℝ → ℝ))
  (h : HasAdjoint F F') (h' : ∀ φ, IsTestFunction φ → F' φ = F'' φ) : HasAdjoint F F'' := sorry

theorem add_rule (F G : (ℝ → ℝ) → (ℝ → ℝ)) (F' G' : (ℝ → ℝ) → (ℝ → ℝ))
    (hF : HasAdjoint F F') (hG : HasAdjoint G G') :
    HasAdjoint (fun φ x => F φ x + G φ x) (fun φ x => F' φ x + G' φ x) where
  test_fun_preserving  _ hφ := by
    have ⟨h,h'⟩ := hφ
    constructor
    · apply ContDiff.add
      apply (hF.test_fun_preserving _ hφ).smooth
      apply (hG.test_fun_preserving _ hφ).smooth
    · apply HasCompactSupport.add
      apply (hF.test_fun_preserving _ hφ).supp
      apply (hG.test_fun_preserving _ hφ).supp
  test_fun_preserving' _ hφ :=  by
    have ⟨h,h'⟩ := hφ
    constructor
    · apply ContDiff.add
      apply (hF.test_fun_preserving' _ hφ).smooth
      apply (hG.test_fun_preserving' _ hφ).smooth
    · apply HasCompactSupport.add
      apply (hF.test_fun_preserving' _ hφ).supp
      apply (hG.test_fun_preserving' _ hφ).supp
  adj _ _ _ _ := by
    simp[right_distrib,left_distrib]
     -- use test functions are integralble
    rw[MeasureTheory.integral_add sorry sorry]
    rw[MeasureTheory.integral_add sorry sorry]
    rw[hF.adj _ _ (by assumption) (by assumption)]
    rw[hG.adj _ _ (by assumption) (by assumption)]

theorem mul_left_rule (F : (ℝ → ℝ) → (ℝ → ℝ)) (ψ : ℝ → ℝ) (F' : (ℝ → ℝ) → (ℝ → ℝ))
    (hF : HasAdjoint F F') (hψ : ContDiff ℝ ∞ ψ) :
    HasAdjoint (fun φ x => ψ x * F φ x) (fun φ x => F' (fun x => ψ x * φ x) x) := sorry

theorem mul_right_rule (F : (ℝ → ℝ) → (ℝ → ℝ)) (ψ : ℝ → ℝ) (F' : (ℝ → ℝ) → (ℝ → ℝ))
    (hF : HasAdjoint F F') (hψ : ContDiff ℝ ∞ ψ) :
    HasAdjoint (fun φ x => F φ x * ψ x) (fun φ x => F' (fun x => φ x * ψ x) x) := sorry

end HasAdjoint
open HasAdjoint

def HasVarGradientAt (S' : (ℝ → ℝ) → (ℝ → ℝ)) (grad : ℝ → ℝ) (u : ℝ → ℝ) :=
  ∃ F',
    HasAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F'
    ∧
    F' (fun _ => 1) = grad


theorem euler_lagrange (L : ℝ → ℝ → ℝ) (u : ℝ → ℝ)
    (hL : ContDiff ℝ ∞ (↿L)) (hu : ContDiff ℝ ∞ u) :
    HasVarGradientAt
      (fun u t => L (u t) (deriv u t))
      (fun t =>
        deriv (L · (deriv u t)) ((u t))
        -
        deriv (fun t' => deriv (L (u t') ·) (deriv u t')) t)
      u := by
  apply Exists.intro
  apply And.intro
  · -- compute derivative
    have h : (fun δu t => deriv (fun s : ℝ => (fun u t => L (u t) (deriv u t)) (u + s • δu) t) 0)
             =
             fun v t =>
               deriv (L · (deriv u t)) (u t) * v t
               +
               deriv (L (u t) ·) (deriv u t) * deriv v t := by
       -- this will be automated by a tactic to compute derivatives
      sorry
    rw[h]

    -- compute adjoint
    apply has_adjoint_ext _ _ _
    · apply add_rule
      · apply mul_left_rule
        apply id_rule
        fun_prop [deriv]
      · apply mul_left_rule
        apply deriv_rule
        fun_prop [deriv]
    · intros; rfl

  · simp[sub_eq_add_neg]

I believe that all sorry can be filled in and if someone would do that I would really appreciate it :)

Joseph Tooby-Smith (Jun 16 2025 at 12:46):

A proof for the first sorry is:

theorem IsTestFunction.integrable {f} (hf : IsTestFunction f) :
    MeasureTheory.Integrable f :=
  Continuous.integrable_of_hasCompactSupport (continuous hf.smooth) hf.supp

Joseph Tooby-Smith (Jun 16 2025 at 13:16):

All but last sorry (if I fill any more I will just edit the code below):

Header

import Mathlib

noncomputable section

open Set

open ContDiff
@[fun_prop]
structure IsTestFunction (f : ℝ → ℝ) where
  smooth : ContDiff ℝ ∞ f
  supp : HasCompactSupport f

@[fun_prop]
theorem IsTestFunction.integrable {f} (hf : IsTestFunction f) :
    MeasureTheory.Integrable f :=
  Continuous.integrable_of_hasCompactSupport (continuous hf.smooth) hf.supp

@[fun_prop]
theorem IsTestFunction.differentiable {f} (hf : IsTestFunction f) :
    Differentiable ℝ f := hf.1.differentiable ENat.LEInfty.out

@[fun_prop]
theorem IsTestFunction.contDiff {f} (hf : IsTestFunction f) :
    ContDiff ℝ ∞ f := hf.1

@[fun_prop]
def IsTestFunction.mul {f g} (hf : IsTestFunction f) (hg : IsTestFunction g) :
    IsTestFunction (f * g) where
  smooth := ContDiff.mul hf.smooth hg.smooth
  supp := HasCompactSupport.mul_left hg.supp

@[fun_prop]
def IsTestFunction.deriv {f} (hf : IsTestFunction f) :
    IsTestFunction (deriv f) where
  smooth := deriv' hf.smooth
  supp := HasCompactSupport.deriv hf.supp

@[fun_prop]
def IsTestFunction.mul_left {f g} (hf : ContDiff ℝ ∞ f) (hg : IsTestFunction g) :
    IsTestFunction (f * g) where
  smooth := ContDiff.mul hf hg.smooth
  supp := HasCompactSupport.mul_left hg.supp

@[fun_prop]
def IsTestFunction.mul_right {f g} (hf : IsTestFunction f) (hg : ContDiff ℝ ∞  g) :
    IsTestFunction (f * g) where
  smooth := ContDiff.mul hf.smooth hg
  supp := HasCompactSupport.mul_right hf.supp



/-- Function `F` transforming functions has adjoint in distributional sense but we require that
its adjoint also maps test function to test functions.

The canonical example is the function `F = deriv` that has adjoint `F' = - deriv`. -/
structure HasAdjoint (F : (ℝ → ℝ) → (ℝ → ℝ)) (F' : (ℝ → ℝ) → (ℝ → ℝ)) where
  test_fun_preserving  : ∀ φ, IsTestFunction φ → IsTestFunction (F φ)
  test_fun_preserving' : ∀ φ, IsTestFunction φ → IsTestFunction (F' φ)
  adj : ∀ φ ψ, IsTestFunction φ → IsTestFunction ψ →
    ∫ x, F φ x * ψ x = ∫ x, φ x * F' ψ x

namespace HasAdjoint

theorem id_rule : HasAdjoint (fun φ => φ) (fun φ => φ) where
  test_fun_preserving  _ hφ := hφ
  test_fun_preserving' _ hφ := hφ
  adj _ _ _ _ := rfl

theorem comp_rule (F G F' G') (hF : HasAdjoint F F') (hG : HasAdjoint G G') :
    HasAdjoint (fun φ => F (G φ)) (fun φ => G' (F' φ)) where
  test_fun_preserving  _ hφ := hF.test_fun_preserving _ (hG.test_fun_preserving _ hφ)
  test_fun_preserving' _ hφ := hG.test_fun_preserving' _ (hF.test_fun_preserving' _ hφ)
  adj φ ψ hφ hψ := by
    rw [hF.adj _ _ (hG.test_fun_preserving φ hφ) hψ]
    rw [hG.adj _ _ hφ (hF.test_fun_preserving' _ hψ)]

theorem deriv_rule :
    HasAdjoint (fun φ => deriv φ) (fun φ => - deriv φ) where
  test_fun_preserving  _ hφ := by
    have ⟨h,h'⟩ := hφ
    constructor
    · fun_prop
    · exact HasCompactSupport.deriv h'
  test_fun_preserving' _ hφ := by
    have ⟨h,h'⟩ := hφ
    constructor
    · eta_expand; dsimp; fun_prop
    · apply HasCompactSupport.neg'
      apply HasCompactSupport.deriv h'
  adj φ ψ hφ hψ := by
    dsimp
    trans ∫ (x : ℝ), ψ x * deriv φ x
    · congr
      funext x
      ring
    rw [MeasureTheory.integral_mul_deriv_eq_deriv_mul_of_integrable (u := ψ) (v := φ)
      (u' := deriv ψ) ]
    · simp
      rw [@MeasureTheory.integral_neg]
      congr
      funext x
      ring
    · intro x
      simpa using hψ.1.differentiable (by exact ENat.LEInfty.out) x
    · intro x
      simpa using hφ.1.differentiable (by exact ENat.LEInfty.out) x
    · refine IsTestFunction.integrable ?_
      apply IsTestFunction.mul
      · exact hψ
      · exact IsTestFunction.deriv hφ
    · refine IsTestFunction.integrable ?_
      apply IsTestFunction.mul
      · exact IsTestFunction.deriv hψ
      · exact hφ
    · refine IsTestFunction.integrable ?_
      exact IsTestFunction.mul hψ hφ


theorem has_adjoint_ext (F : (ℝ → ℝ) → (ℝ → ℝ)) (F' F'' : (ℝ → ℝ) → (ℝ → ℝ))
    (h : HasAdjoint F F') (h' : ∀ φ, IsTestFunction φ → F' φ = F'' φ) :
    HasAdjoint F F'' where
  test_fun_preserving  φ hφ := h.test_fun_preserving φ hφ
  test_fun_preserving' φ hφ := by
    rw [← h' φ hφ]
    exact h.test_fun_preserving' φ hφ
  adj φ ψ hφ hψ := by
    rw [← h' ψ hψ]
    exact h.adj φ ψ hφ hψ

theorem add_rule (F G : (ℝ → ℝ) → (ℝ → ℝ)) (F' G' : (ℝ → ℝ) → (ℝ → ℝ))
    (hF : HasAdjoint F F') (hG : HasAdjoint G G') :
    HasAdjoint (fun φ x => F φ x + G φ x) (fun φ x => F' φ x + G' φ x) where
  test_fun_preserving  _ hφ := by
    have ⟨h,h'⟩ := hφ
    constructor
    · apply ContDiff.add
      apply (hF.test_fun_preserving _ hφ).smooth
      apply (hG.test_fun_preserving _ hφ).smooth
    · apply HasCompactSupport.add
      apply (hF.test_fun_preserving _ hφ).supp
      apply (hG.test_fun_preserving _ hφ).supp
  test_fun_preserving' _ hφ :=  by
    have ⟨h,h'⟩ := hφ
    constructor
    · apply ContDiff.add
      apply (hF.test_fun_preserving' _ hφ).smooth
      apply (hG.test_fun_preserving' _ hφ).smooth
    · apply HasCompactSupport.add
      apply (hF.test_fun_preserving' _ hφ).supp
      apply (hG.test_fun_preserving' _ hφ).supp
  adj _ _ _ _ := by
    simp[right_distrib,left_distrib]
     -- use test functions are integralble
    rw[MeasureTheory.integral_add ]
    rw[MeasureTheory.integral_add ]
    rw[hF.adj _ _ (by assumption) (by assumption)]
    rw[hG.adj _ _ (by assumption) (by assumption)]
    · apply IsTestFunction.integrable
      apply IsTestFunction.mul
      · (expose_names; exact h)
      · (expose_names; exact hF.test_fun_preserving' x_1 h_1)
    · apply IsTestFunction.integrable
      apply IsTestFunction.mul
      · (expose_names; exact h)
      · (expose_names; exact hG.test_fun_preserving' x_1 h_1)
    · apply IsTestFunction.integrable
      apply IsTestFunction.mul
      · (expose_names; exact hF.test_fun_preserving x h)
      · (expose_names; exact h_1)
    · apply IsTestFunction.integrable
      apply IsTestFunction.mul
      · (expose_names; exact hG.test_fun_preserving x h)
      · (expose_names; exact h_1)

theorem mul_left_rule (F : (ℝ → ℝ) → (ℝ → ℝ)) (ψ : ℝ → ℝ) (F' : (ℝ → ℝ) → (ℝ → ℝ))
    (hF : HasAdjoint F F') (hψ : ContDiff ℝ ∞ ψ) :
    HasAdjoint (fun φ x => ψ x * F φ x) (fun φ x => F' (fun x => ψ x * φ x) x) where
  test_fun_preserving  φ hφ := by
    apply IsTestFunction.mul_left
    · exact hψ
    · exact hF.test_fun_preserving φ hφ
  test_fun_preserving' φ hφ := by
    apply hF.test_fun_preserving'
    apply IsTestFunction.mul_left
    · exact hψ
    · exact hφ
  adj φ ψ' hφ hψ' := by
    rw [← hF.adj]
    · congr
      funext x
      ring
    · exact hφ
    · apply IsTestFunction.mul_left
      · exact hψ
      · exact hψ'


theorem mul_right_rule (F : (ℝ → ℝ) → (ℝ → ℝ)) (ψ : ℝ → ℝ) (F' : (ℝ → ℝ) → (ℝ → ℝ))
    (hF : HasAdjoint F F') (hψ : ContDiff ℝ ∞ ψ) :
    HasAdjoint (fun φ x => F φ x * ψ x) (fun φ x => F' (fun x => φ x * ψ x) x) where
  test_fun_preserving  φ hφ := by
    apply IsTestFunction.mul_right
    · exact hF.test_fun_preserving φ hφ
    · exact hψ
  test_fun_preserving' φ hφ := by
    apply hF.test_fun_preserving'
    apply IsTestFunction.mul_right
    · exact hφ
    · exact hψ
  adj φ ψ' hφ hψ' := by
    rw [← hF.adj]
    · congr
      funext x
      ring
    · exact hφ
    · apply IsTestFunction.mul_right
      · exact hψ'
      · exact hψ

end HasAdjoint
open HasAdjoint

def HasVarGradientAt (S' : (ℝ → ℝ) → (ℝ → ℝ)) (grad : ℝ → ℝ) (u : ℝ → ℝ) :=
  ∃ F',
    HasAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F'
    ∧
    F' (fun _ => 1) = grad


theorem euler_lagrange (L : ℝ → ℝ → ℝ) (u : ℝ → ℝ)
    (hL : ContDiff ℝ ∞ (↿L)) (hu : ContDiff ℝ ∞ u) :
    HasVarGradientAt
      (fun u t => L (u t) (deriv u t))
      (fun t =>
        deriv (L · (deriv u t)) ((u t))
        -
        deriv (fun t' => deriv (L (u t') ·) (deriv u t')) t)
      u := by
  apply Exists.intro
  apply And.intro
  · -- compute derivative
    have h : (fun δu t => deriv (fun s : ℝ => (fun u t => L (u t) (deriv u t)) (u + s • δu) t) 0)
             =
             fun v t =>
               deriv (L · (deriv u t)) (u t) * v t
               +
               deriv (L (u t) ·) (deriv u t) * deriv v t := by
       -- this will be automated by a tactic to compute derivatives
      sorry
    rw[h]

    -- compute adjoint
    apply has_adjoint_ext _ _ _
    · apply add_rule
      · apply mul_left_rule
        apply id_rule
        fun_prop [deriv]
      · apply mul_left_rule
        apply deriv_rule
        fun_prop [deriv]
    · intros; rfl

  · simp[sub_eq_add_neg]

Tomas Skrivan (Jun 16 2025 at 13:27):

You can set up IsTestFunction with fun_prop as

@[fun_prop]
structure IsTestFunction (f : ℝ → ℝ) where
  smooth : ContDiff ℝ ∞ f
  supp : HasCompactSupport f

@[fun_prop]
theorem IsTestFunction.integrable {f} (hf : IsTestFunction f) :
    MeasureTheory.Integrable f :=
  Continuous.integrable_of_hasCompactSupport (continuous hf.smooth) hf.supp

@[fun_prop]
theorem IsTestFunction.differentiable {f} (hf : IsTestFunction f) :
    Differentiable ℝ f := hf.1.differentiable ENat.LEInfty.out

@[fun_prop]
theorem IsTestFunction.contDiff {f} (hf : IsTestFunction f) :
    ContDiff ℝ ∞ f := hf.1

@[fun_prop]
def IsTestFunction.mul {f g} (hf : IsTestFunction f) (hg : IsTestFunction g) :
    IsTestFunction (f * g) where
  smooth := ContDiff.mul hf.smooth hg.smooth
  supp := HasCompactSupport.mul_left hg.supp

@[fun_prop]
def IsTestFunction.deriv {f} (hf : IsTestFunction f) :
    IsTestFunction (deriv f) where
  smooth := deriv' hf.smooth
  supp := HasCompactSupport.deriv hf.supp

and it makes some proofs shorter. Like

    · refine IsTestFunction.integrable ?_
      apply IsTestFunction.mul
      · exact IsTestFunction.deriv hψ
      · exact hφ

is just fun_prop

Joseph Tooby-Smith (Jun 16 2025 at 13:34):

I've filled in all but the last sorry in the above - which I assume you don't actually want filled? Or would it also help to have that filled?

When trying to use fun_prop it keeps giving me an error :(.

Tomas Skrivan (Jun 16 2025 at 13:35):

Wonderful! Yes do not fill the last sorry.

Tomas Skrivan (Jun 16 2025 at 13:35):

It will be automated but I noticed an error there too :)

Tomas Skrivan (Jun 16 2025 at 14:19):

Here sorry free proof that for

$E(u) = \int \|u'(x)\|^2 + f(x)u(x) \, dx$

we have

$\frac{\delta E}{\delta u}(x) = - u''(x) + f(x)$

code

import Mathlib

noncomputable section

open Set

open ContDiff
@[fun_prop]
structure IsTestFunction (f : ℝ → ℝ) where
  smooth : ContDiff ℝ ∞ f
  supp : HasCompactSupport f

@[fun_prop]
theorem IsTestFunction.integrable {f} (hf : IsTestFunction f) :
    MeasureTheory.Integrable f :=
  Continuous.integrable_of_hasCompactSupport (continuous hf.smooth) hf.supp

@[fun_prop]
theorem IsTestFunction.differentiable {f} (hf : IsTestFunction f) :
    Differentiable ℝ f := hf.1.differentiable ENat.LEInfty.out

@[fun_prop]
theorem IsTestFunction.contDiff {f} (hf : IsTestFunction f) :
    ContDiff ℝ ∞ f := hf.1

@[fun_prop]
def IsTestFunction.mul {f g} (hf : IsTestFunction f) (hg : IsTestFunction g) :
    IsTestFunction (f * g) where
  smooth := ContDiff.mul hf.smooth hg.smooth
  supp := HasCompactSupport.mul_left hg.supp

@[fun_prop]
def IsTestFunction.deriv {f} (hf : IsTestFunction f) :
    IsTestFunction (deriv f) where
  smooth := deriv' hf.smooth
  supp := HasCompactSupport.deriv hf.supp

@[fun_prop]
def IsTestFunction.mul_left {f g} (hf : ContDiff ℝ ∞ f) (hg : IsTestFunction g) :
    IsTestFunction (f * g) where
  smooth := ContDiff.mul hf hg.smooth
  supp := HasCompactSupport.mul_left hg.supp

@[fun_prop]
def IsTestFunction.mul_right {f g} (hf : IsTestFunction f) (hg : ContDiff ℝ ∞  g) :
    IsTestFunction (f * g) where
  smooth := ContDiff.mul hf.smooth hg
  supp := HasCompactSupport.mul_right hf.supp



/-- Function `F` transforming functions has adjoint in distributional sense but we require that
its adjoint also maps test function to test functions.

The canonical example is the function `F = deriv` that has adjoint `F' = - deriv`. -/
structure HasAdjoint (F : (ℝ → ℝ) → (ℝ → ℝ)) (F' : (ℝ → ℝ) → (ℝ → ℝ)) where
  test_fun_preserving  : ∀ φ, IsTestFunction φ → IsTestFunction (F φ)
  test_fun_preserving' : ∀ φ, IsTestFunction φ → IsTestFunction (F' φ)
  adj : ∀ φ ψ, IsTestFunction φ → IsTestFunction ψ →
    ∫ x, F φ x * ψ x = ∫ x, φ x * F' ψ x

namespace HasAdjoint

theorem id_rule : HasAdjoint (fun φ => φ) (fun φ => φ) where
  test_fun_preserving  _ hφ := hφ
  test_fun_preserving' _ hφ := hφ
  adj _ _ _ _ := rfl

theorem comp_rule (F G F' G') (hF : HasAdjoint F F') (hG : HasAdjoint G G') :
    HasAdjoint (fun φ => F (G φ)) (fun φ => G' (F' φ)) where
  test_fun_preserving  _ hφ := hF.test_fun_preserving _ (hG.test_fun_preserving _ hφ)
  test_fun_preserving' _ hφ := hG.test_fun_preserving' _ (hF.test_fun_preserving' _ hφ)
  adj φ ψ hφ hψ := by
    rw [hF.adj _ _ (hG.test_fun_preserving φ hφ) hψ]
    rw [hG.adj _ _ hφ (hF.test_fun_preserving' _ hψ)]

theorem deriv_rule :
    HasAdjoint (fun φ => deriv φ) (fun φ => - deriv φ) where
  test_fun_preserving  _ hφ := by
    have ⟨h,h'⟩ := hφ
    constructor
    · fun_prop
    · exact HasCompactSupport.deriv h'
  test_fun_preserving' _ hφ := by
    have ⟨h,h'⟩ := hφ
    constructor
    · eta_expand; dsimp; fun_prop
    · apply HasCompactSupport.neg'
      apply HasCompactSupport.deriv h'
  adj φ ψ hφ hψ := by
    dsimp
    trans ∫ (x : ℝ), ψ x * deriv φ x
    · congr
      funext x
      ring
    rw [MeasureTheory.integral_mul_deriv_eq_deriv_mul_of_integrable (u := ψ) (v := φ)
      (u' := deriv ψ) ]
    · simp
      rw [@MeasureTheory.integral_neg]
      congr
      funext x
      ring
    · intro x
      simpa using hψ.1.differentiable (by exact ENat.LEInfty.out) x
    · intro x
      simpa using hφ.1.differentiable (by exact ENat.LEInfty.out) x
    · refine IsTestFunction.integrable ?_
      apply IsTestFunction.mul
      · exact hψ
      · exact IsTestFunction.deriv hφ
    · refine IsTestFunction.integrable ?_
      apply IsTestFunction.mul
      · exact IsTestFunction.deriv hψ
      · exact hφ
    · refine IsTestFunction.integrable ?_
      exact IsTestFunction.mul hψ hφ


theorem has_adjoint_ext (F : (ℝ → ℝ) → (ℝ → ℝ)) (F' F'' : (ℝ → ℝ) → (ℝ → ℝ))
    (h : HasAdjoint F F') (h' : ∀ φ, IsTestFunction φ → F' φ = F'' φ) :
    HasAdjoint F F'' where
  test_fun_preserving  φ hφ := h.test_fun_preserving φ hφ
  test_fun_preserving' φ hφ := by
    rw [← h' φ hφ]
    exact h.test_fun_preserving' φ hφ
  adj φ ψ hφ hψ := by
    rw [← h' ψ hψ]
    exact h.adj φ ψ hφ hψ

theorem add_rule (F G : (ℝ → ℝ) → (ℝ → ℝ)) (F' G' : (ℝ → ℝ) → (ℝ → ℝ))
    (hF : HasAdjoint F F') (hG : HasAdjoint G G') :
    HasAdjoint (fun φ x => F φ x + G φ x) (fun φ x => F' φ x + G' φ x) where
  test_fun_preserving  _ hφ := by
    have ⟨h,h'⟩ := hφ
    constructor
    · apply ContDiff.add
      apply (hF.test_fun_preserving _ hφ).smooth
      apply (hG.test_fun_preserving _ hφ).smooth
    · apply HasCompactSupport.add
      apply (hF.test_fun_preserving _ hφ).supp
      apply (hG.test_fun_preserving _ hφ).supp
  test_fun_preserving' _ hφ :=  by
    have ⟨h,h'⟩ := hφ
    constructor
    · apply ContDiff.add
      apply (hF.test_fun_preserving' _ hφ).smooth
      apply (hG.test_fun_preserving' _ hφ).smooth
    · apply HasCompactSupport.add
      apply (hF.test_fun_preserving' _ hφ).supp
      apply (hG.test_fun_preserving' _ hφ).supp
  adj _ _ _ _ := by
    simp[right_distrib,left_distrib]
     -- use test functions are integralble
    rw[MeasureTheory.integral_add ]
    rw[MeasureTheory.integral_add ]
    rw[hF.adj _ _ (by assumption) (by assumption)]
    rw[hG.adj _ _ (by assumption) (by assumption)]
    · apply IsTestFunction.integrable
      apply IsTestFunction.mul
      · (expose_names; exact h)
      · (expose_names; exact hF.test_fun_preserving' x_1 h_1)
    · apply IsTestFunction.integrable
      apply IsTestFunction.mul
      · (expose_names; exact h)
      · (expose_names; exact hG.test_fun_preserving' x_1 h_1)
    · apply IsTestFunction.integrable
      apply IsTestFunction.mul
      · (expose_names; exact hF.test_fun_preserving x h)
      · (expose_names; exact h_1)
    · apply IsTestFunction.integrable
      apply IsTestFunction.mul
      · (expose_names; exact hG.test_fun_preserving x h)
      · (expose_names; exact h_1)

theorem mul_left_rule (F : (ℝ → ℝ) → (ℝ → ℝ)) (ψ : ℝ → ℝ) (F' : (ℝ → ℝ) → (ℝ → ℝ))
    (hF : HasAdjoint F F') (hψ : ContDiff ℝ ∞ ψ) :
    HasAdjoint (fun φ x => ψ x * F φ x) (fun φ x => F' (fun x => ψ x * φ x) x) where
  test_fun_preserving  φ hφ := by
    apply IsTestFunction.mul_left
    · exact hψ
    · exact hF.test_fun_preserving φ hφ
  test_fun_preserving' φ hφ := by
    apply hF.test_fun_preserving'
    apply IsTestFunction.mul_left
    · exact hψ
    · exact hφ
  adj φ ψ' hφ hψ' := by
    rw [← hF.adj]
    · congr
      funext x
      ring
    · exact hφ
    · apply IsTestFunction.mul_left
      · exact hψ
      · exact hψ'


theorem mul_right_rule (F : (ℝ → ℝ) → (ℝ → ℝ)) (ψ : ℝ → ℝ) (F' : (ℝ → ℝ) → (ℝ → ℝ))
    (hF : HasAdjoint F F') (hψ : ContDiff ℝ ∞ ψ) :
    HasAdjoint (fun φ x => F φ x * ψ x) (fun φ x => F' (fun x => φ x * ψ x) x) where
  test_fun_preserving  φ hφ := by
    apply IsTestFunction.mul_right
    · exact hF.test_fun_preserving φ hφ
    · exact hψ
  test_fun_preserving' φ hφ := by
    apply hF.test_fun_preserving'
    apply IsTestFunction.mul_right
    · exact hφ
    · exact hψ
  adj φ ψ' hφ hψ' := by
    rw [← hF.adj]
    · congr
      funext x
      ring
    · exact hφ
    · apply IsTestFunction.mul_right
      · exact hψ'
      · exact hψ

end HasAdjoint
open HasAdjoint

def HasVarGradientAt (S' : (ℝ → ℝ) → (ℝ → ℝ)) (grad : ℝ → ℝ) (u : ℝ → ℝ) :=
  ∃ F',
    HasAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F'
    ∧
    F' (fun _ => 1) = grad

theorem has_adjoint_ext' (F G : (ℝ → ℝ) → (ℝ → ℝ)) (F' : (ℝ → ℝ) → (ℝ → ℝ))
    (h : HasAdjoint G F') (h' : ∀ φ, IsTestFunction φ → F φ = G φ) :
    HasAdjoint F F' where
  test_fun_preserving  φ hφ := by
    rw[h' _ hφ]
    exact h.test_fun_preserving φ hφ
  test_fun_preserving' φ hφ := h.test_fun_preserving' φ hφ
  adj φ ψ hφ hψ := by
    rw [h' φ hφ]
    exact h.adj φ ψ hφ hψ

attribute [simp] ENat.LEInfty.out

theorem dirichlet_energy (u f : ℝ → ℝ) (hu : ContDiff ℝ ∞ u) (hf : ContDiff ℝ ∞ f) :
    HasVarGradientAt
      (fun u t => 1/2 * (deriv u t)^2 + u t * f t)
      (fun t => - deriv (deriv u) t + f t)
      u := by
  apply Exists.intro
  apply And.intro
  · -- compute derivative
    have h : ∀ δu, IsTestFunction δu →
             (fun t => deriv (fun s : ℝ => (fun u t => 1 / 2 * deriv u t ^ 2 + u t * f t) (u + s • δu) t) 0)
             =
             (fun t => deriv u t * deriv δu t + δu t * f t) := by
      intros δu hδu; funext t
      eta_expand; simp
      rw[deriv_add (by apply Differentiable.differentiableAt; fun_prop (config := {maxTransitionDepth := 4}) (disch:=simp) [deriv]) (by simp)]
      simp (disch:=fun_prop (config := {maxTransitionDepth := 3}) (disch:=simp)) [deriv_add,deriv_mul]
      ring
    apply has_adjoint_ext' _ _ _ _ h
    rotate_left

    -- compute adjoint
    apply has_adjoint_ext _ _ _

    · apply add_rule
      · apply mul_left_rule
        apply deriv_rule
        fun_prop
      · apply mul_right_rule
        apply id_rule
        fun_prop
    · intros; rfl

  · simp[sub_eq_add_neg]

Tomas Skrivan (Jun 16 2025 at 14:24):

Joseph Tooby-Smith said:

I've filled in all but the last sorry in the above - which I assume you don't actually want filled? Or would it also help to have that filled?

When trying to use fun_prop it keeps giving me an error :(.

There was a crucial missing step where you should use

theorem has_adjoint_ext' (F G : (ℝ → ℝ) → (ℝ → ℝ)) (F' : (ℝ → ℝ) → (ℝ → ℝ))
    (h : HasAdjoint G F') (h' : ∀ φ, IsTestFunction φ → F φ = G φ) :
    HasAdjoint F F' := ...

which allows you to rewrite F in HasAdjoit F F' to HasAdjoint G F' if F = G equal only on test functions. The last sorry is missing assumption ContDiff ℝ ∞ δu.

Tomas Skrivan (Jun 16 2025 at 14:40):

I'm very happy that we can do some form of formal calculation like this without fully formal theory of Sobolev spaces, theory of distributions or diffeology/convenient vector spaces.

I think that showing that the function $u$ minimizing $S(u)$ has to satisfy $\delta S(u) = 0$ is the difficult part. However in physics you just start with $\delta S(u) = 0$ as the fundamental law.

Joseph Tooby-Smith (Jun 16 2025 at 15:03):

Yeah, I agree, this is very nice! I think this is precisely what is needed to formalize what physicists do, in a way which is familiar to the physicists :).

Thus, the question, what would you like to do with this? I guess it is a bit complicated where this should sit. It would obviously be welcome in PhysLean, and I would certainly like to see this useable there, but I understand aspects of this may be better placed in SciLean or Mathlib. But PhysLean isn't (yet!) dependent on SciLean - which adds complications.

Tomas Skrivan (Jun 16 2025 at 15:07):

I think this can live in PhysLean for now. I don't have immediate plans for this in SciLean.

Tomas Skrivan (Jun 16 2025 at 15:10):

I'm not sure if it would be welcome in mathlib as it is not a definition from existing literature.

Joseph Tooby-Smith (Jun 16 2025 at 15:11):

Ok, yeah, I am also skeptical if it would be allowed in Mathlib.

Would you like me to put it together in a PR? I can deal with the linters, etc. Then can add you as a co-author to the commit (I think this is a thing). Or would you like to do it.

Tomas Skrivan (Jun 16 2025 at 15:12):

I'm drafing PR now

Tomas Skrivan (Jun 16 2025 at 17:27):

I made a PR612

Tomas Skrivan (Jun 16 2025 at 17:29):

It is effectively the same code only I generalized HasAdjoint to function between abstract vector spaces

structure HasVarAdjoint
    (F : (X → U) → (X → V)) (F' : (X → V) → (X → U)) (μ : Measure X := by volume_tac) : Prop

I also allowed the integral to be over arbitrary measure. Not sure if that will ever be useful though ...

Tomas Skrivan (Jun 16 2025 at 17:41):

It would be nice to also have this theorem, if someone feels like proving it :)

open EuclideanSpace
noncomputable
def div (f : EuclideanSpace ℝ (Fin n) → EuclideanSpace ℝ (Fin n) →L[ℝ] U) (x : EuclideanSpace ℝ (Fin n)) : U :=
  ∑ i, fderiv ℝ (f · (single i 1)) x (single i 1)

protected theorem HasVarAdjoint.fderiv {μ : Measure X} :
    HasVarAdjoint (fun φ : EuclideanSpace ℝ (Fin n) → U => fderiv ℝ φ) (fun φ => - div φ) := sorry

Joseph Tooby-Smith (Jun 17 2025 at 06:25):

I think more generally, building an API around HasVarAdjoint and HasVarGradientAt, which connects it to the vector calculus that @ZhiKai Pong has developed for Space would be very useful.

Tomas Skrivan (Jun 17 2025 at 07:10):

What kind of theorems do you have in mind? Can you give me an example?

Joseph Tooby-Smith (Jun 17 2025 at 08:32):

What I had in mind was essentially the theorem you have above, but with e.g. Space.div.

Although, that said, I'm not sure I know a situation in physics where one would want to do variational calculus with a world-volume (the thing you integrate over) which is Space, so maybe this doesn't make sense. In classical field theory, one does it with a world-volume SpaceTime, so relevant API around that might be good eventually.

Tomas Skrivan (Jun 18 2025 at 11:15):

I see, that would be this statement:

theorem HasVarAdjoint.gradient {d} :
    HasVarAdjoint (fun φ : Space d → ℝ => gradient φ) (fun φ => - Space.div φ) := sorry

and it is more reasonable as right now there is now inner product structure on X →L[ℝ] U in mathlib which is required to even state HasVarAdjoint.fderiv.

Joseph Tooby-Smith (Jun 18 2025 at 11:48):

This seems sensible to me! BTW am I free to merge PR612 into master?

Tomas Skrivan (Jun 18 2025 at 13:17):

Yes, feel free to merge it. I can't do it

Joseph Tooby-Smith (Jun 19 2025 at 05:23):

Just checking you agree with the following definition @Tomas Skrivan , for the definition of a solution to the harmonic oscillator.

/-- The lagrangian of the harmonic oscillator is the kinetic energy minus the potential energy. -/
noncomputable def lagrangian (x : Time → ℝ) : Time → ℝ := fun t =>
  kineticEnergy S x t - potentialEnergy S x t

/-- A trajectory `x : Time → ℝ` is a solution to the harmonic oscillator if
  the variational derivative of the lagrangian evaulated at `x` is zero. -/
def IsSolution (x : Time → ℝ) : Prop := HasVarGradientAt (lagrangian S) 0 x

This somewhat depends on the changes @Lode Vermeulen has made to this file, so won't commit this until then.

Lode Vermeulen (Jun 19 2025 at 06:17):

Just a heads up, I will be busy the remainder of this week and next week on finishing up my final courses for the year. This means that I will probably only be able to finish that PR after that.

Joseph Tooby-Smith (Jun 19 2025 at 06:47):

@Lode Vermeulen No worries, hope your final courses go well!

Tomas Skrivan (Jun 19 2025 at 06:50):

Ahh I think there is a bug in the definition of HasVarGradientAt :)

inductive HasVarGradientAt (S' : (X → U) → (X → ℝ)) (grad : X → U) (u : X → U)
    (μ : Measure X := by volume_tac) : Prop
  | intro (F')
      (adjoint : HasVarAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F' μ)
      (eq : F' (fun _ => 1) = grad)

The HasVarAdjoint guarantees only that F' is doing reasonable things on test functions but (fun _ => 1) is not a test function thus F' is allowed to return a junk value. Likely zero thus the above definition of IsSolution is no good as one can probably show that any path is a solution.

I think the definition should be:

inductive HasVarGradientAt (S' : (X → U) → (X → ℝ)) (grad : X → U) (u : X → U)
    (μ : Measure X := by volume_tac) : Prop
  | intro (F')
      (adjoint : HasVarAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F' μ)
      (eq : ∀ (φ : X → ℝ) (D : Set X), IsTestFunction φ → IsOpen D → (∀ x ∈ D, φ x = 1) →
        ∀ x ∈ D, F' φ x = grad x)

i.e. the eq argument is a bit more refined. Instead of plugging (fun _ => 1) to F' we plug in a test function φ that is constant one on some open set D and then grad is equal to F' φ only on D.

Tomas Skrivan (Jun 19 2025 at 07:11):

However this definition might not be completely satisfactory neither. I believe that it yields expected result for local laws i.e. Lagrangian depending on u(t,x) and its derivatives at x.

For example in peridynamics the potential energy looks like this

$\int \left( \int_{\Omega \cap B_\delta(x)} e(x,x',u(x),u(x')) \, dx' \right) dx$

i.e. the local potential energy density at $x$ is $\int_{\Omega \cap B_\delta(x)} e(x,x',u(x),u(x')) \, dx'$ thus it depends on values of $u$ in some finite neighbourhood $B_\delta(x)$ .

Tomas Skrivan (Jun 19 2025 at 07:14):

I think in this case the definition should be

inductive HasVarGradientAt' (S' : (X → U) → (X → ℝ)) (grad : X → U) (u : X → U)
    (μ : Measure X := by volume_tac) : Prop
  | intro (F')
      (adjoint : HasVarAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F' μ)
      (eq : ∀ (x : X), ∃ D : Set X, IsOpen D ∧ ∀ (φ : X → ℝ), (∀ x ∈ D, φ x = 1) → F' φ x = grad x)

i.e. for every point x there is a neighboorhood D such that if we take a test function φ that is one on that neighborhood we have F' φ x = grad x.

Tomas Skrivan (Jun 19 2025 at 07:15):

For the peridynamic potential the neighboorhood D would be $B_\delta(x)$ . For local laws(Lagrangian depending only on derivatives of u) the neighboorhood can be arbitrary small.

Tomas Skrivan (Jun 19 2025 at 07:18):

Also the definition is also missing that the deriv is not returning junk value

Joseph Tooby-Smith (Jun 19 2025 at 07:18):

Shouldn't we be enforcing some linearity condition on S and F' on smooth functions? Which then determine the value of them away from the test functions?

In your last iteration don't you need the extra condition that x is in D?

Tomas Skrivan (Jun 19 2025 at 07:19):

Joseph Tooby-Smith said:

In your last iteration don't you need the extra condition that x is in D?

Ha, yes I do :)

Joseph Tooby-Smith (Jun 19 2025 at 07:21):

Shouldn't we be enforcing some linearity condition on S and F' on smooth functions? Which then determine the value of them away from the test functions?

Nevermind - in general the lagrangian isn't linear :(.

Tomas Skrivan (Jun 19 2025 at 07:22):

Joseph Tooby-Smith said:

Shouldn't we be enforcing some linearity condition on S and F' on smooth functions? Which then determine the value of them away from the test functions?

Yeah I was thinking about that too but until it is not needed in some proof I'm not adding it.

If you know that HasVarAdjoint F F' μ then you know that

∀ φ ψ, IsTestFunction φ → IsTestFunction ψ →
    ∫ x, ⟪F φ x, ψ x⟫_ℝ ∂μ = ∫ x, ⟪φ x, F' ψ x⟫_ℝ ∂μ

and you might be able to show some linearity from it.

Tomas Skrivan (Jun 19 2025 at 07:24):

∫ x, ⟪F (φ+φ') x, ψ x⟫_ℝ ∂μ = ∫ x, ⟪(φ+φ') x, F' ψ x⟫_ℝ ∂μ
                            = ∫ x, ⟪φ' x, F' ψ x⟫_ℝ ∂μ + ∫ x, ⟪φ' x, F' ψ x⟫_ℝ ∂μ
                            = ∫ x, ⟪F φ' x, ψ x⟫_ℝ ∂μ + ∫ x, ⟪F φ' x, ψ x⟫_ℝ ∂μ
                            = ∫ x, ⟪F φ' x + F φ' x, ψ x⟫_ℝ ∂μ

Tomas Skrivan (Jun 19 2025 at 07:26):

Thus ∀ ψ, ∫ x, ⟪F (φ+φ') x - (F φ' x + F φ' x), ψ x⟫_ℝ ∂μ = 0 and by applying fundamental lemma of variational calculus you get F (φ+φ') x = F φ' x + F φ' x

Joseph Tooby-Smith (Jun 19 2025 at 07:26):

Right! Because even though the lagrangian is not linear:

fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0

is in terms of δu... or something like that.

Tomas Skrivan (Jun 19 2025 at 07:28):

Yes exactly

Joseph Tooby-Smith (Jun 19 2025 at 07:30):

Ok, but then I don't see how F' can give junk values on not test-functions, since it too must be linear (by a similar argument to what you have given), and thus it is determined by it's value on test-functions (which are dense in the space of smooth functions).

Tomas Skrivan (Jun 19 2025 at 07:32):

But we do not require any continuity of F' so we can't use the fact that test functions are dense in smooth functions.

Tomas Skrivan (Jun 19 2025 at 07:33):

That is kind of the point of the definition of HasVarAdjoint and HasVarGradient that you do everything on test functions and you completely dodge the question about topology on infinite dimensional spaces.

Tomas Skrivan (Jun 19 2025 at 07:35):

Tomas Skrivan said:

∫ x, ⟪F (φ+φ') x, ψ x⟫_ℝ ∂μ = ∫ x, ⟪(φ+φ') x, F' ψ x⟫_ℝ ∂μ
                            = ∫ x, ⟪φ' x, F' ψ x⟫_ℝ ∂μ + ∫ x, ⟪φ' x, F' ψ x⟫_ℝ ∂μ
                            = ∫ x, ⟪F φ' x, ψ x⟫_ℝ ∂μ + ∫ x, ⟪F φ' x, ψ x⟫_ℝ ∂μ
                            = ∫ x, ⟪F φ' x + F φ' x, ψ x⟫_ℝ ∂μ

This whole calculation is valid only under IsTestFunction φ, IsTestFunction φ', IsTestFunction ψ.

Tomas Skrivan (Jun 19 2025 at 07:36):

You can't use linearity and partition of unity to show that F' (fun _ => 1) is something meaningful because (fun _ => 1) has to be decomposed into infinitely many test functions

Tomas Skrivan (Jun 19 2025 at 07:38):

To show ∑ i, F' (φ i) = F' (∑ i, φ i) for partition of unity φ i you need some continuity of F'` as the sum has infinitely many terms.

Joseph Tooby-Smith (Jun 19 2025 at 07:54):

Ok, understood, and I agree with this philosophy.

And the 'fundamental lemma of variational calculus' answered another question I had, which was is the following true:

lemma eq_on_isTestFunction_of_hasVarAdjoint {F : (X → U) → (X → V)}
    {F' G'  : (X → V) → (X → U)} {μ : Measure X} (h : HasVarAdjoint F F' μ)
    (h : HasVarAdjoint F G' μ)  :
    ∀ φ, IsTestFunction φ → F' φ = G' φ := sorry

Which, with your new definition, implies uniqueness of the variational gradient, as long as F' φ x in the eq condition doesn't depend on the choice of D or the choice of φ, which it shouldn't do.

Tomas Skrivan (Jun 19 2025 at 08:01):

Ohh nice, I didn't realize that. I though that we will get uniqueness only once we talk about some topology on some infinite dimensional space.

Tomas Skrivan (Jun 19 2025 at 08:02):

So maybe we can even define varGradient function

Joseph Tooby-Smith (Jun 19 2025 at 08:02):

Tomas Skrivan said:

I think in this case the definition should be

inductive HasVarGradientAt' (S' : (X → U) → (X → ℝ)) (grad : X → U) (u : X → U)
    (μ : Measure X := by volume_tac) : Prop
  | intro (F')
      (adjoint : HasVarAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F' μ)
      (eq : ∀ (x : X), ∃ D : Set X, IsOpen D ∧ ∀ (φ : X → ℝ), (∀ x ∈ D, φ x = 1) → F' φ x = grad x)

i.e. for every point x there is a neighboorhood D such that if we take a test function φ that is one on that neighborhood we have F' φ x = grad x.

I also think you are going to actually need to state explicitly in the eq condition that φ is a test function, or something along those lines.

Joseph Tooby-Smith (Jun 19 2025 at 08:03):

Otherwise this eq condition implies the old one.

Tomas Skrivan (Jun 19 2025 at 08:03):

Joseph Tooby-Smith said:

I also think you are going to actually need to state explicitly in the eq condition that φ is a test function, or something along those lines.

Yes, I just realized that as I'm fixing the example in HasVarGradient.lean

Tomas Skrivan (Jun 19 2025 at 08:05):

Now I'm missing

hφ' : ∀ x_1 ∈ Metric.ball x 1, φ x_1 = 1
⊢ (fun x => m * deriv u x * φ x) =ᶠ[nhds x] fun x => m * deriv u x

and I have never worked with =ᶠ[nhds x] before

Joseph Tooby-Smith (Jun 19 2025 at 08:10):

I think something like docs#Filter.eventuallyEq_of_mem, along with the condition that
Metric.ball x 1 ∈ nhds x (docs#Metric.ball_mem_nhds) should help

Tomas Skrivan (Jun 19 2025 at 08:16):

Thanks! now I have proof of euler_lagrange_particle_1d with updated definition of HasVarGradientAt

Joseph Tooby-Smith (Jun 19 2025 at 09:23):

I think this is the correct manifestation of the fundamental theorem of variational calculus that we would need to define varGradient:

lemma fundamental_theorem_of_variational_calculus {f : X → V}
  (μ : Measure X)  [IsFiniteMeasureOnCompacts μ] [μ.IsOpenPosMeasure]
  [OpensMeasurableSpace X]
  (hf : IsTestFunction f) (hg : ∀ g, IsTestFunction g → ∫ x, ⟪f x, g x⟫_ℝ ∂μ = 0) :
    f = 0 := by
  have hf' := hg f hf
  rw [MeasureTheory.integral_eq_zero_iff_of_nonneg] at hf'
  · rw [Continuous.ae_eq_iff_eq] at hf'
    · funext x
      have hf'' := congrFun hf' x
      simpa using hf''
    · have hf : Continuous f := hf.smooth.continuous
      fun_prop
    · fun_prop
  · intro x
    simp
    exact real_inner_self_nonneg
  · apply IsTestFunction.integrable
    exact IsTestFunction.inner hf hf

Joseph Tooby-Smith (Jun 19 2025 at 09:25):

I had to add the instance: μ.IsOpenPosMeasure, which seems to be true for most important cases e.g.

#synth (volume (α := ℝ)).IsOpenPosMeasure
--  says: Measure.IsAddHaarMeasure.toIsOpenPosMeasure

Tomas Skrivan (Jun 19 2025 at 09:30):

Isn't it enought for f to be continuous?

Joseph Tooby-Smith (Jun 19 2025 at 09:31):

One needs that:

 Integrable (fun x => ⟪f x, f x⟫_ℝ) μ

which I proved using

apply IsTestFunction.integrable
exact IsTestFunction.inner hf hf

in the last two lines

Joseph Tooby-Smith (Jun 19 2025 at 09:33):

And also have hf' := hg f hf... so at least this proof doesn't generalize to the case when f is just continuous

Tomas Skrivan (Jun 19 2025 at 09:42):

Ohh I see, the proof can be also done by contradiction. Assume that for some x we have f x > 0 then in some neighbourhood you have f > 0. Then take g positive test function in that neighborhood and you are integrating non zero function.

Joseph Tooby-Smith (Jun 19 2025 at 09:44):

But in general, positive doesn't mean anything here because the target of g is the vector space V.

Joseph Tooby-Smith (Jun 19 2025 at 09:46):

Below is the proof of the result eq_on_isTestFunction_of_hasVarAdjoint I wanted above:

Header

lemma IsTestFunction.sub {f g : X → V} (hf : IsTestFunction f) (hg : IsTestFunction g) :
    IsTestFunction (f - g) where
  smooth := ContDiff.sub hf.smooth hg.smooth
  supp := by
    rw [SubNegMonoid.sub_eq_add_neg]
    apply HasCompactSupport.add
    · exact hf.supp
    · exact HasCompactSupport.neg' hg.supp


lemma eq_on_isTestFunction_of_hasVarAdjoint {F : (X → U) → (X → V)}
    {F' G'  : (X → V) → (X → U)} {μ : Measure X} [IsFiniteMeasureOnCompacts μ] [μ.IsOpenPosMeasure]
    [OpensMeasurableSpace X] (hF' : HasVarAdjoint F F' μ) (hG' : HasVarAdjoint F G' μ)  :
    ∀ φ, IsTestFunction φ → F' φ = G' φ := by
  obtain ⟨F_preserve_test, F'_preserve_test, F'_adjoint⟩ := hF'
  obtain ⟨F_preserve_test, G'_preserve_test, G'_adjoint⟩ := hG'
  intro φ hφ
  rw [← zero_add (G' φ)]
  rw [← sub_eq_iff_eq_add]
  change (F' - G') φ = 0
  apply fundamental_theorem_of_variational_calculus μ
  · simp
    apply IsTestFunction.sub
    · exact F'_preserve_test φ hφ
    · exact G'_preserve_test φ hφ
  · intro ψ hψ
    simp [inner_sub_left]
    rw [MeasureTheory.integral_sub]
    · conv_lhs =>
        enter [2, 2, a]
        rw [← inner_conj_symm]
      conv_lhs =>
        enter [1, 2, a]
        rw [← inner_conj_symm]
      simp
      rw [← F'_adjoint ψ φ hψ hφ]
      rw [G'_adjoint ψ φ hψ hφ]
      simp
    · apply IsTestFunction.integrable
      apply IsTestFunction.inner
      · exact F'_preserve_test φ hφ
      · exact hψ
    · apply IsTestFunction.integrable
      apply IsTestFunction.inner
      · exact G'_preserve_test φ hφ
      · exact hψ

Joseph Tooby-Smith (Jun 19 2025 at 10:04):

I think if you want to have uniqueness of the variational derivative, you need to enforce locality in eq, that is to say something like the following:

inductive HasVarGradientAt'' (S' : (X → U) → (X → ℝ)) (grad : X → U) (u : X → U)
    (μ : Measure X := by volume_tac) : Prop
  | intro (F')
      (adjoint : HasVarAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F' μ)
      (eq : ∀ (x : X), ∃ D ∈ nhds x, ∀ D' ∈ nhds x, D' ⊆ D →
      ∀ (φ : X → ℝ), IsTestFunction φ →  (∀ x ∈ D', φ x = 1) → F' φ x = grad x)

This locality is essentially to ensure that the choice of D does not matter, as long as it is small enough.

The problem is, I think, this would exclude the use of this on the non-local 'peridynamics' example above.

Tomas Skrivan (Jun 19 2025 at 10:51):

Joseph Tooby-Smith said:

But in general, positive doesn't mean anything here because the target of g is the vector space V.

Ohh I see, then there is x such that f x ≠ 0. There has to be a neighborhood N of x such that ⟪f x', f x⟫ > 0 for x' ∈ N. Now take positive test function φ with support in N and pick g x' := φ x' • f x. Then you can show ⟪f x', g x'⟫ ≥ 0 for x' ∈ N and zero outside of N.

If there is some trouble in finding N then you can assume that U is finite dimensional and do it component wise and then take intersection of all those neighborhoods.

Joseph Tooby-Smith (Jun 19 2025 at 10:53):

I guess one option to get around the non-local examples is to do something like the following:

inductive HasNonLocalVarGradientAt (S' : (X → U) → (X → ℝ)) (grad : X → U) (u : X → U)
    (E : (x : X) → Set X) (hE : ∀ x, x ∈ E x)
    (μ : Measure X := by volume_tac) : Prop
  | intro (F')
      (adjoint : HasVarAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F' μ)
      (eq : ∀ (x : X), ∃ D ∈ nhds x, E x ⊆ D ∧
       ∀ D' ∈ nhds x, D' ⊆ D  ∧ E x ⊆ D' →
      ∀ (φ : X → ℝ), IsTestFunction φ →  (∀ x ∈ D', φ x = 1) → F' φ x = grad x)

Where E : (x : X) → Set X expresses the non-locality in the function. The point being that this is only unique if you specify E.

Tomas Skrivan (Jun 19 2025 at 10:55):

I'm a bit confused I do not see the problem with the definition

inductive HasVarGradientAt (S' : (X → U) → (X → ℝ)) (grad : X → U) (u : X → U)
    (μ : Measure X := by volume_tac) : Prop
  | intro (F')
      (diff : ∀ δu x, IsTestFunction δu →
        Differentiable ℝ (fun s : ℝ => S' (fun x' => u x' + s • δu x') x))
      (adjoint : HasVarAdjoint (fun δu x => deriv (fun s : ℝ => S' (u + s • δu) x) 0) F' μ)
      (eq : ∀ (x : X), ∃ D : Set X,
        x ∈ D ∧ IsOpen D ∧ ∀ (φ : X → ℝ), IsTestFunction φ → (∀ x ∈ D, φ x = 1) → F' φ x = grad x)

It should be usable in the peridynamic example.

Tomas Skrivan (Jun 19 2025 at 10:57):

I'm working on the uniqueness proof for this definition now. Let's see if I get stuck

Joseph Tooby-Smith (Jun 19 2025 at 11:00):

Here is a situation where I don't think this would work. Suppose X is open but not compact, then I can let D be X, but then there are no functions φ : X → ℝ which satisfy IsTestFunction φ and
∀ x ∈ D, φ x = 1, and thus ∀ (φ : X → ℝ), IsTestFunction φ → (∀ x ∈ D, φ x = 1) → F' φ x = grad x holds for any grad.

Tomas Skrivan (Jun 19 2025 at 11:01):

Yeah I think the definition needs IsCompact D instead of IsOpen D

Tomas Skrivan (Jun 19 2025 at 11:04):

And somehow it is not necessary to enforce in the definition that D contains neighborhood of x.

Because if you have some D for x that satisfies eq than you can take D' that is D ⊆ D' and this D' also satisfies eq. So you can always arbitrary enlarge D while keeping it compact.

Tomas Skrivan (Jun 19 2025 at 11:10):

The partial uniqueness proof

def smoothIndicatorOn (Ω : Set X) (x : X) : ℝ := sorry

lemma smoothIndicatorOn_eq_one (Ω : Set X) (x : X) (hx : x ∈ Ω) : smoothIndicatorOn Ω x = 1 := sorry

@[fun_prop]
lemma smoothIndicatorOn.isTestFunction (Ω : Set X) : IsTestFunction (smoothIndicatorOn Ω) := sorry

lemma HasVarGradientAt.unique
    {S' : (X → U) → (X → ℝ)} {grad grad' : X → U} {u : X → U} {μ : Measure X}
    [IsFiniteMeasureOnCompacts μ] [μ.IsOpenPosMeasure] [OpensMeasurableSpace X]
    (h : HasVarGradientAt S' grad u μ) (h' : HasVarGradientAt S' grad' u μ) :
    grad = grad' := by

  obtain ⟨F,_,hF,eq⟩ := h
  obtain ⟨G,_,hG,eq'⟩ := h'
  funext x
  obtain ⟨D,_,_,hgrad⟩ := eq x
  obtain ⟨D',_,_,hgrad'⟩ := eq' x
  let φ := smoothIndicatorOn (D ∪ D')
  have hφ : IsTestFunction φ := by fun_prop
  have h := hgrad φ hφ (by intros _ hx; unfold φ; rw[smoothIndicatorOn_eq_one]; simp[hx])
  have h' := hgrad' φ hφ (by intros _ hx; unfold φ; rw[smoothIndicatorOn_eq_one]; simp[hx])
  rw[← h, ← h']
  rw[hF.unique hG φ hφ]

What is missing that we can define smoothIndicatorOn Ω which is test function that is one on Ω

Joseph Tooby-Smith (Jun 19 2025 at 11:40):

Seems like docs#SmoothBumpFunction may be helpful in this regard.

Joseph Tooby-Smith (Jun 19 2025 at 12:08):

Something like:

noncomputable def maxDistance (Ω : Set X) (z : X) : ℝ :=
  ⨆ y ∈ Ω, dist z y

lemma le_maxDistance (Ω : Set X) (z  y: X) (hy : y ∈ Ω)  :
    dist z y ≤ maxDistance Ω z := by
  sorry

lemma maxDistance_pos (Ω : Set X) (z : X) :
    0 < maxDistance Ω z := by
  sorry

noncomputable def smoothIndicatorOn' (Ω : Set X) (z : X) :
    SmoothBumpFunction (modelWithCornersSelf ℝ X ) z where
  rIn := maxDistance Ω z
  rOut := maxDistance Ω z + 1
  rIn_pos := by
    -- 0 < maxDistance Ω z
    sorry
  rIn_lt_rOut := lt_add_one (maxDistance Ω z)
  closedBall_subset := by simp

variable [FiniteDimensional ℝ X]

noncomputable def smoothIndicatorOn  (Ω : Set X) (z : X)  : X →  ℝ :=
  (smoothIndicatorOn' Ω z).toFun

lemma smoothIndicatorOn_eq_one (Ω : Set X) (z x : X) (hx : x ∈ Ω) :
    (smoothIndicatorOn Ω z) x = 1 := by
  simp [smoothIndicatorOn]
  rw [SmoothBumpFunction.one_of_dist_le]
  simp
  simp [smoothIndicatorOn']
  -- Dist.dist x z ≤ maxDistance Ω z
  sorry


lemma smoothIndicatorOn.isTestFunction (Ω : Set X) (z : X) : IsTestFunction (smoothIndicatorOn Ω z) where
  smooth := ContMDiff.contDiff (smoothIndicatorOn' Ω z).contMDiff
  supp := (smoothIndicatorOn' Ω z).hasCompactSupport

Joseph Tooby-Smith (Jun 19 2025 at 12:09):

This relies on FiniteDimensional ℝ X, but I think that is ok, no?

Tomas Skrivan (Jun 19 2025 at 12:20):

I'm thinking this

import Mathlib

open InnerProductSpace ContDiff

section SmoothIndicatorOn

variable {X : Type*} [NormedAddCommGroup X] [InnerProductSpace ℝ X]

noncomputable
def smoothIndicatorOn (Ω : Set X) (x : X) : ℝ :=
  let r2 := ⨆ x ∈ Ω, ‖x‖^2
  1 - Real.smoothTransition (⟪x,x⟫_ℝ - r2)

lemma smoothIndicatorOn.eq_one (Ω : Set X) (x : X)
    (hΩ : IsCompact Ω) (hx : x ∈ Ω) :
    smoothIndicatorOn Ω x = 1 := sorry

lemma smoothIndicatorOn.contDiff (Ω : Set X) : ContDiff ℝ ∞ (smoothIndicatorOn Ω) := by
  unfold smoothIndicatorOn
  sorry

lemma smoothIndicatorOn.hasCompactSupport (Ω : Set X) : HasCompactSupport (smoothIndicatorOn Ω) := by
  sorry

end SmoothIndicatorOn

Tomas Skrivan (Jun 19 2025 at 12:26):

I think both approaches are fine and the condition FiniteDimensional ℝ X is totally fine. I'm not aware of any application where that is not the case.

Tomas Skrivan (Jun 19 2025 at 12:28):

I think your smoothIndicatorOn_eq_one needs hΩ : IsCompact Ωsuch that maxDistance is not a junk value

Joseph Tooby-Smith (Jun 19 2025 at 12:29):

Yeah, I think this is probably true - at the minute I'm struggling to proof anything with maxDistance.

Tomas Skrivan (Jun 19 2025 at 12:30):

Yeah you need that continuous function attains maximum on a compact.

Tomas Skrivan (Jun 19 2025 at 12:33):

Ohh you definition is bit odd.

noncomputable def smoothIndicatorOn  (Ω : Set X) (z : X)  : X →  ℝ :=
  (smoothIndicatorOn' Ω z).toFun

what is z?

Tomas Skrivan (Jun 19 2025 at 12:34):

I think z can by arbitrary and I effectively have z = 0

Joseph Tooby-Smith (Jun 19 2025 at 12:36):

Ah! Right I forgot zero exists :), otherwise, yes it was just an arbitrary point.

Tomas Skrivan (Jun 19 2025 at 12:38):

Also can't we just use ContDiffBump?

Tomas Skrivan (Jun 19 2025 at 12:38):

I think SmoothBumpFunction is unecessary

Joseph Tooby-Smith (Jun 19 2025 at 12:39):

I thought this as well, but I couldn't find the smoothness results for ContDiffBump.

Tomas Skrivan (Jun 19 2025 at 12:41):

docs#ContDiffBump.contDiff

Tomas Skrivan (Jun 19 2025 at 12:52):

I will draft a PR with the uniqueness proofs

Joseph Tooby-Smith (Jun 19 2025 at 12:54):

Did you manage to get one of the above to work?

Tomas Skrivan (Jun 19 2025 at 12:54):

I have this

lemma HasVarGradientAt.unique
    {X : Type*} [NormedAddCommGroup X] [InnerProductSpace ℝ X] [FiniteDimensional ℝ X] [MeasurableSpace X]
    {S' : (X → U) → (X → ℝ)} {grad grad' : X → U} {u : X → U} {μ : Measure X}
    [IsFiniteMeasureOnCompacts μ] [μ.IsOpenPosMeasure] [OpensMeasurableSpace X]
    (h : HasVarGradientAt S' grad u μ) (h' : HasVarGradientAt S' grad' u μ) :
    grad = grad' := by

  obtain ⟨F,_,hF,eq⟩ := h
  obtain ⟨G,_,hG,eq'⟩ := h'
  funext x
  obtain ⟨D,_,hD,hgrad⟩ := eq x
  obtain ⟨D',_,hD',hgrad'⟩ := eq' x

  -- prepare test function that is one on `D ∪ D'`
  let r := ⨆ x ∈ D ∪ D', ‖x‖
  have : 0 ≤ r := sorry
  let φ : ContDiffBump (0 : X) := {
    rIn := r + 1,
    rOut := r + 2,
    rIn_pos := by linarith,
    rIn_lt_rOut := by linarith}

  -- few properties about `φ`
  have hφ : IsTestFunction (fun x : X => φ x) := by
    constructor
    apply ContDiffBump.contDiff
    apply ContDiffBump.hasCompactSupport
  have hφ' : ∀ x, x ∈ D ∪ D' → x ∈ Metric.closedBall 0 φ.rIn := sorry

  have h := hgrad φ hφ
    (by intros _ hx; unfold φ; rw[φ.one_of_mem_closedBall]; apply hφ'; simp[hx])
  have h' := hgrad' φ hφ
    (by intros _ hx; unfold φ; rw[φ.one_of_mem_closedBall]; apply hφ'; simp[hx])

  rw[← h, ← h',hF.unique hG φ hφ]

Tomas Skrivan (Jun 19 2025 at 12:54):

There are still two sorries

Tomas Skrivan (Jun 19 2025 at 12:55):

The first one should be easy, the second is probably a bit harder

Joseph Tooby-Smith (Jun 19 2025 at 13:03):

with a slightly different r to what you had this works:

lemma HasVarGradientAt.unique'
    {X : Type*} [NormedAddCommGroup X] [InnerProductSpace ℝ X] [FiniteDimensional ℝ X] [MeasurableSpace X]
    {S' : (X → U) → (X → ℝ)} {grad grad' : X → U} {u : X → U} {μ : Measure X}
    [IsFiniteMeasureOnCompacts μ] [μ.IsOpenPosMeasure] [OpensMeasurableSpace X]
    (h : HasVarGradientAt S' grad u μ) (h' : HasVarGradientAt S' grad' u μ) :
    grad = grad' := by

  obtain ⟨F,_,hF,eq⟩ := h
  obtain ⟨G,_,hG,eq'⟩ := h'
  funext x
  obtain ⟨D,hm,hD,hgrad⟩ := eq x
  obtain ⟨D',_,hD',hgrad'⟩ := eq' x

  -- prepare test function that is one on `D ∪ D'`
  let r := sSup ((fun x => ‖x‖) '' (D ∪ D'))
  have : 0 ≤ r := by
    obtain ⟨x, h1, h2, h3⟩ := IsCompact.exists_sSup_image_eq_and_ge (s :=  D ∪ D')
      (IsCompact.union hD hD') (Set.Nonempty.inl (Set.nonempty_of_mem hm)) (f := fun x => ‖x‖) (by fun_prop)
    unfold r
    apply le_of_le_of_eq  (b := ‖x‖)
    · exact norm_nonneg x
    · rw [← h2]

  let φ : ContDiffBump (0 : X) := {
    rIn := r + 1,
    rOut := r + 2,
    rIn_pos := by linarith,
    rIn_lt_rOut := by linarith}

  -- few properties about `φ`
  have hφ : IsTestFunction (fun x : X => φ x) := by
    constructor
    apply ContDiffBump.contDiff
    apply ContDiffBump.hasCompactSupport
  have hφ' : ∀ x, x ∈ D ∪ D' → x ∈ Metric.closedBall 0 φ.rIn := by
    intro x hx
    simp [φ, r]
    obtain ⟨y, h1, h2, h3⟩ := IsCompact.exists_sSup_image_eq_and_ge (s :=  D ∪ D')
      (IsCompact.union hD hD') (Set.Nonempty.inl (Set.nonempty_of_mem hm)) (f := fun x => ‖x‖) (by fun_prop)
    rw [h2]
    have h3' := h3 x hx
    apply le_trans h3'
    simp
  have h := hgrad φ hφ
    (by intros _ hx; unfold φ; rw[φ.one_of_mem_closedBall]; apply hφ'; simp[hx])
  have h' := hgrad' φ hφ
    (by intros _ hx; unfold φ; rw[φ.one_of_mem_closedBall]; apply hφ'; simp[hx])
  rw[← h, ← h',hF.unique hG φ hφ]

Tomas Skrivan (Jun 19 2025 at 14:23):

PR614 has uniqueness of variational adjoint and gradient

Tomas Skrivan (Jun 19 2025 at 14:25):

PR615 formulates Euler-Lagrange equations for:

Classical finite dimensional systems:

theorem hasVarGradient_action_eq_euler_lagrange
    (L : ℝ → X → X → ℝ) (q : ℝ → X)
    (hq : ContDiff ℝ ∞ u) (hL : ContDiff ℝ ∞ ↿L) :
    HasVarGradientAt
      (fun (q : ℝ → X) (t : ℝ) => L t (q t) (deriv q t))
      (fun t =>
        -- ∂L/∂x
        gradient (L t · (deriv q t)) (q t)
        -
        -- d/dt ∂L/∂ẋ
        deriv (fun t' => gradient (L t' (q t') ·) (deriv q t')) t)
      u := sorry

Classical scalar fields:

theorem hasVarGradient_action_eq_euler_lagrange_field
    (L : (t : ℝ) → (x : Space d) → (u : ℝ) → (dudt : ℝ) → (dudx : Space d) → ℝ)
    (u : ℝ → Space d → ℝ)
    (hu : ContDiff ℝ ∞ ↿u) (hL : ContDiff ℝ ∞ ↿L) :
    HasVarGradientAt
      (fun (u : ℝ×Space d → ℝ) ((t,x) : ℝ×Space d) =>
           L t x (u (t,x)) (deriv (fun t' => u (t',x)) t) (gradient (fun x' => u (t,x')) x))
      (fun (t,x) =>
           -- ∂L/∂u
           gradient (fun u' => L t x u' (deriv (u · x) t) (gradient (u t ·) x)) (u t x)
           -
           -- d/dt ∂L/∂(∂ₜu)
           deriv (fun t' => deriv (fun dudt => L t' x (u t' x) dudt (gradient (u t' ·) x)) (deriv (u · x) t')) t
           -
           -- divₓ ∂L/∂(∂ₓu)
           Space.div (fun x' => gradient (fun dudx => L t x' (u t x') (deriv (u · x') t) dudx) (gradient (u t ·) x')) x)
      (↿u) := sorry

Tomas Skrivan (Jun 19 2025 at 14:36):

This PR has bunch of sorry. In particular, variational adjoint of gradient is Space.div and generalization of previous HasVarAdjoint.deriv. Any help with these proofs is welcome :)

Tomas Skrivan (Jun 19 2025 at 14:40):

For example, the classical scalar fields version of Euler-Lagrange equation allows us to derive the wave equation

$\ddot u = \Delta u$

from the Lagrangian

$L = \frac12 \|\dot u\|^2 - \frac12 \|\nabla u\|^2$

Tomas Skrivan (Jun 19 2025 at 16:48):

I made another draft PR616 that defines HasVarAdjDerivAt F F' u which defines variational adjoint derivative.

The connection to HasVarGradientAt F grad u is that grad = F' (fun _ => 1).

The benefit is that we can formulate composition rule for HasVarAdjDerivAt thus computing HasVarAdjDerivAt does not have to be in two steps, derivative first and then adjoint, but both steps can be done at the same time.

Joseph Tooby-Smith (Jun 19 2025 at 16:54):

Great! I can fix the linters on PR614, and get it merged.

Joseph Tooby-Smith (Jun 19 2025 at 17:12):

What are the reasons for the extra conditions diff and linearize in HasVarAdjDerivAt? Are they just needed lemmas to ensure that the composition result works?

Tomas Skrivan (Jun 19 2025 at 17:14):

Yes, it is just cooked up such that you can prove the composition theorem.

Tomas Skrivan (Jun 19 2025 at 17:27):

It is very much inspired how smoothness is defined in diffeology.

Joseph Tooby-Smith (Jun 19 2025 at 17:30):

You mean the diff condition, and in the sense that the test-functions are the plots of the diffeology?

Joseph Tooby-Smith (Jun 19 2025 at 17:34):

PR614 is now merged :).

Tomas Skrivan (Jun 19 2025 at 18:00):

Joseph Tooby-Smith said:

You mean the diff condition, and in the sense that the test-functions are the plots of the diffeology?

Yes, that is the motivation.

Tomas Skrivan (Jun 19 2025 at 19:09):

Yay, HasVarAdjDerivAt.comp is now sorry free!

Joseph Tooby-Smith (Jun 20 2025 at 06:00):

Here is a proof of HasVarAdjDerivAt.deriv':

import PhysLean.Mathematics.FDerivCurry

lemma deriv' (u : ℝ → ℝ) (hu : ContDiff ℝ ∞ u) :
    HasVarAdjDerivAt (fun φ : ℝ → ℝ => deriv φ) (fun φ x => - deriv φ x) u where
  smooth_at := hu
  diff := by intros; fun_prop [deriv]
  linearize := by
    intro φ hφ x
    have hd : DifferentiableAt ℝ (fun x => deriv (fun x_1 => φ x_1 x) 0) x :=
      fderiv_curry_differentiableAt_fst_comp_snd  _ _ _ _ (ContDiff.of_le hφ (ENat.LEInfty.out))
    conv_rhs =>
      enter [1, s']
      rw [deriv_add (function_differentiableAt_snd _ _ _ (hφ.differentiable ENat.LEInfty.out))
        (by fun_prop)]
    have hd2 : DifferentiableAt ℝ (fun s' =>
        deriv (fun x => s' • deriv (fun x_1 => φ x_1 x) (0 : ℝ)) x) (0 : ℝ) := by
      conv_lhs =>
        enter [s']
        simp
      fun_prop
    rw [deriv_add (by fun_prop) (hd2)]
    simp only [deriv_const', smul_eq_mul, differentiableAt_const, deriv_const_mul_field',
      differentiableAt_id', deriv_mul, deriv_id'', one_mul, mul_zero, add_zero, zero_add]
    dsimp [deriv]
    exact fderiv_swap (𝕜 := ℝ) φ  0 1 x 1 (ContDiff.of_le hφ (ENat.LEInfty.out))
  adjoint := by
    simp (disch:=fun_prop) [deriv_add]
    apply HasVarAdjoint.congr_fun
    case h' =>
      intro φ hφ
      have := hφ.smooth.differentiable (ENat.LEInfty.out)
      have := hu.differentiable (ENat.LEInfty.out)
      simp (disch:=fun_prop) [deriv_add]
      rfl
    case h =>
      apply HasVarAdjoint.deriv

Joseph Tooby-Smith (Jun 20 2025 at 07:12):

I filled in the sorries in PR616, I directly made a commit to the pull-request, it probably needs a bit of tidying. Hope this is ok, feel free to revert if not.

Tomas Skrivan (Jun 20 2025 at 07:17):

Excellent, thanks a lot!

Joseph Tooby-Smith (Jun 20 2025 at 07:18):

No worries :). I think we probably want an explicit lemma like HasVarGradientAt.of_hasVarAdjDerivAt to go from a HasVarAdjDerivAt to a HasVarGradientAt.

Tomas Skrivan (Jun 20 2025 at 07:21):

Yes that is exactly what I wanted to talk about :) There is the problem you can't just prove grad = F' (fun _ => 1) but have to do the stupid gluing of test functions and I would like to make it easier

Tomas Skrivan (Jun 20 2025 at 07:22):

I'm thinking of changing the definition of HasVarAdjoint F F' by adding a clause saying that F' (fun _ => 1) produces a meaningful result.

Joseph Tooby-Smith (Jun 20 2025 at 07:26):

I think I would be hesitant to do this last thing, it seems to me to be fairly arbitrary (given everything else is defined using test functions).

Joseph Tooby-Smith (Jun 20 2025 at 07:27):

You are saying that the problem is basically that:

 ∀ (x : X), ∃ D : Set X,  x ∈ D ∧ IsCompact D ∧ ∀ (φ : X → ℝ),
   IsTestFunction φ → (∀ x ∈ D, φ x = 1) → F' φ x = F' (fun _ => 1) x

is hard to prove?

Tomas Skrivan (Jun 20 2025 at 07:28):

Joseph Tooby-Smith said:

I think I would be hesitant to do this last thing, it seems to me to be fairly arbitrary (given everything else is defined using test functions).

Yeah it feels a bit dirty

Tomas Skrivan (Jun 20 2025 at 07:28):

Joseph Tooby-Smith said:

You are saying that the problem is basically that:

 ∀ (x : X), ∃ D : Set X,  x ∈ D ∧ IsCompact D ∧ ∀ (φ : X → ℝ),
   IsTestFunction φ → (∀ x ∈ D, φ x = 1) → F' φ x = F' (fun _ => 1) x

is hard to prove?

I think it is more annoying than hard and not sure to how well it can be automated

Tomas Skrivan (Jun 20 2025 at 07:30):

Well my goal is to have a tactic that computes HasVarGradientAt S ?_ u automatically and I'm not sure about the last step.

Tomas Skrivan (Jun 20 2025 at 07:38):

Tomas Skrivan said:

I'm thinking of changing the definition of HasVarAdjoint F F' by adding a clause saying that F' (fun _ => 1) produces a meaningful result.

Yeah this is bad, by requiring just this we won't be able to show the composition theorem.

In F' (G' (fun _ => 1)) the G' (fun _ => 1) can be an arbitrary smooth function and we would need to quarante that F' produces meaningful result.

So I think the correct requirement is that F' is continuous when restricted to smooth functions.

Joseph Tooby-Smith (Jun 20 2025 at 07:38):

I again think that this comes down to locality, and if F' is local, then this last step should be easy to prove - but when it isn't local then it gets complicated. But as long as HasVarGradientAt is defined for non-local F' I'm not sure how much of an issue this is, one could just have a tactic for local F'.

To be more precise of what local means here:

def IsLocalAt (F : (X → U) → (X → ℝ)) (x : X) : Prop :=
    ∀ N ∈ nhds x, ∀ (φ1 φ2 : X → U),  (∀ y ∈ N, φ1 y = φ2 y) → F φ1 x = F φ2 x

Joseph Tooby-Smith (Jun 20 2025 at 07:39):

So I think the correct requirement is that F' is continuous when restricted to smooth functions

which is bad, right, because it requires topology?

Joseph Tooby-Smith (Jun 20 2025 at 07:42):

An aside: Thinking about it, people talk about local lagrangians, and I think the above condition is precisely what they mean. So even if we don't need this here, it would be nice to have in terms of lagrangians.

Tomas Skrivan (Jun 20 2025 at 07:42):

Yeah but I'm thinking of how bad :) it might be sufficient to work only with smooth functions on compact sets(where we have a canonical notion of topology)

Joseph Tooby-Smith (Jun 20 2025 at 07:53):

Actually, I think requiring the following condition is enough:

def cond' (F : (X → U) → (X → ℝ)) (x : X) : Prop :=
    ∃ D : Set X, IsCompact D ∧ ∀ (φ1 φ2 : X → U),  (∀ y ∈ D, φ1 y = φ2 y) → F φ1 x = F φ2 x

which is much looser then locality. It basically says that for every x : X, there is a compact set D where knowing φ1 on that compact set is sufficient to know the value of F φ1 x.

Joseph Tooby-Smith (Jun 20 2025 at 07:55):

(or in physics terms, the value of the Lagrangian at a point x does not depend on values of the field a points infinitely far from x).

Tomas Skrivan (Jun 20 2025 at 07:57):

Yeah that would work too but I find the locality condition more intuitive

Joseph Tooby-Smith (Jun 20 2025 at 07:57):

(Furthermore, I think is cond' is not true, the variational derivative, as we have defined it, will not exist anyway.)

Joseph Tooby-Smith (Jun 20 2025 at 07:58):

Tomas Skrivan said:

Yeah that would work too but I find the locality condition more intuitive

Yeah but, your peridynamic example would satisfy this cond', but not the locality condition.

Tomas Skrivan (Jun 20 2025 at 08:05):

Wait, now I'm a bit confused about what we are trying to achieve here now.

One thing we talk about is local Lagrangian and the peridynamic example is not local.

Next we want some condition such that F' (fun _ => 1) is meaningful. But neither the locality nor cond' is good as it is not evaluating F' on constant functions. For that always have this counter example in mind

F' f := if IsTestFunction f then ... else 0

i.e. F' can return junk values on everything but test functions

Tomas Skrivan (Jun 20 2025 at 08:20):

I think the continuity condition on $F : (X → U) → (X → V)$ should be that for every compact set $K$ there is compact $L$ with $K \subset L$ such that for a smooth function $f$ and any sequence of test functions $\phi_i$ that satisfy

$\phi_i \stackrel{C^\infty(L)}{\rightarrow} f$

we have

$F \phi_i \stackrel{C^\infty(K)}{\rightarrow} F f$

For local Lagrangians $L$ can be $K$ , for peridynamic example $L$ has to be $K$ enlarged by the maximal force distance.

Tomas Skrivan (Jun 20 2025 at 08:23):

btw. docs#ae_eq_zero_of_integral_smooth_smul_eq_zero is another version of fundamental lemma of variational calculus

Joseph Tooby-Smith (Jun 20 2025 at 08:37):

Actually I think your condition and my condition cond' may be equivalent to each other.

But let me address your confusion, by answering the equation: Why does cond' give F' (fun _ => 1) meaning?

This is because for a given x, the condition cond' makes F' (fun _ => 1) x equal to F' φ x for φ a test-function which is equal to 1 on the compact set D determined by cond'.

Note, then that given any smooth function f I can take a bump function on L, φ to give a test function f * φ . Both f and f * φ agree on L, both your and my condition then states that F f and F (f * φ) will agree on L (or K).

Joseph Tooby-Smith (Jun 20 2025 at 08:38):

(Correct me if I have misunderstood your condition).

Tomas Skrivan (Jun 20 2025 at 08:39):

Ohh you are totally right, I misunderstood it :)

Tomas Skrivan (Jun 20 2025 at 08:41):

Shall we try adding

def cond' (F : (X → U) → (X → ℝ)) (x : X) : Prop :=
    ∃ D : Set X, IsCompact D ∧ ∀ (φ1 φ2 : X → U),  (∀ y ∈ D, φ1 y = φ2 y) → F φ1 x = F φ2

to HasVarAdjoint or to HasVarRevDerivAt?

Joseph Tooby-Smith (Jun 20 2025 at 08:42):

I think probably HasVarRevDerivAt no? I think it might be best to keep HasVarAdjoint as minimal as possible, so the definition HasVarGradientAt is as minimal as possible.

Tomas Skrivan (Jun 20 2025 at 08:43):

I'm not even sure if there is a need for HasVarGradientAt anymore as you just prove HasVarRevDerivAt F F' u and the gradient is just F' (fun _ => 1)

Tomas Skrivan (Jun 20 2025 at 08:44):

And that is independent of whether this conditions goes to HasVarRevDerivAt or HasVarAdjoint

Tomas Skrivan (Jun 20 2025 at 08:45):

It think it should go to HasVarAdjoint as it is dealing exactly with the issues of inner product on function spaces which HasVarAdjoint is addressing.

Tomas Skrivan (Jun 20 2025 at 08:47):

Then we can show HasVarAdjoint.unique to be true not only on test functions but on any smooth functions which I think is nice

Joseph Tooby-Smith (Jun 20 2025 at 08:47):

Ok, sounds good to me, but I think getting rid of HasVarGradientAt might cause a bit of confusion, because I think in general the form of F' is a fair bit more complicated then F' (fun _ => 1). Maybe we could define HasVarGradientAt through HasVarRevDerivAt?

Joseph Tooby-Smith (Jun 20 2025 at 08:49):

Ideally we would want varGradient, but it is not clear to me how we should define this in the case where it does not exist. Setting it to 0, like what is done for deriv is bad i think, because we care about the points where it is zero.

Joseph Tooby-Smith (Jun 20 2025 at 08:50):

(i.e. varGradient F u = 0 could either imply that the variational derivative of F doesn't exist or that u is a solution :().

Tomas Skrivan (Jun 20 2025 at 08:51):

Joseph Tooby-Smith said:

Ideally we would want varGradient, but it is not clear to me how we should define this in the case where it does not exist. Setting it to 0, like what is done for deriv is bad i think, because we care about the points where it is zero.

That is not an issue. We just define VarDifferentiable as we do for deriv

Joseph Tooby-Smith (Jun 20 2025 at 08:52):

Basically VarDifferentiable is true if there exists an F' for which HasVarRevDerivAt F F' u? i.e. define it through HasVarRevDerivAt?

Tomas Skrivan (Jun 20 2025 at 08:53):

Yes

Tomas Skrivan (Jun 20 2025 at 08:53):

I'm thinking about changing the definition of HasVarGradientAt to

inductive HasVarGradientAt (S : (X → U) → ℝ) (grad : X → U) (u : X → U) : Prop where
  | intro
    (F : (X → U) → (X → ℝ))
    (hF : ∀ φ, IsTestFunction φ → S φ = ∫ x, F φ x)
    (F' : (X → ℝ) → (X → U))
    (hF' : HasVarAdjDerivAt F F' u)
    (hgrad : grad = F' (fun _ => 1))

Tomas Skrivan (Jun 20 2025 at 08:54):

Then we could write

HaVarGradientAt (fun u => ∫ x, ‖∇ u x‖^2) (- Δ x) u

Tomas Skrivan (Jun 20 2025 at 08:55):

However, I'm wondering if we can still prove uniqueness of grad

Joseph Tooby-Smith (Jun 20 2025 at 08:56):

Ah, good! yes,this is nice! You are worried because F is not necessarily unique here?

Tomas Skrivan (Jun 20 2025 at 08:56):

Yeah something along those lines, but probably we can always add some silly condition to ensure that

Tomas Skrivan (Jun 20 2025 at 08:57):

During computation varGradient fun u => ∫ x, ‖∇ u x‖^2 will be immediately converted to varAdjDeriv fun u x => ‖∇ u x‖^2

Tomas Skrivan (Jun 20 2025 at 09:44):

I renamed cond' to ext as it is some form of extensionality and I had to modify it to

  ext : ∀ (K : Set X) (hK : IsCompact K), ∃ L : Set X,
    IsCompact L ∧ K ⊆ L ∧ ∀ (φ φ' : X → U), (∀ x ∈ L, φ x = φ' x) → ∀ x ∈ K, F φ x = F φ' x

in order to prove composition theorem

Joseph Tooby-Smith (Jun 20 2025 at 09:47):

Sounds good to me.

Joseph Tooby-Smith (Jun 20 2025 at 11:25):

BTW HasVarAdjDerivAt or HasVarRevDerivAt? The current way this appears in PR616 is the file name as HasVarRevDerivAt but the definitions are HasVarAdjDerivAt.

I've also been thinking about the uniqueness of grad in the modified HasVarGradientAt. I think it should follow from something like:

example  (F : (X → U) → (X → V)) (F' : (X → V) → (X → U))
    (μ : Measure X) (hF : HasVarAdjoint F F' μ)
    (hφ : ∀ φ, IsTestFunction φ → ∫ x, F φ x ∂μ = 0 ) : ∀ φ, IsTestFunction φ → F φ = 0 := by
  sorry

But I cannot think of a way to prove this (if it is indeed true).

But then the question is, how necessary is it to actually have varGradient? Can one get away with having just HasVarGradientAt for most things?

Tomas Skrivan (Jun 20 2025 at 11:28):

Joseph Tooby-Smith said:

BTW HasVarAdjDerivAt or HasVarRevDerivAt? The current way this appears in PR616 is the file name as HasVarRevDerivAt but the definitions are HasVarAdjDerivAt.

I think it should be HasVarAdjDerivAt. Previously I called it HasVarRevDerivAt as there is some connection to revese mode automatic differentiation but I don't think it is a good name for PhysLean.

Tomas Skrivan (Jun 20 2025 at 11:29):

Joseph Tooby-Smith said:

But then the question is, how necessary is it to actually have varGradient? Can one get away with having just HasVarGradientAt for most things?

I want HasVarGradientAt S grad u to accept S : (X → U) → ℝ and have uniqueness for grad. Defining varGradient is just a cherry on top.

Joseph Tooby-Smith (Jun 20 2025 at 11:31):

Ok, thanks. Also, I can take over PR617, fix the last lint and get it merged if you want?

Tomas Skrivan (Jun 20 2025 at 11:35):

Joseph Tooby-Smith said:

Ok, thanks. Also, I can take over PR617, fix the last lint and get it merged if you want?

That would be appreciated!

You can also merge 616, there will be conflict with 617 but you can fix the proof of HasVarAdjDerivAt.comp with

lemma comp {F : (X → V) → (X → W)} {G : (X → U) → (X → V)} {u : X → U}
    {F' G'} (hF : HasVarAdjDerivAt F F' (G u)) (hG : HasVarAdjDerivAt G G' u) :
    HasVarAdjDerivAt (fun u => F (G u)) (fun ψ => G' (F' ψ)) u where

  smooth_at := hG.smooth_at

  diff := by
    intro φ hφ
    apply hF.diff (φ := fun t x => G (φ t) x)
    exact hG.diff φ hφ

  linearize := by
    intro φ hφ x
    rw[hF.linearize (fun t x => G (φ t) x) (hG.diff φ hφ)]
    rw[hF.linearize (fun s' => G fun x => φ 0 x + s' • deriv (fun x_1 => φ x_1 x) 0)]
    simp[hG.linearize φ hφ]
    eta_expand; simp[Function.HasUncurry.uncurry]
    apply hG.diff (φ := fun a x => φ 0 x + a • deriv (fun x_1 => φ x_1 x) 0)
    fun_prop [deriv]

  adjoint := by
    have : ContDiff ℝ ∞ u := hG.smooth_at
    have h := hF.adjoint.comp hG.adjoint
    apply h.congr_fun
    intro φ hφ; funext x
    rw[hF.linearize]
    · simp
    · simp [Function.HasUncurry.uncurry];
      apply hG.diff (φ := (fun s x => u x + s • φ x))
      fun_prop

Tomas Skrivan (Jun 20 2025 at 11:36):

Also this fix is necessary

lemma zero : HasVarAdjoint (fun (_ : X → U) _ => (0 : V)) (fun _ _ => 0) μ where
  test_fun_preserving _ hφ := by fun_prop
  test_fun_preserving' _ hφ := by fun_prop
  adjoint _ _ _ _ := by simp
  ext := fun K cK => ⟨K,cK,subset_refl _,fun _ _ h _ _ => rfl⟩

Tomas Skrivan (Jun 20 2025 at 11:37):

Joseph Tooby-Smith said:

I've also been thinking about the uniqueness of grad in the modified HasVarGradientAt. I think it should follow from something like:
example  (F : (X → U) → (X → V)) (F' : (X → V) → (X → U))
    (μ : Measure X) (hF : HasVarAdjoint F F' μ)
    (hφ : ∀ φ, IsTestFunction φ → ∫ x, F φ x ∂μ = 0 ) : ∀ φ, IsTestFunction φ → F φ = 0 := by
  sorry
But I cannot think of a way to prove this (if it is indeed true).

I'm thinking maybe we can add this as another condition to HasVarAdjoint. I think it is fairly natural requirement.

Tomas Skrivan (Jun 20 2025 at 11:39):

Ahhh, no it is not!!! For F = deriv we have ∫ x, deriv φ x = 0 for any test function!

Tomas Skrivan (Jun 20 2025 at 11:44):

Yeah I really do not know how to enforce uniqueness of F in

inductive HasVarGradientAt (S : (X → U) → ℝ) (grad : X → U) (u : X → U) : Prop where
  | intro
    (F : (X → U) → (X → ℝ))
    (hF : ∀ φ, IsTestFunction φ → S φ = ∫ x, F φ x)
    (F' : (X → ℝ) → (X → U))
    (hF' : HasVarAdjDerivAt F F' u)
    (hgrad : grad = F' (fun _ => 1))

Joseph Tooby-Smith (Jun 20 2025 at 11:46):

I think F is not unique here, I was hoping that

(fun δu x => deriv (fun s : ℝ => F (u + s • δu) x) 0)

was unique, if it had an adjoint, but I think you deriv example may be used to construct a counter example.

Tomas Skrivan (Jun 20 2025 at 11:50):

To keep the the integral in HasVarGradientAt I think we could modify the definition such that we can state

HasVarGradientAt (fun u φ => ∫ x, ‖∇ u‖^2 * φ x) (- Δ u) u

HasVarGradientAt (fun u Ω => ∫ x in Ω, ‖∇ u‖^2) (- Δ u) u

Tomas Skrivan (Jun 20 2025 at 11:51):

In this case I think you should be able to recover F : (X → U) → (X → ℝ) uniquely from (fun u φ => ∫ x, ‖∇ u‖^2 * φ x) or from (fun u Ω => ∫ x in Ω, ‖∇ u‖^2)

Tomas Skrivan (Jun 20 2025 at 11:54):

I see three options how to define HasVarGradientAt

HasVarGradientAt (fun u x => ‖∇ u‖^2) (- Δ u) u
HasVarGradientAt (fun u φ => ∫ x, ‖∇ u‖^2 * φ x) (- Δ u) u
HasVarGradientAt (fun u Ω => ∫ x in Ω, ‖∇ u‖^2) (- Δ u) u

Joseph Tooby-Smith (Jun 20 2025 at 12:00):

Let me restate my conjecture:

example  (F : (X → U) → (X → V)) (F' : (X → V) → (X → U))
    (μ : Measure X) (hF : HasVarAdjoint F F' μ)
    (hφ : ∀ φ, IsTestFunction φ → ∫ x, F φ x ∂μ = 0 ) : ∀ φ, IsTestFunction φ → F' (fun _ => 1) = 0 := by
  sorry

Which is true for F := deriv since F' = - deriv and thus F' (fun _ => 1) = 0

Joseph Tooby-Smith (Jun 20 2025 at 12:01):

But if this can't be solved, then I would vote for the first of your options, since I feel the other two are just round-about ways of specifying F whilst having the integral.

Tomas Skrivan (Jun 20 2025 at 12:02):

Yeah I would also go for the first option. Furthermore, we can have specialized notation and recover familiar looking equations

inductive HasVarGradientAt (S : (X → U) → (X → ℝ)) (grad : X → U) (u : X → U) : Prop where
  | intro
    (F' : (X → ℝ) → (X → U))
    (hF' : HasVarAdjDerivAt F F' u)
    (hgrad : grad = F' (fun _ => 1))


open Classical in
noncomputable def varGradient (S : (X → U) → (X → ℝ)) (u : X → U) : X → U :=
  if h : ∃ grad, HasVarGradientAt S grad u then
    choose h
  else
    0

macro "δ" u:term ", " "∫ " x:term ", " b:term : term => `(varGradient (fun $u $x => $b))

example : (δ (x : ℝ → ℝ), ∫ t, ‖deriv x t‖^2) = fun x t => - deriv (deriv x) t := sorry

Joseph Tooby-Smith (Jun 20 2025 at 12:03):

This looks good to me! (although the S in the argument of your first defn should be an F I think)

Tomas Skrivan (Jun 20 2025 at 12:05):

Agreed, It is too suggestive of action which it is not.

Joseph Tooby-Smith (Jun 20 2025 at 12:06):

I agree, but I actually meant because you use F in HasVarAdjDerivAt F F' u rather then S :). But in any case.

Tomas Skrivan (Jun 20 2025 at 12:07):

Ohh I see, it is broken as it is now :)

Joseph Tooby-Smith (Jun 20 2025 at 12:10):

PR617 merged - will turn to PR616 shortly.

Joseph Tooby-Smith (Jun 20 2025 at 13:14):

PR616 merged

Tomas Skrivan (Jun 20 2025 at 13:16):

Thanks a lot for pushing it over the finish line!

Tomas Skrivan (Jun 20 2025 at 13:16):

Right now I'm making progress on showing that variational adjoint of gradient is Space.div

Tomas Skrivan (Jun 20 2025 at 13:58):

I made a PR618 for variatonal adjoint of adjoint, fderiv, gradient and generalization for deriv. There are still bunch of sorries and I'm probably not going to work on this for some time now.

Natural follow up to this PR is adding corresponding theorems for HasVarAdjDerivAt.

Once all of this is done then proving all the different versions of Euler-Lagrange equations should be easy.

Unfortunately, we won't be able to do vector valued field equations as there is no inner product on tensor products/linear maps right now.

Joseph Tooby-Smith (Jun 20 2025 at 14:06):

Ok, thanks for this! I may get chance to fill i a couple of the sorries over the next couple of days.

Joseph Tooby-Smith (Jun 22 2025 at 06:52):

I fixed the linters and merged PR619 and PR621. Hope this is ok. I'm not entirely sure what best practice is in this regard, so if you'd rather me pass them by you before merging, let me know, and I will change what I do in the future.

Tomas Skrivan (Jun 22 2025 at 08:17):

Thanks a lot! It works for me this way but I don't care too much. I just feel bad that you are doing the annoying bits for me :)

Joseph Tooby-Smith (Jun 22 2025 at 19:16):

Ha don't worry about that, I'm more then happy doing these bits :slight_smile:

Tomas Skrivan (Jun 23 2025 at 12:55):

Once all the PRs regarding HasVarAjdDeriv and HasAdjFDeriv are done then the proof of Euler-Lagrange equations looks like this

theorem hasVarGradient_action_eq_euler_lagrange
    (L : ℝ → X → X → ℝ) (q : ℝ → X)
    (hq : ContDiff ℝ ∞ q) (hL : ContDiff ℝ ∞ ↿L) :
    (δ (q':=q), ∫ t, L t (q' t) (deriv q' t))
    =
    (fun t =>
      gradient (L t · (deriv q t)) (q t)
      -
      deriv (fun t' => gradient (L t' (q t') ·) (deriv q t')) t) := by

  apply HasVarGradientAt.varGradient
  constructor
  case hF' =>
    apply HasVarAdjDerivAt.comp
      (F := fun (φ : ℝ → (X×X)) t => L t (φ t).fst (φ t).snd)
      (G := fun (φ : ℝ → X) t => (φ t, deriv φ t))
    · apply HasVarAdjDerivAt.fmap' (f := fun t => ↿(L t))
      · fun_prop
      · fun_prop
    · apply HasVarAdjDerivAt.prod (F:=fun φ => φ) (G:=fun φ => deriv φ)
      · apply HasVarAdjDerivAt.id _ hq
      · apply HasVarAdjDerivAt.deriv
        · apply HasVarAdjDerivAt.id _ hq
  case hgrad =>
    funext t
    simp (disch:=fun_prop) only [adjFDeriv_uncurry,gradient_eq_adjFDeriv]
    simp[sub_eq_add_neg]

Joseph Tooby-Smith (Jun 24 2025 at 10:00):

In the PR622, in HasVarAdjDerivAt.fmap, there is one remaining sorry, which I do not know how to fill in.

But let me ask the following, the declaration of this lemma is:

lemma fmap (f : X → U → V) {f' : X → _ }
    (u : X → U) (hu : ContDiff ℝ ∞ u)
    (hf' : ContDiff ℝ ∞ ↿f) (hf : ∀ x, HasAdjFDerivAt ℝ (f x) (f' x) (u x)) :
    HasVarAdjDerivAt (fun (φ : X → U) x => f x (φ x)) (fun ψ x => f' x (ψ x)) u where

Is there a reason not to include (_ : ContDiff ℝ ∞ ↿f') as an input? I think this would make the proof of this last sorry somewhat straight forward.

Tomas Skrivan (Jun 24 2025 at 11:44):

It is very well possible it is missing but I think it can be inferred from other assumptions. I will give it a shot.

Tomas Skrivan (Jun 24 2025 at 12:22):

So far I have this

open InnerProductSpace' ContDiff in
protected theorem HasAdjFDerivAt.contDiffAt_deriv [CompleteSpace E] [CompleteSpace F]
    {f : E → F} {f' : E → _} (hf : ∀ x, HasAdjFDerivAt 𝕜 f (f' x) x) (hf' : ContDiff 𝕜 ∞ f) :
    ContDiff 𝕜 ∞ (fun xy : E×F => f' xy.1 xy.2) := by
  simp[← fun x => (hf x).adjFDeriv]
  unfold adjFDeriv
  simp[adjoint_eq_clm_adjoint]
  apply ContDiff.fun_comp
  · fun_prop
  · apply ContDiff.fun_comp (𝕜:=𝕜) (n:=∞)
      (g := fun fx : ((WithLp 2 E) →L[𝕜] (WithLp 2 F))×(WithLp 2 F) => fx.1.adjoint fx.2)
      (f := fun xy : E×F => (((toL2 𝕜) ∘L ((fderiv 𝕜 f xy.1) ∘L (fromL2 𝕜))), (toL2 𝕜) xy.2))
    · sorry
    · fun_prop

where the last sorry is

ContDiff 𝕜 ∞ fun (fx : (E →L[𝕜] F) × F) => fx.1.adjoint fx.2)

I believe that showing this

IsBoundedBilinearMap 𝕜 (fun (fy : (X →L[𝕜] Y)×Y) => fy.1.adjoint fy.2)

should be enough.

Tomas Skrivan (Jun 24 2025 at 12:42):

Ok I have the proof! I will update the PR

Tomas Skrivan (Jun 24 2025 at 12:48):

Ahhh, I had one sorry that I though is trivial ... and is false

(adjoint (c • f)) y = c • (adjoint f) y

but

(adjoint (c • f)) y = (conj c) • (adjoint f) y

is true ... so I can't use IsBoundedBilinearMap :( I guess I will just do it over reals right now

Tomas Skrivan (Jun 24 2025 at 12:52):

Ohh it is probably not true over complex numbers as the derivative would not be complex linear.

Tomas Skrivan (Jun 24 2025 at 13:56):

Thanks a lot for the help! The PR622 is now sorry free!

Tomas Skrivan (Jun 24 2025 at 13:57):

What is left to do is documentation and satisfy linters

Joseph Tooby-Smith (Jun 24 2025 at 14:00):

I'm happy to deal with the linters :), though it is probably best if I leave the documentation to you.

Tomas Skrivan (Jun 25 2025 at 09:50):

The next step is to prove HasVarAdjoint for fderiv. The bottleneck to this is the lack of generalizations of docs#MeasureTheory.integral_mul_deriv_eq_deriv_mul_of_integrable to fderivinstead of deriv and inner instead of mul

Joseph Tooby-Smith (Jun 25 2025 at 10:03):

Is this result in one of the PRs? I'm guessing it is this, but this PR has a merge conflict.

Tomas Skrivan (Jun 25 2025 at 10:04):

Yes that is the statement

Joseph Tooby-Smith (Jun 25 2025 at 10:24):

docs#integral_mul_fderiv_eq_neg_fderiv_mul_of_integrable seems useful in particular in proving HasVarAdjoint.fderiv_apply e.g.

lemma fderiv_apply {d} {dx} :
    HasVarAdjoint (fun φ : Space d → U => (fderiv ℝ φ · dx)) (fun φ x => - fderiv ℝ φ x dx) where
  test_fun_preserving φ hφ := by
    sorry
  test_fun_preserving' φ hφ := by
    sorry
  adjoint φ ψ hφ hψ := by
    rw [← sub_eq_zero]
    rw [← integral_sub]
    trans ∫ (a : Space d), fderiv ℝ (fun a => ⟪φ a , ψ a⟫_ℝ) a dx ∂volume
    congr
    · funext a
      simp
      rw [fderiv_inner_apply]
      rw [add_comm]
      · exact hφ.differentiable a
      · exact hψ.differentiable a
    have h1 := integral_mul_fderiv_eq_neg_fderiv_mul_of_integrable (f := fun a => 1)
      (g :=  (fun a => ⟪φ a , ψ a⟫_ℝ)) (v := dx) (μ := volume) (by sorry) (by sorry)
      (by sorry) (by fun_prop) (by sorry)
    simpa using h1
    · sorry
    · sorry

Tomas Skrivan (Jun 25 2025 at 10:26):

Omg, for the life of me I couldn't find this theorem :man_facepalming:

Joseph Tooby-Smith (Jun 25 2025 at 10:28):

I found it by loogleing fderiv, "integral" :slight_smile:

Tomas Skrivan (Jun 25 2025 at 10:28):

And we can prove only HasVarAdjoint.fderiv_apply version as HasVarAdjoint.fderiv would require inner product space structure on continuous linear maps to just state it and that is not in mathlib.

Tomas Skrivan (Jun 25 2025 at 10:29):

Joseph Tooby-Smith said:

I found it by loogleing fderiv, "integral" :slight_smile:

Yeah I don't know what went wrong. It is sufficient to search for integral fderiv in mathlib docs and it comes up. I thought I did that ...

Tomas Skrivan (Jun 25 2025 at 10:31):

Ahh maybe I always had inner in the search query ...

Joseph Tooby-Smith (Jun 25 2025 at 10:35):

Tomas Skrivan said:

Ahh maybe I always had inner in the search query ...

Ah, this would probably explain it.

I can most likely fill in the sorries of my proof above, and merge it with that PR if that would help?

Tomas Skrivan (Jun 25 2025 at 10:36):

Yes that would be great but it should be generalized, Space d to X.

Tomas Skrivan (Jun 25 2025 at 10:42):

And feel free to assume InnerProductSpace ℝ X. I can then translate it to InnerProductSpace' ℝ X.

Joseph Tooby-Smith (Jun 25 2025 at 10:46):

Seems like also need to assume a bunch of other things:

[InnerProductSpace ℝ X] [ProperSpace X] [BorelSpace X]
   [FiniteDimensional ℝ X] [μ.IsAddHaarMeasure]

Joseph Tooby-Smith (Jun 25 2025 at 10:47):

This is mainly to get integral_mul_fderiv_eq_neg_fderiv_mul_of_integrable to work.

Tomas Skrivan (Jun 25 2025 at 10:48):

Yeah that is fine. Although I'm confused why

#synth (volume : Measure (ℝ×ℝ)).IsAddHaarMeasure

fails but I think it is just missing API as

#synth (volume : Measure (Fin 2 → ℝ)).IsAddHaarMeasure

works

Joseph Tooby-Smith (Jun 25 2025 at 10:49):

Think it is just an import issue as

import Mathlib
open MeasureTheory
#synth (volume : Measure (ℝ×ℝ)).IsAddHaarMeasure -- instIsAddHaarMeasureProdRealVolume

Tomas Skrivan (Jun 25 2025 at 10:50):

Ohh I see

Joseph Tooby-Smith (Jun 25 2025 at 10:55):

Ok, merged a proof to the PR.

Tomas Skrivan (Jun 25 2025 at 15:58):

I have resolved conflicts in PR618 and formulated

lemma adjFDeriv_apply
   [InnerProductSpace' ℝ X] [InnerProductSpace' ℝ Y]
   [ProperSpace X] [BorelSpace Y] [FiniteDimensional ℝ X] [(@volume X _).IsAddHaarMeasure] {dy} :
   HasVarAdjoint (fun φ : X → Y => (adjFDeriv ℝ φ · dy)) (fun φ x => - divergence ℝ φ x • dy) := ...

where divergence is defined as

def divergence (f : E → E) (x : E) : 𝕜 := (fderiv 𝕜 f x).toLinearMap.trace _ _

Joseph Tooby-Smith (Jun 27 2025 at 17:35):

Do you think this is going to be true?

lemma HasVarAdjoint.symm {F : (X → U) → (Y → V)} {F' : (Y → V) → (X → U)}
    (hF : HasVarAdjoint F F') : HasVarAdjoint F' F := by sorry

I think if it is the difficulty will be in showing the IsLocalizedFunctionTransform part.

Tomas Skrivan (Jun 27 2025 at 19:05):

Yeah I think so too, everything else is trivial. I'm open to tweaking the definition such that this is true.

Tomas Skrivan (Jun 27 2025 at 19:07):

But maybe we can use the fundamental theorem of variational calculus somehow and show that this theorem holds even with the current definition.

Ryan MacLean (Jun 27 2025 at 19:13):

Joseph Tooby-Smith said:

Do you think this is going to be true?
lemma HasVarAdjoint.symm {F : (X → U) → (Y → V)} {F' : (Y → V) → (X → U)}
    (hF : HasVarAdjoint F F') : HasVarAdjoint F' F := by sorry
I think if it is the difficulty will be in showing the IsLocalizedFunctionTransform part.

Does this help? I obviously used an LLM, but I'm using it to explain what your equation means logically to me. I ask it a bunch of different ways and it gives me the same steps, but sometimes the output is slightly different.

Absolutely — let’s break it down step by step so the whole thing is clear, internalized, and portable. Here's how to understand the full HasVarAdjoint.symm logic:

:repeat: The Core Idea

You’re proving a symmetry property:
If F and F' are variable adjoints (in the HasVarAdjoint sense),
then that relationship works in both directions.

:puzzle_piece: What's HasVarAdjoint?

It’s a structure (likely a structure or class in Lean) that says:

F is localized:
IsLocalizedFunctionTransform F
(meaning it plays nicely with smooth/decaying/test functions)

F' is its adjoint via inner product pairing:
⟪F f, g⟫ = ⟪f, F' g⟫
This is the key identity.

:brain: Goal of the Lemma

Prove:
lemma HasVarAdjoint.symm
  {F : (X → U) → (Y → V)} {F' : (Y → V) → (X → U)}
  (hF : HasVarAdjoint F F') : HasVarAdjoint F' F
So if F is the adjoint of F',
then F' is also the adjoint of F.

That’s symmetry.
Like if A = B, then B = A.

:brick: Step-by-step Breakdown

Step 1: Flip the inner product

From:
⟪F f, g⟫ = ⟪f, F' g⟫
we do a variable swap:
let f' := g, g' := f.

Then:
⟪F g', f'⟫ = ⟪g', F' f'⟫
Rewriting:
⟪F' f, g⟫ = ⟪f, F g⟫
Which is exactly the adjoint identity for F'.

:check_mark: Done.

Step 2: Show that F' is also localized

This is the tricky part. But here’s the intuition:

If F is localized, and

F' is defined by duality with F over a pairing that only works for localized inputs...

Then F' must also be localized.
Otherwise, the duality pairing wouldn't even make sense.

In practice, we write a helper lemma like:
lemma IsLocalizedFunctionTransform.symm_of_adjoint
  (hF : IsLocalizedFunctionTransform F)
  (hAdj : ∀ f g, ⟪F' f, g⟫ = ⟪f, F g⟫) :
  IsLocalizedFunctionTransform F'
This is your justification bridge.

Step 3: Bundle it together

You now have:

Duality: ⟪F' f, g⟫ = ⟪f, F g⟫

Localization: F' is also localized (via helper)

So you write:
lemma HasVarAdjoint.symm
  (hF : HasVarAdjoint F F') : HasVarAdjoint F' F :=
by
  rcases hF with ⟨h_local, h_dual⟩
  exact ⟨
    IsLocalizedFunctionTransform.symm_of_adjoint h_local h_dual,
    fun f g => (h_dual g f).symm
  ⟩
You pattern-match the structure, flip the duality, apply the helper, and rebuild the HasVarAdjoint in reverse.

:dna: Why It Matters

You’re proving symmetry of a duality concept.

This unlocks flexibility in working with transforms like the Fourier, Laplace, or general kernel-based ones.

It matters a lot in quantum mechanics, signal processing, and anything that uses adjoints of operators.

Let me know if you want this translated into a diagram, analogy, or with a physical interpretation (e.g., measuring vs generating signals). We can go deeper into any layer.```

Tomas Skrivan (Jun 28 2025 at 05:15):

Tomas Skrivan said:

But maybe we can use the fundamental theorem of variational calculus somehow and show that this theorem holds even with the current definition.

After some thought this is probably not going to work. At some point you have to show that

F φ - F φ' = 0

but in order to apply the fundamental lemma of variational calculus you need some regularity on F φ and F φ' like continuity. We can show smoothness of F φ' from smoothness of φ' and IsLocalizedFunctionTransform F'. We can't do the same for F φ. It is getting circular as to prove IsLocalizedFunctionTransform F we would need IsLocalizedFunctionTransform F. Furthermore, we do not assume anything about φ and φ' in the statement of IsLocalizedFunctionTransform. I was thinking about modifying the definition to

def IsLocalizedFunctionTransform (F : (X → U) → (Y → V)) : Prop :=
  ∀ (K : Set Y) (_ : IsCompact K), ∃ L : Set X,
    IsCompact L ∧ ∀ (φ φ' : X → U), ContDiff ℝ ∞ φ → ContDiff ℝ ∞ φ' → (∀ x ∈ L, φ x = φ' x) →
      ∀ x ∈ K, F φ x = F φ' x

i.e. adding ContDiff ℝ ∞ ....

Anyway, I think we could just change the definition of HasVarAdjoint to

structure HasVarAdjoint
    (F : (X → U) → (Y → V)) (F' : (Y → V) → (X → U)) where
  test_fun_preserving : ∀ φ, IsTestFunction φ → IsTestFunction (F φ)
  test_fun_preserving' : ∀ φ, IsTestFunction φ → IsTestFunction (F' φ)
  adjoint : ∀ φ ψ, IsTestFunction φ → IsTestFunction ψ →
    ∫ y, ⟪F φ y, ψ y⟫_ℝ = ∫ x, ⟪φ x, F' ψ x⟫_ℝ
  ext : IsLocalizedFunctionTransform F
  ext' : IsLocalizedFunctionTransform F'

Tomas Skrivan (Jun 28 2025 at 05:20):

Ryan MacLean said:

Joseph Tooby-Smith said:
Do you think this is going to be true?
lemma HasVarAdjoint.symm {F : (X → U) → (Y → V)} {F' : (Y → V) → (X → U)}
    (hF : HasVarAdjoint F F') : HasVarAdjoint F' F := by sorry
I think if it is the difficulty will be in showing the IsLocalizedFunctionTransform part.
Does this help? I obviously used an LLM, but I'm using it to explain what your equation means logically to me. I ask it a bunch of different ways and it gives me the same steps, but sometimes the output is slightly different.

Unfortunately LLMs are not yet good at reasoning about definitions that do not appear in existing literature. Our definition of HasVarAdjoint has some very important subtleties that the LLM is completely ignoring. So the output is just some hallucination that appears to be useful but does not contain any interesting ideas.

Ryan MacLean (Jun 28 2025 at 06:40):

Tomas Skrivan said:

Ryan MacLean said:
Joseph Tooby-Smith said:
Do you think this is going to be true?
lemma HasVarAdjoint.symm {F : (X → U) → (Y → V)} {F' : (Y → V) → (X → U)}
    (hF : HasVarAdjoint F F') : HasVarAdjoint F' F := by sorry
I think if it is the difficulty will be in showing the IsLocalizedFunctionTransform part.
Does this help? I obviously used an LLM, but I'm using it to explain what your equation means logically to me. I ask it a bunch of different ways and it gives me the same steps, but sometimes the output is slightly different.
Unfortunately LLMs are not yet good at reasoning about definitions that do not appear in existing literature. Our definition of HasVarAdjoint has some very important subtleties that the LLM is completely ignoring. So the output is just some hallucination that appears to be useful but does not contain any interesting ideas.

That’s kind of the issue I’m running into. The problem isn’t the LLM, I can feed the rules back in. The problem would be me.

As you can see, I can take any problem and logically break it down with an LLM. All it had was the snippet of code you posted, it understood your problem every single time. It came up with a method to solve it and the reasoning behind that, it didn’t hallucinate anything. It just doesn’t know HasVarAdjoint has other rules. If you told it those rules, it’ll work around it. It’s just autocorrect with a bunch of files in it. Ultimately I’ll probably end up making a custom GPT with all the files in there so it has the most current version to work from, but I’m still learning how to do that, I think I may have to switch to copilot for VS code to do it directly.

Is there a way of knowing those dependencies without knowing the codebase directly? I’m not trying to be a pain, I’m just horrible at coding but good at plumbing. If I understand the architecture it gives me the ability to help. You’ve seen what my stuff outputs, it’s not useful for making stuff I don’t understand, it’s only good for teaching me what I don’t understand. The physics isn’t my problem, it’s this. I don’t understand the language you’re using and why things are in these places, like why this equation would be dependent.

I won’t keep bothering you about this stuff now that I realize how to work on it separately, but how do you keep track of these things? I understand the flow of the math, not the flow of software. This is all logic, my sticking point is language not math.

Joseph Tooby-Smith (Jun 28 2025 at 07:27):

Hi Ryan, this isn’t the right place for this conversation. I know my tolerance for this kind of thing may be lower than others’, and my approach might come off more direct (mainly because I've seen where these things can lead), but I’m going to ask that you don’t post on this thread again. I understand you’re well-intentioned, but you’ve already received advice—both from me and others—on your original post. I’d recommend following that.

Joseph Tooby-Smith (Jun 28 2025 at 07:34):

Tomas Skrivan said:

Anyway, I think we could just change the definition of HasVarAdjoint to

structure HasVarAdjoint
    (F : (X → U) → (Y → V)) (F' : (Y → V) → (X → U)) where
  test_fun_preserving : ∀ φ, IsTestFunction φ → IsTestFunction (F φ)
  test_fun_preserving' : ∀ φ, IsTestFunction φ → IsTestFunction (F' φ)
  adjoint : ∀ φ ψ, IsTestFunction φ → IsTestFunction ψ →
    ∫ y, ⟪F φ y, ψ y⟫_ℝ = ∫ x, ⟪φ x, F' ψ x⟫_ℝ
  ext : IsLocalizedFunctionTransform F
  ext' : IsLocalizedFunctionTransform F'

I think this is a good modification, and I think it is worth making to have HasVarAdjoint be symmetric.

I think I agree that the fundamental lemma of variational calculus won't work here :(.

Tomas Skrivan (Jun 28 2025 at 07:41):

Joseph Tooby-Smith said:

I think this is a good modification, and I think it is worth making to have HasVarAdjoint be symmetric.

I also think so! It might be worth developing some API for IsLocalizedFunctionTransform as I feel in HasVarAdjoint we prove same things over and over.

I'm expecting having IsLocalizedFunctionTransform.id/comp/fmap/fderiv theorems and most of the rest will follow from it.

Tomas Skrivan (Jun 28 2025 at 07:41):

Like all the IsLocalizedFunctionTransform.neg/add/sub/mul will directly follow from IsLocalizedFunctionTransform.fmap

Joseph Tooby-Smith (Jun 28 2025 at 07:42):

Sounds good to me! Wasn't there a plan to remove L from the definition at one point? I forgot what happened to this.

Tomas Skrivan (Jun 28 2025 at 07:43):

I don't think so. I removed the condition that K ⊆ L as we never used it

Joseph Tooby-Smith (Jun 28 2025 at 07:47):

Ok, I'm probably misremembering then. Also, do you think this sorryful thing would be helpful here PR627? It may be useful to split the PRs up a bit without having to worry about every sorry being filled for now.

Tomas Skrivan (Jun 28 2025 at 07:59):

I don't know what to do, the PR keeps on ballooning :/ It feels unsatisfying as filling every sorry reveals missing API which adds two new sorry :joy: at this rate it won't get finished ...

Tomas Skrivan (Jun 28 2025 at 07:59):

Maybe we can just split it up

Tomas Skrivan (Jun 28 2025 at 08:02):

I won't be able to work on it at all for the next week or so. Feel free to do what you think is the best.

Maybe you can use the new sorry attribute to test it out and see how it works in practice.

Joseph Tooby-Smith (Jun 28 2025 at 08:06):

Yep, I can do this and split the PR up a bit, I think this will be a good test for it.

Ryan MacLean (Jun 28 2025 at 13:18):

Joseph Tooby-Smith said:

Hi Ryan, this isn’t the right place for this conversation. I know my tolerance for this kind of thing may be lower than others’, and my approach might come off more direct (mainly because I've seen where these things can lead), but I’m going to ask that you don’t post on this thread again. I understand you’re well-intentioned, but you’ve already received advice—both from me and others—on your original post. I’d recommend following that.

Hi Joseph,

I'm going to take your advice and stop posting here, but I think you're missing the point. It isn't about if ChatGPT can write good code, it decoded what you were saying and taught it to me. You, Joseph, are not decoding what you're saying and teaching it to me, you're telling me to leave.

You had a problem. I specifically said you don't have to use this code, is this the problem. It logically solved your problem, like a dozen times. It didn't hallucinate anything, it doesn't understand the syntax because it isn't trained on the ruleset.

I understand math and physics pretty damn well. It means you could have answered your own question logically by asking ChatGPT, whether it's mine or someone elses. Rather than doing that, you'd prefer to ask for other peoples opinons here and have a social club.

Enjoy your social club. I'm going to be busy teaching AI syntax. Thank you for your contributions. I'm going to be teaching other people how to use them, since you are less friendly than my AI. That way, people can learn from my AI and not come here and annoy you, maybe they'll be allowed to join your social club.

Incidentally, there's no such thing as "Artificial Intelligence," I just know how to use a computer better than most. You're literally coding it's guts, where do you think it allows for random in there?

Michael Rothgang (Jun 29 2025 at 07:24):

My 2c as an outside reader (mathematician, who knows this math somewhat). I'd say the LLM output you posted, Ryan, is not helpful to this thread. By this, I mean: it might help you as a person understand this thread better. But it does not help to answer the question posed above.

The output expands the step "assume being a formal adjoint is symmetric, how would I state this in Lean". However, that's not the question here: Joseph already gave a statement in Lean. The real question is "why is the formal adjoint, with this definition, symmetric" --- which is not obvious, because of this subtlety. The LLM output does not answer this at all. (One heuristic test why: any reasonable answer would have to deal with the subtlety.)

Michael Rothgang (Jun 29 2025 at 07:26):

Going meta-level, I perceive this thread as a venue for experts to discuss. This isn't meant to be accessible to newcomers. Some things require enough context, and always providing that is impractical. Sorry, that's how any specialised work (be it in the military, industry or academic research) works at some point.

Michael Rothgang (Jun 29 2025 at 07:28):

As a new user, learning more about Lean first and only participating in such threads once you understand better is a better strategy. If I'd onboard a new member, discussing current zulip discussions can be part of that: but the onboarding should usually happen off-thread.

Joseph Tooby-Smith (Jun 29 2025 at 10:28):

Continuing the original thread:

@Tomas Skrivan I think in InnerProductSpace' it may be wise to replace the condition:

∀ x : E, ‖x‖₂ ^ 2 = re (core.inner x x)

with the (stronger) condition that

∀ x : E, ‖x‖₂ = √ (re (core.inner x x))

otherwise we have nothing to say that ‖x‖₂ is non-negative.

Joseph Tooby-Smith (Jun 29 2025 at 10:57):

Actually never mind - I thought I would need this to show that the inner product is a bounded linear map, but I didn't.

Tomas Skrivan (Jun 29 2025 at 13:24):

Oh you are right, the definition is not correct. But the L2 norm function is not really used anywhere. I just thought it would be good to have.

Joseph Tooby-Smith (Jun 29 2025 at 15:44):

Maybe it is best to do this the other way around, define ‖x‖₂ as √re (core.inner x x) given an instance of InnerProductSpace'.

Tomas Skrivan (Jun 29 2025 at 17:50):

Yeah we should do it like that. I'm too stuck thinking computationally, you can implement the norm a bit faster than calling the inner product so having a separate function can be advantageous. For proving things it is completely irrelevant.

Joseph Tooby-Smith (Jun 30 2025 at 10:47):

Just realized that we should also be able to derive Hamilton's equations from what we have done, see e.g. here.

Joseph Tooby-Smith (Jun 30 2025 at 12:05):

Ok, this turned out to be pretty easy to do, needed to add some small things to the api but the proof is pretty straightforward, which I am very pleased about :).

Code

noncomputable def hamiltonEqOp (H : Time → X → X → ℝ) (p : Time → X) (q : Time → X) :
    Time → X × X :=
  fun x => (deriv q x + -gradient (fun x_1 => H x x_1 (q x)) (p x),
    -deriv p x + -gradient (fun x_1 => H x (p x) x_1) (q x))

theorem hamiltons_equations
    (H : Time → X → X → ℝ) (pq : Time → X × X) (hp : ContDiff ℝ ∞ pq)
     (hL : ContDiff ℝ ∞ ↿H) :
    (δ (pq':= pq), ∫ t, ⟪(pq' t).1, deriv (Prod.snd ∘ pq') t⟫_ℝ - H t (pq' t).1 (pq' t).2) =
    fun t => hamiltonEqOp H (fun t => (pq t).1) (fun t => (pq t).2) t:= by
  apply HasVarGradientAt.varGradient
  apply HasVarGradientAt.intro _
  · apply HasVarAdjDerivAt.add
    · let i := fun (t : ℝ) (x : X × X) =>  ⟪x.1, x.2⟫_ℝ
      apply HasVarAdjDerivAt.comp
        (F := fun (φ : ℝ → X × X) t => i t (φ t))
        (G := fun (φ : ℝ → X × X) t => ((φ t).1, deriv (Prod.snd ∘ φ) t))
      · apply HasVarAdjDerivAt.fmap
        · fun_prop
        · fun_prop
        intro t x
        apply DifferentiableAt.hasAdjFDerivAt
        fun_prop
      · apply HasVarAdjDerivAt.prod
        · apply HasVarAdjDerivAt.fst
          apply HasVarAdjDerivAt.id
          fun_prop
        · apply HasVarAdjDerivAt.deriv (F := fun (φ : ℝ → X × X) t => (φ t).2)
          apply HasVarAdjDerivAt.fmap
          · fun_prop
          · fun_prop
          intro t x
          apply DifferentiableAt.hasAdjFDerivAt
          fun_prop
    · apply HasVarAdjDerivAt.neg
      let H' := fun t => ↿(H t)
      change HasVarAdjDerivAt (fun φ x => H' x (φ x)) _ pq
      apply HasVarAdjDerivAt.fmap
      · fun_prop
      · fun_prop
      intro x u
      apply DifferentiableAt.hasAdjFDerivAt
      apply Differentiable.differentiableAt
      apply ContDiff.differentiable
      fun_prop
      simp
  · simp
    funext x
    rw [adjFDeriv_uncurry]
    simp
    swap
    · apply ContDiff.differentiable
      fun_prop
      simp
    rw [adjFDeriv_inner]
    simp
    conv_rhs =>
      enter [2, 1, 1, 1, x]
      rw [adjFDeriv_inner]
      simp
    rw [← gradient_eq_adjFDeriv, ← gradient_eq_adjFDeriv]
    rfl
    · apply ContDiff.differentiable
      fun_prop
      simp
    · apply ContDiff.differentiable
      fun_prop
      simp

Tomas Skrivan (Jun 30 2025 at 17:56):

Very nice! The proof indeed looks quite simple and automatable in the future.

Tomas Skrivan (Jun 30 2025 at 17:58):

Next logical step would be to connect Lagrangian and Hamiltonian mechanics but that will require developing Legamdre transform. Maybe a good opportunity to finally learn it properly.

Dominic Steinitz (Jun 30 2025 at 18:20):

I just wanted to say symplectic manifolds but that is probably years away.

Joseph Tooby-Smith (Jun 30 2025 at 18:24):

I was looking into this connection earlier, the best notes I found on it were these.

I think a lot depends on how much API there is for convex functions.

Michael Rothgang (Jun 30 2025 at 22:02):

Dominic Steinitz said:

I just wanted to say symplectic manifolds but that is probably years away.

I don't think it's years away. Differential forms are close, differential manifolds are there already (and we're gaining Riemannian metrics right this month). Having all the theory you'd expect will be a while, but I'm a bit more optimistic :-)

Joseph Tooby-Smith (Jul 01 2025 at 05:40):

I think the simplest definition of the hamiltonian from the Lagrangian is the following:

noncomputable def hamiltonian (L : Time → X → X → ℝ)
    (h : ∀ t q, Function.Bijective (gradient (L t q ·))) :
    Time → X → X → ℝ := fun t p q =>
  let v := Classical.choose (Function.bijective_iff_has_inverse.mp (h t q)) p
  ⟪p, v⟫_ℝ - L t q v

It seems to me that the whole story about convex functions etc, is just to ultimately prove that (h : ∀ t q, Function.Bijective (gradient (L t q ·))) is true. And in general, I think it might be easier to prove that h is true directly.

(naturally this only works for certain X)

Cleon Teunissen (Jul 12 2025 at 08:43):

There is something that I would like to verify, and whether it falls within the scope of the PhysLean objectives.

In the message of June 16, 2:33 PM Tomas Skrivan states the action functional, and then remarks: "[...] instead of working with $S$ I will be working with $S'$ "

I surmise that Tomas Skrivan is omitting the integration. I'm aware that the integration can be omitted; the integration is superfluous. However, I would have expected that that omission would be motivated explicitly. In physics the expectation is that the integration is the absolute core of the action concept; the physicist's expectation will be that omitting the integration renders the derivation invalid.

An example of that point of view: the book by the physicist Douglas Cline, who has made the book also available on the libretexts website. In the Prologue Douglas Cline is committed to believing that differential approach and variational approach are fundamentally distinct.

A number of years ago, looking for information about the Lagrange equation, I came across the following article: 'Deriving Lagrange’s equations using elementary calculus', published in 2003. Authors: Jozef Hanc, Edwin F. Taylor, Slavomir Tuleja.

The authors show that the Euler-Lagrange equation can be derived using differential reasoning only.

I used that discussion as the basis of a derivation of the Euler-Lagrange equation that is available on my own website: Calculus of Variations, as applied in physics

I'm aware of a couple of other sources with derivation of the Euler-Lagrange equation using differential reasoning only.

The researcher Preetum Nakkiran phrases the logic as follows: 'This result is often proven using integration by parts – but the equation expresses a local condition, and should be derivable using local reasoning.'

What is your opinion on the following:
To assess in what way the integration is a factor, does that fall within the scope of the PhysLean objectives?

Tomas Skrivan (Jul 12 2025 at 09:35):

Cleon Teunissen said:

To assess in what way the integration is a factor, does that fall within the scope of the PhysLean objectives?

I would say: variational gradient is defined with HasVarGradientAt which is defined using HasAdjVarDerivAt which is using HasVarAdjoint which has integrals in it. So yes, integration is still completely integral part of the computation. However, the formalization is done to hide it all away and end up with calculus that is very similar to normal differentiable calculus. All the applications of per partes are tucked away in elementary rules.

Tomas Skrivan (Jul 12 2025 at 09:36):

My aim was to have formalization of variational calculus for which it is easy to automate computations

Tomas Skrivan (Jul 12 2025 at 09:39):

One thing my approach does not address at all is the question of boundary conditions. The formalization is done such that we can ignore that question. We obtain the right equation 'in the bulk'(continuum mechanics slang for 'away from boundary')

Kenny Lau (Jul 12 2025 at 09:45):

what’s the current definition of var cal?

Tomas Skrivan (Jul 12 2025 at 09:47):

To find out, I would start at this file https://github.com/HEPLean/PhysLean/blob/master/PhysLean/Mathematics/VariationalCalculus/HasVarGradient.lean

Kenny Lau (Jul 12 2025 at 09:50):

interesting

Kenny Lau (Jul 12 2025 at 09:50):

it’s an inductive type

ZhiKai Pong (Jul 12 2025 at 09:52):

(is this docstring for https://github.com/HEPLean/PhysLean/blob/master/PhysLean/Mathematics/VariationalCalculus/Basic.lean incomplete?)

Tomas Skrivan (Jul 12 2025 at 10:01):

Would you like a doc that gives an overview of the whole directory there?

Joseph Tooby-Smith (Jul 12 2025 at 10:07):

To assess in what way the integration is a factor, does that fall within the scope of the PhysLean objectives?

Let me just address the part of this question about whether this is "within the scope of the PhysLean". IMO it absolutely is :). In addition to the conversation happening here, there is a lot to be done around the action and integration. E.g., when do two Lagrangians correspond to the same action, when do two actions or Lagrangians have the same or equivalent equations of motion etc.

ZhiKai Pong (Jul 12 2025 at 10:12):

ZhiKai Pong said:

(is this docstring for https://github.com/HEPLean/PhysLean/blob/master/PhysLean/Mathematics/VariationalCalculus/Basic.lean incomplete?)

what I mean is just the current doc string has an incomplete sentence at the end

The key took in variational calculus is:
``
∀ h, ∫ x, f x * h x = 0 → f = 0
``
which allows use to go from reasoning about integrals to reasoning about functions. There are

Joseph Tooby-Smith (Jul 12 2025 at 10:15):

Tomas Skrivan said:

Would you like a doc that gives an overview of the whole directory there?

I think generally including something like this in the Basic modules isn't a bad idea (especially if people get in the habit of always checking the Basic module first). Except the obvious time, effort, and maintenance burden that it adds.

Kenny Lau (Jul 12 2025 at 10:35):

sorry if this is a basic question, is var cal about taking derivatives of functions (R^n -> C) -> C?

Kenny Lau (Jul 12 2025 at 10:35):

i just want to make sure im getting the type right

Tomas Skrivan (Jul 12 2025 at 10:37):

Kenny Lau said:

sorry if this is a basic question, is var cal about taking derivatives of functions (R^n -> C) -> C?

Yes it is, but we requite that such functions are in the form of integral. Therefore, HasVarGradient accepts the function to be intehrated

Tomas Skrivan (Jul 12 2025 at 10:38):

I couldn't figure out a formal framework that would work with any function of that type.

Kenny Lau (Jul 12 2025 at 10:40):

ok, my next question is then, wouldnt this mean we’re taking derivatives inside S’, so that this would land us in exactly the same situation as the other channel about Fourier transform

Kenny Lau (Jul 12 2025 at 10:40):

(deriv: S’ -> S’)

Kenny Lau (Jul 12 2025 at 10:44):

i want to make S’ a reality lol

Tomas Skrivan (Jul 12 2025 at 10:49):

In finite dimension we have

S[u] = <S'[u], 1>

grad S u = grad S' u 1

In an infinite dimension, you don't have inner product space unless you restrict yourself to a very specific space but that would stiffle usability and generality

Tomas Skrivan (Jul 12 2025 at 10:49):

Kenny Lau said:

(deriv: S’ -> S’)

I don't understand this S' is a function, not a type

Joseph Tooby-Smith (Jul 12 2025 at 10:49):

(Just to prevent confusion, I note there are two different things been called $S'$ )

Tomas Skrivan (Jul 12 2025 at 10:50):

Kenny Lau said:

i want to make S’ a reality lol

I don't understand

Joseph Tooby-Smith (Jul 12 2025 at 10:51):

One $S'$ is Tomas's definition from an earlier message (#PhysLean > Variational Calculus @ 💬 ) and the other (from the Fourier transform thread) is the dual of the Schwartz maps.

Tomas Skrivan (Jul 12 2025 at 10:51):

Joseph Tooby-Smith said:

(Just to prevent confusion, I note there are two different things been called $S'$ )

Duh, really? Where, what?

Cleon Teunissen (Jul 12 2025 at 11:36):

There is knowledge about Calculus of Variations that I think is of interest to the PhysLean effort.

Disclaimer: I'm not a programmer. I'm sorry to say that I have no knowledge of the Lean programming language. (The extent of my programming: I have used the JSXGraph library to create interactive diagrams)

About the Euler-Lagrange equation:
We have: to derive the EL-eq. using integration is overkill. Evidence for that: it is also possible to derive the Euler-Lagrange equation using differential reasoning only; no involvement of integration.

Another angle is:
When deriving the Euler-Lagrange equation using the integration-by-parts approach: towards the end of the derivation the integration is dismissed. That implies that the integration is non-essential. If it would be essential then that dismissal would not be possible. The factor that is necessary-and-sufficient is the differential relation.

The reason that integration is superfluous is, in my opinion, very interesting. There is a striking parallel between that reason and the fundamental theorem of Calculus. (Conjecture: it is in fact an instance of the fundamental theorem of Calculus)

Tomas Skrivan (Jul 12 2025 at 11:52):

Then show don't tell :) could you give a short summary of the main idea here? I don't want to hunt it down myself in all those links you provided

Kenny Lau (Jul 12 2025 at 12:16):

so do you think it would work if we look at S’ which is the normed dual of the schwarz space?

Kenny Lau (Jul 12 2025 at 12:17):

@Joseph Tooby-Smith I’d be interested to make S’ a reality if you think this is a good idea

Joseph Tooby-Smith (Jul 12 2025 at 12:34):

@Kenny Lau I do think this is a good idea! I think this is related to my 3rd thing here #PhysLean > Space, Time and SpaceTime API @ 💬 , and would be very happy if you wanted to work on this! Maybe we could use that thread to discuss?

so do you think it would work if we look at S’ which is the normed dual of the schwarz space?

Are you asking the following here?: Do you think that variational calculus procedure we have would work if the function S' : (ℝ → ℝ) → (ℝ → ℝ) was replaced with an term of type 'normed dual of the Schwarz space' (let me call this space $\mathcal{S}^\dagger$ here to avoid confusion)? I think this could be made to work and it is very close to what Tomas did, except the test functions used for the calculus of variations here are different from the Schwartz functions.

Kenny Lau (Jul 12 2025 at 12:46):

yeah it’s essentially working out the correct space of “test functions”

Kenny Lau (Jul 12 2025 at 12:46):

what are the other choices?

Cleon Teunissen (Jul 12 2025 at 15:04):

Tomas Skrivan said:

[...] the main idea here?

Screenshot_Calculus_of_Variations__unit_of_operation.png

The diagram in the attached screenshot will be used throughout the following discussion.
As motivating example for derivation of the Euler-Lagrange equation I use the Catenary problem; what is the shape of a hanging chain?

(Incidentally, decades before the introduction of Calculus of Variations by Euler: Leibniz had solved the catenary problem using differential reasoning.)

I introduce what I will refer to as 'unit of operation'.

In the case of differentiation the unit of operation is a pair of points; in the diagram the unit of operation is a triplet of points. Connecting the point of the triplet there are two line sections: $A$ and $B$ . Variation of the unit of operation consists of applying variation to the y-coordinate of the point ( $x_2$ , $y_2$ ).

Over the entire curve the unit of operation is concatenated such that each point (except the boundary points) participates in the variation.
Example:
$x_0$ , $x_1$ , $x_2$
$x_1$ , $x_2$ , $x_3$ ,
$x_2$ , $x_3$ , $x_4$
And so on.

The intervals along the horizontal axis, the X-axis, are set to have the same length, so the $x$ -interval can be generically referred to as $\Delta x$ . To obtain the differential relation: $\displaystyle \lim_{\Delta x \to 0}$

$y_A$ is the y-coordinate of the midpoint of the line section $A$
$y_B$ is the y-coordinate of the midpoint of the line section $B$

We have the following:
Each subsection of the catenary is in and of itself an instance of the catenary problem, only with different boundary condition, depending on where along the catenary the subsection is taken.

There is no lower limit to how small the subsections can be taken.

The idea is to evaluate the unit of operation in such a way that the reasoning is concurrently valid everywhere along the curve.

Each section, section $A$ and section $B$ , contributes to the overall potential energy of the catenary. Per unit of length along the horizontal axis the potential energy of the corresponding section of catenary is a function of the height, and of the slope (the slope gives the curve length per unit along the horizontal axis.

$C_A$ the potential energy of section $A$
$C_B$ the potential energy of section $B$

At the equilibrium point the two contributions are counterchanging. The total potential energy is stationary when the two rates of change are matching.

$\frac{d(C_A)}{dy} = -\frac{d(C_B)}{dy} \tag{1}$

A set of four relations.
I'm using Lagrange notation: $y' \Leftrightarrow \tfrac{dy}{dx}$

$y_A = \frac{y_1 + y_2}{2}$ , $\quad y'_A = \frac{y_2 - y_1}{\Delta x}$ , $\quad y_B = \frac{y_2 + y_3}{2}$ , $\quad y'_B = \frac{y_3 - y_2}{\Delta x}$ (2)

To set up for later use: each of the above is differentiated with respect to $y$ .
$\frac{dy_A}{dy} = \frac{1}{2}$ , $\quad \frac{dy'_A}{dy} = \frac{1}{\Delta x}$ , $\quad \frac{dy_B}{dy} = \frac{1}{2}$ , $\quad \frac{dy'_B}{dy} = - \frac{1}{\Delta x}$ (3)

Each term has two contributions, one for the height and the other for the slope. For each contribution: partial differentiation:

$\frac{dy_A}{dy}\frac{\partial C_A}{y_A} + \frac{dy'_A}{dy}\frac{\partial C_A}{y'_A} = -\left( \frac{dy_B}{dy}\frac{\partial C_B}{y_B} + \frac{dy'_B}{dy}\frac{\partial C_B}{y'_B} \right) \tag{4}$

Using the relations stated in (3), substitute:

$\frac{1}{2}\frac{\partial C_A}{\partial y_A} + \frac{1}{\Delta x}\frac{\partial C_A}{\partial y'_A} = - \left( \frac{1}{2}\frac{\partial C_B}{\partial y_B} - \frac{1}{\Delta x}\frac{\partial C_B}{\partial y'_B} \right) \tag{5}$

Distribute the minus sign:

$\frac{1}{2}\frac{\partial C_A}{\partial y_A} + \frac{1}{\Delta x}\frac{\partial C_A}{\partial y'_A} = - \frac{1}{2}\frac{\partial C_B}{\partial y_B} + \frac{1}{\Delta x}\frac{\partial C_B}{\partial y'_B} \tag{6}$

Move everything to the right hand side:

$\frac{1}{2}\frac{\partial C_A}{\partial y_A} + \frac{1}{2}\frac{\partial C_B}{\partial y_B} - \frac{1}{\Delta x}\frac{\partial C_B}{\partial y'_B} + \frac{1}{\Delta x}\frac{\partial C_A}{\partial y'_A} = 0 \tag{7}$

Regroup the terms: in (7) the terms with $C_A$ and $C_B$ are side-by-side, in (8) the terms with $y$ and $y'$ are side-by-side.

$\frac{1}{2} \left( \frac{\partial C_A}{\partial y_A} + \frac{\partial C_B}{\partial y_B} \right) - \frac{1}{\Delta x} \left( \frac{\partial C_B}{\partial y'_B} -\frac{\partial C_A}{\partial y'_A} \right) = 0 \tag{8}$

The expression is now ready to take the limit of $\Delta x \to 0$

$\frac{\partial C}{\partial y} - \frac{d}{dx}\left(\frac{\partial C}{\partial y'}\right) = 0 \tag{9}$

The core ideas:

The variational problem can be subdivided in subsections. Each subsection is in and of itself an instance of the problem you want to solve.

As we know: a differential equation is a higher order equation in the following sense: whereas a root finding equation as a value as its solution, the solution to a differential equation is a function.
For a differential equation: the space of solutions to the differential equation is a space of functions.

A differential equation is a global equation in the following sense: the demand is that the differential equation is to be satisfied concurrently over the whole domain.

In the derivation: the step that I find the most interesting is the one from (7) to (8). That regrouping allows the equation to be collapsed from four terms to two terms, so that you can solve the problem.

This derivation of the Euler-Lagrange equation is modeled after the derivation in the following article: Deriving Lagrange’s equations using elementary calculus

The discussion in this message is an excerpt from the discusion on my own website: Calculus of Variations, as applied in physics
In that treatment the catenary problem is solved twice; first with differential calculus, then with calculus of variations.
Not only do the two approaches arrive at the same solution, the two approaches are convergent: they give the same differential equation.

The reason for the convergence isn't specific to the catenary problem; it generalizes.
Conjecture:
The set of all variational equations is a subset of the set of all differential equations.

Cleon Teunissen (Jul 13 2025 at 09:55):

Joseph Tooby-Smith said:

[...] there is a lot to be done around the action and integration. E.g., when do two Lagrangians correspond to the same action, when do two actions or Lagrangians have the same or equivalent equations of motion etc.

In my previous message I used the catenary problem as motivating example. The reason that the catenary problem is a challenging problem: stated as a variational problem the Lagrangian of the catenary problem has the following form:

$y \ \sqrt{1 + {y'}^2}$

That Lagrangian requires partial differentiation.

As to the application of Calculus of Variations in classical mechanics: the thing is: compared to the catenary problem classical mechanics is a problem of a lower level.

I mean lower level in the following sense: in classical mechanics the potential energy is a function of position coordinate only, and the kinetic energy is a function of the velocity coordinate only, so there is no need for partial differention. It's a much, much simpler case.

Some historical information that is helpful for context:
Lagrange developed his mechanics in the years leading up to the publication of Mécanique analytique in 1788, and Hamilton's stationary action was introduced in 1834. That is: Lagrange developing his mechanics was half a century before the introduction of Hamilton's stationary action.

Stretching from the time of Huygens up to and including the time of Hamilton the concept of vis viva (the 'living force'), was used - vis viva being defined as $mv^2$ . Around the 1850's the physics community shifted to the notion of kinetic energy, defined as $\tfrac{1}{2}mv^2$ . I will return to this further on in this message.

Statement:
If $F=ma$ is granted as law of motion then the work-energy theorem follows as theorem.

In preparation for derivation of the work-energy theorem:
With:
$s$ position coordinate
$v$ velocity
$a$ acceleration
$t$ time

A relation will be derived, following from these definitions:

$v = \frac{ds}{dt} \quad , \quad a = \frac{dv}{dt} \tag{1}$

The definitions will be used in this form:

$ds = v \ dt \tag{2}$

$a \ dt = v \tag{3}$

Let $a$ represent an arbitrary acceleration profile. Set up integration of this acceleration profile from starting position coordinate $s_0$ to final position coordinate $s$

$\int_{s_0}^s a \ ds \tag{4}$

$\int_{t_0}^t a \ v \ dt \tag{5}$

$\int_{t_0}^t v \ a \ dt \tag{6}$

$\int_{v_0}^v v \ dv \tag{7}$

The result:

$\int_{s_0}^s a \ ds = \tfrac{1}{2}v^2 - \tfrac{1}{2}v_0^2 \tag{8}$

(8) generalizes to other contexts. Here it is stated in terms of position, velocity and acceleration; it generalizes to any context where a coordinate, its first time derivative, and its second time derivative are concurrently used.

The work-energy theorem

Starting point:

$F = ma \tag{9}$

Set up integration with respect to the position coordinate:

$\int_{s_0}^s F \ ds = \int_{s_0}^s ma \ ds \tag{10}$

Use (8) to process the right hand side:

$\int_{s_0}^s F \ ds = \tfrac{1}{2}mv^2 - \tfrac{1}{2}mv_0^2 \tag{11}$

(11) is the work-energy theorem.
The work-energy theorem motivates the concepts of potential energy and kinetic energy, and provides the definitions for them. Specifically: why does the definition of kinetic energy have that factor $\tfrac{1}{2}$ ? That's because the value of kinetic energy is obtained by way of integration - in this case integration with respect to position coordinate.

An integral expresses a difference:

$\Delta E_k = - \Delta E_p \tag{12}$

For (12) we have: the validity extends to infinitesimally small change of energy.

Thus the work-energy theorem implies: as an object is being accelerated by a force there is a conserved quantity: the rate of change of kinetic energy matches the rate of change of potential energy, hence the sum of the two is a conserved quantity.

At this point we have the option of capitalizing on the fundamental theorem of Calculus.

We take the work-energy theorem, and we set up the inverse of the integration that obtained the work-energy theorem; we set up differentiation with respect to position coordinate

$\frac{d \left( \int_{s_0}^s F \ ds \right)} {ds} = \frac{d \left( \tfrac{1}{2}mv^2 \right)}{ds} \tag{13}$

At this point I want to compare two differentiation operations:

$\frac{d(\tfrac{1}{2}mv^2)}{ds} \tag{14}$

and

$\frac{d}{dt} \frac{d(\tfrac{1}{2}mv^2)}{dv} \tag{15}$

Processing both:

$\frac{d(\tfrac{1}{2}mv^2)}{ds} = \tfrac{1}{2}m\left( 2v\frac{dv}{ds} \right) = m\frac{ds}{dt}\frac{dv}{ds} = m\frac{dv}{dt} = ma \tag{16}$

$\frac{d}{dt} \frac{d(\tfrac{1}{2}mv^2)}{dv} = \frac{d}{dt}mv = ma \tag{17}$

The two differentiation operations arrive at the same result: $ma$ .
I accept (16)-(17) as corroboration that (14) and (15) are the same differentiation operation, but of course (16)-(17) is way short of a proof.

With the differentiation operation stated in the form of (15):

$\frac{d(-E_p)}{ds} = \frac{d}{dt}\frac{d (E_k)}{d v} \tag{18}$

With the benefit of hindsight: (18), and how 18 is derived, is the basis of the mechanics that was developed by Joseph Louis Lagrange. (The way Lagrange motivated the mechanics was comparitively circuitous; Lagrange wasn't using the work-energy theorem. The way Lagrange motivated the mechanics did not involve calculus of variations.)

I will refer to (18) as 'the Lagrange equation'.

The following notation is pretending that partial differentiation is being used:

$\frac{\partial(E_k - E_p)}{\partial s} - \frac{d}{dt}\frac{\partial (E_k - E_p)}{\partial v} = 0 \tag{19}$

My objection to (19) is:
It is smoke and mirrors. (19) is still the same equation as (18).

Discussion:

The power of the formulation of mechanics developed by Joseph Louis Lagrange is that the mechanics taking place is expressed in terms of potential energy and kinetic energy.

In terms of the newtonian formulation of mechanics: when there are multiple degrees of freedom vectorial representation is used, and with $\vec{F}=m\vec{a}$ the vectorial information is present on both sides of the equation. In the case of coordinate representation the vectors must be decomposed in vector components.

On the other hand: with kinetic energy and $x, y, z$ degrees of freedom:

$\tfrac{1}{2}mv_x^2 + \tfrac{1}{2}mv_y^2 + \tfrac{1}{2}mv_z^2 = \tfrac{1}{2}mv^2$

That is: kinetic energy, being a quadratic expression, comes pre-pythagorized.

In terms of the mechanics of Joseph Louis Lagrange the directional information of the velocity vector is discarded; the expression for the kinetic energy doesn't carry directional information.

The required directional information is still available: take the potential energy, and evaluate the gradient of the potential energy. By differentiating the potential energy (decomposing in components) the directional information is made explicit again.

Given an equation of motion: how to construct a Lagrangian?

The Euler-Lagrange operator executes differentiation with respect to position coordinate.
Simplest example: the catenary problem:
The catenary shape is an instance of force equilibrium.
In order to apply calculus of variations: obtain the integral-with-respect-position of the force. Then the Euler-Lagrange operator converts that potential energy back to an expression in terms of force.

The ways a Lagrangian can be non-unique correlate with the fact that differentiation is a lossy operation. Presumably: the fact that the Euler-Lagrange operator specifies partial differentiation creates significant leeway.

The key points:

The lagrange equation:

$\frac{d(-E_p)}{ds} = \frac{d}{dt}\frac{d (E_k)}{d v} \tag{18}$

The Lagrange equation consists of three elements:

$F=ma$
$\int_{s_0}^s a \ ds = \tfrac{1}{2}v^2 - \tfrac{1}{2}v_0^2$
The fundamental theorem of calculus: integration and differentiation are inverse operations.

Joseph Tooby-Smith (Jul 13 2025 at 10:24):

Hey @Cleon Teunissen, it is going to take me a bit of time to read and try and understand what you've written here, and I'll have more time tomorrow to do this - so if no one else gets back to you before then - I will respond tomorrow.

Cleon Teunissen (Jul 13 2025 at 10:35):

Joseph Tooby-Smith said:

[...] I'll have more time tomorrow [...]

I hope you will allocate the time to consider the information thoroughly. If it takes longer, please take longer.

This information is towards the goal of the Lean project to assist in clarifying derivations.

In physics, as in any science, it is considered worthwhile to try and narrow down to the minimal set of assumptions that is sufficient. This information goes towards that.

Joseph Tooby-Smith (Jul 14 2025 at 10:05):

Ok, I've read and somewhat understood what you have written. I have two questions:

I'm slightly confused on where you stand - do you think integration is important or not in terms of this?
In your second message, your starting assumptions is newtonian mechanics, right?

Cleon Teunissen (Jul 15 2025 at 00:02):

Joseph Tooby-Smith said:

do you think integration is important or not in terms of this?

First off: to exclude possibility of babylonian confusion about which integration is intended: in the case of Hamilton's stationary action the integration I'm referring to is the integration of the Lagrangian with respect to time:

$S = \int_{t_1}^{t_2} \ L(q, \dot{q}) \ dt$

To derive the Euler-Lagrange equation it is not necessary to involve that integration; the Euler-Lagrange equation can be derived with purely differential reasoning.

To explain how that is possible: a particular case in the history of Calculus of Variations: the Brachistochrone problem.

The Bernoulli who issued the challenge was Johann Bernoulli. In the end about four or five mathematicians had solved the problem, among them was Johann's older brother Jacob Bernoulli.

Jacob Bernoulli opened his discussion of the Brachistochrone problem (Acta Eruditorum, May 1697, pp. 211-217) with a lemma.

jacob_lemma_acta_512x512.png

Lemma. Let ACEDB be the desired curve along which a heavy point falls from A to B in the shortest time, and let C and D be two points on it as close together as we like. Then the segment of arc CED is among all segments of arc with C and D as end points the segment that a heavy point falling from A traverses in the shortest time. Indeed, if another segment of arc CFD were traversed in a shorter time, then the point would move along ACFDB in a shorter time than along ACEDB, which is contrary to our supposition.

Key to the lemma is the "let C and D be two points on it as close together as we like."
If the curve as a whole is a path of fastest descent then each subsection is itself a curve of fastest descent, and that is valid down to infinitesimally short subsections.

That is:
The property of being a curve of fastest descent obtains at the infinitesimal level, and from there it propagates out to the curve as a whole.

When deriving the the Euler-Lagrange equation you can go either way:

without involvement of the integration
with involvement of the integration

In both cases the result is the same: the Euler-Lagrange equation.

That is to say: having the integration involved in deriving the Euler-Lagrange equation is not contributing anything; the differential relation carries the information.

It all falls into place:
As I mentioned in an earlier message: in the case of the derivation of the EL-eq. using integration-by-parts: towards the end the integration is dismissed, while retaining the differential relation. The integration can be dismissed because it is the differential relation that expresses the information that allows you to solve the problem.

Cleon Teunissen (Jul 15 2025 at 00:07):

Joseph Tooby-Smith said:

[...] your starting assumptions is newtonian mechanics, right?

Well, I'm not committed to a particular choice of starting point.

That is why I wrote: "If F=ma is granted as law of motion."

As the physics historian John Norton has pointed out:
In any logical system there is great freedom to interchange theorem and axiom without changing the contents of the system.

When taking the work-energy relation as axiom: F=ma follows by way of differentiation.

My primary concern is the content, and the relations.
Example: we have that the Lagrangian formulation of mechanics and the Hamiltonian formulation of mechanics are related by Legrendre transform. Legendre transform is it's own inverse; applying Legendre transform twice recovers the original function. Lagrangian mechanics and Hamiltonian mechanics, while formulated differently, are expressing the same content.

In the case of the newtonian formulation and the formulation of mechanics by Joseph Louis Lagrange: the derivation of the work-energy relation provides the transformation between the two. I think of the two formulations as expresssing the same content.

About the newtonian formulation

The following may be worth mentioning explicitly:

The newtonian formulation of mechanics generalizes to using generalized coordinates.
For further reading on that: the discussion by Richard Fitzpatrick: Generalized Forces

(It may be that some people are under the impression that using generalized coordinates is exclusive to the formulation of mechanics by Joseph Louis Lagrange. That isn't the case.)

Discussion:

The type of relation described by $a=\frac{F}{m}$ is a recurring theme.

In the abstract:
There is a state, for example a position coordinate, its first time derivative, and its second time derivative. A causal agent can generically be referred to as a 'force'. When there is opposition to change of state (an inertia) then a relation of the type $a=\frac{F}{m}$ is seen.

Example:
The electromechanics of electric oscillation in an LC-circuit.
With an LC-circuit:
The electromagnetic effects of the inductor give rise to an opposition to change of current strength. With an inductor in the circuit: an electromotive force is required to get a current going. Once a current is going the inductor acts in opposition to decrease in current strength. That is: inductance is quite analogous to inertia.

Change of current strength is a second time derivative.

Let $\frac{dC}{dt}$ represent rate of flow of charge through a circuit.
With:
$F_e$ electromotive force
$L$ inductance

Then:
$\frac{d^2C}{dt^2} =\frac{F_e}{L}$

This type of relation - response to causal agent proportional to second time derivative - is a recurring theme.

When a relation of the type $\frac{d^2C}{dt^2} =\frac{F_e}{L}$ is applicable then a counterpart of the work-energy relation is applicable.

Cleon Teunissen (Jul 15 2025 at 20:47):

@Joseph Tooby-Smith

I'm aware that my outpouring of messages can lead to fatigue and diminishing returns, but there is another angle that may well go a long way in clarification.

Back in 2003, in the collaboration of Edwin F. Taylor, Jozef Hanc, and Slavomir Tuleja, they also featured interactive diagrams, implemented in Java. They referred to the process of converging on the sought after curve as 'hunting'.

My version of that is an interactive diagram for finding the shape of a soap film stretching between two coaxial rings.

soapfilm_catenoid.jpg

Image credit: Susan Schwartzenberg - Exploratorium

The screenshot below shows the shape-finding diagram in the state where it has converged onto the solution. The diagram shows the cross section. The polygon is already a remarkably good approximation.

Screenshot_catenoid.png

The screenshot below shows a state with the sliders scrambled.

Screenshot_catenoid2.png

There are 7 nodes; 3 of the nodes are mirrored.

Direct link to the diagram on my website:
Soap film stretching between coaxial rings

Each of the nodes is an instance of the type of triplet displayed in the unit-of-operation diagram:

Screenshot_Calculus_of_Variations__unit_of_operation.png

To converge onto the curve: iterate over the sliders. Each adjustment is by itself a local adjustment. Over multiple iterations the polygon converges.
The idea, of course, is that an automated version of that diagram is a numerical analysis implementation of variational approach.

The important property is: while iterating: at each visit to a node, applying variation to the node, only the surface area of the triplet unit-of-operation of that node is affected by that variation. So at each visit to a node it is sufficient to compute the change of area for that triplet unit-of-operation only. Over multiple iterations all of the adjustments propagate out to the polygon as a whole; the end result is convergence of the polygon as a whole.

The principle:
If all the unit-of-operation triplets are concurrently locally minimized then the curve as a whole is minimized.

In another article on my website I implemented three instances of a diagram with more sliders, 10 sliders, moving 19 nodes (9 nodes mirrored), and additional sliders for granular control. The following three cases are implemented:

potential increases linear with heigth
potential increases quadratic with displacement (Hooke's law)
potential increases in proportion to the cube of the displacement

The interactive diagrams available on my website are created with JSXGraph, a javascript library. The JSXGraph developers are employed at the University of Bayreuth, Germany.

Joseph Tooby-Smith (Jul 16 2025 at 10:02):

I think I understand your messages, and what you are saying. I think what would be good for you to think about is the actually mathematical definitions and theorems you want would want to formalize, and how they relate to the formalization already present of variational calculus. If you can state these in Lean - great, but if not I think stating them as mathematically concrete as possible would be helpful (ignoring the underlying physics for now).

Lean forces one to think in terms of precise mathematical definitions and theorems, and as such many people here think in that way, so doing so may also help others here understand better what you are trying to say.

Last updated: Dec 20 2025 at 21:32 UTC

Stream: PhysLean

Topic: Variational Calculus

Tomas Skrivan (Jun 13 2025 at 10:57):

Tomas Skrivan (Jun 13 2025 at 11:01):

Tomas Skrivan (Jun 13 2025 at 11:53):

Joseph Tooby-Smith (Jun 14 2025 at 06:00):

Tomas Skrivan (Jun 16 2025 at 12:33):

Joseph Tooby-Smith (Jun 16 2025 at 12:46):

Joseph Tooby-Smith (Jun 16 2025 at 13:16):

Tomas Skrivan (Jun 16 2025 at 13:27):

Joseph Tooby-Smith (Jun 16 2025 at 13:34):

Tomas Skrivan (Jun 16 2025 at 13:35):

Tomas Skrivan (Jun 16 2025 at 13:35):

Tomas Skrivan (Jun 16 2025 at 14:19):

Tomas Skrivan (Jun 16 2025 at 14:24):

Tomas Skrivan (Jun 16 2025 at 14:40):

Joseph Tooby-Smith (Jun 16 2025 at 15:03):

Tomas Skrivan (Jun 16 2025 at 15:07):

Tomas Skrivan (Jun 16 2025 at 15:10):

Joseph Tooby-Smith (Jun 16 2025 at 15:11):

Tomas Skrivan (Jun 16 2025 at 15:12):

Tomas Skrivan (Jun 16 2025 at 17:27):

Tomas Skrivan (Jun 16 2025 at 17:29):

Tomas Skrivan (Jun 16 2025 at 17:41):

Joseph Tooby-Smith (Jun 17 2025 at 06:25):

Tomas Skrivan (Jun 17 2025 at 07:10):

Joseph Tooby-Smith (Jun 17 2025 at 08:32):

Tomas Skrivan (Jun 18 2025 at 11:15):

Joseph Tooby-Smith (Jun 18 2025 at 11:48):

Tomas Skrivan (Jun 18 2025 at 13:17):

Joseph Tooby-Smith (Jun 19 2025 at 05:23):

Lode Vermeulen (Jun 19 2025 at 06:17):

Joseph Tooby-Smith (Jun 19 2025 at 06:47):

Tomas Skrivan (Jun 19 2025 at 06:50):

Tomas Skrivan (Jun 19 2025 at 07:11):

Tomas Skrivan (Jun 19 2025 at 07:14):

Tomas Skrivan (Jun 19 2025 at 07:15):

Tomas Skrivan (Jun 19 2025 at 07:18):

Joseph Tooby-Smith (Jun 19 2025 at 07:18):

Tomas Skrivan (Jun 19 2025 at 07:19):

Joseph Tooby-Smith (Jun 19 2025 at 07:21):

Tomas Skrivan (Jun 19 2025 at 07:22):

Tomas Skrivan (Jun 19 2025 at 07:24):

Tomas Skrivan (Jun 19 2025 at 07:26):

Joseph Tooby-Smith (Jun 19 2025 at 07:26):

Tomas Skrivan (Jun 19 2025 at 07:28):

Joseph Tooby-Smith (Jun 19 2025 at 07:30):

Tomas Skrivan (Jun 19 2025 at 07:32):

Tomas Skrivan (Jun 19 2025 at 07:33):

Tomas Skrivan (Jun 19 2025 at 07:35):

Tomas Skrivan (Jun 19 2025 at 07:36):

Tomas Skrivan (Jun 19 2025 at 07:38):

Joseph Tooby-Smith (Jun 19 2025 at 07:54):

Tomas Skrivan (Jun 19 2025 at 08:01):

Tomas Skrivan (Jun 19 2025 at 08:02):

Joseph Tooby-Smith (Jun 19 2025 at 08:02):

Joseph Tooby-Smith (Jun 19 2025 at 08:03):

Tomas Skrivan (Jun 19 2025 at 08:03):

Tomas Skrivan (Jun 19 2025 at 08:05):

Joseph Tooby-Smith (Jun 19 2025 at 08:10):

Tomas Skrivan (Jun 19 2025 at 08:16):

Joseph Tooby-Smith (Jun 19 2025 at 09:23):

Joseph Tooby-Smith (Jun 19 2025 at 09:25):

Tomas Skrivan (Jun 19 2025 at 09:30):

Joseph Tooby-Smith (Jun 19 2025 at 09:31):

Joseph Tooby-Smith (Jun 19 2025 at 09:33):

Tomas Skrivan (Jun 19 2025 at 09:42):

Joseph Tooby-Smith (Jun 19 2025 at 09:44):

Joseph Tooby-Smith (Jun 19 2025 at 09:46):

Joseph Tooby-Smith (Jun 19 2025 at 10:04):

Tomas Skrivan (Jun 19 2025 at 10:51):

Joseph Tooby-Smith (Jun 19 2025 at 10:53):

Tomas Skrivan (Jun 19 2025 at 10:55):

Tomas Skrivan (Jun 19 2025 at 10:57):

Joseph Tooby-Smith (Jun 19 2025 at 11:00):

Tomas Skrivan (Jun 19 2025 at 11:01):

Tomas Skrivan (Jun 19 2025 at 11:04):

Tomas Skrivan (Jun 19 2025 at 11:10):

Joseph Tooby-Smith (Jun 19 2025 at 11:40):

Joseph Tooby-Smith (Jun 19 2025 at 12:08):