Zulip Chat Archive
Stream: PhysLean
Topic: Variational Calculus
Tomas Skrivan (Jun 13 2025 at 10:57):
Here I just want to summarize few things about variational calculus and hopefully dispel some confusion about it. Hopefully I will also manage to pinpoint the difficulty compared to the finite dimensional case.
The canonical example of variational calculus is that the function that minimizes the functional
satisfies the Poisson equation
where is some given function.
The calculus of variation allows you to show that
thus the Poisson equation is .
(I'm being purposefully imprecise here as I do not mention the integration domain nor boundary conditions. This is usually done in physics textbooks as they often want to derive only the differential equation and not the boundary conditions.)
Finite dimensional analog
Let's have a look at analogous finite dimensional problem. Discretized version of the above energy, for , is
in Lean
def F (u : PiLp 2 (fun _ : Fin 10 => ℝ)) : ℝ := ∑ i, ((u (i+1) - u i)^2 + u i * f i)
The necessary condition for minimum, , can be expressed with fderiv or with gradient
theorem minimum_condition_1 (u) : (F u = ⨅ u', F u') → fderiv ℝ F x = 0 := sorry
theorem minimum_condition_2 (u) : (F u = ⨅ u', F u') → gradient F x = 0 := sorry
the difference between the two is not interesting in finite dimension but in infinite dimension fderiv leads to so called weak form of the differential equation and gradient leads to the strong form of the differential equation. For exposition we keep the distinction in finite dimensional case too. Let's define the weak and strong forms
def weakForm (u v : PiLp 2 (fun _ : Fin 10 => ℝ)) : ℝ :=
∑ i, ((u (i+1) - u i) * (v (i+1) - v i) - v i * f i)
def strongForm (u : PiLp 2 (fun _ : Fin 10 => ℝ)) : PiLp 2 (fun _ : Fin 10 => ℝ) :=
fun i => u (i+1) - 2 * u i + u (i-1) - u i * f i
Notice that u (i+1) - 2 * u i + u (i-1) in strongForm the discrete version of Laplace operator .
We can show that fderiv is equal to the weakForm and gradient is equal to the strongForm
theorem fderiv_eq_weakForm : fderiv ℝ F u v = 2 * weakForm u v := sorry
theorem gradient_eq_strongForm : gradient F u = - strongForm u := sorry
The equation strongForm u = 0 is:
∀ i, u (i+1) - 2 * u i + u (i-1) = u i * f i
which is discretized version of Poisson equation.
Operator view
The above equation are much more readable in matrix/operator form. Let's introduce matrix that takes difference of consecutive values (analog of derivative operator)
def D : (PiLp 2 (fun _ : Fin 10 => ℝ)) →L[ℝ] (PiLp 2 (fun _ : Fin 10 => ℝ)) :=
⟨⟨⟨fun x => fun i => x (i+1) - x i,sorry⟩,sorry⟩, sorry⟩
and its negative transpose
def D' : (PiLp 2 (fun _ : Fin 10 => ℝ)) →L[ℝ] (PiLp 2 (fun _ : Fin 10 => ℝ)) :=
⟨⟨⟨fun x => fun i => x i - x (i-1),sorry⟩,sorry⟩, sorry⟩
D' is just a different way of discretizing derivative and we can show that D.adjoint = - D'.
Now the above equations can be written as:
def F' (x) := ‖D x‖^2
def weakForm' (x u) := ⟪D x, D u⟫_ℝ
def strongForm' (x) := D' (D x)
theorem minimum_weakForm' : fderiv ℝ F' x u = 2 * weakForm' x u := sorry
theorem minimum_strongForm' : gradient F' x = - strongForm' x := sorry
i.e. it boils down to showing these two simple theorems
theorem fderiv_norm2 : fderiv ℝ (fun u => ‖D u‖^2) u v = 2 * ⟪D u, D v⟫_ℝ := sorry
theorem gradient_norm2 : gradient (fun u => ‖D u‖^2) u = - D' (D u) := sorry
and showing that D.adjoint = - D'.
In infinite dimensional case, the calculation is basically the same. The difficulty is to figure out the setting(i.e. appropriate function spaces) such that the calculation can be made formal.
Full code in finite dimension. I believe that the code can be made sorry free.
code
Difficulties in infinite dimensional case.
We can define the energy
def E (u : ℝ → ℝ) : ℝ := ∫ x, ((deriv u x)^2 + u x * f x)
but we get into trouble immediately. The integral is meaninful only under some integrability and differentiability conditions on u which also requires to specify the domain over which we integrate. Furthermore, the necessary conditions
theorem minimum_condition_1 (u) : (F u = ⨅ u', F u') → fderiv ℝ F u = 0 := sorry
theorem minimum_condition_2 (u) : (F u = ⨅ u', F u') → gradient F u = 0 := sorry
do not even type check as ℝ → ℝ is not NormedSpace for fderiv or InnerProductSpace for gradient.
The standard approach is to restrict E to some appropriate Sobolev space which gives you inner product space structure. In this setting, you should be able to also show the reverse implication gradient F u = 0 → (F u = ⨅ u', F u') with some convexity and coercivity condition on F.
Just for computing the derivative, I think some approach based on test functions and distributions should be possible too. I also have some ideas how to to this using the theory of convenient vector spaces. In both of these cases, you do not have inner product space so one has to be a bit more careful how to define gradient but I will write about that another time.
Tomas Skrivan (Jun 13 2025 at 11:01):
I would like to have a formal setting in which we can do the operator view reasoning because then I know how to write a tactic that automates the computation. Meanwhile we can do the calculation by hand. Here is a skeleton how to to the basic calculation for Poisson equation on [0,1] interval:
import Mathlib
noncomputable section
open Set
variable
(u δu f : ℝ → ℝ)
(hu : Differentiable ℝ u) (hδu : Differentiable ℝ δu) (hf : Continuous f)
(hb₀ : δu 0 = 0) (hb₁ : δu 1 = 0) -- boundary conditions
def E (f u : ℝ → ℝ) := ∫ x in Icc 0 1, 1/2 * (deriv u x)^2 + u x * f x
def B (f u v : ℝ → ℝ) := ∫ x in Icc 0 1, deriv u x * deriv v x + v x * f x
def strongForm (f u : ℝ → ℝ) (x : ℝ) := (- iteratedDeriv 2 u x + f x)
theorem minimum_weakForm :
(deriv (fun t : ℝ => E f (u + t • δu)) 0) -- this is more or less the definition of the symbol `δE/δu`
=
B f u δu := by
unfold E B
-- 1. exchange integral and derivative
-- 2. compute simple derivative
-- 3. clean up expression
sorry
theorem minimum_strongForm (u δu f : ℝ → ℝ) (hu : Differentiable ℝ u) (hδu : Differentiable ℝ δu) :
(deriv (fun t : ℝ => E f (u + t • δu)) 0)
=
∫ x in Icc 0 1, strongForm f u x * δu x := by
simp (disch:=assumption) [minimum_weakForm]
unfold B strongForm
-- 1. apply per-partes -- on discrete level this is the step where we show that `D.adjoint = - D'`
-- 2. remove boundary terms using boundary conditions
-- 3. clean up expression
sorry
Then we proceed with the fundamental lemma of variational calculus which states
(∀ δu, ∫ x in Icc 0 1, g x * δu x = 0) → ∀ x in Icc 0 1, g x = 0
for g = strongForm f u we conclude that
iteratedDeriv 2 u x = f x
Tomas Skrivan (Jun 13 2025 at 11:53):
Sketch how to use theory of distributions
Here is a sketch how to set things up to do the calculation with test functions and dsitributions such that you end up with operators for which I know how to write tactic that automates the calculations.
The following is only a sketch! Let 𝒟 ℝ is the space of test functions in reals and 𝒟' ℝis the space if distributions. We can convert between the two with
def toDistrib (f : 𝒟 ℝ) : 𝒟' ℝ := fun φ => ∫ x, f x * φ x
def fromDistrib (u : 𝒟' ℝ) : 𝒟 ℝ := if h : ∃ φ, toDistrib φ = u then choose h else 0
of course fromDistrib is only one way inverse, not every distribution is test function.
There is a well defined notion of adjoint for functions between test functions which produces function between distributions
def adjoint (F : (𝒟 ℝ) → 𝒟 ℝ) : (𝒟' ℝ) → 𝒟' ℝ := fun u φ => u (F φ)
We can lift this to adjoint' that operates only test functions only by using fromDistrib
def adjoint' (F : (𝒟 ℝ) → 𝒟 ℝ) : (𝒟 ℝ) → 𝒟 ℝ := fun u => fromDistrib (adjoint F (toDistrib u))
Not every function has meaningful adjoint this way but many do. For example adjoint' deriv = - deriv.
Now we can define variational gradient:
def varGradient (F : 𝒟 ℝ → ℝ) (ϕ : 𝒟 ℝ) : 𝒟 ℝ :=
adjoint' ((fun c _ => c) ∘ fderiv ℝ F ϕ) (fun _ => 1)
this is the function that is usually denoted with δ.
I believe if everything is made formal one can then show that
theorem varGradient_dirichelt_energy (f ϕ : 𝒟 ℝ) :
varGradient (fun ϕ => ∫ x, 1/2*(deriv ϕ x)^2 + ϕ x * f x) ϕ
=
- iteratedDeriv 2 φ + f := sorry
and importantly I know how to write a tactic for fderiv and adjoint' such that this is proven automatically by that tactic.
Joseph Tooby-Smith (Jun 14 2025 at 06:00):
Hi @Tomas Skrivan , many many thanks for writing this out! For me at least, this clears up my understanding of the things involved, so thank you for explaining!
Tomas Skrivan (Jun 16 2025 at 12:33):
Here is a concrete plan to do mechanistically computations in variational calculus and I will demonstrate it on derivation of Euler-Lagrange equations.
This nice thing is that this will lead to an actual derivation of Euler-Lagrange equations from action. Once I port data_synth tactic it will take action as an input and spit out Euler-Lagrange equations and a proof that it is the variational gradient of action.
What I want to show is that if we have the action functional
then
There are some issues that I do not know how to resolve regarding the integration domain. Therefore instead of working with I will be working with
i.e. just without the integral and I will define a meaning for instead of for and I will show that
The Lean statement of this equation is
theorem euler_lagrange (L : ℝ → ℝ → ℝ) (u : ℝ → ℝ)
(hL : ContDiff ℝ ∞ (↿L)) (hu : ContDiff ℝ ∞ u) :
HasVarGradientAt
(fun u t => L (u t) (deriv u t))
(fun t =>
deriv (L · (deriv u t)) ((u t))
-
deriv (fun t' => deriv (L (u t') ·) (deriv u t')) t)
u
where HasVarGradientAt has signature
HasVarGradientAt (S' : (ℝ → ℝ) → (ℝ → ℝ)) (grad : ℝ → ℝ) (u : ℝ → ℝ) : Prop
saying that the function S' has variational gradient grad at point u. It is formulated with the Has... idiom instead of varGrad S' = ... because there are some problems with uniqueness as grad is most likely uniquely determined only almost everywhere.
The definition of HasVarGradientAt is:
def HasVarGradientAt (S' : (ℝ → ℝ) → (ℝ → ℝ)) (grad : ℝ → ℝ) (u : ℝ → ℝ) :=
∃ F',
HasAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F'
∧
F' (fun _ => 1) = grad
similarly to normal gradient we take an adjoint of derivative and evaluate it at one (at constant one function in this case). The derivative is just normal derivative of S' at u in the direction δu and then we take an adjoint w.r.t. δu. The adjoint is a spacial notion inspired by theory of distributions:
structure HasAdjoint (F : (ℝ → ℝ) → (ℝ → ℝ)) (F' : (ℝ → ℝ) → (ℝ → ℝ)) where
test_fun_preserving : ∀ φ, IsTestFunction φ → IsTestFunction (F φ)
test_fun_preserving' : ∀ φ, IsTestFunction φ → IsTestFunction (F' φ)
adj : ∀ φ ψ, IsTestFunction φ → IsTestFunction ψ →
∫ x, F φ x * ψ x = ∫ x, φ x * F' ψ x
i.e. F has adjoint F' if both F and F' preserve test functions and ⟪F φ, ψ⟫ = ⟪φ, F' ψ⟫ where the "inner product" ⟪·, ·⟫ is meaningful only on test functions.
The computation of the adjoint in the proof of euler_lagrange is
apply has_adjoint_ext _ _ _
· apply add_rule
· apply mul_left_rule
apply id_rule
fun_prop [deriv]
· apply mul_left_rule
apply deriv_rule
fun_prop [deriv]
· intros; rfl
for which I know how to write a tactic to fully automate.
Here is the full code
code
I believe that all sorry can be filled in and if someone would do that I would really appreciate it :)
Joseph Tooby-Smith (Jun 16 2025 at 12:46):
A proof for the first sorry is:
theorem IsTestFunction.integrable {f} (hf : IsTestFunction f) :
MeasureTheory.Integrable f :=
Continuous.integrable_of_hasCompactSupport (continuous hf.smooth) hf.supp
Joseph Tooby-Smith (Jun 16 2025 at 13:16):
All but last sorry (if I fill any more I will just edit the code below):
Header
Tomas Skrivan (Jun 16 2025 at 13:27):
You can set up IsTestFunction with fun_prop as
@[fun_prop]
structure IsTestFunction (f : ℝ → ℝ) where
smooth : ContDiff ℝ ∞ f
supp : HasCompactSupport f
@[fun_prop]
theorem IsTestFunction.integrable {f} (hf : IsTestFunction f) :
MeasureTheory.Integrable f :=
Continuous.integrable_of_hasCompactSupport (continuous hf.smooth) hf.supp
@[fun_prop]
theorem IsTestFunction.differentiable {f} (hf : IsTestFunction f) :
Differentiable ℝ f := hf.1.differentiable ENat.LEInfty.out
@[fun_prop]
theorem IsTestFunction.contDiff {f} (hf : IsTestFunction f) :
ContDiff ℝ ∞ f := hf.1
@[fun_prop]
def IsTestFunction.mul {f g} (hf : IsTestFunction f) (hg : IsTestFunction g) :
IsTestFunction (f * g) where
smooth := ContDiff.mul hf.smooth hg.smooth
supp := HasCompactSupport.mul_left hg.supp
@[fun_prop]
def IsTestFunction.deriv {f} (hf : IsTestFunction f) :
IsTestFunction (deriv f) where
smooth := deriv' hf.smooth
supp := HasCompactSupport.deriv hf.supp
and it makes some proofs shorter. Like
· refine IsTestFunction.integrable ?_
apply IsTestFunction.mul
· exact IsTestFunction.deriv hψ
· exact hφ
is just fun_prop
Joseph Tooby-Smith (Jun 16 2025 at 13:34):
I've filled in all but the last sorry in the above - which I assume you don't actually want filled? Or would it also help to have that filled?
When trying to use fun_prop it keeps giving me an error :(.
Tomas Skrivan (Jun 16 2025 at 13:35):
Wonderful! Yes do not fill the last sorry.
Tomas Skrivan (Jun 16 2025 at 13:35):
It will be automated but I noticed an error there too :)
Tomas Skrivan (Jun 16 2025 at 14:19):
Here sorry free proof that for
we have
code
Tomas Skrivan (Jun 16 2025 at 14:24):
Joseph Tooby-Smith said:
I've filled in all but the last sorry in the above - which I assume you don't actually want filled? Or would it also help to have that filled?
When trying to use
fun_propit keeps giving me an error :(.
There was a crucial missing step where you should use
theorem has_adjoint_ext' (F G : (ℝ → ℝ) → (ℝ → ℝ)) (F' : (ℝ → ℝ) → (ℝ → ℝ))
(h : HasAdjoint G F') (h' : ∀ φ, IsTestFunction φ → F φ = G φ) :
HasAdjoint F F' := ...
which allows you to rewrite F in HasAdjoit F F' to HasAdjoint G F' if F = G equal only on test functions. The last sorry is missing assumption ContDiff ℝ ∞ δu.
Tomas Skrivan (Jun 16 2025 at 14:40):
I'm very happy that we can do some form of formal calculation like this without fully formal theory of Sobolev spaces, theory of distributions or diffeology/convenient vector spaces.
I think that showing that the function minimizing has to satisfy is the difficult part. However in physics you just start with as the fundamental law.
Joseph Tooby-Smith (Jun 16 2025 at 15:03):
Yeah, I agree, this is very nice! I think this is precisely what is needed to formalize what physicists do, in a way which is familiar to the physicists :).
Thus, the question, what would you like to do with this? I guess it is a bit complicated where this should sit. It would obviously be welcome in PhysLean, and I would certainly like to see this useable there, but I understand aspects of this may be better placed in SciLean or Mathlib. But PhysLean isn't (yet!) dependent on SciLean - which adds complications.
Tomas Skrivan (Jun 16 2025 at 15:07):
I think this can live in PhysLean for now. I don't have immediate plans for this in SciLean.
Tomas Skrivan (Jun 16 2025 at 15:10):
I'm not sure if it would be welcome in mathlib as it is not a definition from existing literature.
Joseph Tooby-Smith (Jun 16 2025 at 15:11):
Ok, yeah, I am also skeptical if it would be allowed in Mathlib.
Would you like me to put it together in a PR? I can deal with the linters, etc. Then can add you as a co-author to the commit (I think this is a thing). Or would you like to do it.
Tomas Skrivan (Jun 16 2025 at 15:12):
I'm drafing PR now
Tomas Skrivan (Jun 16 2025 at 17:27):
I made a PR612
Tomas Skrivan (Jun 16 2025 at 17:29):
It is effectively the same code only I generalized HasAdjoint to function between abstract vector spaces
structure HasVarAdjoint
(F : (X → U) → (X → V)) (F' : (X → V) → (X → U)) (μ : Measure X := by volume_tac) : Prop
I also allowed the integral to be over arbitrary measure. Not sure if that will ever be useful though ...
Tomas Skrivan (Jun 16 2025 at 17:41):
It would be nice to also have this theorem, if someone feels like proving it :)
open EuclideanSpace
noncomputable
def div (f : EuclideanSpace ℝ (Fin n) → EuclideanSpace ℝ (Fin n) →L[ℝ] U) (x : EuclideanSpace ℝ (Fin n)) : U :=
∑ i, fderiv ℝ (f · (single i 1)) x (single i 1)
protected theorem HasVarAdjoint.fderiv {μ : Measure X} :
HasVarAdjoint (fun φ : EuclideanSpace ℝ (Fin n) → U => fderiv ℝ φ) (fun φ => - div φ) := sorry
Joseph Tooby-Smith (Jun 17 2025 at 06:25):
I think more generally, building an API around HasVarAdjoint and HasVarGradientAt, which connects it to the vector calculus that @ZhiKai Pong has developed for Space would be very useful.
Tomas Skrivan (Jun 17 2025 at 07:10):
What kind of theorems do you have in mind? Can you give me an example?
Joseph Tooby-Smith (Jun 17 2025 at 08:32):
What I had in mind was essentially the theorem you have above, but with e.g. Space.div.
Although, that said, I'm not sure I know a situation in physics where one would want to do variational calculus with a world-volume (the thing you integrate over) which is Space, so maybe this doesn't make sense. In classical field theory, one does it with a world-volume SpaceTime, so relevant API around that might be good eventually.
Tomas Skrivan (Jun 18 2025 at 11:15):
I see, that would be this statement:
theorem HasVarAdjoint.gradient {d} :
HasVarAdjoint (fun φ : Space d → ℝ => gradient φ) (fun φ => - Space.div φ) := sorry
and it is more reasonable as right now there is now inner product structure on X →L[ℝ] U in mathlib which is required to even state HasVarAdjoint.fderiv.
Joseph Tooby-Smith (Jun 18 2025 at 11:48):
This seems sensible to me! BTW am I free to merge PR612 into master?
Tomas Skrivan (Jun 18 2025 at 13:17):
Yes, feel free to merge it. I can't do it
Joseph Tooby-Smith (Jun 19 2025 at 05:23):
Just checking you agree with the following definition @Tomas Skrivan , for the definition of a solution to the harmonic oscillator.
/-- The lagrangian of the harmonic oscillator is the kinetic energy minus the potential energy. -/
noncomputable def lagrangian (x : Time → ℝ) : Time → ℝ := fun t =>
kineticEnergy S x t - potentialEnergy S x t
/-- A trajectory `x : Time → ℝ` is a solution to the harmonic oscillator if
the variational derivative of the lagrangian evaulated at `x` is zero. -/
def IsSolution (x : Time → ℝ) : Prop := HasVarGradientAt (lagrangian S) 0 x
This somewhat depends on the changes @Lode Vermeulen has made to this file, so won't commit this until then.
Lode Vermeulen (Jun 19 2025 at 06:17):
Just a heads up, I will be busy the remainder of this week and next week on finishing up my final courses for the year. This means that I will probably only be able to finish that PR after that.
Joseph Tooby-Smith (Jun 19 2025 at 06:47):
@Lode Vermeulen No worries, hope your final courses go well!
Tomas Skrivan (Jun 19 2025 at 06:50):
Ahh I think there is a bug in the definition of HasVarGradientAt :)
inductive HasVarGradientAt (S' : (X → U) → (X → ℝ)) (grad : X → U) (u : X → U)
(μ : Measure X := by volume_tac) : Prop
| intro (F')
(adjoint : HasVarAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F' μ)
(eq : F' (fun _ => 1) = grad)
The HasVarAdjoint guarantees only that F' is doing reasonable things on test functions but (fun _ => 1) is not a test function thus F' is allowed to return a junk value. Likely zero thus the above definition of IsSolution is no good as one can probably show that any path is a solution.
I think the definition should be:
inductive HasVarGradientAt (S' : (X → U) → (X → ℝ)) (grad : X → U) (u : X → U)
(μ : Measure X := by volume_tac) : Prop
| intro (F')
(adjoint : HasVarAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F' μ)
(eq : ∀ (φ : X → ℝ) (D : Set X), IsTestFunction φ → IsOpen D → (∀ x ∈ D, φ x = 1) →
∀ x ∈ D, F' φ x = grad x)
i.e. the eq argument is a bit more refined. Instead of plugging (fun _ => 1) to F' we plug in a test function φ that is constant one on some open set D and then grad is equal to F' φ only on D.
Tomas Skrivan (Jun 19 2025 at 07:11):
However this definition might not be completely satisfactory neither. I believe that it yields expected result for local laws i.e. Lagrangian depending on u(t,x) and its derivatives at x.
For example in peridynamics the potential energy looks like this
i.e. the local potential energy density at is thus it depends on values of in some finite neighbourhood .
Tomas Skrivan (Jun 19 2025 at 07:14):
I think in this case the definition should be
inductive HasVarGradientAt' (S' : (X → U) → (X → ℝ)) (grad : X → U) (u : X → U)
(μ : Measure X := by volume_tac) : Prop
| intro (F')
(adjoint : HasVarAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F' μ)
(eq : ∀ (x : X), ∃ D : Set X, IsOpen D ∧ ∀ (φ : X → ℝ), (∀ x ∈ D, φ x = 1) → F' φ x = grad x)
i.e. for every point x there is a neighboorhood D such that if we take a test function φ that is one on that neighborhood we have F' φ x = grad x.
Tomas Skrivan (Jun 19 2025 at 07:15):
For the peridynamic potential the neighboorhood D would be . For local laws(Lagrangian depending only on derivatives of u) the neighboorhood can be arbitrary small.
Tomas Skrivan (Jun 19 2025 at 07:18):
Also the definition is also missing that the deriv is not returning junk value
Joseph Tooby-Smith (Jun 19 2025 at 07:18):
Shouldn't we be enforcing some linearity condition on S and F' on smooth functions? Which then determine the value of them away from the test functions?
In your last iteration don't you need the extra condition that x is in D?
Tomas Skrivan (Jun 19 2025 at 07:19):
Joseph Tooby-Smith said:
In your last iteration don't you need the extra condition that
xis inD?
Ha, yes I do :)
Joseph Tooby-Smith (Jun 19 2025 at 07:21):
Shouldn't we be enforcing some linearity condition on
SandF'on smooth functions? Which then determine the value of them away from the test functions?
Nevermind - in general the lagrangian isn't linear :(.
Tomas Skrivan (Jun 19 2025 at 07:22):
Joseph Tooby-Smith said:
Shouldn't we be enforcing some linearity condition on
SandF'on smooth functions? Which then determine the value of them away from the test functions?
Yeah I was thinking about that too but until it is not needed in some proof I'm not adding it.
If you know that HasVarAdjoint F F' μ then you know that
∀ φ ψ, IsTestFunction φ → IsTestFunction ψ →
∫ x, ⟪F φ x, ψ x⟫_ℝ ∂μ = ∫ x, ⟪φ x, F' ψ x⟫_ℝ ∂μ
and you might be able to show some linearity from it.
Tomas Skrivan (Jun 19 2025 at 07:24):
∫ x, ⟪F (φ+φ') x, ψ x⟫_ℝ ∂μ = ∫ x, ⟪(φ+φ') x, F' ψ x⟫_ℝ ∂μ
= ∫ x, ⟪φ' x, F' ψ x⟫_ℝ ∂μ + ∫ x, ⟪φ' x, F' ψ x⟫_ℝ ∂μ
= ∫ x, ⟪F φ' x, ψ x⟫_ℝ ∂μ + ∫ x, ⟪F φ' x, ψ x⟫_ℝ ∂μ
= ∫ x, ⟪F φ' x + F φ' x, ψ x⟫_ℝ ∂μ
Tomas Skrivan (Jun 19 2025 at 07:26):
Thus ∀ ψ, ∫ x, ⟪F (φ+φ') x - (F φ' x + F φ' x), ψ x⟫_ℝ ∂μ = 0 and by applying fundamental lemma of variational calculus you get F (φ+φ') x = F φ' x + F φ' x
Joseph Tooby-Smith (Jun 19 2025 at 07:26):
Right! Because even though the lagrangian is not linear:
fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0
is in terms of δu... or something like that.
Tomas Skrivan (Jun 19 2025 at 07:28):
Yes exactly
Joseph Tooby-Smith (Jun 19 2025 at 07:30):
Ok, but then I don't see how F' can give junk values on not test-functions, since it too must be linear (by a similar argument to what you have given), and thus it is determined by it's value on test-functions (which are dense in the space of smooth functions).
Tomas Skrivan (Jun 19 2025 at 07:32):
But we do not require any continuity of F' so we can't use the fact that test functions are dense in smooth functions.
Tomas Skrivan (Jun 19 2025 at 07:33):
That is kind of the point of the definition of HasVarAdjoint and HasVarGradient that you do everything on test functions and you completely dodge the question about topology on infinite dimensional spaces.
Tomas Skrivan (Jun 19 2025 at 07:35):
Tomas Skrivan said:
∫ x, ⟪F (φ+φ') x, ψ x⟫_ℝ ∂μ = ∫ x, ⟪(φ+φ') x, F' ψ x⟫_ℝ ∂μ = ∫ x, ⟪φ' x, F' ψ x⟫_ℝ ∂μ + ∫ x, ⟪φ' x, F' ψ x⟫_ℝ ∂μ = ∫ x, ⟪F φ' x, ψ x⟫_ℝ ∂μ + ∫ x, ⟪F φ' x, ψ x⟫_ℝ ∂μ = ∫ x, ⟪F φ' x + F φ' x, ψ x⟫_ℝ ∂μ
This whole calculation is valid only under IsTestFunction φ, IsTestFunction φ', IsTestFunction ψ.
Tomas Skrivan (Jun 19 2025 at 07:36):
You can't use linearity and partition of unity to show that F' (fun _ => 1) is something meaningful because (fun _ => 1) has to be decomposed into infinitely many test functions
Tomas Skrivan (Jun 19 2025 at 07:38):
To show ∑ i, F' (φ i) = F' (∑ i, φ i) for partition of unity φ i you need some continuity of F'` as the sum has infinitely many terms.
Joseph Tooby-Smith (Jun 19 2025 at 07:54):
Ok, understood, and I agree with this philosophy.
And the 'fundamental lemma of variational calculus' answered another question I had, which was is the following true:
lemma eq_on_isTestFunction_of_hasVarAdjoint {F : (X → U) → (X → V)}
{F' G' : (X → V) → (X → U)} {μ : Measure X} (h : HasVarAdjoint F F' μ)
(h : HasVarAdjoint F G' μ) :
∀ φ, IsTestFunction φ → F' φ = G' φ := sorry
Which, with your new definition, implies uniqueness of the variational gradient, as long as F' φ x in the eq condition doesn't depend on the choice of D or the choice of φ, which it shouldn't do.
Tomas Skrivan (Jun 19 2025 at 08:01):
Ohh nice, I didn't realize that. I though that we will get uniqueness only once we talk about some topology on some infinite dimensional space.
Tomas Skrivan (Jun 19 2025 at 08:02):
So maybe we can even define varGradient function
Joseph Tooby-Smith (Jun 19 2025 at 08:02):
Tomas Skrivan said:
I think in this case the definition should be
inductive HasVarGradientAt' (S' : (X → U) → (X → ℝ)) (grad : X → U) (u : X → U) (μ : Measure X := by volume_tac) : Prop | intro (F') (adjoint : HasVarAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F' μ) (eq : ∀ (x : X), ∃ D : Set X, IsOpen D ∧ ∀ (φ : X → ℝ), (∀ x ∈ D, φ x = 1) → F' φ x = grad x)i.e. for every point
xthere is a neighboorhoodDsuch that if we take a test functionφthat is one on that neighborhood we haveF' φ x = grad x.
I also think you are going to actually need to state explicitly in the eq condition that φ is a test function, or something along those lines.
Joseph Tooby-Smith (Jun 19 2025 at 08:03):
Otherwise this eq condition implies the old one.
Tomas Skrivan (Jun 19 2025 at 08:03):
Joseph Tooby-Smith said:
I also think you are going to actually need to state explicitly in the
eqcondition thatφis a test function, or something along those lines.
Yes, I just realized that as I'm fixing the example in HasVarGradient.lean
Tomas Skrivan (Jun 19 2025 at 08:05):
Now I'm missing
hφ' : ∀ x_1 ∈ Metric.ball x 1, φ x_1 = 1
⊢ (fun x => m * deriv u x * φ x) =ᶠ[nhds x] fun x => m * deriv u x
and I have never worked with =ᶠ[nhds x] before
Joseph Tooby-Smith (Jun 19 2025 at 08:10):
I think something like docs#Filter.eventuallyEq_of_mem, along with the condition that
Metric.ball x 1 ∈ nhds x (docs#Metric.ball_mem_nhds) should help
Tomas Skrivan (Jun 19 2025 at 08:16):
Thanks! now I have proof of euler_lagrange_particle_1d with updated definition of HasVarGradientAt
Joseph Tooby-Smith (Jun 19 2025 at 09:23):
I think this is the correct manifestation of the fundamental theorem of variational calculus that we would need to define varGradient:
lemma fundamental_theorem_of_variational_calculus {f : X → V}
(μ : Measure X) [IsFiniteMeasureOnCompacts μ] [μ.IsOpenPosMeasure]
[OpensMeasurableSpace X]
(hf : IsTestFunction f) (hg : ∀ g, IsTestFunction g → ∫ x, ⟪f x, g x⟫_ℝ ∂μ = 0) :
f = 0 := by
have hf' := hg f hf
rw [MeasureTheory.integral_eq_zero_iff_of_nonneg] at hf'
· rw [Continuous.ae_eq_iff_eq] at hf'
· funext x
have hf'' := congrFun hf' x
simpa using hf''
· have hf : Continuous f := hf.smooth.continuous
fun_prop
· fun_prop
· intro x
simp
exact real_inner_self_nonneg
· apply IsTestFunction.integrable
exact IsTestFunction.inner hf hf
Joseph Tooby-Smith (Jun 19 2025 at 09:25):
I had to add the instance: μ.IsOpenPosMeasure, which seems to be true for most important cases e.g.
#synth (volume (α := ℝ)).IsOpenPosMeasure
-- says: Measure.IsAddHaarMeasure.toIsOpenPosMeasure
Tomas Skrivan (Jun 19 2025 at 09:30):
Isn't it enought for f to be continuous?
Joseph Tooby-Smith (Jun 19 2025 at 09:31):
One needs that:
Integrable (fun x => ⟪f x, f x⟫_ℝ) μ
which I proved using
apply IsTestFunction.integrable
exact IsTestFunction.inner hf hf
in the last two lines
Joseph Tooby-Smith (Jun 19 2025 at 09:33):
And also have hf' := hg f hf... so at least this proof doesn't generalize to the case when f is just continuous
Tomas Skrivan (Jun 19 2025 at 09:42):
Ohh I see, the proof can be also done by contradiction. Assume that for some x we have f x > 0 then in some neighbourhood you have f > 0. Then take g positive test function in that neighborhood and you are integrating non zero function.
Joseph Tooby-Smith (Jun 19 2025 at 09:44):
But in general, positive doesn't mean anything here because the target of g is the vector space V.
Joseph Tooby-Smith (Jun 19 2025 at 09:46):
Below is the proof of the result eq_on_isTestFunction_of_hasVarAdjoint I wanted above:
Header
Joseph Tooby-Smith (Jun 19 2025 at 10:04):
I think if you want to have uniqueness of the variational derivative, you need to enforce locality in eq, that is to say something like the following:
inductive HasVarGradientAt'' (S' : (X → U) → (X → ℝ)) (grad : X → U) (u : X → U)
(μ : Measure X := by volume_tac) : Prop
| intro (F')
(adjoint : HasVarAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F' μ)
(eq : ∀ (x : X), ∃ D ∈ nhds x, ∀ D' ∈ nhds x, D' ⊆ D →
∀ (φ : X → ℝ), IsTestFunction φ → (∀ x ∈ D', φ x = 1) → F' φ x = grad x)
This locality is essentially to ensure that the choice of D does not matter, as long as it is small enough.
The problem is, I think, this would exclude the use of this on the non-local 'peridynamics' example above.
Tomas Skrivan (Jun 19 2025 at 10:51):
Joseph Tooby-Smith said:
But in general,
positivedoesn't mean anything here because the target ofgis the vector spaceV.
Ohh I see, then there is x such that f x ≠ 0. There has to be a neighborhood N of x such that ⟪f x', f x⟫ > 0 for x' ∈ N. Now take positive test function φ with support in N and pick g x' := φ x' • f x. Then you can show ⟪f x', g x'⟫ ≥ 0 for x' ∈ N and zero outside of N.
If there is some trouble in finding N then you can assume that U is finite dimensional and do it component wise and then take intersection of all those neighborhoods.
Joseph Tooby-Smith (Jun 19 2025 at 10:53):
I guess one option to get around the non-local examples is to do something like the following:
inductive HasNonLocalVarGradientAt (S' : (X → U) → (X → ℝ)) (grad : X → U) (u : X → U)
(E : (x : X) → Set X) (hE : ∀ x, x ∈ E x)
(μ : Measure X := by volume_tac) : Prop
| intro (F')
(adjoint : HasVarAdjoint (fun δu t => deriv (fun s : ℝ => S' (u + s • δu) t) 0) F' μ)
(eq : ∀ (x : X), ∃ D ∈ nhds x, E x ⊆ D ∧
∀ D' ∈ nhds x, D' ⊆ D ∧ E x ⊆ D' →
∀ (φ : X → ℝ), IsTestFunction φ → (∀ x ∈ D', φ x = 1) → F' φ x = grad x)
Where E : (x : X) → Set X expresses the non-locality in the function. The point being that this is only unique if you specify E.
Tomas Skrivan (Jun 19 2025 at 10:55):
I'm a bit confused I do not see the problem with the definition
inductive HasVarGradientAt (S' : (X → U) → (X → ℝ)) (grad : X → U) (u : X → U)
(μ : Measure X := by volume_tac) : Prop
| intro (F')
(diff : ∀ δu x, IsTestFunction δu →
Differentiable ℝ (fun s : ℝ => S' (fun x' => u x' + s • δu x') x))
(adjoint : HasVarAdjoint (fun δu x => deriv (fun s : ℝ => S' (u + s • δu) x) 0) F' μ)
(eq : ∀ (x : X), ∃ D : Set X,
x ∈ D ∧ IsOpen D ∧ ∀ (φ : X → ℝ), IsTestFunction φ → (∀ x ∈ D, φ x = 1) → F' φ x = grad x)
It should be usable in the peridynamic example.
Tomas Skrivan (Jun 19 2025 at 10:57):
I'm working on the uniqueness proof for this definition now. Let's see if I get stuck
Joseph Tooby-Smith (Jun 19 2025 at 11:00):
Here is a situation where I don't think this would work. Suppose X is open but not compact, then I can let D be X, but then there are no functions φ : X → ℝ which satisfy IsTestFunction φ and
∀ x ∈ D, φ x = 1, and thus ∀ (φ : X → ℝ), IsTestFunction φ → (∀ x ∈ D, φ x = 1) → F' φ x = grad x holds for any grad.
Tomas Skrivan (Jun 19 2025 at 11:01):
Yeah I think the definition needs IsCompact D instead of IsOpen D
Tomas Skrivan (Jun 19 2025 at 11:04):
And somehow it is not necessary to enforce in the definition that D contains neighborhood of x.
Because if you have some D for x that satisfies eq than you can take D' that is D ⊆ D' and this D' also satisfies eq. So you can always arbitrary enlarge D while keeping it compact.
Tomas Skrivan (Jun 19 2025 at 11:10):
The partial uniqueness proof
def smoothIndicatorOn (Ω : Set X) (x : X) : ℝ := sorry
lemma smoothIndicatorOn_eq_one (Ω : Set X) (x : X) (hx : x ∈ Ω) : smoothIndicatorOn Ω x = 1 := sorry
@[fun_prop]
lemma smoothIndicatorOn.isTestFunction (Ω : Set X) : IsTestFunction (smoothIndicatorOn Ω) := sorry
lemma HasVarGradientAt.unique
{S' : (X → U) → (X → ℝ)} {grad grad' : X → U} {u : X → U} {μ : Measure X}
[IsFiniteMeasureOnCompacts μ] [μ.IsOpenPosMeasure] [OpensMeasurableSpace X]
(h : HasVarGradientAt S' grad u μ) (h' : HasVarGradientAt S' grad' u μ) :
grad = grad' := by
obtain ⟨F,_,hF,eq⟩ := h
obtain ⟨G,_,hG,eq'⟩ := h'
funext x
obtain ⟨D,_,_,hgrad⟩ := eq x
obtain ⟨D',_,_,hgrad'⟩ := eq' x
let φ := smoothIndicatorOn (D ∪ D')
have hφ : IsTestFunction φ := by fun_prop
have h := hgrad φ hφ (by intros _ hx; unfold φ; rw[smoothIndicatorOn_eq_one]; simp[hx])
have h' := hgrad' φ hφ (by intros _ hx; unfold φ; rw[smoothIndicatorOn_eq_one]; simp[hx])
rw[← h, ← h']
rw[hF.unique hG φ hφ]
What is missing that we can define smoothIndicatorOn Ω which is test function that is one on Ω
Joseph Tooby-Smith (Jun 19 2025 at 11:40):
Seems like docs#SmoothBumpFunction may be helpful in this regard.
Joseph Tooby-Smith (Jun 19 2025 at 12:08):
Something like:
noncomputable def maxDistance (Ω : Set X) (z : X) : ℝ :=
⨆ y ∈ Ω, dist z y
lemma le_maxDistance (Ω : Set X) (z y: X) (hy : y ∈ Ω) :
dist z y ≤ maxDistance Ω z := by
sorry
lemma maxDistance_pos (Ω : Set X) (z : X) :
0 < maxDistance Ω z := by
sorry
noncomputable def smoothIndicatorOn' (Ω : Set X) (z : X) :
SmoothBumpFunction (modelWithCornersSelf ℝ X ) z where
rIn := maxDistance Ω z
rOut := maxDistance Ω z + 1
rIn_pos := by
-- 0 < maxDistance Ω z
sorry
rIn_lt_rOut := lt_add_one (maxDistance Ω z)
closedBall_subset := by simp
variable [FiniteDimensional ℝ X]
noncomputable def smoothIndicatorOn (Ω : Set X) (z : X) : X → ℝ :=
(smoothIndicatorOn' Ω z).toFun
lemma smoothIndicatorOn_eq_one (Ω : Set X) (z x : X) (hx : x ∈ Ω) :
(smoothIndicatorOn Ω z) x = 1 := by
simp [smoothIndicatorOn]
rw [SmoothBumpFunction.one_of_dist_le]
simp
simp [smoothIndicatorOn']
-- Dist.dist x z ≤ maxDistance Ω z
sorry
lemma smoothIndicatorOn.isTestFunction (Ω : Set X) (z : X) : IsTestFunction (smoothIndicatorOn Ω z) where
smooth := ContMDiff.contDiff (smoothIndicatorOn' Ω z).contMDiff
supp := (smoothIndicatorOn' Ω z).hasCompactSupport
Joseph Tooby-Smith (Jun 19 2025 at 12:09):
This relies on FiniteDimensional ℝ X, but I think that is ok, no?
Tomas Skrivan (Jun 19 2025 at 12:20):
I'm thinking this
import Mathlib
open InnerProductSpace ContDiff
section SmoothIndicatorOn
variable {X : Type*} [NormedAddCommGroup X] [InnerProductSpace ℝ X]
noncomputable
def smoothIndicatorOn (Ω : Set X) (x : X) : ℝ :=
let r2 := ⨆ x ∈ Ω, ‖x‖^2
1 - Real.smoothTransition (⟪x,x⟫_ℝ - r2)
lemma smoothIndicatorOn.eq_one (Ω : Set X) (x : X)
(hΩ : IsCompact Ω) (hx : x ∈ Ω) :
smoothIndicatorOn Ω x = 1 := sorry
lemma smoothIndicatorOn.contDiff (Ω : Set X) : ContDiff ℝ ∞ (smoothIndicatorOn Ω) := by
unfold smoothIndicatorOn
sorry
lemma smoothIndicatorOn.hasCompactSupport (Ω : Set X) : HasCompactSupport (smoothIndicatorOn Ω) := by
sorry
end SmoothIndicatorOn
Tomas Skrivan (Jun 19 2025 at 12:26):
I think both approaches are fine and the condition FiniteDimensional ℝ X is totally fine. I'm not aware of any application where that is not the case.
Tomas Skrivan (Jun 19 2025 at 12:28):
I think your smoothIndicatorOn_eq_one needs hΩ : IsCompact Ωsuch that maxDistance is not a junk value
Joseph Tooby-Smith (Jun 19 2025 at 12:29):
Yeah, I think this is probably true - at the minute I'm struggling to proof anything with maxDistance.
Tomas Skrivan (Jun 19 2025 at 12:30):
Yeah you need that continuous function attains maximum on a compact.
Tomas Skrivan (Jun 19 2025 at 12:33):
Ohh you definition is bit odd.
noncomputable def smoothIndicatorOn (Ω : Set X) (z : X) : X → ℝ :=
(smoothIndicatorOn' Ω z).toFun
what is z?
Tomas Skrivan (Jun 19 2025 at 12:34):
I think z can by arbitrary and I effectively have z = 0
Joseph Tooby-Smith (Jun 19 2025 at 12:36):
Ah! Right I forgot zero exists :), otherwise, yes it was just an arbitrary point.
Tomas Skrivan (Jun 19 2025 at 12:38):
Also can't we just use ContDiffBump?
Tomas Skrivan (Jun 19 2025 at 12:38):
I think SmoothBumpFunction is unecessary
Joseph Tooby-Smith (Jun 19 2025 at 12:39):
I thought this as well, but I couldn't find the smoothness results for ContDiffBump.
Tomas Skrivan (Jun 19 2025 at 12:41):
Tomas Skrivan (Jun 19 2025 at 12:52):
I will draft a PR with the uniqueness proofs
Joseph Tooby-Smith (Jun 19 2025 at 12:54):
Did you manage to get one of the above to work?
Tomas Skrivan (Jun 19 2025 at 12:54):
I have this
lemma HasVarGradientAt.unique
{X : Type*} [NormedAddCommGroup X] [InnerProductSpace ℝ X] [FiniteDimensional ℝ X] [MeasurableSpace X]
{S' : (X → U) → (X → ℝ)} {grad grad' : X → U} {u : X → U} {μ : Measure X}
[IsFiniteMeasureOnCompacts μ] [μ.IsOpenPosMeasure] [OpensMeasurableSpace X]
(h : HasVarGradientAt S' grad u μ) (h' : HasVarGradientAt S' grad' u μ) :
grad = grad' := by
obtain ⟨F,_,hF,eq⟩ := h
obtain ⟨G,_,hG,eq'⟩ := h'
funext x
obtain ⟨D,_,hD,hgrad⟩ := eq x
obtain ⟨D',_,hD',hgrad'⟩ := eq' x
-- prepare test function that is one on `D ∪ D'`
let r := ⨆ x ∈ D ∪ D', ‖x‖
have : 0 ≤ r := sorry
let φ : ContDiffBump (0 : X) := {
rIn := r + 1,
rOut := r + 2,
rIn_pos := by linarith,
rIn_lt_rOut := by linarith}
-- few properties about `φ`
have hφ : IsTestFunction (fun x : X => φ x) := by
constructor
apply ContDiffBump.contDiff
apply ContDiffBump.hasCompactSupport
have hφ' : ∀ x, x ∈ D ∪ D' → x ∈ Metric.closedBall 0 φ.rIn := sorry
have h := hgrad φ hφ
(by intros _ hx; unfold φ; rw[φ.one_of_mem_closedBall]; apply hφ'; simp[hx])
have h' := hgrad' φ hφ
(by intros _ hx; unfold φ; rw[φ.one_of_mem_closedBall]; apply hφ'; simp[hx])
rw[← h, ← h',hF.unique hG φ hφ]
Tomas Skrivan (Jun 19 2025 at 12:54):
There are still two sorries
Tomas Skrivan (Jun 19 2025 at 12:55):
The first one should be easy, the second is probably a bit harder
Joseph Tooby-Smith (Jun 19 2025 at 13:03):
with a slightly different r to what you had this works:
lemma HasVarGradientAt.unique'
{X : Type*} [NormedAddCommGroup X] [InnerProductSpace ℝ X] [FiniteDimensional ℝ X] [MeasurableSpace X]
{S' : (X → U) → (X → ℝ)} {grad grad' : X → U} {u : X → U} {μ : Measure X}
[IsFiniteMeasureOnCompacts μ] [μ.IsOpenPosMeasure] [OpensMeasurableSpace X]
(h : HasVarGradientAt S' grad u μ) (h' : HasVarGradientAt S' grad' u μ) :
grad = grad' := by
obtain ⟨F,_,hF,eq⟩ := h
obtain ⟨G,_,hG,eq'⟩ := h'
funext x
obtain ⟨D,hm,hD,hgrad⟩ := eq x
obtain ⟨D',_,hD',hgrad'⟩ := eq' x
-- prepare test function that is one on `D ∪ D'`
let r := sSup ((fun x => ‖x‖) '' (D ∪ D'))
have : 0 ≤ r := by
obtain ⟨x, h1, h2, h3⟩ := IsCompact.exists_sSup_image_eq_and_ge (s := D ∪ D')
(IsCompact.union hD hD') (Set.Nonempty.inl (Set.nonempty_of_mem hm)) (f := fun x => ‖x‖) (by fun_prop)
unfold r
apply le_of_le_of_eq (b := ‖x‖)
· exact norm_nonneg x
· rw [← h2]
let φ : ContDiffBump (0 : X) := {
rIn := r + 1,
rOut := r + 2,
rIn_pos := by linarith,
rIn_lt_rOut := by linarith}
-- few properties about `φ`
have hφ : IsTestFunction (fun x : X => φ x) := by
constructor
apply ContDiffBump.contDiff
apply ContDiffBump.hasCompactSupport
have hφ' : ∀ x, x ∈ D ∪ D' → x ∈ Metric.closedBall 0 φ.rIn := by
intro x hx
simp [φ, r]
obtain ⟨y, h1, h2, h3⟩ := IsCompact.exists_sSup_image_eq_and_ge (s := D ∪ D')
(IsCompact.union hD hD') (Set.Nonempty.inl (Set.nonempty_of_mem hm)) (f := fun x => ‖x‖) (by fun_prop)
rw [h2]
have h3' := h3 x hx
apply le_trans h3'
simp
have h := hgrad φ hφ
(by intros _ hx; unfold φ; rw[φ.one_of_mem_closedBall]; apply hφ'; simp[hx])
have h' := hgrad' φ hφ
(by intros _ hx; unfold φ; rw[φ.one_of_mem_closedBall]; apply hφ'; simp[hx])
rw[← h, ← h',hF.unique hG φ hφ]
Tomas Skrivan (Jun 19 2025 at 14:23):
PR614 has uniqueness of variational adjoint and gradient
Tomas Skrivan (Jun 19 2025 at 14:25):
PR615 formulates Euler-Lagrange equations for:
Classical finite dimensional systems:
theorem hasVarGradient_action_eq_euler_lagrange
(L : ℝ → X → X → ℝ) (q : ℝ → X)
(hq : ContDiff ℝ ∞ u) (hL : ContDiff ℝ ∞ ↿L) :
HasVarGradientAt
(fun (q : ℝ → X) (t : ℝ) => L t (q t) (deriv q t))
(fun t =>
-- ∂L/∂x
gradient (L t · (deriv q t)) (q t)
-
-- d/dt ∂L/∂ẋ
deriv (fun t' => gradient (L t' (q t') ·) (deriv q t')) t)
u := sorry
Classical scalar fields:
theorem hasVarGradient_action_eq_euler_lagrange_field
(L : (t : ℝ) → (x : Space d) → (u : ℝ) → (dudt : ℝ) → (dudx : Space d) → ℝ)
(u : ℝ → Space d → ℝ)
(hu : ContDiff ℝ ∞ ↿u) (hL : ContDiff ℝ ∞ ↿L) :
HasVarGradientAt
(fun (u : ℝ×Space d → ℝ) ((t,x) : ℝ×Space d) =>
L t x (u (t,x)) (deriv (fun t' => u (t',x)) t) (gradient (fun x' => u (t,x')) x))
(fun (t,x) =>
-- ∂L/∂u
gradient (fun u' => L t x u' (deriv (u · x) t) (gradient (u t ·) x)) (u t x)
-
-- d/dt ∂L/∂(∂ₜu)
deriv (fun t' => deriv (fun dudt => L t' x (u t' x) dudt (gradient (u t' ·) x)) (deriv (u · x) t')) t
-
-- divₓ ∂L/∂(∂ₓu)
Space.div (fun x' => gradient (fun dudx => L t x' (u t x') (deriv (u · x') t) dudx) (gradient (u t ·) x')) x)
(↿u) := sorry
Tomas Skrivan (Jun 19 2025 at 14:36):
This PR has bunch of sorry. In particular, variational adjoint of gradient is Space.div and generalization of previous HasVarAdjoint.deriv. Any help with these proofs is welcome :)
Tomas Skrivan (Jun 19 2025 at 14:40):
For example, the classical scalar fields version of Euler-Lagrange equation allows us to derive the wave equation
from the Lagrangian
Tomas Skrivan (Jun 19 2025 at 16:48):
I made another draft PR616 that defines HasVarAdjDerivAt F F' u which defines variational adjoint derivative.
The connection to HasVarGradientAt F grad u is that grad = F' (fun _ => 1).
The benefit is that we can formulate composition rule for HasVarAdjDerivAt thus computing HasVarAdjDerivAt does not have to be in two steps, derivative first and then adjoint, but both steps can be done at the same time.
Joseph Tooby-Smith (Jun 19 2025 at 16:54):
Great! I can fix the linters on PR614, and get it merged.
Joseph Tooby-Smith (Jun 19 2025 at 17:12):
What are the reasons for the extra conditions diff and linearize in HasVarAdjDerivAt? Are they just needed lemmas to ensure that the composition result works?
Tomas Skrivan (Jun 19 2025 at 17:14):
Yes, it is just cooked up such that you can prove the composition theorem.
Tomas Skrivan (Jun 19 2025 at 17:27):
It is very much inspired how smoothness is defined in diffeology.
Joseph Tooby-Smith (Jun 19 2025 at 17:30):
You mean the diff condition, and in the sense that the test-functions are the plots of the diffeology?
Joseph Tooby-Smith (Jun 19 2025 at 17:34):
PR614 is now merged :).
Tomas Skrivan (Jun 19 2025 at 18:00):
Joseph Tooby-Smith said:
You mean the
diffcondition, and in the sense that the test-functions are theplotsof the diffeology?
Yes, that is the motivation.
Tomas Skrivan (Jun 19 2025 at 19:09):
Yay, HasVarAdjDerivAt.comp is now sorry free!
Joseph Tooby-Smith (Jun 20 2025 at 06:00):
Here is a proof of HasVarAdjDerivAt.deriv':
import PhysLean.Mathematics.FDerivCurry
lemma deriv' (u : ℝ → ℝ) (hu : ContDiff ℝ ∞ u) :
HasVarAdjDerivAt (fun φ : ℝ → ℝ => deriv φ) (fun φ x => - deriv φ x) u where
smooth_at := hu
diff := by intros; fun_prop [deriv]
linearize := by
intro φ hφ x
have hd : DifferentiableAt ℝ (fun x => deriv (fun x_1 => φ x_1 x) 0) x :=
fderiv_curry_differentiableAt_fst_comp_snd _ _ _ _ (ContDiff.of_le hφ (ENat.LEInfty.out))
conv_rhs =>
enter [1, s']
rw [deriv_add (function_differentiableAt_snd _ _ _ (hφ.differentiable ENat.LEInfty.out))
(by fun_prop)]
have hd2 : DifferentiableAt ℝ (fun s' =>
deriv (fun x => s' • deriv (fun x_1 => φ x_1 x) (0 : ℝ)) x) (0 : ℝ) := by
conv_lhs =>
enter [s']
simp
fun_prop
rw [deriv_add (by fun_prop) (hd2)]
simp only [deriv_const', smul_eq_mul, differentiableAt_const, deriv_const_mul_field',
differentiableAt_id', deriv_mul, deriv_id'', one_mul, mul_zero, add_zero, zero_add]
dsimp [deriv]
exact fderiv_swap (𝕜 := ℝ) φ 0 1 x 1 (ContDiff.of_le hφ (ENat.LEInfty.out))
adjoint := by
simp (disch:=fun_prop) [deriv_add]
apply HasVarAdjoint.congr_fun
case h' =>
intro φ hφ
have := hφ.smooth.differentiable (ENat.LEInfty.out)
have := hu.differentiable (ENat.LEInfty.out)
simp (disch:=fun_prop) [deriv_add]
rfl
case h =>
apply HasVarAdjoint.deriv
Joseph Tooby-Smith (Jun 20 2025 at 07:12):
I filled in the sorries in PR616, I directly made a commit to the pull-request, it probably needs a bit of tidying. Hope this is ok, feel free to revert if not.
Tomas Skrivan (Jun 20 2025 at 07:17):
Excellent, thanks a lot!
Joseph Tooby-Smith (Jun 20 2025 at 07:18):
No worries :). I think we probably want an explicit lemma like HasVarGradientAt.of_hasVarAdjDerivAt to go from a HasVarAdjDerivAt to a HasVarGradientAt.
Tomas Skrivan (Jun 20 2025 at 07:21):
Yes that is exactly what I wanted to talk about :) There is the problem you can't just prove grad = F' (fun _ => 1) but have to do the stupid gluing of test functions and I would like to make it easier
Tomas Skrivan (Jun 20 2025 at 07:22):
I'm thinking of changing the definition of HasVarAdjoint F F' by adding a clause saying that F' (fun _ => 1) produces a meaningful result.
Joseph Tooby-Smith (Jun 20 2025 at 07:26):
I think I would be hesitant to do this last thing, it seems to me to be fairly arbitrary (given everything else is defined using test functions).
Joseph Tooby-Smith (Jun 20 2025 at 07:27):
You are saying that the problem is basically that:
∀ (x : X), ∃ D : Set X, x ∈ D ∧ IsCompact D ∧ ∀ (φ : X → ℝ),
IsTestFunction φ → (∀ x ∈ D, φ x = 1) → F' φ x = F' (fun _ => 1) x
is hard to prove?
Tomas Skrivan (Jun 20 2025 at 07:28):
Joseph Tooby-Smith said:
I think I would be hesitant to do this last thing, it seems to me to be fairly arbitrary (given everything else is defined using test functions).
Yeah it feels a bit dirty
Tomas Skrivan (Jun 20 2025 at 07:28):
Joseph Tooby-Smith said:
You are saying that the problem is basically that:
∀ (x : X), ∃ D : Set X, x ∈ D ∧ IsCompact D ∧ ∀ (φ : X → ℝ), IsTestFunction φ → (∀ x ∈ D, φ x = 1) → F' φ x = F' (fun _ => 1) xis hard to prove?
I think it is more annoying than hard and not sure to how well it can be automated
Tomas Skrivan (Jun 20 2025 at 07:30):
Well my goal is to have a tactic that computes HasVarGradientAt S ?_ u automatically and I'm not sure about the last step.
Tomas Skrivan (Jun 20 2025 at 07:38):
Tomas Skrivan said:
I'm thinking of changing the definition of
HasVarAdjoint F F'by adding a clause saying thatF' (fun _ => 1)produces a meaningful result.
Yeah this is bad, by requiring just this we won't be able to show the composition theorem.
In F' (G' (fun _ => 1)) the G' (fun _ => 1) can be an arbitrary smooth function and we would need to quarante that F' produces meaningful result.
So I think the correct requirement is that F' is continuous when restricted to smooth functions.
Joseph Tooby-Smith (Jun 20 2025 at 07:38):
I again think that this comes down to locality, and if F' is local, then this last step should be easy to prove - but when it isn't local then it gets complicated. But as long as HasVarGradientAt is defined for non-local F' I'm not sure how much of an issue this is, one could just have a tactic for local F'.
To be more precise of what local means here:
def IsLocalAt (F : (X → U) → (X → ℝ)) (x : X) : Prop :=
∀ N ∈ nhds x, ∀ (φ1 φ2 : X → U), (∀ y ∈ N, φ1 y = φ2 y) → F φ1 x = F φ2 x
Joseph Tooby-Smith (Jun 20 2025 at 07:39):
So I think the correct requirement is that
F'is continuous when restricted to smooth functions
which is bad, right, because it requires topology?
Joseph Tooby-Smith (Jun 20 2025 at 07:42):
An aside: Thinking about it, people talk about local lagrangians, and I think the above condition is precisely what they mean. So even if we don't need this here, it would be nice to have in terms of lagrangians.
Tomas Skrivan (Jun 20 2025 at 07:42):
Yeah but I'm thinking of how bad :) it might be sufficient to work only with smooth functions on compact sets(where we have a canonical notion of topology)
Joseph Tooby-Smith (Jun 20 2025 at 07:53):
Actually, I think requiring the following condition is enough:
def cond' (F : (X → U) → (X → ℝ)) (x : X) : Prop :=
∃ D : Set X, IsCompact D ∧ ∀ (φ1 φ2 : X → U), (∀ y ∈ D, φ1 y = φ2 y) → F φ1 x = F φ2 x
which is much looser then locality. It basically says that for every x : X, there is a compact set D where knowing φ1 on that compact set is sufficient to know the value of F φ1 x.
Joseph Tooby-Smith (Jun 20 2025 at 07:55):
(or in physics terms, the value of the Lagrangian at a point x does not depend on values of the field a points infinitely far from x).
Tomas Skrivan (Jun 20 2025 at 07:57):
Yeah that would work too but I find the locality condition more intuitive
Joseph Tooby-Smith (Jun 20 2025 at 07:57):
(Furthermore, I think is cond' is not true, the variational derivative, as we have defined it, will not exist anyway.)
Joseph Tooby-Smith (Jun 20 2025 at 07:58):
Tomas Skrivan said:
Yeah that would work too but I find the locality condition more intuitive
Yeah but, your peridynamic example would satisfy this cond', but not the locality condition.
Tomas Skrivan (Jun 20 2025 at 08:05):
Wait, now I'm a bit confused about what we are trying to achieve here now.
One thing we talk about is local Lagrangian and the peridynamic example is not local.
Next we want some condition such that F' (fun _ => 1) is meaningful. But neither the locality nor cond' is good as it is not evaluating F' on constant functions. For that always have this counter example in mind
F' f := if IsTestFunction f then ... else 0
i.e. F' can return junk values on everything but test functions
Tomas Skrivan (Jun 20 2025 at 08:20):
I think the continuity condition on should be that for every compact set there is compact with such that for a smooth function and any sequence of test functions that satisfy
we have
For local Lagrangians can be , for peridynamic example has to be enlarged by the maximal force distance.
Tomas Skrivan (Jun 20 2025 at 08:23):
btw. docs#ae_eq_zero_of_integral_smooth_smul_eq_zero is another version of fundamental lemma of variational calculus
Joseph Tooby-Smith (Jun 20 2025 at 08:37):
Actually I think your condition and my condition cond' may be equivalent to each other.
But let me address your confusion, by answering the equation: Why does cond' give F' (fun _ => 1) meaning?
This is because for a given x, the condition cond' makes F' (fun _ => 1) x equal to F' φ x for φ a test-function which is equal to 1 on the compact set D determined by cond'.
Note, then that given any smooth function f I can take a bump function on L, φ to give a test function f * φ . Both f and f * φ agree on L, both your and my condition then states that F f and F (f * φ) will agree on L (or K).
Joseph Tooby-Smith (Jun 20 2025 at 08:38):
(Correct me if I have misunderstood your condition).
Tomas Skrivan (Jun 20 2025 at 08:39):
Ohh you are totally right, I misunderstood it :)
Tomas Skrivan (Jun 20 2025 at 08:41):
Shall we try adding
def cond' (F : (X → U) → (X → ℝ)) (x : X) : Prop :=
∃ D : Set X, IsCompact D ∧ ∀ (φ1 φ2 : X → U), (∀ y ∈ D, φ1 y = φ2 y) → F φ1 x = F φ2
to HasVarAdjoint or to HasVarRevDerivAt?
Joseph Tooby-Smith (Jun 20 2025 at 08:42):
I think probably HasVarRevDerivAt no? I think it might be best to keep HasVarAdjoint as minimal as possible, so the definition HasVarGradientAt is as minimal as possible.
Tomas Skrivan (Jun 20 2025 at 08:43):
I'm not even sure if there is a need for HasVarGradientAt anymore as you just prove HasVarRevDerivAt F F' u and the gradient is just F' (fun _ => 1)
Tomas Skrivan (Jun 20 2025 at 08:44):
And that is independent of whether this conditions goes to HasVarRevDerivAt or HasVarAdjoint
Tomas Skrivan (Jun 20 2025 at 08:45):
It think it should go to HasVarAdjoint as it is dealing exactly with the issues of inner product on function spaces which HasVarAdjoint is addressing.
Tomas Skrivan (Jun 20 2025 at 08:47):
Then we can show HasVarAdjoint.unique to be true not only on test functions but on any smooth functions which I think is nice
Joseph Tooby-Smith (Jun 20 2025 at 08:47):
Ok, sounds good to me, but I think getting rid of HasVarGradientAt might cause a bit of confusion, because I think in general the form of F' is a fair bit more complicated then F' (fun _ => 1). Maybe we could define HasVarGradientAt through HasVarRevDerivAt?
Joseph Tooby-Smith (Jun 20 2025 at 08:49):
Ideally we would want varGradient, but it is not clear to me how we should define this in the case where it does not exist. Setting it to 0, like what is done for deriv is bad i think, because we care about the points where it is zero.
Joseph Tooby-Smith (Jun 20 2025 at 08:50):
(i.e. varGradient F u = 0 could either imply that the variational derivative of F doesn't exist or that u is a solution :().
Tomas Skrivan (Jun 20 2025 at 08:51):
Joseph Tooby-Smith said:
Ideally we would want
varGradient, but it is not clear to me how we should define this in the case where it does not exist. Setting it to0, like what is done forderivis bad i think, because we care about the points where it is zero.
That is not an issue. We just define VarDifferentiable as we do for deriv
Joseph Tooby-Smith (Jun 20 2025 at 08:52):
Basically VarDifferentiable is true if there exists an F' for which HasVarRevDerivAt F F' u? i.e. define it through HasVarRevDerivAt?
Tomas Skrivan (Jun 20 2025 at 08:53):
Yes
Tomas Skrivan (Jun 20 2025 at 08:53):
I'm thinking about changing the definition of HasVarGradientAt to
inductive HasVarGradientAt (S : (X → U) → ℝ) (grad : X → U) (u : X → U) : Prop where
| intro
(F : (X → U) → (X → ℝ))
(hF : ∀ φ, IsTestFunction φ → S φ = ∫ x, F φ x)
(F' : (X → ℝ) → (X → U))
(hF' : HasVarAdjDerivAt F F' u)
(hgrad : grad = F' (fun _ => 1))
Tomas Skrivan (Jun 20 2025 at 08:54):
Then we could write
HaVarGradientAt (fun u => ∫ x, ‖∇ u x‖^2) (- Δ x) u
Tomas Skrivan (Jun 20 2025 at 08:55):
However, I'm wondering if we can still prove uniqueness of grad
Joseph Tooby-Smith (Jun 20 2025 at 08:56):
Ah, good! yes,this is nice! You are worried because F is not necessarily unique here?
Tomas Skrivan (Jun 20 2025 at 08:56):
Yeah something along those lines, but probably we can always add some silly condition to ensure that
Tomas Skrivan (Jun 20 2025 at 08:57):
During computation varGradient fun u => ∫ x, ‖∇ u x‖^2 will be immediately converted to varAdjDeriv fun u x => ‖∇ u x‖^2
Tomas Skrivan (Jun 20 2025 at 09:44):
I renamed cond' to ext as it is some form of extensionality and I had to modify it to
ext : ∀ (K : Set X) (hK : IsCompact K), ∃ L : Set X,
IsCompact L ∧ K ⊆ L ∧ ∀ (φ φ' : X → U), (∀ x ∈ L, φ x = φ' x) → ∀ x ∈ K, F φ x = F φ' x
in order to prove composition theorem
Joseph Tooby-Smith (Jun 20 2025 at 09:47):
Sounds good to me.
Joseph Tooby-Smith (Jun 20 2025 at 11:25):
BTW HasVarAdjDerivAt or HasVarRevDerivAt? The current way this appears in PR616 is the file name as HasVarRevDerivAt but the definitions are HasVarAdjDerivAt.
I've also been thinking about the uniqueness of grad in the modified HasVarGradientAt. I think it should follow from something like:
example (F : (X → U) → (X → V)) (F' : (X → V) → (X → U))
(μ : Measure X) (hF : HasVarAdjoint F F' μ)
(hφ : ∀ φ, IsTestFunction φ → ∫ x, F φ x ∂μ = 0 ) : ∀ φ, IsTestFunction φ → F φ = 0 := by
sorry
But I cannot think of a way to prove this (if it is indeed true).
But then the question is, how necessary is it to actually have varGradient? Can one get away with having just HasVarGradientAt for most things?
Tomas Skrivan (Jun 20 2025 at 11:28):
Joseph Tooby-Smith said:
BTW
HasVarAdjDerivAtorHasVarRevDerivAt? The current way this appears in PR616 is the file name asHasVarRevDerivAtbut the definitions areHasVarAdjDerivAt.
I think it should be HasVarAdjDerivAt. Previously I called it HasVarRevDerivAt as there is some connection to revese mode automatic differentiation but I don't think it is a good name for PhysLean.
Tomas Skrivan (Jun 20 2025 at 11:29):
Joseph Tooby-Smith said:
But then the question is, how necessary is it to actually have
varGradient? Can one get away with having justHasVarGradientAtfor most things?
I want HasVarGradientAt S grad u to accept S : (X → U) → ℝ and have uniqueness for grad. Defining varGradient is just a cherry on top.
Joseph Tooby-Smith (Jun 20 2025 at 11:31):
Ok, thanks. Also, I can take over PR617, fix the last lint and get it merged if you want?
Tomas Skrivan (Jun 20 2025 at 11:35):
Joseph Tooby-Smith said:
Ok, thanks. Also, I can take over PR617, fix the last lint and get it merged if you want?
That would be appreciated!
You can also merge 616, there will be conflict with 617 but you can fix the proof of HasVarAdjDerivAt.comp with
lemma comp {F : (X → V) → (X → W)} {G : (X → U) → (X → V)} {u : X → U}
{F' G'} (hF : HasVarAdjDerivAt F F' (G u)) (hG : HasVarAdjDerivAt G G' u) :
HasVarAdjDerivAt (fun u => F (G u)) (fun ψ => G' (F' ψ)) u where
smooth_at := hG.smooth_at
diff := by
intro φ hφ
apply hF.diff (φ := fun t x => G (φ t) x)
exact hG.diff φ hφ
linearize := by
intro φ hφ x
rw[hF.linearize (fun t x => G (φ t) x) (hG.diff φ hφ)]
rw[hF.linearize (fun s' => G fun x => φ 0 x + s' • deriv (fun x_1 => φ x_1 x) 0)]
simp[hG.linearize φ hφ]
eta_expand; simp[Function.HasUncurry.uncurry]
apply hG.diff (φ := fun a x => φ 0 x + a • deriv (fun x_1 => φ x_1 x) 0)
fun_prop [deriv]
adjoint := by
have : ContDiff ℝ ∞ u := hG.smooth_at
have h := hF.adjoint.comp hG.adjoint
apply h.congr_fun
intro φ hφ; funext x
rw[hF.linearize]
· simp
· simp [Function.HasUncurry.uncurry];
apply hG.diff (φ := (fun s x => u x + s • φ x))
fun_prop
Tomas Skrivan (Jun 20 2025 at 11:36):
Also this fix is necessary
lemma zero : HasVarAdjoint (fun (_ : X → U) _ => (0 : V)) (fun _ _ => 0) μ where
test_fun_preserving _ hφ := by fun_prop
test_fun_preserving' _ hφ := by fun_prop
adjoint _ _ _ _ := by simp
ext := fun K cK => ⟨K,cK,subset_refl _,fun _ _ h _ _ => rfl⟩
Tomas Skrivan (Jun 20 2025 at 11:37):
Joseph Tooby-Smith said:
I've also been thinking about the uniqueness of
gradin the modifiedHasVarGradientAt. I think it should follow from something like:example (F : (X → U) → (X → V)) (F' : (X → V) → (X → U)) (μ : Measure X) (hF : HasVarAdjoint F F' μ) (hφ : ∀ φ, IsTestFunction φ → ∫ x, F φ x ∂μ = 0 ) : ∀ φ, IsTestFunction φ → F φ = 0 := by sorryBut I cannot think of a way to prove this (if it is indeed true).
I'm thinking maybe we can add this as another condition to HasVarAdjoint. I think it is fairly natural requirement.
Tomas Skrivan (Jun 20 2025 at 11:39):
Ahhh, no it is not!!! For F = deriv we have ∫ x, deriv φ x = 0 for any test function!
Tomas Skrivan (Jun 20 2025 at 11:44):
Yeah I really do not know how to enforce uniqueness of F in
inductive HasVarGradientAt (S : (X → U) → ℝ) (grad : X → U) (u : X → U) : Prop where
| intro
(F : (X → U) → (X → ℝ))
(hF : ∀ φ, IsTestFunction φ → S φ = ∫ x, F φ x)
(F' : (X → ℝ) → (X → U))
(hF' : HasVarAdjDerivAt F F' u)
(hgrad : grad = F' (fun _ => 1))
Joseph Tooby-Smith (Jun 20 2025 at 11:46):
I think F is not unique here, I was hoping that
(fun δu x => deriv (fun s : ℝ => F (u + s • δu) x) 0)
was unique, if it had an adjoint, but I think you deriv example may be used to construct a counter example.
Tomas Skrivan (Jun 20 2025 at 11:50):
To keep the the integral in HasVarGradientAt I think we could modify the definition such that we can state
HasVarGradientAt (fun u φ => ∫ x, ‖∇ u‖^2 * φ x) (- Δ u) u
or
HasVarGradientAt (fun u Ω => ∫ x in Ω, ‖∇ u‖^2) (- Δ u) u
Tomas Skrivan (Jun 20 2025 at 11:51):
In this case I think you should be able to recover F : (X → U) → (X → ℝ) uniquely from (fun u φ => ∫ x, ‖∇ u‖^2 * φ x) or from (fun u Ω => ∫ x in Ω, ‖∇ u‖^2)
Tomas Skrivan (Jun 20 2025 at 11:54):
I see three options how to define HasVarGradientAt
- HasVarGradientAt (fun u x => ‖∇ u‖^2) (- Δ u) u
- HasVarGradientAt (fun u φ => ∫ x, ‖∇ u‖^2 * φ x) (- Δ u) u
- HasVarGradientAt (fun u Ω => ∫ x in Ω, ‖∇ u‖^2) (- Δ u) u
Joseph Tooby-Smith (Jun 20 2025 at 12:00):
Let me restate my conjecture:
example (F : (X → U) → (X → V)) (F' : (X → V) → (X → U))
(μ : Measure X) (hF : HasVarAdjoint F F' μ)
(hφ : ∀ φ, IsTestFunction φ → ∫ x, F φ x ∂μ = 0 ) : ∀ φ, IsTestFunction φ → F' (fun _ => 1) = 0 := by
sorry
Which is true for F := deriv since F' = - deriv and thus F' (fun _ => 1) = 0
Joseph Tooby-Smith (Jun 20 2025 at 12:01):
But if this can't be solved, then I would vote for the first of your options, since I feel the other two are just round-about ways of specifying F whilst having the integral.
Tomas Skrivan (Jun 20 2025 at 12:02):
Yeah I would also go for the first option. Furthermore, we can have specialized notation and recover familiar looking equations
inductive HasVarGradientAt (S : (X → U) → (X → ℝ)) (grad : X → U) (u : X → U) : Prop where
| intro
(F' : (X → ℝ) → (X → U))
(hF' : HasVarAdjDerivAt F F' u)
(hgrad : grad = F' (fun _ => 1))
open Classical in
noncomputable def varGradient (S : (X → U) → (X → ℝ)) (u : X → U) : X → U :=
if h : ∃ grad, HasVarGradientAt S grad u then
choose h
else
0
macro "δ" u:term ", " "∫ " x:term ", " b:term : term => `(varGradient (fun $u $x => $b))
example : (δ (x : ℝ → ℝ), ∫ t, ‖deriv x t‖^2) = fun x t => - deriv (deriv x) t := sorry
Joseph Tooby-Smith (Jun 20 2025 at 12:03):
This looks good to me! (although the S in the argument of your first defn should be an F I think)
Tomas Skrivan (Jun 20 2025 at 12:05):
Agreed, It is too suggestive of action which it is not.
Joseph Tooby-Smith (Jun 20 2025 at 12:06):
I agree, but I actually meant because you use F in HasVarAdjDerivAt F F' u rather then S :). But in any case.
Tomas Skrivan (Jun 20 2025 at 12:07):
Ohh I see, it is broken as it is now :)
Joseph Tooby-Smith (Jun 20 2025 at 12:10):
PR617 merged - will turn to PR616 shortly.
Joseph Tooby-Smith (Jun 20 2025 at 13:14):
PR616 merged
Tomas Skrivan (Jun 20 2025 at 13:16):
Thanks a lot for pushing it over the finish line!
Tomas Skrivan (Jun 20 2025 at 13:16):
Right now I'm making progress on showing that variational adjoint of gradient is Space.div
Tomas Skrivan (Jun 20 2025 at 13:58):
I made a PR618 for variatonal adjoint of adjoint, fderiv, gradient and generalization for deriv. There are still bunch of sorries and I'm probably not going to work on this for some time now.
Natural follow up to this PR is adding corresponding theorems for HasVarAdjDerivAt.
Once all of this is done then proving all the different versions of Euler-Lagrange equations should be easy.
Unfortunately, we won't be able to do vector valued field equations as there is no inner product on tensor products/linear maps right now.
Joseph Tooby-Smith (Jun 20 2025 at 14:06):
Ok, thanks for this! I may get chance to fill i a couple of the sorries over the next couple of days.
Joseph Tooby-Smith (Jun 22 2025 at 06:52):
I fixed the linters and merged PR619 and PR621. Hope this is ok. I'm not entirely sure what best practice is in this regard, so if you'd rather me pass them by you before merging, let me know, and I will change what I do in the future.
Tomas Skrivan (Jun 22 2025 at 08:17):
Thanks a lot! It works for me this way but I don't care too much. I just feel bad that you are doing the annoying bits for me :)
Joseph Tooby-Smith (Jun 22 2025 at 19:16):
Ha don't worry about that, I'm more then happy doing these bits :slight_smile:
Tomas Skrivan (Jun 23 2025 at 12:55):
Once all the PRs regarding HasVarAjdDeriv and HasAdjFDeriv are done then the proof of Euler-Lagrange equations looks like this
theorem hasVarGradient_action_eq_euler_lagrange
(L : ℝ → X → X → ℝ) (q : ℝ → X)
(hq : ContDiff ℝ ∞ q) (hL : ContDiff ℝ ∞ ↿L) :
(δ (q':=q), ∫ t, L t (q' t) (deriv q' t))
=
(fun t =>
gradient (L t · (deriv q t)) (q t)
-
deriv (fun t' => gradient (L t' (q t') ·) (deriv q t')) t) := by
apply HasVarGradientAt.varGradient
constructor
case hF' =>
apply HasVarAdjDerivAt.comp
(F := fun (φ : ℝ → (X×X)) t => L t (φ t).fst (φ t).snd)
(G := fun (φ : ℝ → X) t => (φ t, deriv φ t))
· apply HasVarAdjDerivAt.fmap' (f := fun t => ↿(L t))
· fun_prop
· fun_prop
· apply HasVarAdjDerivAt.prod (F:=fun φ => φ) (G:=fun φ => deriv φ)
· apply HasVarAdjDerivAt.id _ hq
· apply HasVarAdjDerivAt.deriv
· apply HasVarAdjDerivAt.id _ hq
case hgrad =>
funext t
simp (disch:=fun_prop) only [adjFDeriv_uncurry,gradient_eq_adjFDeriv]
simp[sub_eq_add_neg]
Joseph Tooby-Smith (Jun 24 2025 at 10:00):
In the PR622, in HasVarAdjDerivAt.fmap, there is one remaining sorry, which I do not know how to fill in.
But let me ask the following, the declaration of this lemma is:
lemma fmap (f : X → U → V) {f' : X → _ }
(u : X → U) (hu : ContDiff ℝ ∞ u)
(hf' : ContDiff ℝ ∞ ↿f) (hf : ∀ x, HasAdjFDerivAt ℝ (f x) (f' x) (u x)) :
HasVarAdjDerivAt (fun (φ : X → U) x => f x (φ x)) (fun ψ x => f' x (ψ x)) u where
Is there a reason not to include (_ : ContDiff ℝ ∞ ↿f') as an input? I think this would make the proof of this last sorry somewhat straight forward.
Tomas Skrivan (Jun 24 2025 at 11:44):
It is very well possible it is missing but I think it can be inferred from other assumptions. I will give it a shot.
Tomas Skrivan (Jun 24 2025 at 12:22):
So far I have this
open InnerProductSpace' ContDiff in
protected theorem HasAdjFDerivAt.contDiffAt_deriv [CompleteSpace E] [CompleteSpace F]
{f : E → F} {f' : E → _} (hf : ∀ x, HasAdjFDerivAt 𝕜 f (f' x) x) (hf' : ContDiff 𝕜 ∞ f) :
ContDiff 𝕜 ∞ (fun xy : E×F => f' xy.1 xy.2) := by
simp[← fun x => (hf x).adjFDeriv]
unfold adjFDeriv
simp[adjoint_eq_clm_adjoint]
apply ContDiff.fun_comp
· fun_prop
· apply ContDiff.fun_comp (𝕜:=𝕜) (n:=∞)
(g := fun fx : ((WithLp 2 E) →L[𝕜] (WithLp 2 F))×(WithLp 2 F) => fx.1.adjoint fx.2)
(f := fun xy : E×F => (((toL2 𝕜) ∘L ((fderiv 𝕜 f xy.1) ∘L (fromL2 𝕜))), (toL2 𝕜) xy.2))
· sorry
· fun_prop
where the last sorry is
ContDiff 𝕜 ∞ fun (fx : (E →L[𝕜] F) × F) => fx.1.adjoint fx.2)
I believe that showing this
IsBoundedBilinearMap 𝕜 (fun (fy : (X →L[𝕜] Y)×Y) => fy.1.adjoint fy.2)
should be enough.
Tomas Skrivan (Jun 24 2025 at 12:42):
Ok I have the proof! I will update the PR
Tomas Skrivan (Jun 24 2025 at 12:48):
Ahhh, I had one sorry that I though is trivial ... and is false
(adjoint (c • f)) y = c • (adjoint f) y
but
(adjoint (c • f)) y = (conj c) • (adjoint f) y
is true ... so I can't use IsBoundedBilinearMap :( I guess I will just do it over reals right now
Tomas Skrivan (Jun 24 2025 at 12:52):
Ohh it is probably not true over complex numbers as the derivative would not be complex linear.
Tomas Skrivan (Jun 24 2025 at 13:56):
Thanks a lot for the help! The PR622 is now sorry free!
Tomas Skrivan (Jun 24 2025 at 13:57):
What is left to do is documentation and satisfy linters
Joseph Tooby-Smith (Jun 24 2025 at 14:00):
I'm happy to deal with the linters :), though it is probably best if I leave the documentation to you.
Tomas Skrivan (Jun 25 2025 at 09:50):
The next step is to prove HasVarAdjoint for fderiv. The bottleneck to this is the lack of generalizations of docs#MeasureTheory.integral_mul_deriv_eq_deriv_mul_of_integrable to fderivinstead of deriv and inner instead of mul
Joseph Tooby-Smith (Jun 25 2025 at 10:03):
Is this result in one of the PRs? I'm guessing it is this, but this PR has a merge conflict.
Tomas Skrivan (Jun 25 2025 at 10:04):
Yes that is the statement
Joseph Tooby-Smith (Jun 25 2025 at 10:24):
docs#integral_mul_fderiv_eq_neg_fderiv_mul_of_integrable seems useful in particular in proving HasVarAdjoint.fderiv_apply e.g.
lemma fderiv_apply {d} {dx} :
HasVarAdjoint (fun φ : Space d → U => (fderiv ℝ φ · dx)) (fun φ x => - fderiv ℝ φ x dx) where
test_fun_preserving φ hφ := by
sorry
test_fun_preserving' φ hφ := by
sorry
adjoint φ ψ hφ hψ := by
rw [← sub_eq_zero]
rw [← integral_sub]
trans ∫ (a : Space d), fderiv ℝ (fun a => ⟪φ a , ψ a⟫_ℝ) a dx ∂volume
congr
· funext a
simp
rw [fderiv_inner_apply]
rw [add_comm]
· exact hφ.differentiable a
· exact hψ.differentiable a
have h1 := integral_mul_fderiv_eq_neg_fderiv_mul_of_integrable (f := fun a => 1)
(g := (fun a => ⟪φ a , ψ a⟫_ℝ)) (v := dx) (μ := volume) (by sorry) (by sorry)
(by sorry) (by fun_prop) (by sorry)
simpa using h1
· sorry
· sorry
Tomas Skrivan (Jun 25 2025 at 10:26):
Omg, for the life of me I couldn't find this theorem :man_facepalming:
Joseph Tooby-Smith (Jun 25 2025 at 10:28):
I found it by loogleing fderiv, "integral" :slight_smile:
Tomas Skrivan (Jun 25 2025 at 10:28):
And we can prove only HasVarAdjoint.fderiv_apply version as HasVarAdjoint.fderiv would require inner product space structure on continuous linear maps to just state it and that is not in mathlib.
Tomas Skrivan (Jun 25 2025 at 10:29):
Joseph Tooby-Smith said:
I found it by loogleing
fderiv, "integral":slight_smile:
Yeah I don't know what went wrong. It is sufficient to search for integral fderiv in mathlib docs and it comes up. I thought I did that ...
Tomas Skrivan (Jun 25 2025 at 10:31):
Ahh maybe I always had inner in the search query ...
Joseph Tooby-Smith (Jun 25 2025 at 10:35):
Tomas Skrivan said:
Ahh maybe I always had
innerin the search query ...
Ah, this would probably explain it.
I can most likely fill in the sorries of my proof above, and merge it with that PR if that would help?
Tomas Skrivan (Jun 25 2025 at 10:36):
Yes that would be great but it should be generalized, Space d to X.
Tomas Skrivan (Jun 25 2025 at 10:42):
And feel free to assume InnerProductSpace ℝ X. I can then translate it to InnerProductSpace' ℝ X.
Joseph Tooby-Smith (Jun 25 2025 at 10:46):
Seems like also need to assume a bunch of other things:
[InnerProductSpace ℝ X] [ProperSpace X] [BorelSpace X]
[FiniteDimensional ℝ X] [μ.IsAddHaarMeasure]
Joseph Tooby-Smith (Jun 25 2025 at 10:47):
This is mainly to get integral_mul_fderiv_eq_neg_fderiv_mul_of_integrable to work.
Tomas Skrivan (Jun 25 2025 at 10:48):
Yeah that is fine. Although I'm confused why
#synth (volume : Measure (ℝ×ℝ)).IsAddHaarMeasure
fails but I think it is just missing API as
#synth (volume : Measure (Fin 2 → ℝ)).IsAddHaarMeasure
works
Joseph Tooby-Smith (Jun 25 2025 at 10:49):
Think it is just an import issue as
import Mathlib
open MeasureTheory
#synth (volume : Measure (ℝ×ℝ)).IsAddHaarMeasure -- instIsAddHaarMeasureProdRealVolume
Tomas Skrivan (Jun 25 2025 at 10:50):
Ohh I see
Joseph Tooby-Smith (Jun 25 2025 at 10:55):
Ok, merged a proof to the PR.
Tomas Skrivan (Jun 25 2025 at 15:58):
I have resolved conflicts in PR618 and formulated
lemma adjFDeriv_apply
[InnerProductSpace' ℝ X] [InnerProductSpace' ℝ Y]
[ProperSpace X] [BorelSpace Y] [FiniteDimensional ℝ X] [(@volume X _).IsAddHaarMeasure] {dy} :
HasVarAdjoint (fun φ : X → Y => (adjFDeriv ℝ φ · dy)) (fun φ x => - divergence ℝ φ x • dy) := ...
where divergence is defined as
def divergence (f : E → E) (x : E) : 𝕜 := (fderiv 𝕜 f x).toLinearMap.trace _ _
Joseph Tooby-Smith (Jun 27 2025 at 17:35):
Do you think this is going to be true?
lemma HasVarAdjoint.symm {F : (X → U) → (Y → V)} {F' : (Y → V) → (X → U)}
(hF : HasVarAdjoint F F') : HasVarAdjoint F' F := by sorry
I think if it is the difficulty will be in showing the IsLocalizedFunctionTransform part.
Tomas Skrivan (Jun 27 2025 at 19:05):
Yeah I think so too, everything else is trivial. I'm open to tweaking the definition such that this is true.
Tomas Skrivan (Jun 27 2025 at 19:07):
But maybe we can use the fundamental theorem of variational calculus somehow and show that this theorem holds even with the current definition.
Ryan MacLean (Jun 27 2025 at 19:13):
Joseph Tooby-Smith said:
Do you think this is going to be true?
lemma HasVarAdjoint.symm {F : (X → U) → (Y → V)} {F' : (Y → V) → (X → U)} (hF : HasVarAdjoint F F') : HasVarAdjoint F' F := by sorryI think if it is the difficulty will be in showing the
IsLocalizedFunctionTransformpart.
Does this help? I obviously used an LLM, but I'm using it to explain what your equation means logically to me. I ask it a bunch of different ways and it gives me the same steps, but sometimes the output is slightly different.
Absolutely — let’s break it down step by step so the whole thing is clear, internalized, and portable. Here's how to understand the full
HasVarAdjoint.symmlogic:
:repeat: The Core Idea
You’re proving a symmetry property:
IfFandF'are variable adjoints (in theHasVarAdjointsense),
then that relationship works in both directions.
:puzzle_piece: What's
HasVarAdjoint?It’s a structure (likely a
structureorclassin Lean) that says:
Fis localized:
IsLocalizedFunctionTransform F
(meaning it plays nicely with smooth/decaying/test functions)
F'is its adjoint via inner product pairing:
⟪F f, g⟫ = ⟪f, F' g⟫
This is the key identity.
:brain: Goal of the Lemma
Prove:
lemma HasVarAdjoint.symm {F : (X → U) → (Y → V)} {F' : (Y → V) → (X → U)} (hF : HasVarAdjoint F F') : HasVarAdjoint F' FSo if
Fis the adjoint ofF',
thenF'is also the adjoint ofF.That’s symmetry.
Like if A = B, then B = A.
:brick: Step-by-step Breakdown
Step 1: Flip the inner product
From:
⟪F f, g⟫ = ⟪f, F' g⟫we do a variable swap:
letf' := g,g' := f.Then:
⟪F g', f'⟫ = ⟪g', F' f'⟫Rewriting:
⟪F' f, g⟫ = ⟪f, F g⟫Which is exactly the adjoint identity for
F'.:check_mark: Done.
Step 2: Show that
F'is also localizedThis is the tricky part. But here’s the intuition:
If
Fis localized, and
F'is defined by duality withFover a pairing that only works for localized inputs...Then
F'must also be localized.
Otherwise, the duality pairing wouldn't even make sense.In practice, we write a helper lemma like:
lemma IsLocalizedFunctionTransform.symm_of_adjoint (hF : IsLocalizedFunctionTransform F) (hAdj : ∀ f g, ⟪F' f, g⟫ = ⟪f, F g⟫) : IsLocalizedFunctionTransform F'This is your justification bridge.
Step 3: Bundle it together
You now have:
Duality:
⟪F' f, g⟫ = ⟪f, F g⟫Localization:
F'is also localized (via helper)So you write:
lemma HasVarAdjoint.symm (hF : HasVarAdjoint F F') : HasVarAdjoint F' F := by rcases hF with ⟨h_local, h_dual⟩ exact ⟨ IsLocalizedFunctionTransform.symm_of_adjoint h_local h_dual, fun f g => (h_dual g f).symm ⟩You pattern-match the structure, flip the duality, apply the helper, and rebuild the
HasVarAdjointin reverse.
:dna: Why It Matters
You’re proving symmetry of a duality concept.
This unlocks flexibility in working with transforms like the Fourier, Laplace, or general kernel-based ones.
It matters a lot in quantum mechanics, signal processing, and anything that uses adjoints of operators.
Let me know if you want this translated into a diagram, analogy, or with a physical interpretation (e.g., measuring vs generating signals). We can go deeper into any layer.```
Tomas Skrivan (Jun 28 2025 at 05:15):
Tomas Skrivan said:
But maybe we can use the fundamental theorem of variational calculus somehow and show that this theorem holds even with the current definition.
After some thought this is probably not going to work. At some point you have to show that
F φ - F φ' = 0
but in order to apply the fundamental lemma of variational calculus you need some regularity on F φ and F φ' like continuity. We can show smoothness of F φ' from smoothness of φ' and IsLocalizedFunctionTransform F'. We can't do the same for F φ. It is getting circular as to prove IsLocalizedFunctionTransform F we would need IsLocalizedFunctionTransform F. Furthermore, we do not assume anything about φ and φ' in the statement of IsLocalizedFunctionTransform. I was thinking about modifying the definition to
def IsLocalizedFunctionTransform (F : (X → U) → (Y → V)) : Prop :=
∀ (K : Set Y) (_ : IsCompact K), ∃ L : Set X,
IsCompact L ∧ ∀ (φ φ' : X → U), ContDiff ℝ ∞ φ → ContDiff ℝ ∞ φ' → (∀ x ∈ L, φ x = φ' x) →
∀ x ∈ K, F φ x = F φ' x
i.e. adding ContDiff ℝ ∞ ....
Anyway, I think we could just change the definition of HasVarAdjoint to
structure HasVarAdjoint
(F : (X → U) → (Y → V)) (F' : (Y → V) → (X → U)) where
test_fun_preserving : ∀ φ, IsTestFunction φ → IsTestFunction (F φ)
test_fun_preserving' : ∀ φ, IsTestFunction φ → IsTestFunction (F' φ)
adjoint : ∀ φ ψ, IsTestFunction φ → IsTestFunction ψ →
∫ y, ⟪F φ y, ψ y⟫_ℝ = ∫ x, ⟪φ x, F' ψ x⟫_ℝ
ext : IsLocalizedFunctionTransform F
ext' : IsLocalizedFunctionTransform F'
Tomas Skrivan (Jun 28 2025 at 05:20):
Ryan MacLean said:
Joseph Tooby-Smith said:
Do you think this is going to be true?
lemma HasVarAdjoint.symm {F : (X → U) → (Y → V)} {F' : (Y → V) → (X → U)} (hF : HasVarAdjoint F F') : HasVarAdjoint F' F := by sorryI think if it is the difficulty will be in showing the
IsLocalizedFunctionTransformpart.Does this help? I obviously used an LLM, but I'm using it to explain what your equation means logically to me. I ask it a bunch of different ways and it gives me the same steps, but sometimes the output is slightly different.
Unfortunately LLMs are not yet good at reasoning about definitions that do not appear in existing literature. Our definition of HasVarAdjoint has some very important subtleties that the LLM is completely ignoring. So the output is just some hallucination that appears to be useful but does not contain any interesting ideas.
Ryan MacLean (Jun 28 2025 at 06:40):
Tomas Skrivan said:
Ryan MacLean said:
Joseph Tooby-Smith said:
Do you think this is going to be true?
lemma HasVarAdjoint.symm {F : (X → U) → (Y → V)} {F' : (Y → V) → (X → U)} (hF : HasVarAdjoint F F') : HasVarAdjoint F' F := by sorryI think if it is the difficulty will be in showing the
IsLocalizedFunctionTransformpart.Does this help? I obviously used an LLM, but I'm using it to explain what your equation means logically to me. I ask it a bunch of different ways and it gives me the same steps, but sometimes the output is slightly different.
Unfortunately LLMs are not yet good at reasoning about definitions that do not appear in existing literature. Our definition of
HasVarAdjointhas some very important subtleties that the LLM is completely ignoring. So the output is just some hallucination that appears to be useful but does not contain any interesting ideas.
That’s kind of the issue I’m running into. The problem isn’t the LLM, I can feed the rules back in. The problem would be me.
As you can see, I can take any problem and logically break it down with an LLM. All it had was the snippet of code you posted, it understood your problem every single time. It came up with a method to solve it and the reasoning behind that, it didn’t hallucinate anything. It just doesn’t know HasVarAdjoint has other rules. If you told it those rules, it’ll work around it. It’s just autocorrect with a bunch of files in it. Ultimately I’ll probably end up making a custom GPT with all the files in there so it has the most current version to work from, but I’m still learning how to do that, I think I may have to switch to copilot for VS code to do it directly.
Is there a way of knowing those dependencies without knowing the codebase directly? I’m not trying to be a pain, I’m just horrible at coding but good at plumbing. If I understand the architecture it gives me the ability to help. You’ve seen what my stuff outputs, it’s not useful for making stuff I don’t understand, it’s only good for teaching me what I don’t understand. The physics isn’t my problem, it’s this. I don’t understand the language you’re using and why things are in these places, like why this equation would be dependent.
I won’t keep bothering you about this stuff now that I realize how to work on it separately, but how do you keep track of these things? I understand the flow of the math, not the flow of software. This is all logic, my sticking point is language not math.
Joseph Tooby-Smith (Jun 28 2025 at 07:27):
Hi Ryan, this isn’t the right place for this conversation. I know my tolerance for this kind of thing may be lower than others’, and my approach might come off more direct (mainly because I've seen where these things can lead), but I’m going to ask that you don’t post on this thread again. I understand you’re well-intentioned, but you’ve already received advice—both from me and others—on your original post. I’d recommend following that.
Joseph Tooby-Smith (Jun 28 2025 at 07:34):
Tomas Skrivan said:
Anyway, I think we could just change the definition of
HasVarAdjointtostructure HasVarAdjoint (F : (X → U) → (Y → V)) (F' : (Y → V) → (X → U)) where test_fun_preserving : ∀ φ, IsTestFunction φ → IsTestFunction (F φ) test_fun_preserving' : ∀ φ, IsTestFunction φ → IsTestFunction (F' φ) adjoint : ∀ φ ψ, IsTestFunction φ → IsTestFunction ψ → ∫ y, ⟪F φ y, ψ y⟫_ℝ = ∫ x, ⟪φ x, F' ψ x⟫_ℝ ext : IsLocalizedFunctionTransform F ext' : IsLocalizedFunctionTransform F'
I think this is a good modification, and I think it is worth making to have HasVarAdjoint be symmetric.
I think I agree that the fundamental lemma of variational calculus won't work here :(.
Tomas Skrivan (Jun 28 2025 at 07:41):
Joseph Tooby-Smith said:
I think this is a good modification, and I think it is worth making to have
HasVarAdjointbe symmetric.
I also think so! It might be worth developing some API for IsLocalizedFunctionTransform as I feel in HasVarAdjoint we prove same things over and over.
I'm expecting having IsLocalizedFunctionTransform.id/comp/fmap/fderiv theorems and most of the rest will follow from it.
Tomas Skrivan (Jun 28 2025 at 07:41):
Like all the IsLocalizedFunctionTransform.neg/add/sub/mul will directly follow from IsLocalizedFunctionTransform.fmap
Joseph Tooby-Smith (Jun 28 2025 at 07:42):
Sounds good to me! Wasn't there a plan to remove L from the definition at one point? I forgot what happened to this.
Tomas Skrivan (Jun 28 2025 at 07:43):
I don't think so. I removed the condition that K ⊆ L as we never used it
Joseph Tooby-Smith (Jun 28 2025 at 07:47):
Ok, I'm probably misremembering then. Also, do you think this sorryful thing would be helpful here PR627? It may be useful to split the PRs up a bit without having to worry about every sorry being filled for now.
Tomas Skrivan (Jun 28 2025 at 07:59):
I don't know what to do, the PR keeps on ballooning :/ It feels unsatisfying as filling every sorry reveals missing API which adds two new sorry :joy: at this rate it won't get finished ...
Tomas Skrivan (Jun 28 2025 at 07:59):
Maybe we can just split it up
Tomas Skrivan (Jun 28 2025 at 08:02):
I won't be able to work on it at all for the next week or so. Feel free to do what you think is the best.
Maybe you can use the new sorry attribute to test it out and see how it works in practice.
Joseph Tooby-Smith (Jun 28 2025 at 08:06):
Yep, I can do this and split the PR up a bit, I think this will be a good test for it.
Ryan MacLean (Jun 28 2025 at 13:18):
Joseph Tooby-Smith said:
Hi Ryan, this isn’t the right place for this conversation. I know my tolerance for this kind of thing may be lower than others’, and my approach might come off more direct (mainly because I've seen where these things can lead), but I’m going to ask that you don’t post on this thread again. I understand you’re well-intentioned, but you’ve already received advice—both from me and others—on your original post. I’d recommend following that.
Hi Joseph,
I'm going to take your advice and stop posting here, but I think you're missing the point. It isn't about if ChatGPT can write good code, it decoded what you were saying and taught it to me. You, Joseph, are not decoding what you're saying and teaching it to me, you're telling me to leave.
You had a problem. I specifically said you don't have to use this code, is this the problem. It logically solved your problem, like a dozen times. It didn't hallucinate anything, it doesn't understand the syntax because it isn't trained on the ruleset.
I understand math and physics pretty damn well. It means you could have answered your own question logically by asking ChatGPT, whether it's mine or someone elses. Rather than doing that, you'd prefer to ask for other peoples opinons here and have a social club.
Enjoy your social club. I'm going to be busy teaching AI syntax. Thank you for your contributions. I'm going to be teaching other people how to use them, since you are less friendly than my AI. That way, people can learn from my AI and not come here and annoy you, maybe they'll be allowed to join your social club.
Incidentally, there's no such thing as "Artificial Intelligence," I just know how to use a computer better than most. You're literally coding it's guts, where do you think it allows for random in there?
Michael Rothgang (Jun 29 2025 at 07:24):
My 2c as an outside reader (mathematician, who knows this math somewhat). I'd say the LLM output you posted, Ryan, is not helpful to this thread. By this, I mean: it might help you as a person understand this thread better. But it does not help to answer the question posed above.
The output expands the step "assume being a formal adjoint is symmetric, how would I state this in Lean". However, that's not the question here: Joseph already gave a statement in Lean. The real question is "why is the formal adjoint, with this definition, symmetric" --- which is not obvious, because of this subtlety. The LLM output does not answer this at all. (One heuristic test why: any reasonable answer would have to deal with the subtlety.)
Michael Rothgang (Jun 29 2025 at 07:26):
Going meta-level, I perceive this thread as a venue for experts to discuss. This isn't meant to be accessible to newcomers. Some things require enough context, and always providing that is impractical. Sorry, that's how any specialised work (be it in the military, industry or academic research) works at some point.
Michael Rothgang (Jun 29 2025 at 07:28):
As a new user, learning more about Lean first and only participating in such threads once you understand better is a better strategy. If I'd onboard a new member, discussing current zulip discussions can be part of that: but the onboarding should usually happen off-thread.
Joseph Tooby-Smith (Jun 29 2025 at 10:28):
Continuing the original thread:
@Tomas Skrivan I think in InnerProductSpace' it may be wise to replace the condition:
∀ x : E, ‖x‖₂ ^ 2 = re (core.inner x x)
with the (stronger) condition that
∀ x : E, ‖x‖₂ = √ (re (core.inner x x))
otherwise we have nothing to say that ‖x‖₂ is non-negative.
Joseph Tooby-Smith (Jun 29 2025 at 10:57):
Actually never mind - I thought I would need this to show that the inner product is a bounded linear map, but I didn't.
Tomas Skrivan (Jun 29 2025 at 13:24):
Oh you are right, the definition is not correct. But the L2 norm function is not really used anywhere. I just thought it would be good to have.
Joseph Tooby-Smith (Jun 29 2025 at 15:44):
Maybe it is best to do this the other way around, define ‖x‖₂ as √re (core.inner x x) given an instance of InnerProductSpace'.
Tomas Skrivan (Jun 29 2025 at 17:50):
Yeah we should do it like that. I'm too stuck thinking computationally, you can implement the norm a bit faster than calling the inner product so having a separate function can be advantageous. For proving things it is completely irrelevant.
Joseph Tooby-Smith (Jun 30 2025 at 10:47):
Just realized that we should also be able to derive Hamilton's equations from what we have done, see e.g. here.
Joseph Tooby-Smith (Jun 30 2025 at 12:05):
Ok, this turned out to be pretty easy to do, needed to add some small things to the api but the proof is pretty straightforward, which I am very pleased about :).
Code
Tomas Skrivan (Jun 30 2025 at 17:56):
Very nice! The proof indeed looks quite simple and automatable in the future.
Tomas Skrivan (Jun 30 2025 at 17:58):
Next logical step would be to connect Lagrangian and Hamiltonian mechanics but that will require developing Legamdre transform. Maybe a good opportunity to finally learn it properly.
Dominic Steinitz (Jun 30 2025 at 18:20):
I just wanted to say symplectic manifolds but that is probably years away.
Joseph Tooby-Smith (Jun 30 2025 at 18:24):
I was looking into this connection earlier, the best notes I found on it were these.
I think a lot depends on how much API there is for convex functions.
Michael Rothgang (Jun 30 2025 at 22:02):
Dominic Steinitz said:
I just wanted to say symplectic manifolds but that is probably years away.
I don't think it's years away. Differential forms are close, differential manifolds are there already (and we're gaining Riemannian metrics right this month). Having all the theory you'd expect will be a while, but I'm a bit more optimistic :-)
Joseph Tooby-Smith (Jul 01 2025 at 05:40):
I think the simplest definition of the hamiltonian from the Lagrangian is the following:
noncomputable def hamiltonian (L : Time → X → X → ℝ)
(h : ∀ t q, Function.Bijective (gradient (L t q ·))) :
Time → X → X → ℝ := fun t p q =>
let v := Classical.choose (Function.bijective_iff_has_inverse.mp (h t q)) p
⟪p, v⟫_ℝ - L t q v
It seems to me that the whole story about convex functions etc, is just to ultimately prove that (h : ∀ t q, Function.Bijective (gradient (L t q ·))) is true. And in general, I think it might be easier to prove that h is true directly.
(naturally this only works for certain X)
Cleon Teunissen (Jul 12 2025 at 08:43):
There is something that I would like to verify, and whether it falls within the scope of the PhysLean objectives.
In the message of June 16, 2:33 PM Tomas Skrivan states the action functional, and then remarks: "[...] instead of working with I will be working with "
I surmise that Tomas Skrivan is omitting the integration. I'm aware that the integration can be omitted; the integration is superfluous. However, I would have expected that that omission would be motivated explicitly. In physics the expectation is that the integration is the absolute core of the action concept; the physicist's expectation will be that omitting the integration renders the derivation invalid.
An example of that point of view: the book by the physicist Douglas Cline, who has made the book also available on the libretexts website. In the Prologue Douglas Cline is committed to believing that differential approach and variational approach are fundamentally distinct.
A number of years ago, looking for information about the Lagrange equation, I came across the following article: 'Deriving Lagrange’s equations using elementary calculus', published in 2003. Authors: Jozef Hanc, Edwin F. Taylor, Slavomir Tuleja.
The authors show that the Euler-Lagrange equation can be derived using differential reasoning only.
I used that discussion as the basis of a derivation of the Euler-Lagrange equation that is available on my own website: Calculus of Variations, as applied in physics
I'm aware of a couple of other sources with derivation of the Euler-Lagrange equation using differential reasoning only.
The researcher Preetum Nakkiran phrases the logic as follows: 'This result is often proven using integration by parts – but the equation expresses a local condition, and should be derivable using local reasoning.'
What is your opinion on the following:
To assess in what way the integration is a factor, does that fall within the scope of the PhysLean objectives?
Tomas Skrivan (Jul 12 2025 at 09:35):
Cleon Teunissen said:
To assess in what way the integration is a factor, does that fall within the scope of the PhysLean objectives?
I would say: variational gradient is defined with HasVarGradientAt which is defined using HasAdjVarDerivAt which is using HasVarAdjoint which has integrals in it. So yes, integration is still completely integral part of the computation. However, the formalization is done to hide it all away and end up with calculus that is very similar to normal differentiable calculus. All the applications of per partes are tucked away in elementary rules.
Tomas Skrivan (Jul 12 2025 at 09:36):
My aim was to have formalization of variational calculus for which it is easy to automate computations
Tomas Skrivan (Jul 12 2025 at 09:39):
One thing my approach does not address at all is the question of boundary conditions. The formalization is done such that we can ignore that question. We obtain the right equation 'in the bulk'(continuum mechanics slang for 'away from boundary')
Kenny Lau (Jul 12 2025 at 09:45):
what’s the current definition of var cal?
Tomas Skrivan (Jul 12 2025 at 09:47):
To find out, I would start at this file https://github.com/HEPLean/PhysLean/blob/master/PhysLean/Mathematics/VariationalCalculus/HasVarGradient.lean
Kenny Lau (Jul 12 2025 at 09:50):
interesting
Kenny Lau (Jul 12 2025 at 09:50):
it’s an inductive type
ZhiKai Pong (Jul 12 2025 at 09:52):
(is this docstring for https://github.com/HEPLean/PhysLean/blob/master/PhysLean/Mathematics/VariationalCalculus/Basic.lean incomplete?)
Tomas Skrivan (Jul 12 2025 at 10:01):
Would you like a doc that gives an overview of the whole directory there?
Joseph Tooby-Smith (Jul 12 2025 at 10:07):
To assess in what way the integration is a factor, does that fall within the scope of the PhysLean objectives?
Let me just address the part of this question about whether this is "within the scope of the PhysLean". IMO it absolutely is :). In addition to the conversation happening here, there is a lot to be done around the action and integration. E.g., when do two Lagrangians correspond to the same action, when do two actions or Lagrangians have the same or equivalent equations of motion etc.
ZhiKai Pong (Jul 12 2025 at 10:12):
ZhiKai Pong said:
(is this docstring for https://github.com/HEPLean/PhysLean/blob/master/PhysLean/Mathematics/VariationalCalculus/Basic.lean incomplete?)
what I mean is just the current doc string has an incomplete sentence at the end
The key took in variational calculus is:
``
∀ h, ∫ x, f x * h x = 0 → f = 0
``
which allows use to go from reasoning about integrals to reasoning about functions. There are
Joseph Tooby-Smith (Jul 12 2025 at 10:15):
Tomas Skrivan said:
Would you like a doc that gives an overview of the whole directory there?
I think generally including something like this in the Basic modules isn't a bad idea (especially if people get in the habit of always checking the Basic module first). Except the obvious time, effort, and maintenance burden that it adds.
Kenny Lau (Jul 12 2025 at 10:35):
sorry if this is a basic question, is var cal about taking derivatives of functions (R^n -> C) -> C?
Kenny Lau (Jul 12 2025 at 10:35):
i just want to make sure im getting the type right
Tomas Skrivan (Jul 12 2025 at 10:37):
Kenny Lau said:
sorry if this is a basic question, is var cal about taking derivatives of functions (R^n -> C) -> C?
Yes it is, but we requite that such functions are in the form of integral. Therefore, HasVarGradient accepts the function to be intehrated
Tomas Skrivan (Jul 12 2025 at 10:38):
I couldn't figure out a formal framework that would work with any function of that type.
Kenny Lau (Jul 12 2025 at 10:40):
ok, my next question is then, wouldnt this mean we’re taking derivatives inside S’, so that this would land us in exactly the same situation as the other channel about Fourier transform
Kenny Lau (Jul 12 2025 at 10:40):
(deriv: S’ -> S’)
Kenny Lau (Jul 12 2025 at 10:44):
i want to make S’ a reality lol
Tomas Skrivan (Jul 12 2025 at 10:49):
In finite dimension we have
S[u] = <S'[u], 1>
Do
grad S u = grad S' u 1
In an infinite dimension, you don't have inner product space unless you restrict yourself to a very specific space but that would stiffle usability and generality
Tomas Skrivan (Jul 12 2025 at 10:49):
Kenny Lau said:
(deriv: S’ -> S’)
I don't understand this S' is a function, not a type
Joseph Tooby-Smith (Jul 12 2025 at 10:49):
(Just to prevent confusion, I note there are two different things been called )
Tomas Skrivan (Jul 12 2025 at 10:50):
Kenny Lau said:
i want to make S’ a reality lol
I don't understand
Joseph Tooby-Smith (Jul 12 2025 at 10:51):
One is Tomas's definition from an earlier message ( ) and the other (from the Fourier transform thread) is the dual of the Schwartz maps.
Tomas Skrivan (Jul 12 2025 at 10:51):
Joseph Tooby-Smith said:
(Just to prevent confusion, I note there are two different things been called )
Duh, really? Where, what?
Cleon Teunissen (Jul 12 2025 at 11:36):
There is knowledge about Calculus of Variations that I think is of interest to the PhysLean effort.
Disclaimer: I'm not a programmer. I'm sorry to say that I have no knowledge of the Lean programming language. (The extent of my programming: I have used the JSXGraph library to create interactive diagrams)
About the Euler-Lagrange equation:
We have: to derive the EL-eq. using integration is overkill. Evidence for that: it is also possible to derive the Euler-Lagrange equation using differential reasoning only; no involvement of integration.
Another angle is:
When deriving the Euler-Lagrange equation using the integration-by-parts approach: towards the end of the derivation the integration is dismissed. That implies that the integration is non-essential. If it would be essential then that dismissal would not be possible. The factor that is necessary-and-sufficient is the differential relation.
The reason that integration is superfluous is, in my opinion, very interesting. There is a striking parallel between that reason and the fundamental theorem of Calculus. (Conjecture: it is in fact an instance of the fundamental theorem of Calculus)
Tomas Skrivan (Jul 12 2025 at 11:52):
Then show don't tell :) could you give a short summary of the main idea here? I don't want to hunt it down myself in all those links you provided
Kenny Lau (Jul 12 2025 at 12:16):
so do you think it would work if we look at S’ which is the normed dual of the schwarz space?
Kenny Lau (Jul 12 2025 at 12:17):
@Joseph Tooby-Smith I’d be interested to make S’ a reality if you think this is a good idea
Joseph Tooby-Smith (Jul 12 2025 at 12:34):
@Kenny Lau I do think this is a good idea! I think this is related to my 3rd thing here , and would be very happy if you wanted to work on this! Maybe we could use that thread to discuss?
so do you think it would work if we look at S’ which is the normed dual of the schwarz space?
Are you asking the following here?: Do you think that variational calculus procedure we have would work if the function S' : (ℝ → ℝ) → (ℝ → ℝ) was replaced with an term of type 'normed dual of the Schwarz space' (let me call this space here to avoid confusion)? I think this could be made to work and it is very close to what Tomas did, except the test functions used for the calculus of variations here are different from the Schwartz functions.
Kenny Lau (Jul 12 2025 at 12:46):
yeah it’s essentially working out the correct space of “test functions”
Kenny Lau (Jul 12 2025 at 12:46):
what are the other choices?
Cleon Teunissen (Jul 12 2025 at 15:04):
Tomas Skrivan said:
[...] the main idea here?
Screenshot_Calculus_of_Variations__unit_of_operation.png
The diagram in the attached screenshot will be used throughout the following discussion.
As motivating example for derivation of the Euler-Lagrange equation I use the Catenary problem; what is the shape of a hanging chain?
(Incidentally, decades before the introduction of Calculus of Variations by Euler: Leibniz had solved the catenary problem using differential reasoning.)
I introduce what I will refer to as 'unit of operation'.
In the case of differentiation the unit of operation is a pair of points; in the diagram the unit of operation is a triplet of points. Connecting the point of the triplet there are two line sections: and . Variation of the unit of operation consists of applying variation to the y-coordinate of the point (, ).
Over the entire curve the unit of operation is concatenated such that each point (except the boundary points) participates in the variation.
Example:
, ,
, , ,
, ,
And so on.
The intervals along the horizontal axis, the X-axis, are set to have the same length, so the -interval can be generically referred to as . To obtain the differential relation:
is the y-coordinate of the midpoint of the line section
is the y-coordinate of the midpoint of the line section
We have the following:
Each subsection of the catenary is in and of itself an instance of the catenary problem, only with different boundary condition, depending on where along the catenary the subsection is taken.
There is no lower limit to how small the subsections can be taken.
The idea is to evaluate the unit of operation in such a way that the reasoning is concurrently valid everywhere along the curve.
Each section, section and section , contributes to the overall potential energy of the catenary. Per unit of length along the horizontal axis the potential energy of the corresponding section of catenary is a function of the height, and of the slope (the slope gives the curve length per unit along the horizontal axis.
the potential energy of section
the potential energy of section
At the equilibrium point the two contributions are counterchanging. The total potential energy is stationary when the two rates of change are matching.
A set of four relations.
I'm using Lagrange notation:
, , , (2)
To set up for later use: each of the above is differentiated with respect to .
, , , (3)
Each term has two contributions, one for the height and the other for the slope. For each contribution: partial differentiation:
Using the relations stated in (3), substitute:
Distribute the minus sign:
Move everything to the right hand side:
Regroup the terms: in (7) the terms with and are side-by-side, in (8) the terms with and are side-by-side.
The expression is now ready to take the limit of
The core ideas:
The variational problem can be subdivided in subsections. Each subsection is in and of itself an instance of the problem you want to solve.
As we know: a differential equation is a higher order equation in the following sense: whereas a root finding equation as a value as its solution, the solution to a differential equation is a function.
For a differential equation: the space of solutions to the differential equation is a space of functions.
A differential equation is a global equation in the following sense: the demand is that the differential equation is to be satisfied concurrently over the whole domain.
In the derivation: the step that I find the most interesting is the one from (7) to (8). That regrouping allows the equation to be collapsed from four terms to two terms, so that you can solve the problem.
This derivation of the Euler-Lagrange equation is modeled after the derivation in the following article: Deriving Lagrange’s equations using elementary calculus
The discussion in this message is an excerpt from the discusion on my own website: Calculus of Variations, as applied in physics
In that treatment the catenary problem is solved twice; first with differential calculus, then with calculus of variations.
Not only do the two approaches arrive at the same solution, the two approaches are convergent: they give the same differential equation.
The reason for the convergence isn't specific to the catenary problem; it generalizes.
Conjecture:
The set of all variational equations is a subset of the set of all differential equations.
Cleon Teunissen (Jul 13 2025 at 09:55):
Joseph Tooby-Smith said:
[...] there is a lot to be done around the action and integration. E.g., when do two Lagrangians correspond to the same action, when do two actions or Lagrangians have the same or equivalent equations of motion etc.
In my previous message I used the catenary problem as motivating example. The reason that the catenary problem is a challenging problem: stated as a variational problem the Lagrangian of the catenary problem has the following form:
That Lagrangian requires partial differentiation.
As to the application of Calculus of Variations in classical mechanics: the thing is: compared to the catenary problem classical mechanics is a problem of a lower level.
I mean lower level in the following sense: in classical mechanics the potential energy is a function of position coordinate only, and the kinetic energy is a function of the velocity coordinate only, so there is no need for partial differention. It's a much, much simpler case.
Some historical information that is helpful for context:
Lagrange developed his mechanics in the years leading up to the publication of Mécanique analytique in 1788, and Hamilton's stationary action was introduced in 1834. That is: Lagrange developing his mechanics was half a century before the introduction of Hamilton's stationary action.
Stretching from the time of Huygens up to and including the time of Hamilton the concept of vis viva (the 'living force'), was used - vis viva being defined as . Around the 1850's the physics community shifted to the notion of kinetic energy, defined as . I will return to this further on in this message.
Statement:
If is granted as law of motion then the work-energy theorem follows as theorem.
In preparation for derivation of the work-energy theorem:
With:
position coordinate
velocity
acceleration
time
A relation will be derived, following from these definitions:
The definitions will be used in this form:
Let represent an arbitrary acceleration profile. Set up integration of this acceleration profile from starting position coordinate to final position coordinate
The result:
(8) generalizes to other contexts. Here it is stated in terms of position, velocity and acceleration; it generalizes to any context where a coordinate, its first time derivative, and its second time derivative are concurrently used.
The work-energy theorem
Starting point:
Set up integration with respect to the position coordinate:
Use (8) to process the right hand side:
(11) is the work-energy theorem.
The work-energy theorem motivates the concepts of potential energy and kinetic energy, and provides the definitions for them. Specifically: why does the definition of kinetic energy have that factor ? That's because the value of kinetic energy is obtained by way of integration - in this case integration with respect to position coordinate.
An integral expresses a difference:
For (12) we have: the validity extends to infinitesimally small change of energy.
Thus the work-energy theorem implies: as an object is being accelerated by a force there is a conserved quantity: the rate of change of kinetic energy matches the rate of change of potential energy, hence the sum of the two is a conserved quantity.
At this point we have the option of capitalizing on the fundamental theorem of Calculus.
We take the work-energy theorem, and we set up the inverse of the integration that obtained the work-energy theorem; we set up differentiation with respect to position coordinate
At this point I want to compare two differentiation operations:
and
Processing both:
The two differentiation operations arrive at the same result: .
I accept (16)-(17) as corroboration that (14) and (15) are the same differentiation operation, but of course (16)-(17) is way short of a proof.
With the differentiation operation stated in the form of (15):
With the benefit of hindsight: (18), and how 18 is derived, is the basis of the mechanics that was developed by Joseph Louis Lagrange. (The way Lagrange motivated the mechanics was comparitively circuitous; Lagrange wasn't using the work-energy theorem. The way Lagrange motivated the mechanics did not involve calculus of variations.)
I will refer to (18) as 'the Lagrange equation'.
The following notation is pretending that partial differentiation is being used:
My objection to (19) is:
It is smoke and mirrors. (19) is still the same equation as (18).
Discussion:
The power of the formulation of mechanics developed by Joseph Louis Lagrange is that the mechanics taking place is expressed in terms of potential energy and kinetic energy.
In terms of the newtonian formulation of mechanics: when there are multiple degrees of freedom vectorial representation is used, and with the vectorial information is present on both sides of the equation. In the case of coordinate representation the vectors must be decomposed in vector components.
On the other hand: with kinetic energy and degrees of freedom:
That is: kinetic energy, being a quadratic expression, comes pre-pythagorized.
In terms of the mechanics of Joseph Louis Lagrange the directional information of the velocity vector is discarded; the expression for the kinetic energy doesn't carry directional information.
The required directional information is still available: take the potential energy, and evaluate the gradient of the potential energy. By differentiating the potential energy (decomposing in components) the directional information is made explicit again.
Given an equation of motion: how to construct a Lagrangian?
The Euler-Lagrange operator executes differentiation with respect to position coordinate.
Simplest example: the catenary problem:
The catenary shape is an instance of force equilibrium.
In order to apply calculus of variations: obtain the integral-with-respect-position of the force. Then the Euler-Lagrange operator converts that potential energy back to an expression in terms of force.
The ways a Lagrangian can be non-unique correlate with the fact that differentiation is a lossy operation. Presumably: the fact that the Euler-Lagrange operator specifies partial differentiation creates significant leeway.
The key points:
The lagrange equation:
The Lagrange equation consists of three elements:
- The fundamental theorem of calculus: integration and differentiation are inverse operations.
Joseph Tooby-Smith (Jul 13 2025 at 10:24):
Hey @Cleon Teunissen, it is going to take me a bit of time to read and try and understand what you've written here, and I'll have more time tomorrow to do this - so if no one else gets back to you before then - I will respond tomorrow.
Cleon Teunissen (Jul 13 2025 at 10:35):
Joseph Tooby-Smith said:
[...] I'll have more time tomorrow [...]
I hope you will allocate the time to consider the information thoroughly. If it takes longer, please take longer.
This information is towards the goal of the Lean project to assist in clarifying derivations.
In physics, as in any science, it is considered worthwhile to try and narrow down to the minimal set of assumptions that is sufficient. This information goes towards that.
Joseph Tooby-Smith (Jul 14 2025 at 10:05):
Ok, I've read and somewhat understood what you have written. I have two questions:
- I'm slightly confused on where you stand - do you think integration is important or not in terms of this?
- In your second message, your starting assumptions is newtonian mechanics, right?
Cleon Teunissen (Jul 15 2025 at 00:02):
Joseph Tooby-Smith said:
- do you think integration is important or not in terms of this?
First off: to exclude possibility of babylonian confusion about which integration is intended: in the case of Hamilton's stationary action the integration I'm referring to is the integration of the Lagrangian with respect to time:
To derive the Euler-Lagrange equation it is not necessary to involve that integration; the Euler-Lagrange equation can be derived with purely differential reasoning.
To explain how that is possible: a particular case in the history of Calculus of Variations: the Brachistochrone problem.
The Bernoulli who issued the challenge was Johann Bernoulli. In the end about four or five mathematicians had solved the problem, among them was Johann's older brother Jacob Bernoulli.
Jacob Bernoulli opened his discussion of the Brachistochrone problem (Acta Eruditorum, May 1697, pp. 211-217) with a lemma.
Lemma. Let ACEDB be the desired curve along which a heavy point falls from A to B in the shortest time, and let C and D be two points on it as close together as we like. Then the segment of arc CED is among all segments of arc with C and D as end points the segment that a heavy point falling from A traverses in the shortest time. Indeed, if another segment of arc CFD were traversed in a shorter time, then the point would move along ACFDB in a shorter time than along ACEDB, which is contrary to our supposition.
Key to the lemma is the "let C and D be two points on it as close together as we like."
If the curve as a whole is a path of fastest descent then each subsection is itself a curve of fastest descent, and that is valid down to infinitesimally short subsections.
That is:
The property of being a curve of fastest descent obtains at the infinitesimal level, and from there it propagates out to the curve as a whole.
When deriving the the Euler-Lagrange equation you can go either way:
- without involvement of the integration
- with involvement of the integration
In both cases the result is the same: the Euler-Lagrange equation.
That is to say: having the integration involved in deriving the Euler-Lagrange equation is not contributing anything; the differential relation carries the information.
It all falls into place:
As I mentioned in an earlier message: in the case of the derivation of the EL-eq. using integration-by-parts: towards the end the integration is dismissed, while retaining the differential relation. The integration can be dismissed because it is the differential relation that expresses the information that allows you to solve the problem.
Cleon Teunissen (Jul 15 2025 at 00:07):
Joseph Tooby-Smith said:
[...] your starting assumptions is newtonian mechanics, right?
Well, I'm not committed to a particular choice of starting point.
That is why I wrote: "If F=ma is granted as law of motion."
As the physics historian John Norton has pointed out:
In any logical system there is great freedom to interchange theorem and axiom without changing the contents of the system.
When taking the work-energy relation as axiom: F=ma follows by way of differentiation.
My primary concern is the content, and the relations.
Example: we have that the Lagrangian formulation of mechanics and the Hamiltonian formulation of mechanics are related by Legrendre transform. Legendre transform is it's own inverse; applying Legendre transform twice recovers the original function. Lagrangian mechanics and Hamiltonian mechanics, while formulated differently, are expressing the same content.
In the case of the newtonian formulation and the formulation of mechanics by Joseph Louis Lagrange: the derivation of the work-energy relation provides the transformation between the two. I think of the two formulations as expresssing the same content.
About the newtonian formulation
The following may be worth mentioning explicitly:
The newtonian formulation of mechanics generalizes to using generalized coordinates.
For further reading on that: the discussion by Richard Fitzpatrick: Generalized Forces
(It may be that some people are under the impression that using generalized coordinates is exclusive to the formulation of mechanics by Joseph Louis Lagrange. That isn't the case.)
Discussion:
The type of relation described by is a recurring theme.
In the abstract:
There is a state, for example a position coordinate, its first time derivative, and its second time derivative. A causal agent can generically be referred to as a 'force'. When there is opposition to change of state (an inertia) then a relation of the type is seen.
Example:
The electromechanics of electric oscillation in an LC-circuit.
With an LC-circuit:
The electromagnetic effects of the inductor give rise to an opposition to change of current strength. With an inductor in the circuit: an electromotive force is required to get a current going. Once a current is going the inductor acts in opposition to decrease in current strength. That is: inductance is quite analogous to inertia.
Change of current strength is a second time derivative.
Let represent rate of flow of charge through a circuit.
With:
electromotive force
inductance
Then:
This type of relation - response to causal agent proportional to second time derivative - is a recurring theme.
When a relation of the type is applicable then a counterpart of the work-energy relation is applicable.
Cleon Teunissen (Jul 15 2025 at 20:47):
@Joseph Tooby-Smith
I'm aware that my outpouring of messages can lead to fatigue and diminishing returns, but there is another angle that may well go a long way in clarification.
Back in 2003, in the collaboration of Edwin F. Taylor, Jozef Hanc, and Slavomir Tuleja, they also featured interactive diagrams, implemented in Java. They referred to the process of converging on the sought after curve as 'hunting'.
My version of that is an interactive diagram for finding the shape of a soap film stretching between two coaxial rings.
Image credit: Susan Schwartzenberg - Exploratorium
The screenshot below shows the shape-finding diagram in the state where it has converged onto the solution. The diagram shows the cross section. The polygon is already a remarkably good approximation.
The screenshot below shows a state with the sliders scrambled.
There are 7 nodes; 3 of the nodes are mirrored.
Direct link to the diagram on my website:
Soap film stretching between coaxial rings
Each of the nodes is an instance of the type of triplet displayed in the unit-of-operation diagram:
Screenshot_Calculus_of_Variations__unit_of_operation.png
To converge onto the curve: iterate over the sliders. Each adjustment is by itself a local adjustment. Over multiple iterations the polygon converges.
The idea, of course, is that an automated version of that diagram is a numerical analysis implementation of variational approach.
The important property is: while iterating: at each visit to a node, applying variation to the node, only the surface area of the triplet unit-of-operation of that node is affected by that variation. So at each visit to a node it is sufficient to compute the change of area for that triplet unit-of-operation only. Over multiple iterations all of the adjustments propagate out to the polygon as a whole; the end result is convergence of the polygon as a whole.
The principle:
If all the unit-of-operation triplets are concurrently locally minimized then the curve as a whole is minimized.
In another article on my website I implemented three instances of a diagram with more sliders, 10 sliders, moving 19 nodes (9 nodes mirrored), and additional sliders for granular control. The following three cases are implemented:
- potential increases linear with heigth
- potential increases quadratic with displacement (Hooke's law)
- potential increases in proportion to the cube of the displacement
The interactive diagrams available on my website are created with JSXGraph, a javascript library. The JSXGraph developers are employed at the University of Bayreuth, Germany.
Joseph Tooby-Smith (Jul 16 2025 at 10:02):
I think I understand your messages, and what you are saying. I think what would be good for you to think about is the actually mathematical definitions and theorems you want would want to formalize, and how they relate to the formalization already present of variational calculus. If you can state these in Lean - great, but if not I think stating them as mathematically concrete as possible would be helpful (ignoring the underlying physics for now).
Lean forces one to think in terms of precise mathematical definitions and theorems, and as such many people here think in that way, so doing so may also help others here understand better what you are trying to say.
Last updated: Dec 20 2025 at 21:32 UTC