Zulip Chat Archive

Stream: new members

Topic: differentiation in multivariate setting


Abu Al Hassan (Mar 14 2023 at 17:27):

Hello,
I am trying to prove :

Theorem (Necessary Second Order Optimality Conditions). Let f : U → R be a function
defined on an open set U ⊆ R^{n} Suppose that f is twice continuously differentiable over U
and that x∗ is a stationary point. Then

  1. if x∗ is a local minimum point, then ∇^{2}f (x∗) ≽ 0

In doing so I have been trying to set up the problem. I am having two problems. First I want to know what I can use in the mathlib documentation to define that the first derivative f' of f is 0 at x_star, meaning x_star is a stationary point of f. This could be a hypothesis to input into the theorem.

Secondly, I would like to know how to write that the hessian is a semi-positive definite in the Prop bit of the theorem (i imagine there is some built-in way to write this).

So far my code is:

import tactic
import data.real.basic
import analysis.inner_product_space.euclidean_dist
import analysis.calculus.deriv

variables {n : } {f f': (fin n  )  }

def open_set (U : set (fin n  )) : Prop :=
 x  U,  (ε : ), 0 < ε  euclidean.ball x ε  U

def local_minimum (x_star : fin n  ) (U : set (fin n  )) : Prop :=
 (ε : ), 0 < ε  ((euclidean.ball x_star ε  U)  ( x  euclidean.ball x_star ε, f x_star  f x))

variables {x_star : fin n  }

theorem necess_2nd_order_optimality_condit (U : set (fin n  )) (h_open : open_set U)
(h_f_twice_cont_diff : cont_diff  2 f) (h_xstar_in_U : x_star  U)
--{h_xstar_stationary : deriv f x_star = 0}  --not working
: local_minimum x_star U  Hessian is positive definite  :=
begin
  sorry,
end

Any help or hints will be much appreciated!

Alistair Tucker (Mar 14 2023 at 21:41):

I don't know the answers to your questions but I think where you've written ∃ (ε : ℝ), 0 < ε → ... you wanted to write ∃ ε (h : 0 < ε), ....

Alistair Tucker (Mar 14 2023 at 22:05):

Also fin n → ℝ probably doesn't have the norm you think it does so best to use euclidean_space ℝ (fin n) instead. Then your first question might be answered by including a hypothesis of the form (h : has_fderiv_at f (0 : euclidean_space ℝ (fin n) →L[ℝ] ℝ) x_star). To deal with the second it looks like you're going to have to delve into iterated_fderiv and multilinear_map.

Abu Al Hassan (Mar 15 2023 at 00:14):

Hi Alistair,
Thanks for pointing out my mistake with the \epsilon. I have made this mistake in the past but will be more careful going forward.

Abu Al Hassan (Mar 15 2023 at 00:19):

I am still unable to work out how to make progress on the second problem of writing out that the hessian is positive semi-definite.
I think using iterated_fderiv \R 2 f defines the hessian but I couldn't find anything in mathlib that defines such a multilinear map as being positive semi-definite. Maybe there is a way to define the hessian as a matrix so I can use matrix.pos_semidef

Abu Al Hassan (Mar 15 2023 at 00:20):

Updated code:

import tactic
import data.real.basic
import analysis.inner_product_space.euclidean_dist
import analysis.calculus.deriv
import analysis.calculus.cont_diff

def open_set (n : ) (U : set (fin n  )) : Prop :=
 x  U,  (ε : ), 0 < ε  euclidean.ball x ε  U

def local_minimum (n : ) (x_star : fin n  ) (U : set (fin n  )) (f : (fin n  )  ) : Prop :=
 ε (h : 0 < ε), (euclidean.ball x_star ε  U)  ( x  euclidean.ball x_star ε, f x_star  f x)

theorem necess_2nd_order_optimality_condit
(n : ) (U : set (fin n  )) (h_open : open_set n U)
(x_star : fin n  ) (U : set (fin n  )) (f f' : (fin n  )  )
(h_f_twice_cont_diff : cont_diff  2 f)
(h_xstar_in_U : x_star  U)
(h : has_fderiv_at f (0 : euclidean_space  (fin n) L[] ) x_star):
local_minimum n x_star U f  Hessian is positive semi definite  :=  --trying to fix this bit
begin
  sorry,
end

Eric Wieser (Mar 15 2023 at 00:45):

(docs#iterated_fderiv for reference)

Eric Wieser (Mar 15 2023 at 00:48):

You'll almost certainly need to uncurry the multilinear map into a bilinear map

Eric Wieser (Mar 15 2023 at 00:49):

Then we should have something about positive definiteness of bilinear maps

Eric Wieser (Mar 15 2023 at 00:51):

Though actually it might be better to avoid iterated_fderiv altogether, and just use fderiv twice.

Eric Wieser (Mar 15 2023 at 00:52):

Does seeing how docs#second_derivative_symmetric is stated help?

Abu Al Hassan (Mar 15 2023 at 12:43):

Hi @Eric Wieser , I don't think fderiv is what I want as it requires you to give a point where f has a derivative. So if i want to write second derivative at x_star, i will need to input an arbitrary point x and after doing fderiv twice evaluate it at x_star.
Something like fderiv ℝ (fderiv ℝ f x) x x_star which doesn't work

Kevin Buzzard (Mar 15 2023 at 12:44):

Can you summarise what the question is, perhaps with a much smaller example which doesn't depend on all the calculus? Ultimately is your question just about linear algebra? I feel that the question could do with a lot more minimising but maybe I'm wrong

Abu Al Hassan (Mar 15 2023 at 12:46):

Further on, hessian contains partial derivatives and not the standard derivative.

Abu Al Hassan (Mar 15 2023 at 12:47):

I dont think an MWE would help and I think the question is quite concise tbh. I will attempt to further break it down now and open another question. Hopefully that will help

Eric Wieser (Mar 15 2023 at 12:48):

Eric Wieser said:

Does seeing how docs#second_derivative_symmetric is stated help?

Did you look at this? It uses

variables {x : E} {f : E  F} {f' : E  (E L[] F)} {f'' : E L[] E L[] F}
variables (hf :  (y : E), has_fderiv_at f (f' y) y) (hx : has_fderiv_at f' f'' x)

Eric Wieser (Mar 15 2023 at 12:49):

I'm pretty sure that f'' is the hessian in this arrangement

Abu Al Hassan (Mar 15 2023 at 15:18):

Hi @Eric Wieser , thank you for the hint. Sorry for the late reply, i was trying to understand the way f' and f'' are defined in second_derivative_symmetric.

I would have thought that f' x is a row vector (1 by n) containing the partial derivatives of f evaluated at x. I think {f' : (fin n → ℝ) → ((fin n → ℝ) →L[ℝ] ℝ)} does that.

However, the hessian f'' is meant to be an n by n matrix (so later I can use matrix.pos_semidef). I am afraid I really dont understand {f'' : (fin n → ℝ) →L[ℝ] (fin n → ℝ) →L[ℝ] ℝ} achieves this. I am confused why the entries in this hessian are linear functions.

Abu Al Hassan (Mar 15 2023 at 15:20):

I am really sorry for maybe not asking this in the most intelligent manner. I am very new to this stuff in lean.

Eric Wieser (Mar 15 2023 at 15:21):

If I understand correctly, f' x dx gives the directional derivative at x in the direction dx

Eric Wieser (Mar 15 2023 at 15:21):

... and f'' dx dy gives you the entry of the hessian corresponding to the directions dx and dy. To recover the matrix you can set dx = single i 1 and dy = single j 1

Eric Wieser (Mar 15 2023 at 15:23):

Perhaps docs#bilin_form.to_matrix' makes that link clearer?

Kevin Buzzard (Mar 15 2023 at 18:11):

f '' eats two elements of Rn\R^n and returns a real number, in a bilinear way, and every bilinear form on Rn\R^n is represented by a matrix.

Abu Al Hassan (Mar 15 2023 at 22:29):

Hi Eric and Prof. Buzzard. Thank you for the help on this query, I know my questioning was a bit all over the place (I was feeling really out of my depth tbh with all the built-in libraries I was having to utilise). I have been able to simplify the problem and be able to define hessian and positive semi-definite from a more basic way. I will leave my code here in case it helps anyone in the future and mark this query resolved.

import analysis.calculus.deriv
import data.matrix.notation
import linear_algebra.matrix

open set

variables {n : } {U : set (fin n  )} {f : (fin n  )  }
variables (x_star : fin n  )

-- f is twice continuously differentiable over U
def twice_continuous_diff_on (f : (fin n  )  ) (U : set (fin n  )) : Prop :=
  continuous_on (fderiv  f) U  --checks if the first derivative of f is continuous on teh set U
  continuous_on (λ x, fderiv  (fderiv  f x)) U --checks if the second derivative of f is continuous on the set U

-- x_star is a stationary point
def stationary_point (f : (fin n  )  ) (x_star : fin n  ) : Prop :=
  fderiv  f x_star = 0

-- x_star is a local minimum point
def local_minimum  (x_star : fin n  ) (U : set (fin n  )) (f : (fin n  )  ) : Prop :=
   ε > 0,  y  U, dist y x_star < ε  f x_star  f y

-- Hessian matrix. This is a square matrix of second order partial derivatives of f.
-- computes the second order partial derivative of f at the ith and jth variables at a point x.
noncomputable def hessian (f : (fin n  )  ) (x : fin n  ) : matrix (fin n) (fin n)  :=
λ i j, (fderiv  (λ y, fderiv  f y i) x) j

-- Hessian matrix at x_star, H is positive semi-definite (∀ x, 0 ≤ xᵀ * H * x ).
def hessian_positive_semi_definite (f : (fin n  )  ) (x_star : fin n  ) : Prop :=
let H := hessian f x_star in
   (x : fin n  ), (matrix.dot_product x (matrix.mul_vec H x))  0

Notification Bot (Mar 15 2023 at 22:30):

Abu Al Hassan has marked this topic as resolved.

Eric Wieser (Mar 15 2023 at 22:47):

Your use of dist above is not the euclidean distance; that might be a problem

Eric Wieser (Mar 15 2023 at 23:02):

Note also that you should use docs#matrix.of to build a matrix from coefficients

Notification Bot (Mar 15 2023 at 23:02):

Abu Al Hassan has marked this topic as unresolved.

Abu Al Hassan (Mar 15 2023 at 23:02):

Yes you're right, thanks for the spot. I wasn't a fan of the earlier euclidean.ball but the below should be fine:

-- x_star is a local minimum point using Euclidean distance
def local_minimum  (x_star : fin n  ) (U : set (fin n  )) (f : (fin n  )  ) : Prop :=
   ε > 0,  y  U,  y - x_star  < ε  f x_star  f y

Abu Al Hassan (Mar 15 2023 at 23:03):

Eric Wieser said:

Note also that you should use docs#matrix.of to build a matrix from coefficients

Okay I will try to adapt it

Eric Wieser (Mar 15 2023 at 23:15):

That doesn't work either. That's not the euclidean norm you claim it is!

Alistair Tucker (Mar 15 2023 at 23:17):

I think I mentioned that first of all, but I guess it doesn't matter here

Alistair Tucker (Mar 15 2023 at 23:28):

Btw if f isn't differentiable then fderiv is defined to be 0. That's why I wrote has_fderiv_at f 0 x_star rather than fderiv ℝ f x_star = 0 for the first order condition.

Abu Al Hassan (Mar 15 2023 at 23:42):

Eric Wieser said:

That doesn't work either. That's not the euclidean norm you claim it is!

Does the following work?

-- x_star is a local minimum point using Euclidean distance
def local_minimum  (x_star : fin n  ) (U : set (fin n  )) (f : (fin n  )  ) : Prop :=
   ε > 0,  y  U, euclidean.dist y x_star < ε  f x_star  f y

Abu Al Hassan (Mar 15 2023 at 23:43):

Or shall i got back to :

def local_minimum (n : ) (x_star : fin n  ) (U : set (fin n  )) (f : (fin n  )  ) : Prop :=
 ε (h : 0 < ε), (euclidean.ball x_star ε  U)  ( x  euclidean.ball x_star ε, f x_star  f x)

Eric Wieser (Mar 15 2023 at 23:44):

I think you should use x_star : euclidean_space ℝ (fin n) isntead of x_star : fin n → ℝ, and similar throughout. Then norm, dist, ball etc all mean the right thing.

Eric Wieser (Mar 15 2023 at 23:44):

docs#euclidean.dist is likely to just add extra confusion

Alistair Tucker (Mar 15 2023 at 23:50):

Ouch in your definition of hessian you are coercing terms of fin n to fin n → ℝ. Pretty sure that's not what you want to be doing

Eric Wieser (Mar 15 2023 at 23:51):

I knew it looked fishy, but assumed it was fine because Lean accepted it and I must have misread ...

Eric Wieser (Mar 15 2023 at 23:51):

You don't want the coercion, you want docs#euclidean_space.single

Abu Al Hassan (Mar 15 2023 at 23:52):

Alistair Tucker said:

Ouch in your definition of hessian you are coercing terms of fin n to fin n → ℝ. Pretty sure that's not what you want to be doing

Yh i realised that was a error when i tried to implement matrix.of. Currently struggling to make that work

Eric Wieser (Mar 15 2023 at 23:54):

Eric Wieser said:

You don't want the coercion, you want docs#euclidean_space.single

Or better yet, use

Eric Wieser said:

Perhaps docs#bilin_form.to_matrix' makes that link clearer?

Eric Wieser (Mar 15 2023 at 23:55):

(combined with docs#linear_map.to_bilin)

Abu Al Hassan (Mar 16 2023 at 00:24):

Would you recommend that i do (f : (euclidean_space ℝ (fin n)) → ℝ) rather than (f : (fin n → ℝ) → ℝ) ?

Abu Al Hassan (Mar 16 2023 at 00:46):

Hi Eric, I tried what you suggested but can't get lean to be alright with the code. Can you please help with how to do it. Here is my MWE:

import analysis.calculus.deriv
import linear_algebra.matrix
import analysis.inner_product_space.euclidean_dist
import linear_algebra.bilinear_form
import linear_algebra.matrix.bilinear_form
open set

variables {n : } {f : (euclidean_space  (fin n))  }

def hessian (f : (euclidean_space  (fin n))  ) (x : euclidean_space  (fin n)) : matrix (fin n) (fin n)  :=
  let H := linear_map.to_bilin (fderiv  (λ y : euclidean_space  (fin n), fderiv  f y) x) in
  bilin_form.to_matrix' H
don't know how to synthesize placeholder
context:
n : ,
f : euclidean_space  (fin n)  ,
x : euclidean_space  (fin n)
 Sort ?
delete2.lean:11:11
type mismatch at application
  linear_map.to_bilin (fderiv  (λ (y : euclidean_space  (fin n)), fderiv  f y) x)
term
  fderiv  (λ (y : euclidean_space  (fin n)), fderiv  f y) x
has type
  euclidean_space  (fin n) L[] euclidean_space  (fin n) L[]  : Type
but is expected to have type
  ?m_3 →ₗ[?m_1] ?m_3 →ₗ[?m_1] ?m_1 : Type (max ? ?)
Additional information:
c:\Users\ahass\OneDrive\Desktop\formalising-mathematics-2023\src\coursework\delete2.lean:11:11: context: switched to simple application elaboration procedure because failed to use expected type to elaborate it, error message
  too many arguments

Abu Al Hassan (Mar 16 2023 at 00:49):

I am hoping my def hessian_positive_semi_definite will work in its current form later

Abu Al Hassan (Mar 16 2023 at 00:57):

Alistair Tucker said:

Ouch in your definition of hessian you are coercing terms of fin n to fin n → ℝ. Pretty sure that's not what you want to be doing

Can you please elaborate on this?

Eric Wieser (Mar 16 2023 at 01:18):

type mismatch at application
  linear_map.to_bilin (fderiv  (λ (y : euclidean_space  (fin n)), fderiv  f y) x)
term
  fderiv  (λ (y : euclidean_space  (fin n)), fderiv  f y) x
has type
  euclidean_space  (fin n) L[] euclidean_space  (fin n) L[]  : Type
but is expected to have type
  ?m_3 →ₗ[?m_1] ?m_3 →ₗ[?m_1] ?m_1 : Type (max ? ?)

This is telling you that you have a continuous bilinear map (→L), but lean is expecting you to provide a regular one (→ₗ)

Eric Wieser (Mar 16 2023 at 01:23):

This is of course rather irritating, and I don't think we have anything that directly converts between the two

Eric Wieser (Mar 16 2023 at 01:24):

You can change the left-most arrow with .to_linear_map, but the rightmost one needs to be eliminated with composition, and we don't have the map to compose with either

Alistair Tucker (Mar 16 2023 at 10:30):

Abu Al Hassan said:

Alistair Tucker said:

Ouch in your definition of hessian you are coercing terms of fin n to fin n → ℝ. Pretty sure that's not what you want to be doing

Can you please elaborate on this?

When applied to 𝕜 : Type* the underlying field, f : E → F the function to be differentiated and x : E the point at which to do so, fderiv will return a linear map of type E →L[𝕜] F. That linear map (let's call it f' x) also takes an argument which must be of type E or able to be coerced to type E. Otherwise Lean will complain. By accident you fooled Lean by providing a term with a valid but wholly unsuitable coercion to E. It surprises me, given that there is no automatic coercion from type to fin n → ℝ, but there is apparently one from type fin n to fin n → ℝ. Try it yourself by typing #eval ((2 : fin 3) : fin 3 → ℝ) in your editor. One has to look through some noise, but the answer it returns is the vector (2, 2, 2)! This is not what you wanted. The good news is that working with euclidean_space ℝ (fin n) you are less likely to fall into this trap again, because there is no automatic coercion from type fin n to euclidean_space ℝ (fin n).

Yaël Dillies (Mar 16 2023 at 11:42):

Yes, that is the ""canonical"" coercion from fin n to any ring. Of course this is absolutely not intended use. Intended use is fin a -> K where K is a characteristic a ring.

Yaël Dillies (Mar 16 2023 at 11:44):

I've argued on Xena that we should remove this coercion given its surprisiveness and the fact that the coercion fin a -> fin a is not defeq to id. IIUC @Eric Rodriguez wants to replace it with a fin_cast field in char_p, the same way we have docs#add_monoid_with_one.nat_cast.

Eric Wieser (Mar 16 2023 at 11:48):

Where's that coercion?

Eric Rodriguez (Mar 16 2023 at 11:55):

It was for zmod instead of fin, but the same idea stands, yea

Abu Al Hassan (Mar 16 2023 at 12:44):

Eric Wieser said:

You can change the left-most arrow with .to_linear_map, but the rightmost one needs to be eliminated with composition, and we don't have the map to compose with either

What do you suggest I do? It's sounding like a bit of a dead end.

Abu Al Hassan (Mar 16 2023 at 12:45):

I am doing this for a university module and setting this problem up has been a nightmare compared to the other coursework I have done in lean. Trying to weigh my option of whether I should tackle a different proof.

Eric Wieser (Mar 16 2023 at 12:46):

I think it's a good lean exercise to manually write something like

def continuous_linear_map.to_bilin (f : E L[] E L[] ) : bilin_form  E

Abu Al Hassan (Mar 16 2023 at 12:46):

Hi @Alistair Tucker, thank you for explaining the problem. I understand it much better now.

Alistair Tucker (Mar 16 2023 at 14:17):

Is it really necessary to do this through matrices? As I understand it (and I may not—e.g. I initially thought your ∇^{2} was the Laplacian) your and Eric's code can already be brought together into this problem statement:

import analysis.inner_product_space.euclidean_dist

notation `𝓔` n:50 := euclidean_space  (fin n)

def open_set {n : } (U : set (𝓔 n)) : Prop :=
   x  U,  ε (h : 0 < ε), euclidean.ball x ε  U

def local_minimum {n : } (x_star : 𝓔 n) (U : set (𝓔 n)) (f : 𝓔 n  ) : Prop :=
   ε (h : 0 < ε),  y  U, y - x_star < ε  f x_star  f y

theorem necess_2nd_order_optimality_condit
    {n : } {U : set (𝓔 n)} (h_open : open_set U)
    {x_star : 𝓔 n} (h_xstar_in_U : x_star  U)
    {f : 𝓔 n  } {f' : 𝓔 n  𝓔 n L[] } {f'' : 𝓔 n L[] 𝓔 n L[] }
    (hf :  y  U, has_fderiv_at f (f' y) y) (hx : has_fderiv_at f' f'' x_star)
    (h_stationary : f' x_star = 0) :
    local_minimum x_star U f   y, 0  f'' y y :=
begin
  sorry,
end

Eric Wieser (Mar 16 2023 at 14:22):

I think you might still see some advantage to going through bilin_form as f.to_bilin.to_quadratic_form.pos_def or something instead of ∀ y, 0 ≤ f'' y y , but I agree that matrices are probably a distraction

Eric Wieser (Mar 16 2023 at 14:22):

It's annoying that continuous_linear_map.to_bilin is missing, but it's super trivial to prove

Sebastien Gouezel (Mar 16 2023 at 15:13):

Note that you don't need the space to be finite-dimensional for the statement to hold. (And no need for Euclidean norm either). The statement is definitely easier to prove if you put it in the right generality, because you won't be deterred by irrelevant details.

Abu Al Hassan (Mar 16 2023 at 23:41):

Following on from @Alistair Tucker code above. it would be great if someone can explain what 0 ≤ f'' y y means in it.

Does 0 ≤ f'' y y mean we are saying y^{T}Hessian(x_star)y >= 0 for positive semi definiteness?
(So our goal is that for all y in R^n, y transpose times Hessian evaluated at x_star times y is greater or equal to 0)

Abu Al Hassan (Mar 16 2023 at 23:57):

Great, thank you sincerely to everyone that helped.

Notification Bot (Mar 17 2023 at 14:21):

Abu Al Hassan has marked this topic as resolved.

Abu Al Hassan (Mar 18 2023 at 00:29):

Hi, I have a follow-up from Alistairs's code. (hx: has_fderiv_at f' f'' x_star) isn't exactly what I want because f'' here is fixed here such that it's evaluated at x_star. I however want f'' where I can evaluate it at any point in U.

I think changing hx to (hx : ∀ x_star ∈ U, has_fderiv_at f' f'' x_star) is nonsense and I would probably need to change the structure of {f'' : 𝓔 n →L[ℝ] 𝓔 n →L[ℝ] ℝ}. Not sure how.

I do have the following which could help.
-- f is twice continuously differentiable over U def twice_continuous_diff_on {n : ℕ} (f : 𝓔 n → ℝ) (U : set (𝓔 n)) : Prop := continuous_on (fderiv ℝ f) U ∧ --checks if the first derivative of f is continuous on the set U continuous_on (λ x, fderiv ℝ (fderiv ℝ f x)) U --checks if the second derivative of f is continuous on the set U

I would really appreciate some further guidance

Notification Bot (Mar 18 2023 at 00:30):

Abu Al Hassan has marked this topic as unresolved.

Kevin Buzzard (Mar 18 2023 at 00:34):

Can you ask the full question again eg with a working example? I can try and have a look because I feel like we're converging now.

Kevin Buzzard (Mar 18 2023 at 00:35):

But I'm too lazy to read the entire thread and put everything together, I wasn't following at the beginning because I've had a really busy week

Abu Al Hassan (Mar 18 2023 at 11:08):

I have opened up another question Prof. Hopefully I have explained my problem there sufficiently well.


Last updated: Dec 20 2023 at 11:08 UTC