Zulip Chat Archive

Stream: lean4

Topic: Breaking instance resolution loop

Tomas Skrivan (Oct 20 2021 at 19:30):

I'm trying to do some proof automation with typeclasses. I have two theorems that can potentially prove one another and lean gets stuck in an infinite loop while trying to synthesize an instance.

The new and new goals look like:

(a : α) → IsLin fun u => f (g u) a
(a a : α) → IsLin fun u => f (g u) a
(a a a : α) → IsLin fun u => f (g u) a
(a a a a : α) → IsLin fun u => f (g u) a

They are clearly the same goals, they just differ by an additional unused argument. Is there a way to break out of this loop and try to use different instances or fail?

mwe:

def is_linear {α β} (f : α → β) : Prop := sorry

class IsLin {α β} (f : α → β) where
  {proof : is_linear f}

variable {U V W: Type}
instance swap (f : α → U → V) [∀ a, IsLin (f a)] : IsLin (λ (u : U) (a : α) => f a u) := sorry
instance parm (f : U → α → V) (a : α) [IsLin f] : IsLin (λ (u : U) => f u a) := sorry
instance comp (f : V → W) (g : U → V) [IsLin f] [IsLin g] : IsLin λ u => f (g u) := sorry

set_option trace.Meta.synthInstance true
theorem test1 (f : V → α → W) [IsLin f] (g : U → V) [IsLin g] (a : α) : IsLin (λ u => f (g u) a) := by infer_instance

When the loop is straightforward, the loop gets broken and error failed to synthesize instance is reported.

def is_linear {α β} (f : α → β) : Prop := sorry

class IsLin {α β} (f : α → β) where
  {proof : is_linear f}

variable {U V W: Type}

instance (f : U → V → W → α) [IsLin f] : IsLin (λ v w u => f u v w) := sorry
instance (f : U → V → W → α) [IsLin (λ w u v => f u v w)] : IsLin f := sorry

set_option trace.Meta.synthInstance true
def loop_break (f : U → V → W → α) : IsLin f := by infer_instance

Tomas Skrivan (Oct 21 2021 at 13:14):

More investigation on this, we can have a look what is going on when doing these proofs manually.

def is_linear {α β} (f : α → β) : Prop := sorry

class IsLin {α β} (f : α → β) where
  {proof : is_linear f}

variable {U V W}

theorem swap (f : α → U → V) (h : ∀ a, IsLin (f a)) : IsLin (λ u a => f a u) := sorry
theorem parm (f : U → α → V) (hf : IsLin f) (a : α) : IsLin (λ (u : U) => f u a) := sorry
theorem comp (f : V → W) (g : U → V) (hf : IsLin f) (hg : IsLin g) : IsLin λ u => f (g u) := sorry

theorem test1 (f : V → α → W) [IsLin f] (g : U → V) [IsLin g] (a : α) : IsLin (λ u => f (g u) a) :=
by
  apply parm
  apply comp; assumption; assumption; done

theorem test2 (f : V → α → W) [IsLin f] (g : U → V) [IsLin g] : IsLin (λ u a => f (g u) a) :=
by
  apply swap; intro; apply parm
  apply comp; assumption; assumption; done

theorem test3 (f : V → α → β → W) [IsLin f] (g : U → V) [IsLin g] : IsLin (λ u a b => f (g u) a b) :=
by
  apply swap; intro; apply swap; intro; apply parm; apply parm
  apply swap; intro; apply parm  -- this does nothing but introduces unused `α` variable
  apply comp; assumption; assumption; done

Playing around with these theorems reveals that applying apply swap; intro; apply parm at certain stage does absolutely nothing but adds a new unused variable a : α. The typeclass resolution gets stuck at applying this over and over.

It feels like that the typeclass resolution should be able to detect this and stop. Is that the case? Or is it more complicated and fundamentally it can not be done? Or any idea how to restructure these instances such that they do not get into the loop but still automatically imply the same set of theorems?

Tomas Skrivan (Oct 26 2021 at 13:41):

I have looked into the paper "Tabled Typeclass Resolution", the cycle problem is discussed there. I have tried to come up with the Coe vs CoeT trick, but ultimately failed in doing so.

Is there any hope of a fix for this? I'm happy to wait for it. On the other hand, It looks like that this problem totally breaks my project I'm working on and I have to rethink it from ground up if this won't get a fix.

Mac (Oct 26 2021 at 16:37):

@Tomas Skrivan the problem here is essentially as you surmised, the swap and param instances overlap in a way that produces an infinite cycle. You can fix your original MWE by removing the swap instance and making the comp instance use the actual composition function. That is, this works:

def is_linear {α β} (f : α → β) : Prop := sorry

class IsLin {α β} (f : α → β) where
  {proof : is_linear f}

variable {U V W}
instance parm (f : U → α → V) (a : α) [IsLin f] : IsLin (λ (u : U) => f u a) := sorry
instance comp (f : V → W) (g : U → V) [IsLin f] [IsLin g] : IsLin (f ∘ g) := sorry

theorem test1 (f : V → α → W) [IsLin f] (g : U → V) [IsLin g] (a : α) : IsLin (λ u => f (g u) a) := by infer_instance
theorem test2 (f : V → α → W) [IsLin f] (g : U → V) [IsLin g] : IsLin (λ u a => f (g u) a) := by infer_instance

Mac (Oct 26 2021 at 16:41):

Note that deleting the swap instance though will prevent it from being inferred, though:

theorem test3 (f : α → U → V) [∀ a, IsLin (f a)] : IsLin (λ (u : U) (a : α) => f a u) := inferInstance -- error

That is, you can only have one of param or swap. They don't work together. This makes some sense as they are both flip the arguments of f but are stated in two different ways that Lean has a hard time reconciling.

Tomas Skrivan (Oct 26 2021 at 19:30):

Unfortunately, not having test3 automatically proven by infer_instancecompletely breaks my project I'm working on. I really need that to work.

Note that lean can detect and work with some kind of cycles. See my example loop_break in the first post.

Tomas Skrivan (Oct 26 2021 at 19:35):

Here is my attempt at the Coe + CoeT trick. In this case, I have IsLin and IsLin₂ := ∀ a, IsLin (f a).

For some unimportant reasons I have IsSmooth here instead of IsLin but it is the same thing.

Now I can automatically do the examples above, but the problem just gets pushed down and ptest2, ptest3 and ptest4 get stuck in an infinite loop.

def is_smooth {α β} (f : α → β) : Prop := sorry

class IsSmooth {α β} (f : α → β) : Prop where
  (proof : is_smooth f)

class IsSmooth₂ {α β γ} (f : α → β → γ) : Prop where
  (proof : ∀ a, is_smooth (f a))

instance to_smooth₂ (f : α → β → γ) [inst : ∀ a, IsSmooth (f a)] : IsSmooth₂ f := ⟨λ a => (inst a).proof⟩

instance swap (f : α → β → γ) [IsSmooth₂ f] : IsSmooth (λ b a => f a b) := sorry
instance parm (f : α → β → γ) [IsSmooth f] (b : β) : IsSmooth (λ a => f a b) := sorry
instance comp (f : β → γ) (g : α → β) [IsSmooth f] [IsSmooth g] : IsSmooth (f ∘ g) := sorry
instance comp₂ (f : β → δ → γ) (g : α → β) [IsSmooth f] [IsSmooth g] : IsSmooth₂ (λ d a => f (g a) d) := sorry

theorem test1 (f : β → δ → γ) [IsSmooth f] (g : α → β) [IsSmooth g] (d : δ) : IsSmooth (λ a => f (g a) d) := by infer_instance
theorem test2 (f : β → δ → γ) [IsSmooth f] (g : α → β) [IsSmooth g] : IsSmooth (λ a d => f (g a) d) := by infer_instance
theorem test3 (f : δ → β → γ) [∀ d, IsSmooth (f d)] (g : α → β) [IsSmooth g] : IsSmooth (λ a d => (f d (g a))) := by infer_instance

theorem stest1 (f : α → β → γ) [∀ a, IsSmooth (f a)] : IsSmooth (λ b a => f a b) := by infer_instance
theorem stest2 (f : α → β → γ → δ) [∀ a b, IsSmooth (f a b)] : IsSmooth (λ c b a => f a b c) := by infer_instance
theorem stest3 (f : α → β → γ → δ) [∀ a b, IsSmooth (f a b)] : IsSmooth (λ c a b => f a b c) := by infer_instance
theorem stest4 (f : α → β → γ → δ → ε) [∀ a b c, IsSmooth (f a b c)] : IsSmooth (λ d a b c => f a b c d) := by infer_instance

theorem ptest1 (f : α → β → γ) [IsSmooth f] (b : β) : IsSmooth (λ a => f a b) := by infer_instance
theorem ptest2 (f : α → β → γ → δ) [IsSmooth f] (b : β) (c : γ) : IsSmooth (λ a => f a b c) := by infer_instance  --- (deterministic) timeout at 'typeclass'
theorem ptest3 (f : α → β → γ → δ) [IsSmooth f] (b : β) : IsSmooth (λ a c => f a b c) := by infer_instance  --- (deterministic) timeout at 'typeclass'
theorem ptest4 (f : α → β → γ → δ) [IsSmooth f] (c : γ) : IsSmooth (λ a b => f a b c) := by infer_instance  --- (deterministic) timeout at 'typeclass'

Mac (Oct 26 2021 at 19:56):

Tomas Skrivan said:

Note that lean can detect and work with some kind of cycles. See my example loop_break in the first post.

Yes this is true, but this only generally works when the loop cycles back to an equivalent term. As you mentioned:

Tomas Skrivan said:

Playing around with these theorems reveals that applying apply swap; intro; apply parm at certain stage does absolutely nothing but adds a new unused variable a : α. The typeclass resolution gets stuck at applying this over and over.

It feels like that the typeclass resolution should be able to detect this and stop. Is that the case? Or is it more complicated and fundamentally it can not be done?

The loop in your case produces a different term at each cycle. Thus, the resolution procedure has no way of knowing that this a pointless progression (as there could be a case where adding dozens of unused variables results in hitting a new instance). Also, it should be noted that type classes are not really intended for automated proving, that is generally suppose to be the realm of tactics. For one, type class resolution follows a generic procedure that thus can't rely on domain specific assumptions / heuristics. A specific tactic for a case like this, on the other hand, could detect such loops and intelligently assume they are futile and instead do something else.

Tomas Skrivan (Oct 26 2021 at 20:08):

as there could be a case where adding dozens of unused variables results in hitting a new instance

Ohh, this didn't cross my mind. Is that really true? I thought that the kind of a sequence of terms I'm getting is clearly a loop that should be detected and stopped. I didn't realize that you might want to continue. Do you have an example where it is necessary?

Also, it should be noted that type classes are not really intended for automated proving, that is generally suppose to be the realm of tactics.

Fair point, and maybe I will have to write my own tactic. On the other hand, all I want is to use simp tactic that conditions certain rewrite rules on whether the expression satisfies certain conditions. I can check these conditions automatically using type classes. So I feel writing my own tactic would require more or less rewriting simp and type class resolution ... I do not want to do that and I'm not capable of doing that.

Yakov Pechersky (Oct 26 2021 at 20:38):

You could just provide a set of lemmas to simp that are those rewrite rules. In mathlib3, those are called simp sets. You can take a look at the implementation of field_simp for example

Mac (Oct 26 2021 at 21:35):

Tomas Skrivan said:

Ohh, this didn't cross my mind. Is that really true? I thought that the kind of a sequence of terms I'm getting is clearly a loop that should be detected and stopped. I didn't realize that you might want to continue. Do you have an example where it is necessary? .

Here is an extremely contrived one:

def is_linear {α β} (f : α → β) : Prop := sorry

class IsLin {α β} (f : α → β) where
  {proof : is_linear f}

variable {U V W}

theorem swap (f : α → U → V) (h : ∀ a, IsLin (f a)) : IsLin (λ u a => f a u) := sorry
theorem parm (f : U → α → V) (hf : IsLin f) (a : α) : IsLin (λ (u : U) => f u a) := sorry
theorem comp (f : V → W) (g : U → V) (hf : IsLin f) (hg : IsLin g) : IsLin λ u => f (g u) := sorry

theorem foo (f : V → α → W) [IsLin f] (g : U → V) [IsLin g] (x y z : α)
: IsLin (λ u => f (g u) z) := sorry

theorem test1 (f : V → α → W) [IsLin f] (g : U → V) [IsLin g] (a : α)
: IsLin (λ u => f (g u) a) :=
by
  apply parm; apply swap; intro x;
  apply parm; apply swap; intro y;
  apply parm; apply swap; intro z;
  revert x y z
  exact foo f g

The take away here is that any new term could itself be an instance/theorem/axiom that would resolve the search.

Tomas Skrivan (Oct 27 2021 at 08:09):

It is really contrived and I would argue not a valid example because when you replace theorem by instance it does not work.

The thing is that when foo is declared as an instance it does not synthesize itself:

instance foo (f : V → α → W) [IsLin f] (g : U → V) [IsLin g] (x y z: α)
: IsLin (λ u => f (g u) z) := sorry

theorem test1 (f : V → α → W) [IsLin f] (g : U → V) [IsLin g] (x y z: α)
 : IsLin (λ u => f (g u) z) := by infer_instance   -- failed to synthesize instance

It leads to an answer that contains metavariable and is rejected.

A simple example

set_option trace.Meta.synthInstance true
class Foo (α : Type) where
 (a : α)

instance instFoo [Inhabited α] (a : α) : Foo α := ⟨arbitrary⟩

theorem test [Inhabited α] : Foo α := by infer_instance

The trace contains:

[Meta.synthInstance.newAnswer] val: instFoo ?m.344
[Meta.synthInstance] skip answer containing metavariables instFoo ?m.348
[Meta.synthInstance] failed

Gabriel Ebner (Oct 27 2021 at 08:16):

instFoo is a bad instance due to the argument a : α, which is a non-class argument that does not occur anywhere else and hence can't be found via unification.

Gabriel Ebner (Oct 27 2021 at 08:20):

In mathlib (Lean 3) we have a linter that checks for these kinds of issues. This is what it prints:

/- The `unused_arguments` linter reports: -/
/- UNUSED ARGUMENTS. -/
#check @instFoo /- argument 3: (a : α) -/

/- The `impossible_instance` linter reports: -/
/- IMPOSSIBLE INSTANCES FOUND.
These instances have an argument that cannot be found during type-class resolution, and therefore can never succeed. Either mark the arguments with square brackets (if it is a class), or don't make it an instance. -/
#check @instFoo /- Impossible to infer argument 3: (a : α) -/

/- The `instance_priority` linter reports: -/
/- DANGEROUS INSTANCE PRIORITIES.
The following instances always apply, and therefore should have a priority < 1000.
If you don't know what priority to choose, use priority 100.
See note [lower instance priority] for instructions to change the priority. -/
#check @instFoo /- set priority below 1000 -/

Tomas Skrivan (Oct 27 2021 at 08:21):

Yakov Pechersky said:

You could just provide a set of lemmas to simp that are those rewrite rules. In mathlib3, those are called simp sets. You can take a look at the implementation of field_simp for example

Very good point! Simplification in a field sometimes needs a proof that certain terms are non-zero. I will look into it and hopefully get inspired.

Tomas Skrivan (Oct 27 2021 at 08:23):

@Gabriel Ebner Yup, agree that it is a bad instance. Thus I would argue that detecting the infinite loop I have described in the first post can(should?) be detected and stopped.

Gabriel Ebner (Oct 27 2021 at 08:31):

Okay, looking at your original example. You should think a bit about how you want to define "linear". You probably want some kind of typeclass for the vector spaces. docs#module is how we define it in mathlib.

Gabriel Ebner (Oct 27 2021 at 08:31):

With that in mind, I can't make much sense of test2.

Gabriel Ebner (Oct 27 2021 at 08:33):

Surprisingly this works and the composition instances don't even cause nontermination:

def is_linear {α β} (f : α → β) : Prop := sorry

class IsLin {α β} (f : α → β) where
  {proof : is_linear f}

variable {U V W}
instance comp (f : V → W) (g : U → V) [IsLin f] [IsLin g] : IsLin λ u => f (g u) := sorry
instance comp2 (f : V → α → W) (g : U → V) [IsLin f] [IsLin g] : IsLin λ u => f (g u) a := sorry

theorem test1 (f : V → α → W) [IsLin f] (g : U → V) [IsLin g] : IsLin (λ u => f (g u) a) := by
  infer_instance
theorem test1' (f : V → α → W) [IsLin f] (g : V → V) [IsLin g] : IsLin (λ u => f (g (g u)) a) := by
  infer_instance

Tomas Skrivan (Oct 27 2021 at 09:04):

@Gabriel Ebner In my real code, I have a definition of vector space and so on. Here I'm just posting stripped down version of it to demonstrate my problem.

Concerning the test2, it is very subtle. Consider function f : V -> V -> R that is bilinear, e.g. inner product. There are kind of two ways you can ask about linearity in the first argument, either you fix the second argument as y and ask if fun x => f x y is linear? Or you can ask if f is linear as a function from vector space V to vector space V -> R.
This distinction is not very interesting for linearity but this is very crucial and subtle in variational calculus when dealing with differentiability.

See my other earlier post, for more tests I want to get working:
Tomas Skrivan said:

Here is my attempt at the Coe + CoeT trick. In this case, I have IsLin and IsLin₂ := ∀ a, IsLin (f a).

For some unimportant reasons I have IsSmooth here instead of IsLin but it is the same thing.

Now I can automatically do the examples above, but the problem just gets pushed down and ptest2, ptest3 and ptest4 get stuck in an infinite loop.

def is_smooth {α β} (f : α → β) : Prop := sorry

class IsSmooth {α β} (f : α → β) : Prop where
  (proof : is_smooth f)

class IsSmooth₂ {α β γ} (f : α → β → γ) : Prop where
  (proof : ∀ a, is_smooth (f a))

instance to_smooth₂ (f : α → β → γ) [inst : ∀ a, IsSmooth (f a)] : IsSmooth₂ f := ⟨λ a => (inst a).proof⟩

instance swap (f : α → β → γ) [IsSmooth₂ f] : IsSmooth (λ b a => f a b) := sorry
instance parm (f : α → β → γ) [IsSmooth f] (b : β) : IsSmooth (λ a => f a b) := sorry
instance comp (f : β → γ) (g : α → β) [IsSmooth f] [IsSmooth g] : IsSmooth (f ∘ g) := sorry
instance comp₂ (f : β → δ → γ) (g : α → β) [IsSmooth f] [IsSmooth g] : IsSmooth₂ (λ d a => f (g a) d) := sorry

theorem test1 (f : β → δ → γ) [IsSmooth f] (g : α → β) [IsSmooth g] (d : δ) : IsSmooth (λ a => f (g a) d) := by infer_instance
theorem test2 (f : β → δ → γ) [IsSmooth f] (g : α → β) [IsSmooth g] : IsSmooth (λ a d => f (g a) d) := by infer_instance
theorem test3 (f : δ → β → γ) [∀ d, IsSmooth (f d)] (g : α → β) [IsSmooth g] : IsSmooth (λ a d => (f d (g a))) := by infer_instance

theorem stest1 (f : α → β → γ) [∀ a, IsSmooth (f a)] : IsSmooth (λ b a => f a b) := by infer_instance
theorem stest2 (f : α → β → γ → δ) [∀ a b, IsSmooth (f a b)] : IsSmooth (λ c b a => f a b c) := by infer_instance
theorem stest3 (f : α → β → γ → δ) [∀ a b, IsSmooth (f a b)] : IsSmooth (λ c a b => f a b c) := by infer_instance
theorem stest4 (f : α → β → γ → δ → ε) [∀ a b c, IsSmooth (f a b c)] : IsSmooth (λ d a b c => f a b c d) := by infer_instance

theorem ptest1 (f : α → β → γ) [IsSmooth f] (b : β) : IsSmooth (λ a => f a b) := by infer_instance
theorem ptest2 (f : α → β → γ → δ) [IsSmooth f] (b : β) (c : γ) : IsSmooth (λ a => f a b c) := by infer_instance  --- (deterministic) timeout at 'typeclass'
theorem ptest3 (f : α → β → γ → δ) [IsSmooth f] (b : β) : IsSmooth (λ a c => f a b c) := by infer_instance  --- (deterministic) timeout at 'typeclass'
theorem ptest4 (f : α → β → γ → δ) [IsSmooth f] (c : γ) : IsSmooth (λ a b => f a b c) := by infer_instance  --- (deterministic) timeout at 'typeclass'

Gabriel Ebner (Oct 27 2021 at 09:25):

I still find this hard to parse. Some general advice:

Avoid instances like swap, they're not going to help you.
Compositionality instances like comp seem to work surprisingly well in Lean 4. Try to write more of them, and as general as possible. Instances like this: IsSmooth f → IsSmooth g → IsSmooth h → IsSmooth λ x => f (g x) (h x) or IsSmooth f → IsSmooth g → IsSmooth h → IsSmooth λ x y => f (g x y) (h x y).

Tomas Skrivan (Oct 27 2021 at 09:31):

Avoid instances like swap, they're not going to help you.

I would like to avoid it, I just do not see how.

Can you write down some small set of instances that will pass all tests in the example above? Furthermore, tests stest* and ptest* naturally extend to infinity for more and more arguments. Will all of those pass?

I'm trying to come up with this small set of instances for couple of weeks/months now and I'm still not there yet. Either I get into an infinite loop or I need to write down an infinite number of instances in order to pass all stest* for example.

Tomas Skrivan (Oct 27 2021 at 09:34):

Try to write more of them, and as general as possible.

Yes, I have plenty more. Mainly for doubling arguments as you have pointed out. But reordering arguments is the problem and I feel that something like swap is necessary.

Tomas Skrivan (Oct 27 2021 at 09:42):

they're not going to help you

In the example above, if you comment out swap then test3 and all stest* will fail.
ptest2 starts working but ptest3 and ptest4 are still not working because they need both swap and parm.

Gabriel Ebner (Oct 27 2021 at 09:55):

I'm still not sure I understand how you're trying to define IsSmooth but if IsSmooth (f : α → β → γ → δ) means f is smooth in the first argument, then this should work:

def is_smooth {α β} (f : α → β) : Prop := sorry

class IsSmooth {α β} (f : α → β) : Prop where
  (proof : is_smooth f)

instance : IsSmooth fun a : α => a := sorry
instance (b : β) : IsSmooth fun a : α => b := sorry
instance param (f : α → β → γ) [∀ a, IsSmooth fun b => f a b] : IsSmooth λ b a => f a b := sorry
instance comp (f : β → γ) (g : α → β) [IsSmooth f] [IsSmooth g] : IsSmooth fun x => f (g x) := sorry
instance comp2 (f : β → δ → γ) (g : α → β) (h : α → δ) [IsSmooth f] [IsSmooth g] [IsSmooth h] : IsSmooth (λ a => f (g a) (h a)) := sorry
instance comp3 (f : β → δ → ι → γ) (g : α → β) (h : α → δ) (i : α → ι) [IsSmooth f] [IsSmooth g] [IsSmooth h] [IsSmooth i] :
  IsSmooth (λ a => f (g a) (h a) (i a)) := sorry

theorem test1 (f : β → δ → γ) [IsSmooth f] (g : α → β) [IsSmooth g] (d : δ) : IsSmooth (λ a => f (g a) d) := by infer_instance
theorem test2 (f : β → δ → γ) [IsSmooth f] (g : α → β) [IsSmooth g] : IsSmooth (λ a d => f (g a) d) := by infer_instance
theorem test3 (f : δ → β → γ) [∀ d, IsSmooth (f d)] (g : α → β) [IsSmooth g] : IsSmooth (λ a d => (f d (g a))) := by infer_instance

example (f : β → γ) (g : β → β) [IsSmooth f] [IsSmooth g] : IsSmooth (fun x => f (g (g x))) := by infer_instance
theorem stest1 (f : α → β → γ) [∀ a, IsSmooth (f a)] : IsSmooth (λ b a => f a b) := by infer_instance
theorem stest2 (f : α → β → γ → δ) [∀ a b, IsSmooth (f a b)] : IsSmooth (λ c b a => f a b c) := by infer_instance
theorem stest3 (f : α → β → γ → δ) [∀ a b, IsSmooth (f a b)] : IsSmooth (λ c a b => f a b c) := by infer_instance
theorem stest4 (f : α → β → γ → δ → ε) [∀ a b c, IsSmooth (f a b c)] : IsSmooth (λ d a b c => f a b c d) := by infer_instance

theorem ptest1 (f : α → β → γ) [IsSmooth f] (b : β) : IsSmooth (λ a => f a b) := by infer_instance
theorem ptest2 (f : α → β → γ → δ) [IsSmooth f] (b : β) (c : γ) : IsSmooth (λ a => f a b c) := by infer_instance  --- (deterministic) timeout at 'typeclass'
theorem ptest3 (f : α → β → γ → δ) [IsSmooth f] (b : β) : IsSmooth (λ a c => f a b c) := by infer_instance  --- (deterministic) timeout at 'typeclass'
theorem ptest4 (f : α → β → γ → δ) [IsSmooth f] (c : γ) : IsSmooth (λ a b => f a b c) := by infer_instance  --- (deterministic) timeout at 'typeclass'

Tomas Skrivan (Oct 27 2021 at 11:39):

(deleted)

Tomas Skrivan (Oct 27 2021 at 12:04):

@Gabriel Ebner Very cool! I was unable to figure this out, it looks very promising. I have many more tests I will throw at it to see if it scales, but it looks really promising. Thank you very much!

The definition of IsSmooth is saying that f is smooth only in the first argument. Because of this, the example you gave is not entirely correct as comp2 additionally requires smoothness of f in the second argument and comp3 additionally requires smoothness of f in the second and third argument.

It also requires to add variants of comp2 and comp3, but to get these tests running only one additional variant is necessary. I will investigate if that is the case all the time.

instance comp2_00 (f : β → δ → γ) (g : α → β) (h : α → δ) [IsSmooth f] [∀ b, IsSmooth (f b)] [IsSmooth g] [IsSmooth h] : IsSmooth (λ a => f (g a) (h a)) := sorry
instance comp2_01 (f : β → δ → γ) (g : α → β) [IsSmooth f] [IsSmooth g] (d) : IsSmooth (λ a => f (g a) d) := sorry
-- instance comp2_10 (f : β → δ → γ) (h : α → δ) [IsSmooth f] [IsSmooth h] (b) : IsSmooth (λ a => f b (h a)) := sorry


instance comp3_000 (f : β → δ → ι → γ) (g : α → β) (h : α → δ) (i : α → ι) [IsSmooth f] [∀ b, IsSmooth (f b)] [∀ b d, IsSmooth (f b d)] [IsSmooth g] [IsSmooth h] [IsSmooth i] :
  IsSmooth (λ a => f (g a) (h a) (i a)) := sorry
-- instance comp3_001 (f : β → δ → ι → γ) (g : α → β) (h : α → δ) [IsSmooth f] [∀ b, IsSmooth (f b)] [IsSmooth g] [IsSmooth h] (i) :
--   IsSmooth (λ a => f (g a) (h a) i) := sorry
-- instance comp3_010 (f : β → δ → ι → γ) (g : α → β) (i : α → ι) [IsSmooth f] [∀ b d, IsSmooth (f b d)] [IsSmooth g] [IsSmooth i] (h) :
--   IsSmooth (λ a => f (g a) h (i a)) := sorry
instance comp3_011 (f : β → δ → ι → γ) (g : α → β) [IsSmooth f] [IsSmooth g] (d i) :
  IsSmooth (λ a => f (g a) d i) := sorry
-- instance comp3_100 (f : β → δ → ι → γ) (h : α → δ) (i : α → ι) [∀ b, IsSmooth (f b)] [∀ b d, IsSmooth (f b d)] [IsSmooth h] [IsSmooth i] (g) :
--   IsSmooth (λ a => f g (h a) (i a)) := sorry
-- instance comp3_101 (f : β → δ → ι → γ) (h : α → δ) [∀ b, IsSmooth (f b)] [IsSmooth h] (g i) :
--   IsSmooth (λ a => f g (h a) i) := sorry
-- instance comp3_110 (f : β → δ → ι → γ) (i : α → ι) [∀ b d, IsSmooth (f b d)] [IsSmooth i] (g h) :
--   IsSmooth (λ a => f g h (i a)) := sorry

It seems to work really nicely, for example the instance you suggested previously works automatically now

example (f : α → β → γ) (g : δ → ε → α) (h : δ → ε → β) [IsSmooth f] [∀ a, IsSmooth (f a)] [IsSmooth g] [IsSmooth h]
  : IsSmooth (λ x y => f (g x y) (h x y)) := by infer_instance

Tomas Skrivan (Oct 27 2021 at 12:11):

A quick investigation reveals that if we have comp2_** and comp3_0** then all comp3_1** follow automatically. It kind of make sense. Hopefully it means that there is no combinatirial blow up of instances when writing down all instance from comp1 tocomp10(10 should be enough for any practical purposes)

Tomas Skrivan (Oct 27 2021 at 12:20):

Super cool, this also works when we add an additional instance eval:

instance eval (x : α) : IsSmooth (λ f : α → β => f x) := sorry

example (f : β → δ → γ) (h : α → δ) [IsSmooth f] : IsSmooth (λ (g : α → β) a => f (g a) (h a)) := by infer_instance

Kevin Buzzard (Oct 27 2021 at 12:54):

Dare I ask why it is that from your description of the problem, you are doing what looks to me like mathematics, and yet you are writing things like comp10 which look to me like lemmas which we would never have in mathlib simply because they'd be of no use in practice? I think we rarely need to go beyond analogues of comp2 when proving the kinds of theorems we have in mathlib. I guess what I'm saying is that I'm interested in hearing a little more about what you're doing.

Tomas Skrivan (Oct 27 2021 at 13:14):

This is all for doing symbolic differentiation. The kicker is that I want to do symbilic differentiation in variational calculus, i.e. differentiate function w.r.t. to another functions. Ultimately, I want to write down the integral of Lagrangian, take gradient of it to get Euler-Lagrange equations. Once I have that, I can go further and do similar calculations using GENERIC formalism used in non-equlibrium thermodynamics to derive various PDEs for complicated fluids. Calculations there are huge pain and chore. I want a computer to do that for me.

I could just write down the rules for differentiation and how to manipulate those expressions, but I'm afraid of bugs and accidentally writing down incorrect simplification rule that I will not be able to track down later on. That is the reason I'm using proof assistance. Plus it is fun to learn.

From formal point of view. Variational calculus is famously subtle subject as space of differentiable function between Banach spaces is not a Banach space etc. For this reason I'm using the notion of convenient vector spaces, they form cartesian closed category, so I can do lambda calculus there. So the exercise here is more or less proving that a certain lambda function in Type category is also lambda function in category of Convenient Vector Spaces.

Tomas Skrivan (Oct 27 2021 at 13:24):

And I want to take this further, formalize many numerical algorithms and use Lean to help me to assemble programs that solve PDEs.

A simple example of what I'm trying to achieve. Simulation of harmonic oscillator:

First you define the Hamiltonian:

def H (m k : ℝ) (x p : V) := 1/(2*m) * ⟨p,p⟩ + k/2 * ⟨x, x⟩

Then you state that you want an implementation of ODE solver for this particular Hamiltonian system:

def solver (m k : ℝ) (steps : Nat) : Impl (ode_solve (HamiltonianSystem (H m k))) :=

Then you turn into a tactic mode where you guide the compute how to actually build the program:

by
  impl_check (m.toFloat>0) "Mass has to be non zero."

  simp [HamiltonianSystem, uncurry, H, gradient];
  autograd

  rw [ode_solve_fixed_dt runge_kutta4_step]
  lift_limit steps "Number of ODE solver steps."; admit; simp

  finish_impl

In order,

1. just preform a simple runtime check
2. expand some definitions and run automatic differentiation to arrive at Hamiltons equations
3. replace `ode_solve` with a particular numerical scheme
4. currently, replacing `ode_solve` with a numerical scheme stills leaves a limit there such that you  can  proof that the numerical scheme is actually doing what you want. The command `lift_limit` just says, ok at this point I'm giving up on implementing exactly the original goal, I'm going to implement just an approximation. 
5. `finish_impl` is just `done` tactic. In future I want to check that the goal is actually computable function that you can run.

An example how would you then use solver to actually run the simulation. It is hopefully quite self explanatory.

def harmonic_oscillator_main : IO Unit := do

  let steps := 100
  let m := 1.0
  let k := 10.0

  let evolve ← (solver m k steps).assemble

  let t := 1.0
  let x₀ := (1.0, 0.5)
  let p₀ := (0.0, 0.0)
  let (x,p) := evolve t (x₀, p₀)

  IO.println s!"In {t} seconds the harmonic oscillator evolved from ({x₀}, {p₀}) to ({x},{p})."

#eval harmonic_oscillator_main

Tomas Skrivan (Oct 27 2021 at 13:30):

One of my big goals is to create Finite Element library that can be differentiated w.r.t. almost anything you can think of. One big applications of this would be topology optimization . Currently, I'm just focusing on fundamentals and how to get all the core moving pieces right such that I can build something like that.

Tomas Skrivan (Oct 27 2021 at 13:35):

Also, my day job is to write physics simulation software that is used in visual effects. There you need to write incredibly complicated solvers that talk to each other. For example, you want to create a bear swimming in a river, you need to simulate his skin+flesh, his fur and the water. Ideally all of these elements should influence each another. Writing all of this is super hard, especially if all these elements are supposed to affect each another. So I just want to explain to the computer the mathematics behind these algorithms and then the computer will help me to write such a program.

Yakov Pechersky (Oct 27 2021 at 13:35):

In general, you probably don't want to rely solely on the TC inference system to provide proofs of behaviors of certain functions. You'll likely want to build a custom tactic to do so. For example, the "smoothness" of 1/x^2 might be used in a proof, but of course isn't smooth everywhere. Perhaps in the proof you do know you don't care about evaluating at 0. How would the TC know that constraint? You might try to bake [fact (0 < x)] into the TC in the proof context. But then you'll have way more instance searching in the cases when that fact isn't available.

Yakov Pechersky (Oct 27 2021 at 13:36):

An example of such a tactic is continuity.

Gabriel Ebner (Oct 27 2021 at 13:38):

I don't think it's such a crazy idea to try out the TC system for this. Type class synthesis is a lot nicer now, maybe we can get away without writing by continuity in many cases?

Tomas Skrivan (Oct 27 2021 at 13:39):

@Yakov Pechersky Yes, this is definitely a problem and something I should think about. So currently, I'm focusing only on functions that are smooth everywhere or not smooth at all. My current plan to deal with ||x|| is to approximate it with sqrt( ||x||^2 + eps^2 ) which is smooth and non-zero everywhere. Do all the automatic differentiations with this approximation and then set eps to zero or close to zero.

Note that this is for example commonly used in N-body simlation to improve numerical stability.

Tomas Skrivan (Oct 27 2021 at 13:41):

One, maybe not so good, reason why I'm using TC for this is if you can prove that a function is linear then it is super easy to differentiate it and you do not have to decompose it into smaller parts and differentiate those.

And I do not know how would I do that if I would have to manually invoke a tactic somewhere.

However, I should have a closed look to field_simp as you have mentioned. Unfortunately, I'm not sufficiently familiar with all the details of simp tactic and in its core.

Tomas Skrivan (Oct 27 2021 at 17:42):

Kevin Buzzard said:

Dare I ask why it is that from your description of the problem, you are doing what looks to me like mathematics, and yet you are writing things like comp10 which look to me like lemmas which we would never have in mathlib simply because they'd be of no use in practice? I think we rarely need to go beyond analogues of comp2 when proving the kinds of theorems we have in mathlib. I guess what I'm saying is that I'm interested in hearing a little more about what you're doing.

Ohh, I realized that I went on a rant without giving a concrete example.

Consider neural network with n layers. The type of the neural network f can be

f : W₁ × ... × Wₙ → X → Y

where Wᵢ are the weights for i-th layer, X is the image and Y is the output space.

To do backpropagation I will apply the bakckpropagation operator 𝓑

noncomputable
def backprop (f : U → V) : U → V×(V→U) := λ x => (f x, (δ f x)†)

prefix:max "𝓑" => backprop

where (δ f x)† is the adjoint of the differential of f at x

The main theorem for the backpropagation operator is:

@[simp]
theorem backprop_of_composition (f : V → W) [IsSmooth f] (g : U → V) [IsSmooth g]
    : 𝓑 (f ∘ g) = (𝓑 f • 𝓑 g)

where • is a special kind of composition such that the above is true, see The simple essence of automatic differentiation. Simply put, you just compose the functions, and compose the adjoints of their differentials.

What I fear that at some point you will work with curried version of the network

f : W₁ →  ... → Wₙ → X → Y

at that point theorems like comp10 might be necessary to automatically fetch instances of IsSmooth in backprop_of_composition.

Tomas Skrivan (Oct 27 2021 at 19:33):

@Gabriel Ebner The whole thing breaks again and loops indefinitely on the most simple example:

example (f : β → γ) (g : α → β) [IsSmooth f] [IsSmooth g] : IsSmooth (f ∘ g) = by infer_instance  -- (deterministic) timeout at 'typeclass'

The culprit is the instance comp3_000.

It is a very similar failure case as in my original example. Successive goals are:

∀ (b : β → γ), IsSmooth (Function.comp b)
∀ (b : β → γ), α → ∀ (b : β → γ), IsSmooth (Function.comp b)
∀ (b : β → γ), α → ∀ (b : β → γ), α → ∀ (b : β → γ), IsSmooth (Function.comp b)
∀ (b : β → γ), α → ∀ (b : β → γ), α → ∀ (b : β → γ), α → ∀ (b : β → γ), IsSmooth (Function.comp b)
....

I really feel that the typeclass resolution should stop when it tries to resolve a goal of type (a : α) → C where C is some class that does not contain a. As you mentioned, in Lean 3 providing such an instance is considered invalid/bad. Why should the typeclass resolution keep looking for such instances?

Tomas Skrivan (Oct 29 2021 at 10:12):

To further motivate what I'm trying to do: I want some kind of automatic type deduction for "lambda's arrows". In mathlib you have linear_map or smooth_map, I want to write a normal lambda and automatically deduce it is linear_map or smooth_map when I pass it to a function that is expecting one of those.

Something similar that is happening with types, when you write fun x => x and pass it somewhere, the type of x will be automatically deduced. You do not have to say somewhere "show that type of x is X" as it would be required for "lambda's arrow" when specialized tactic like by continuity is used(I might be wrong about this, happy to be proven incorrect)

I have actually tried working with morphisms like linear_map and smooth_map but I managed to hit a wall that I wasn't able to pass. Quite often I had affine_map that after couple of transformations actually happen to be linear_map and there wasn't a clear point where I should convert affine to linear.

When looking at this from the point of "typing lambda's arrow" then maybe this should all be done on the level of elaborator but I do not understand it and I would probably get nowhere when I would try to do so. Therefore, I'm trying to use typeclass system to build a prototype for this.

Tomas Skrivan (Oct 29 2021 at 10:22):

To give a "nonsense" description. If you have a category C and forgetful functor F : C → Type the typeclass system can happy tell you that for a type α : Obj(Type) if there is A : Obj(C) such that F(A) = α. This the standard trick in mathlib where you work with the unbundled types and the typeclass system on the fly fetches the structure of group, ring or whatever. What I'm asking is, given a function f : Morph(Type) is there a morphism m : Morph(C) such that F(m) = f?

If the typeclass system can do this lifting for objects, why not morphisms?

Tomas Skrivan (Oct 30 2021 at 19:31):

I have attempted to modify typeclass synthesis and it works!(partially). In src/Lean/Meta/SynthInstance.leanI have replaced

def mkTableKey (mctx : MetavarContext) (e : Expr) : Expr := do
  MkTableKey.normExpr e mctx |>.run' {}

with

def mkTableKey (mctx : MetavarContext) (e : Expr) : MetaM Expr := do
  pure (MkTableKey.normExpr (← removeUnusedArguments e) mctx |>.run' {})

where removeUnusedArguments, as the name suggests, removes unused arguments. For example, if e is (a : α) → b where b does not contain a then e is replaced with b.

Surprisingly, this modification breaks only 3 out of 1006 current unit tests. In order to keep Lean working as usual I remove unused arguments only when looking for an instance of IsSmooth. The implementaiont of removeUnusedArguments is:

def removeUnusedArguments (e : Expr) : MetaM Expr := do
match e with
  | Expr.forallE .. => forallTelescope e fun xs b => do
    if (b.getAppFn.constName! == `IsSmooth) then
      let mut ys : List Expr := []
      for x in xs do
        if (b.containsFVar x.fvarId!) then
          ys := ys.concat x
      mkForallFVars ys.toArray b
    else
      pure e
  | _ => e

With this modification, all the examples I have posted here are automatically solved when these six simple instances are provided:

instance identity : IsSmooth fun a : α => a := sorry
instance const (b : β) : IsSmooth fun a : α => b := sorry
instance swap (f : α → β → γ) [∀ a, IsSmooth (f a)] : IsSmooth (λ b a => f a b) := sorry
instance parm (f : α → β → γ) [IsSmooth f] (b : β) : IsSmooth (λ a => f a b) := sorry
instance diag (f : β → δ → γ) (g : α → β) (h : α → δ) [IsSmooth f] [∀ b, IsSmooth (f b)] : IsSmooth (λ a => f (g a) (h a)) := sorry
instance comp (f : β → γ) (g : α → β) [IsSmooth f] [IsSmooth g] : IsSmooth (f ∘ g) := sorry

Last updated: May 02 2025 at 03:31 UTC