Zulip Chat Archive

Stream: new members

Topic: Confusing issue with exact


Sam (Aug 07 2022 at 16:37):

Having just closed one help thread, I'm immediately back with another. Sorry!

I'm experimenting with defining findp (probably a bad name) on list. It's a version of find that requires a proof that a matching item exists, and is therefore able to return α instead of option α. I'm using it to define another function overlap, which gives the longest common tail and prefix between two lists. This is all just for learning purposes.

I've run into an issue where two lemmas that look very similar to me (findp_mem and findp_pred) seem to behave differently. One works fine with a simple exact, the other gives a confusing error.

Apologies for the long code sample! I've stripped this down to only the relevant lemmas, and replaced irrelevant bodies with sorry to try to keep things reasonably compact.

import data.list

universe u

namespace list

-- There exists some item that is in the list and matches a predicate.
def containsp
{α : Type u}
(l : list α)
(p : α  Prop) [decidable_pred p] :=
 (a : α), a  l  p a

-- find is some when containsp.
@[simp]
lemma find_containsp
{α : Type u}
(p : α  Prop) [decidable_pred p]
(l : list α)
(q : l.containsp p) :
(l.find p).is_some := sorry


-- Version of find that takes in a proof of containsp.
-- Avoids exposing option publicly, because we know containsp.
def findp
{α : Type u}
(p : α  Prop) [decidable_pred p]
(l : list α)
(q : l.containsp p) : α :=
@option.get _ (find p l) (find_containsp p l q)


-- The return value of findp is a member of the list.
@[simp]
lemma findp_mem
{α : Type u}
{p : α  Prop} [decidable_pred p]
{l : list α}
{q : l.containsp p} :
(l.findp p q)  l := sorry

-- The return value of findp matches the predicate.
@[simp]
lemma findp_pred
{α : Type u}
{p : α  Prop} [decidable_pred p]
{l : list α}
{q : l.containsp p} :
p (l.findp p q) := sorry


-- A common tail and suffix always exist between two lists.
-- (Allowing [])
lemma overlap_exists_aux
{α : Type u} [decidable_eq α]
(l₁ l₂ : list α) :
l₁.tails.containsp (<+: l₂) := sorry

-- The longest tail/suffix of l₁ that is also a prefix of l₂.
def overlap {α : Type u} [decidable_eq α] (l₁ l₂ : list α) : list α :=
  findp (<+: l₂) l₁.tails (overlap_exists_aux l₁ l₂)

-- The overlap is in l₁.tails by trivial application of findp_mem.
@[simp]
lemma overlap_tail
{α : Type u} [decidable_eq α]
{l₁ l₂ : list α} :
(overlap l₁ l₂)  l₁.tails :=
begin
  rw overlap,
  exact findp_mem,
end

set_option pp.proofs true

-- The overlap is a prefix of l₂ by trivial application of findp_pred.
-- Except that this doesn't work...
@[simp]
lemma overlap_prefix
{α : Type u} [decidable_eq α]
{l₁ l₂ : list α} :
(<+: l₂) (overlap l₁ l₂) :=
begin
  rw overlap,
  exact findp_pred, -- Apparently not
end

end list

The issue is at the end, in overlap_prefix.

The exact fails to match, and I don't understand why. I can sort of understand that it might be something to do with matching up the q proof parameter to findp, but I don't understand why it would work in overlap_tail and not in overlap_prefix. None of the standard auto solving tactics are getting anywhere to give me a clue here.

So a multipart question: What's the explanation for these behaving differently, and what can I do to fix it?

Sorry for spamming Zulip with so many questions these last few days! Lean is fun, but my debugging skills from traditional programming aren't getting me very far here yet!

Kyle Miller (Aug 07 2022 at 16:48):

It seems to need some help to be able to deduce what p is. That needs some higher-order unification I believe, or at least some unification that goes down the wrong path somewhere.

I didn't golf the proof here, I just left it in the form where I got it to work.

lemma overlap_prefix
{α : Type u} [decidable_eq α]
{l₁ l₂ : list α} :
(<+: l₂) (overlap l₁ l₂) :=
begin
  rw overlap,
  have := @findp_pred _ (<+: l₂),
  apply this,
end

Kyle Miller (Aug 07 2022 at 16:49):

lemma overlap_prefix
{α : Type u} [decidable_eq α]
{l₁ l₂ : list α} :
(<+: l₂) (overlap l₁ l₂) :=
@findp_pred _ (<+: l₂) _ _ _

Sam (Aug 07 2022 at 16:51):

Huh. Thanks!

It's cool that that works, but do we have an explanation for why that's needed for findp_pred and not for findp_mem?

Kyle Miller (Aug 07 2022 at 16:54):

I haven't dug into it, but my guess is it's the difference between a list argument appearing twice in the expression and a predicate argument.

Sam (Aug 07 2022 at 16:55):

I don't know how much this applies to Lean, but from my traditional programming background I care a lot about creating high quality APIs. It feels like I'm doing something wrong if findp_pred is hard to use in this way. Is there something I can change about the definition that will make the simple use (exact findp_pred) work? I don't like providing a lemma that library_search isn't even able to find for some reason.

Sam (Aug 07 2022 at 16:56):

And it seems problematic that simp also doesn't know how to use it.

Kyle Miller (Aug 07 2022 at 16:59):

simp can't apply lemmas unless they have a symbolic "head", so for example it has to be about f x y where f is not an argument, but an actual constant

Kyle Miller (Aug 07 2022 at 17:00):

That's because it stores all its lemmas in a table indexed by the head for performance

Kyle Miller (Aug 07 2022 at 17:01):

A simple change to findp_pred is to make p be an explicit argument, and then say that findp_pred is a technical lemma that you apply in specific situations. I'm not sure if there's a good way to ensure Lean will solve the unification constraints correctly on its own (maybe someone has a trick though)

Kyle Miller (Aug 07 2022 at 17:02):

By the way, you can write the goal as overlap l₁ l₂ <+: l₂, this isn't getting in the way of typechecking

Sam (Aug 07 2022 at 17:05):

Huh. So the reason findp_mem does work with simp is that mem is a constant head. That makes some sense.

Kyle Miller (Aug 07 2022 at 17:06):

I think generally findp_pred would have the l.containsp p argument be explicit, since in practice you probably would want it to be explicit (and a user can always supply a _ if not)

Kyle Miller (Aug 07 2022 at 17:08):

By the way, there's a more accurate rule for simp. Simp has two levels of tables: each relation can have simp lemmas. So, for example, it needs to be f x y = foo with f a constant. There is a special case that every Prop simp lemma that proves foo turns into foo = true and gets added to the = table.

Sam (Aug 07 2022 at 17:12):

Kyle Miller said:

A simple change to findp_pred is to make p be an explicit argument, and then say that findp_pred is a technical lemma that you apply in specific situations. I'm not sure if there's a good way to ensure Lean will solve the unification constraints correctly on its own (maybe someone has a trick though)

To clarify, you're not saying that this would enable simp and similar to use this lemma automatically, are you? I've been experimenting with making arguments explicit but it doesn't seem to have changed that behaviour.

Kyle Miller (Aug 07 2022 at 17:13):

You are correct, I'm not suggesting that simp would be able to handle it. It's just easier to use the lemma if it's an explicit argument.

Sam (Aug 07 2022 at 17:15):

Oh! I think maybe I've just understood something weird about the find (not findp) lemmas!

Sam (Aug 07 2022 at 17:15):

They're defined like this

list.find_some
{α : Type u}
{p : α  Prop} [decidable_pred p]
{l : list α}
{a : α}
(H : list.find p l = option.some a) :
p a

This implicit a feels weird, but is that a trick to get around this problem?

Kyle Miller (Aug 07 2022 at 17:15):

A way to think about the relationship between findp_pred and overlap_prefix is that you're instantiating a generic family of lemmas into a concrete case. (Some programming languages, like Ada, require you to instantiate generics explicitly, unlike languages like C++ or Java that try to be more automatic.)

Kyle Miller (Aug 07 2022 at 17:16):

I think you'd usually find a to be implicit like that in mathlib since it's determined by H in a straightforward way

Kyle Miller (Aug 07 2022 at 17:17):

but, depending on how you intend to use find_some, you might choose to make it explicit

Kyle Miller (Aug 07 2022 at 17:18):

Design consideration: do you think you'll have list.find p l = option.some a proofs in that form lying around? If so, then it'll be easy to use, if not, maybe make some of these arguments explicit so you can more easily provide a proof of H inline and have the type all set up for you, or accept that you'll write (begin ... end : list.find p l = option.some a) with the relevant p, l, and a

Kyle Miller (Aug 07 2022 at 17:19):

(In Lean 4 this is less of an issue because you can write list.find_some (l := [1,2,3]) my_H to specify l)

Sam (Aug 07 2022 at 17:20):

I was wondering whether something like this was somehow better, although it looks like it would still have the same head problem.

@[simp]
lemma findp_pred2
{α : Type u}
{p : α  Prop} [decidable_pred p]
{l : list α}
{q : l.containsp p}
{a : α}
(H : (l.findp p q) = a) :
p a := sorry

Kyle Miller (Aug 07 2022 at 17:22):

That's a useful pattern sometimes, for example if you want to apply findp_pred2 and have it generate an equality goal for you

Kyle Miller (Aug 07 2022 at 17:22):

But I'm not sure it's better or worse. You can also use docs#convert to generate those equality goals

Sam (Aug 07 2022 at 17:26):

Hmm, interesting.

Sam (Aug 07 2022 at 17:27):

So to summarise, probably the best thing that I can do is make p explicit, so that I can do exact findp_pred (<+: l₂),.

But there's nothing much I can do about library_search, simp, and other automation tactics not being able to use this automatically.

Sam (Aug 07 2022 at 17:29):

And maybe make q explicit too. Although so far I've not found a need to specify it explicitly, because it seems to always be inferred from findp.

Kyle Miller (Aug 07 2022 at 17:29):

Yeah, simp is impossible (though of course it will find overlap_prefix), and library_search is unlikely (since exact/apply fails, which is how it finds relevant lemmas -- it really tries to apply the entire library!)

Kyle Miller (Aug 07 2022 at 17:30):

The point about q being explicit is that you're not always going to have a goal to unify against -- you might want to create a p (l.findp p q) proof from scratch in the middle of a proof.

Sam (Aug 07 2022 at 17:33):

Oh. Right. I think that makes sense. So I guess I should make both p and q explicit in findp_mem too, right?

Sam (Aug 07 2022 at 17:34):

Yeah when I first realised how much of a blunt instrument library_search is that caught me by surprise! Surely there's some better way of indexing for that, but I've not given it any thought and the people who build it know far more about it than I do!

Kyle Miller (Aug 07 2022 at 17:37):

It turned out that anything you could write in Lean that could speed it up only slowed it down because then less was being done by the stuff written in C++

Kyle Miller (Aug 07 2022 at 17:40):

Somehow findp_mem doesn't lead the unifier down a bad path, so you might get away with just q explicit. I'd say, though, that you should make your decisions based on pain points in whatever you're writing. Choosing which arguments work well implicit seems to be an art, and there are no hard rules.

Kyle Miller (Aug 07 2022 at 17:43):

A good rule of thumb is that it's safe to start out making sure every implicit argument can be inferred from the explicit arguments.

But this rule doesn't always hold! docs#inhabited.default uses the return type to infer its implicit argument, and that tends to work fine. But it's not trying to infer an applied function, like in your findp_pred, which is more difficult for Lean.

Sam (Aug 07 2022 at 17:47):

Well for learning purposes, I've been trying to develop findp to a standard that could be included in mathlib, so it's hard to know what other people's pain points would be. For now, I'm tempted to make it explicit in findp_mem too simply for symmetry with findp_pred, but that's probably not a smart way to decide.

Sam (Aug 07 2022 at 17:48):

Out of interest, do you think findp would be a useful addition to mathlib? I've (mostly) found it easier to use than dealing with the option result when proving things about find, but I'm a clueless newb so I don't know about others.

Kyle Miller (Aug 07 2022 at 17:59):

Well for learning purposes, I've been trying to develop findp to a standard that could be included in mathlib

I think I'm telling you how mathlib standards come to be -- usually it's good if you have a project you're working on where these definitions and lemmas are used, you see what works and doesn't about them, they get refined through that experience, and then you contribute them to mathlib once they've proved to be useful (and then they get refined even more during code review). It's hard (or impossible) to know the right design ahead of time, and with experience you get more lucky.

Kyle Miller (Aug 07 2022 at 18:03):

findp seems like it could be useful -- I haven't done much with list.find or <+: though, so I can't say for sure.

I'm surprised there's not already a containsp...

Kyle Miller (Aug 07 2022 at 18:04):

docs#list.any is close

def list.containsp {α : Type u} [decidable_eq α]
  (l : list α)
  (p : α  Prop) [decidable_pred p] : bool :=
l.any (λ a, a  l  p a)

Sam (Aug 07 2022 at 18:06):

Yes, there was some reason that I didn't use any, but I can't remember what it was so I should probably revisit that!

Sam (Aug 07 2022 at 18:08):

Isn't it just l.any p? I think a ∈ l is implicitly true.

Sam (Aug 07 2022 at 18:10):

There was something about it handling bool instead of Prop that I decided was awkward for some reason, but that decision was made after midnight and I'm not sure what it was based on.

Kyle Miller (Aug 07 2022 at 18:13):

I'm not sure what the right thing to do is -- a lot of times it seems like it's better to stay in the Prop world and use decidable instances -- so having a function like

def list.any' {α : Type u} (l : list α) (p : α  Prop) [decidable_pred p] : Prop :=
l.any (λ a, p a)

can be useful. But then you need to add an instance that this is decidable, so it's a lot of boilerplate.

Kyle Miller (Aug 07 2022 at 18:15):

import data.list
import data.bool.all_any

universe u

-- There exists some item that is in the list and matches a predicate.
def list.containsp {α : Type u} [decidable_eq α]
  (l : list α)
  (p : α  Prop) [decidable_pred p] :=
 (a : α), a  l  p a

example {α : Type u} [decidable_eq α]
  (l : list α)
  (p : α  bool) :
  l.containsp (λ a, p a)  l.any p :=
begin
  rw list.any_iff_exists,
  simp [list.containsp],
end

Sam (Aug 07 2022 at 18:20):

Hmm. I don't think I fully understand the naming scheme with the apostrophe. Is there a good reason the Prop version of any would be called any', or is it just a hacky way of doing an overload?

Kyle Miller (Aug 07 2022 at 18:23):

Apostrophes are occasionally used to give alternative definitions/lemmas in mathlib, but I'm not following any scheme here really. It's just meant to be a prime symbol

Sam (Aug 07 2022 at 18:24):

Cool. For some reason, I'd assumed that the apostrophe versions were for internal use and should generally be avoided, so good to know it's just an alternative.


Last updated: Dec 20 2023 at 11:08 UTC