Zulip Chat Archive

Stream: general

Topic: unused arguments


Floris van Doorn (Jul 29 2019 at 22:14):

I created a script that prints all declarations with unused arguments:
https://github.com/leanprover-community/mathlib/blob/sanity_check/src/tactic/sanity_check.lean#L97

The output of this command is here:
https://gist.github.com/fpvandoorn/04837ea7f20479bcac9a99539d2c17b3

Many of these declarations are "wrong" in some way: maybe a type class argument occurs more than once, maybe a declaration is given with unnecessary assumptions (which is not always bad, but sometimes it is), maybe a declaration should really be in a different theory (e.g. this should probably be proven for an arbitrary type, not an arbitrary topological space)

Is anyone interested in going through this list and cleaning stuff up? If not, I'll do it at some point, but that will probably take a while.

Floris van Doorn (Jul 29 2019 at 22:17):

There is some noise in this list: there are some automatically generated declarations and there are some declarations where you really want an unused argument (for example a user command with argument (meta_info : decl_meta_info)).

Floris van Doorn (Jul 29 2019 at 22:24):

There are 653 declarations in mathlib that have unused arguments. This is after removing all declarations that are "internal" and most (but not all) automatically generated declarations.

Rob Lewis (Jul 29 2019 at 22:24):

Could you post an issue on GitHub with that list of declarations? I can imagine it will take a while to get it all fixed -- it would be easier to track incremental progress there.

Rob Lewis (Jul 29 2019 at 22:24):

I wonder what percentage are unnecessary type class arguments?

Floris van Doorn (Jul 29 2019 at 22:25):

From looking manually at a bunch of them: most of them are.

Simon Hudon (Jul 29 2019 at 22:35):

Any way for your tool to check lemmas that have too many or too few implicit arguments?

Kevin Buzzard (Jul 29 2019 at 22:57):

Looking at the algebra ones I think it's difficult to argue that these are "incorrect"; in fact for some of them I don't understand the output at all.

-- ring_theory\ideals.lean
#print ideal.is_coprime_self -- [2, 5]

what does [2,5] mean here? The theorem is about x being coprime to x and this is surely meaningless unless x is in an algebraic structure like a commutative ring. The [2] seems to point to comm_ring and the 5 to the coprime thing. I am confused.

-- ring_theory\localization.lean
#print localization.r -- [4]

This one is saying that you don't need a submonoid to define a certain relation. This is like saying that if we have a group G and a subgroup H then we can define a binary relation on G by g1 R g2 iff g1/g2 is in H, but ha ha, we don't strictly speaking need that H is a subgroup -- it's just that we will never use this relation unless H is a subgroup.

Similar story here, and probably for a lot of other algebra stuff:

-- ring_theory\ideal_operations.lean
#print ideal.map -- [6]

if you have a map between rings then you can push ideals forward, and yes, strictly speaking, it doesn't have to be an unbundled ring hom, but when everything is bundled do we want this map between rings to be a ring hom or a random map of sets? It will always be a ring hom. Is there still some CS-related justification for removing the assumption?

Mario Carneiro (Jul 29 2019 at 23:04):

It makes all theorems using the definition shorter, and it proves for free that the choice of witness to H being a subgroup doesn't matter

Mario Carneiro (Jul 29 2019 at 23:05):

So I would definitely try to avoid "hygiene arguments" that are there only because that's the area of math we are interested in today

Mario Carneiro (Jul 29 2019 at 23:07):

ideal.is_coprime_self takes an x and a y and proves x is coprime to x iff x is a unit. That's pretty clear cut

Mario Carneiro (Jul 29 2019 at 23:08):

it also has two comm_ring assumptions

Kevin Buzzard (Jul 29 2019 at 23:09):

aah I had misunderstood [2,5]

Kevin Buzzard (Jul 29 2019 at 23:09):

I don't see the two comm_ring assumptions

Kevin Buzzard (Jul 29 2019 at 23:10):

so maybe I'm miscounting instead

Mario Carneiro (Jul 29 2019 at 23:10):

theorem ideal.is_coprime_self :  {α : Type u} [_inst_1 : comm_ring α] [_inst_2 : comm_ring α]
  (x : α), α  (is_coprime x x  is_unit x)

Kevin Buzzard (Jul 29 2019 at 23:10):

I still don't see it :-/

Mario Carneiro (Jul 29 2019 at 23:10):

[_inst_1 : comm_ring α] [_inst_2 : comm_ring α]

Mario Carneiro (Jul 29 2019 at 23:12):

If you are asking why it acquired a duplicate assumption, it's probably because one is explicitly in the statement and the other is a variable in the section

Mario Carneiro (Jul 29 2019 at 23:12):

it's on line 15

Mario Carneiro (Jul 29 2019 at 23:13):

I can see that mem_span_pair, is_coprime_def, is_coprime_self all make the same mistake of writing [comm_ring α] even though it's already included

Floris van Doorn (Jul 29 2019 at 23:41):

what does [2,5] mean here?

From the Gist:

The list after each declaration gives all the positions of the unused arguments (starting at 1). E.g. [2] means that the second argument is unused.

Jeremy Avigad (Jul 29 2019 at 23:45):

Nice job, Floris. The list is really striking.

Floris van Doorn (Jul 29 2019 at 23:46):

This one is saying that you don't need a submonoid to define a certain relation. This is like saying that if we have a group G and a subgroup H then we can define a binary relation on G by g1 R g2 iff g1/g2 is in H, but ha ha, we don't strictly speaking need that H is a subgroup -- it's just that we will never use this relation unless H is a subgroup.

I'm ambivalent about this issue myself. I relate with both your and Mario's argument. If we're going through this list, we don't have to remove all the unused arguments.

Jeremy Avigad (Jul 29 2019 at 23:58):

Actually, Georges got a lot of mileage in the math components library by defining things in greater generality than you might think you would need. For example, you can take quotients G / H even if H is not a normal subgroup (maybe not even a subgroup?). IIRC it is defined as N_G(H) / H, where N_G(H) is the normalizer of H in G. So the construction makes sense and coincides with the usual one when H is normal in G. But doing things in greater generality meant that there were often less side conditions to check when writing proofs.

I am not saying that deleting unused arguments will be genuinely helpful in all cases on Floris' list, only that you might be surprised.

Floris van Doorn (Jul 30 2019 at 00:04):

Any way for your tool to check lemmas that have too many or too few implicit arguments?

I can try that next. The main problem is that we don't have universal rules for which arguments should be implicit, and it's hard to say which decisions are "mistakes". For example, we treat declarations proving iff different than declarations prove eq:

lemma sub_right_comm (a b c : α) : a - b - c = a - c - b
theorem inv_eq_one {a : α} : a⁻¹ = 1 ↔ a = 1
theorem mul_left_inj [left_cancel_semigroup α] (a : α) {b c : α} : a * b = a * c ↔ b = c

I can try to write a tactic that finds choices that are "obviously wrong". If you have ideas to decide which kinds of choices are wrong, then please tell me.

Mario Carneiro (Jul 30 2019 at 00:35):

The rule is that if it's an iff then it should be implicit if it appears on both sides of the iff. That is, iffs are treated as a pair of implications, and are implicit if they would be implicit in both one-directional versions

Mario Carneiro (Jul 30 2019 at 00:38):

However, there are a lot of equalities that have all implicit args as well - I think Johannes likes this style as it appears a lot in topology and lattices - and I don't know if we should take a stance for or against it

Mario Carneiro (Jul 30 2019 at 00:39):

stuff like inf_le_left are almost always used with inferred arguments, but the standard rule would have them take explicit args

Jeremy Avigad (Jul 30 2019 at 00:52):

I have also sometimes been surprised by choices regarding implicit / explicit variables in the library. I am curious to know if others think we should adopt more rigid policies.
Does Johannes like to use implicit arguments in identities so that he can call simp with foo.symm? Or write foo.trans bar? Or is there some other reason?

Kenny Lau (Jul 30 2019 at 06:29):

might this be the cause of some slowness?

Johan Commelin (Jul 30 2019 at 06:44):

Nice work @Floris van Doorn !

Johan Commelin (Jul 30 2019 at 06:44):

Btw, anyone else got stomach aches from the folder separators being slanted the wrong way?

Keeley Hoek (Jul 30 2019 at 09:30):

I think I might have been guilty of this myself, but I am pretty sure it is safe to delete (meta_info : decl_meta_info) from a user command---lean knows still understands, and it's less confusing. (Assuming the tactic doesn't use this information...)

Keeley Hoek (Jul 30 2019 at 09:31):

On the other hand an unused (_ : parse (tk "blah")) obviously can't be dropped---but probably your program can ignore arguments which have placeholder _ names?

Patrick Massot (Jul 30 2019 at 10:38):

Nice job, Floris. The list is really striking.

I'm not surprised at all. The variable mechanism is great, but it's clearly leading to this situation as soon as you start building a significant library about anything. I was already asking for this tool more than one year ago. Thanks Floris! I'll try to find time to participate in this argument hunting in topology at least.

Kevin Buzzard (Jul 30 2019 at 11:03):

I located the cause of my confusion -- I was checking these things in the schemes project, which has a version of mathlib from March. In March the definition looked like this:

ideal.is_coprime_self :
  ∀ {α : Type u_1} [_inst_2 : comm_ring α] (x : α), α → (ideal.is_coprime x x ↔ is_unit x)

hence why I asked about [2,5] -- there are only four inputs, I wondered if ideal.is_coprime counted somehow.

So presumably since March someone threw in an extra variable somewhere completely unrelated and the problem then appeared.

Floris van Doorn (Jul 30 2019 at 13:50):

@Kevin Buzzard On the March version of that declaration my output would probably be [4], since there is still a erroneous second argument of type α in the lemma.

Floris van Doorn (Jul 30 2019 at 13:51):

Kenny Lau: might this be the cause of some slowness?

Probably not in any significant way.

Floris van Doorn (Jul 30 2019 at 13:56):

@Keeley Hoek That's good to know. I always copy-pasted it thinking it was necessary. In that case, when cleaning up this list, we can also go through these arguments.

Kevin Buzzard (Jul 30 2019 at 14:24):

Kevin Buzzard On the March version of that declaration my output would probably be [4], since there is still a erroneous second argument of type α in the lemma.

Yes exactly. That's why I was totally confused earlier :-)

Kevin Buzzard (Jul 30 2019 at 14:28):

Re: the argument that in the odd order project they worked with more generality. It would not surprise me if, after removing the condition that S be a submonoid in the definition of localisation R S, one found several other lemmas which happened not to use the fact that S was a submonoid. I can see that this is a change which gets computer scientists excited, but after The Big Refactor, when submonoids are bundled, one will just end up bundling S back into being a submonoid -- either that or one will have a lot of coercions from submonoid to set (and some other thread started today seemed to indicate that there might be problems here, although they might be specific to submodules, I'm not sure).

Johan Commelin (Jul 30 2019 at 14:33):

On the other hand, in that example we do use that R is a ring, but we shouldn't. Because now we can't localise monoids )-;

Kevin Buzzard (Jul 30 2019 at 14:34):

@Amelia Livingston is working on localising monoids as we speak! She's preparing a PR for mathlib which localises a monoid at a submonoid.

Kevin Buzzard (Jul 30 2019 at 14:34):

She has everything working, we're just trying to figure out what a good universal predicate should be.

Kevin Buzzard (Jul 30 2019 at 14:35):

Our initial guess, using kernels, was wrong, because a surjective morphism of monoids is not determined by its kernel, in contrast to groups.

Kevin Buzzard (Jul 30 2019 at 14:37):

We need this for the perfectoid project :-)

Chris Hughes (Jul 30 2019 at 14:38):

Can we have a generic theory of congruence relations on monoids. I don't think it would be very long. But I think you can quotient by the smallest congruence relation such that everything that's meant to have an inverse in the quotient has an inverse in the quotient.

Kevin Buzzard (Jul 30 2019 at 14:39):

The fact that cosets of submonoids can be quite badly behaved (they can have non-trivial intersection) makes things a bit more delicate than you might think, but if you think about stuff the right way it all seems to work.

Chris Hughes (Jul 30 2019 at 14:40):

That's why you need to talk about congruence relations instead of submonoids. But a congruence relation is just a submonoid of the product, that's also an equivalence relation.

Kevin Buzzard (Jul 30 2019 at 14:40):

Here's a terrifying example; the map from (nat,+) to (Z/2Z),* sending 0 to 1 and everything else to 0 is a monoid hom.

Kevin Buzzard (Jul 30 2019 at 14:42):

and it has trivial kernel :-) We don't have the target of the map in the localisation story, we just have a monoid and a submonoid.

Chris Hughes (Jul 30 2019 at 14:42):

Congruence relations form a complete lattice.

Kevin Buzzard (Jul 30 2019 at 14:43):

but all I have is a monoid and a submonoid. I don't have the target.

Chris Hughes (Jul 30 2019 at 14:43):

What do you mean?

Kevin Buzzard (Jul 30 2019 at 14:44):

But a congruence relation is just a submonoid of the product,

What product?

Chris Hughes (Jul 30 2019 at 14:45):

For localisation of M by a submonoid S it will be (M /times S) \times (M \times S)

Kevin Buzzard (Jul 30 2019 at 14:46):

Aah I see, so a congruence relation is a way of taking a quotient of a general monoid perhaps, and M[1/S] is a quotient of M x S

Chris Hughes (Jul 30 2019 at 14:46):

And the relation is easy to find, because it's the smallest one that gives you inverses.

Kevin Buzzard (Jul 30 2019 at 14:48):

In contrast to groups, where we just remember the things which get mapped to 1, here we need to keep track of the things which get mapped to the same thing.

Kevin Buzzard (Jul 30 2019 at 14:48):

So that's called a congruence relation?

Chris Hughes (Jul 30 2019 at 14:48):

Yes.

Johan Commelin (Jul 30 2019 at 14:51):

@Kevin Buzzard Is she refactoring the file for rings?

Johan Commelin (Jul 30 2019 at 14:51):

I don't think we need/want two separate files, right?

Amelia Livingston (Jul 30 2019 at 14:54):

Not currently but what I've got so far would make it fairly easy to do so I think

Johan Commelin (Jul 30 2019 at 15:04):

Ok, cool!


Last updated: Dec 20 2023 at 11:08 UTC