Zulip Chat Archive

Stream: mathlib4

Topic: Rat in norm_num


Thomas Murrills (Jan 09 2023 at 23:56):

Is the definition of IsRat correct? Currently it seems to posit (given mouseovers and pp.all true) that denom is Invertible in Nat, which won't ever happen. (Or does it?)

structure IsRat [Ring α] (a : α) (num : ) (denom : ) where
  /-- The denominator is invertible. -/
  inv : Invertible denom
  /-- The element is equal to the fraction with the specified numerator and denominator. -/
  eq : a = num * denom

Yakov Pechersky (Jan 09 2023 at 23:58):

I think it is supposed to mean that Invertible (denom : \alpha).

Thomas Murrills (Jan 10 2023 at 00:01):

That works, afaict; nice. I was also wondering, though, if this is an artifact of some earlier time in the port, and if isRat should be rewritten to IsRat [Ring α] (a : α) (q : ℚ).

Yakov Pechersky (Jan 10 2023 at 00:02):

Why is that? Ah, right, Mario wanted to have it all be passing around

Thomas Murrills (Jan 10 2023 at 00:03):

Neat, I'll change it. :)

Thomas Murrills (Jan 10 2023 at 00:05):

Oops, I mean IsRat [DivisionRing α] (a : α) (q : ℚ).

Thomas Murrills (Jan 10 2023 at 00:19):

New question: what's the "right" way to write rationals in terms of numerators and denominators? Say I want to construct the most "general" form of a rational for use in e.g. matching.

Is it/? It seems a little high-level, but maybe it gets elaborated away or something. For example:

theorem IsRat.to_isNat {α} [DivisionRing α] :  {a : α} {n},
    IsRat a ((Int.ofNat n) / 1)  IsNat a n := ...

works, but feels heavy-handed.

Thomas Murrills (Jan 10 2023 at 00:20):

Should I go down to the structure level ({ num := n, den := 1, den_nz := ... })? (/. syntax rings a bell, too, but I don't seem to have access to it in this file/know where it is.)

Gabriel Ebner (Jan 10 2023 at 00:21):

Yes, I'd use /. And don't use Int.ofNat. Use coercions (or Int.cast) directly.

Gabriel Ebner (Jan 10 2023 at 00:22):

In this case you should be able to write IsRat a n.

Thomas Murrills (Jan 10 2023 at 00:23):

Ah, okay, great, thanks.

Thomas Murrills (Jan 10 2023 at 00:23):

Btw, I was patterning the ofNat bit off of the pre-existing IsInt.to_isNat above—is the use of ofNat there still okay, or should it be similarly changed?

Gabriel Ebner (Jan 10 2023 at 00:26):

Yes, I'd avoid Int.ofNat where possible.

Thomas Murrills (Jan 10 2023 at 00:39):

Hmm, just to be sure before I change the whole file: was it written this way because it has something to do with using literals in this context? All of the functions here in norm_num are meant to be passing around literals, I think, and I have a small note from a little while back written down saying "ofNat is for literals that lean uses; Nat.cast is for variables". Have things changed since this file was written such that casting is now preferred everywhere?

Gabriel Ebner (Jan 10 2023 at 00:40):

OfNat.ofNat is something entirely different. It's only for numeric literals. (and the argument should always be a nat_lit) OfNat.ofNat is ok.

Gabriel Ebner (Jan 10 2023 at 00:40):

Int.ofNat is the one that shouldn't be there imho.

Thomas Murrills (Jan 10 2023 at 00:40):

Interesting...that clarifies things a bit, thanks.

Mario Carneiro (Jan 10 2023 at 02:41):

The purpose of the IsNat and IsInt functions is to say a = n where n is a Nat or Int, but written using as few expr operators as possible: n is "raw" here. In the case of Nat, it's a raw nat literal, and in the case of Int it is a "raw int literal", which means a raw nat literal wrapped in Int.ofNat or Int.negOfNat. With IsRat the idea was to have a "raw rat literal" be represented as a numerator and denominator in separate arguments, where the numerator is a raw int literal and the denominator is a raw nat literal. This is the form that norm_num uses internally for subproofs to avoid the overhead of building all the typeclass arguments associated with numeral construction.

There are two ways to get out of this representation, depending on how it is being used.

  • Some tactics are still using numbers internally but can't deal with IsNat at the head, they want it to be = there so they can put the resulting expression in a subterm. In that case we apply Result.toRawEq , which produces a "raw nat cast expression" which is like an OfNat.ofNat application but with simplified typeclass arguments.
  • For the final result of the tactic to be seen by end-users, such as the result of norm_num after it has done normalization but has not closed the goal, we want to produce "canonical" numeral terms using the same constructors as users would normally use. This is done by the Result.toSimpResult function, which constructs a OfNat.ofNat term with the right typeclass arguments. For negative numbers, it applies - to an ofNat, and for rationals it uses / applied to two numerals. The result is expressions like 2, -2, 0, 1 / 2 and -(2 / 3) exactly like what you would get if you wrote those in a lean file (where the expected type is α).

Mario Carneiro (Jan 10 2023 at 02:55):

I don't think we need to change the definition of IsRat, except to fix the bug you noted about Invertible (denom : α). It does not take a Rat and this is intentional. Since the invertibility assumption is inside the structure, it doesn't affect most theorem statements: you can prove that you can add and multiply IsRat terms without assuming anything beyond Ring α. The only time you need a division ring assumption is to prove that the denominator is invertible, and if we search directly for Invertible denom in that case, with a fallback using DivisionRing α if it doesn't already work out of the box, then we get the ability to work in characteristic-n rings almost for free. (There is a bit more work to make the division operator work without assuming DivisionRing, so I'm fine if v1 just assumes DivisionRing in this lemma and doesn't bother with the invertible stuff.)

Thomas Murrills (Jan 10 2023 at 02:55):

Ohhkay, cool! So, just so I can get a better picture of things:

  1. what’s the underlying motivation for trying to reduce the number of expr operators? (performance somehow?)
  2. what counts as an expr operator? casting counts, I’m guessing, from context…but in general, what’s that mean? (elaboration-stage manipulation?)

As for how to proceed:

  1. I won’t change the ofNats, I’m guessing—just want to confirm we’re sticking with the original general approach.
  2. Likewise, should we stick with the original isRat? This is less clear to me—having a Rat guarantees the arguments are well-formed, and the literals seem like they’re still “there”.

However…I’m wondering if the previous-ish method of Invertible (d : α) ensured that the cast to αworks for division rings that aren’t char zero, and this feels like it might be important.

Thomas Murrills (Jan 10 2023 at 02:55):

Oops, sent at the same time, let me read that.

Mario Carneiro (Jan 10 2023 at 02:57):

what’s the underlying motivation for trying to reduce the number of expr operators? (performance somehow?)

Yes, performance. The expr construction is the main bottleneck when constructing huge norm_num proofs (and the performance of norm_num is quite often the limiting factor on numerical proofs like proving the first n digits of pi correct).

Mario Carneiro (Jan 10 2023 at 02:58):

what counts as an expr operator? casting counts, I’m guessing, from context…but in general, what’s that mean? (elaboration-stage manipulation?)

There is no elaboration-stage manipulation here, these terms are handed directly to the kernel. I'm talking about the number of calls to Expr.app and friends.

Mario Carneiro (Jan 10 2023 at 02:59):

It is essentially the size of the expression you would see if you pp.all the term, except long names mostly don't count against you because they are shared.

Thomas Murrills (Jan 10 2023 at 03:00):

Ahh, ok, gotcha!

Mario Carneiro (Jan 10 2023 at 03:01):

OfNat.ofNat has the big problem that OfNat is dependent, which means you need a different typeclass argument for every single number so they can't be shared. This basically doubles the number of terms we have to construct

Mario Carneiro (Jan 10 2023 at 03:02):

So IsNat uses a CommSemiring α argument which is cached so it can be used in each subterm and shared

Thomas Murrills (Jan 10 2023 at 03:04):

Hmm, ok…I’ll need to get back to my computer and look at the types of these to see where the dependence flows! Does lean cache automatically when it can?

Mario Carneiro (Jan 10 2023 at 03:04):

When you pass the same expr in multiple places, you construct a term with internal sharing

Mario Carneiro (Jan 10 2023 at 03:05):

like if I do (fun a => (a, a)) 17 there is only one 17 expr in memory and both members of the pair point to it

Thomas Murrills (Jan 10 2023 at 03:05):

Oh, that’s great

Mario Carneiro (Jan 10 2023 at 03:05):

If I do (17, 17) (i.e. elaborating 17 twice) there may or may not be two 17's in memory

Thomas Murrills (Jan 10 2023 at 03:06):

May not? :eyes:

Thomas Murrills (Jan 10 2023 at 03:06):

Like in the case where 17 is actually a generic expression, so to speak?

Mario Carneiro (Jan 10 2023 at 03:07):

Lean is not hash-consed, if that's what you mean: we do not have the guarantee that identical subterms are always shared

Mario Carneiro (Jan 10 2023 at 03:07):

so unless you do something in particular to cause the same expression to be copied around you can't rely on them being pointer-equal

Thomas Murrills (Jan 10 2023 at 03:08):

Oh, yeah, that’s what I expected: I was wondering how they could wind up shared if input separately (not appearing as intermediate terms)

Mario Carneiro (Jan 10 2023 at 03:08):

There is the shareCommon function in Init which will maximally deduplicate an expression, so if you gave it (17, 17) it would ensure that both components of the pair point to the same object

Mario Carneiro (Jan 10 2023 at 03:09):

Lean does have a variety of elaboration caches though so it is possible for things to become shared in the course of elaboration

Mario Carneiro (Jan 10 2023 at 03:09):

I don't think shareCommon is used in the elaborator

Mario Carneiro (Jan 10 2023 at 03:09):

and it's a linear time operation so not very usable for a program like norm_num to run after each operation

Thomas Murrills (Jan 10 2023 at 03:11):

I seeee, very cool…yeah, it felt possibly expensive-enough to be “opt-in only” even not knowing that it was linear-time, lol

Thomas Murrills (Jan 10 2023 at 03:15):

So when we have a nat literal, there’s some magic to make it not a chain of succ applications, right? Assuming so, do induction proofs/definitions break that, or is there magic to deal with matching on succ too?

Mario Carneiro (Jan 10 2023 at 03:18):

yes, raw nat literals are a concept core to lean itself, there is a constructor Expr.lit that has support for literal Nat and String, and the kernel uses bignum computation to do Nat.add/Nat.mul. This is really cool, it means that we don't have to do things one bit0 at a time anymore and it's also a lot faster. So norm_num is now delegated to handling all the other number arithmetic that isn't special cased by the kernel, and arithmetic on other types like Real where the kernel will do something really stupid if you just ask it to prove it by rfl

Thomas Murrills (Jan 10 2023 at 03:22):

Okay, gotcha, that is really cool! Makes sense!

Thomas Murrills (Jan 10 2023 at 20:01):

Result.toRawEq , which produces a "raw nat cast expression" which is like an OfNat.ofNat application but with simplified typeclass arguments.

So, okay...I'm trying to write the analogous rawCast expression for rationals. Thing is, it seems to depend on info stored in the IsRat structure, namely, that denom is invertible. Should these be instance arguments? E.g.

@[simp] def _root_.Rat.rawCast [Ring α] [Invertible (d : α)] (n : )  (d : ) : α := n * (d : α)

As-is, though, this fails with a "failed to synthesize instance Invertible ↑d" message which I find unusual—it's right there!

Or, am I totally misinterpreting what a rawCast is?

Thomas Murrills (Jan 10 2023 at 20:02):

Also, what's that _root_ doing there in all these rawCasts?

Mario Carneiro (Jan 11 2023 at 01:13):

It's to put the function in the Nat namespace

Mario Carneiro (Jan 11 2023 at 01:16):

The function should not take extra arguments, that would cause other problems. It could be a noncomputable definition, that is if h : Invertible (d : A) then n * ⅟(d : α) else 0, since it's only used in the middle of a proof where we don't care about computation

Mario Carneiro (Jan 11 2023 at 01:19):

actually it's probably best to just use n / d in a division ring for the definition of rawCast. The aforementioned expression does not have nice properties without more specific invertibility assumptions anyway. (That is, if you had two of those conditional expressions for different n,d it is not the case that rawCast n1 d1 + rawCast n2 d2 = rawCast (n1 * d2 + n2 * d1) (d1 * d2) without additional assumptions.)

Thomas Murrills (Jan 14 2023 at 00:54):

norm_num now works with rationals! mathlib4#1441

Thomas Murrills (Jan 14 2023 at 01:05):

Mario and I spent a few hours today hammering it out; we sorried the proofs of the internal theorems given that the functionality is time-critical, but the actual tactic nonetheless now works as expected with rationals. I’ll try to respond to the just-incoming review comments asap. :)

Mario Carneiro (Jan 14 2023 at 06:01):

I have a question for the mathematicians: Is it true in a ring A that if ((a : Nat) : A) is invertible and b divides a then ((b : Nat) : A) is also invertible? I think the answer is yes but there isn't anything obviously applicable from mathlib. This is needed for norm_num to reduce rational number expressions

Thomas Murrills (Jan 14 2023 at 06:14):

yes, by associativity for each side, e.g. 1 = a⁻¹a = a⁻¹(kb) = (a⁻¹k)b, yielding a⁻¹k as a left inverse and ka⁻¹ as a right one; then a⁻¹k = ka⁻¹ because natural numbers commute with everything in the ring

Mario Carneiro (Jan 14 2023 at 06:19):

oh interesting, that argument doesn't need k to be nonzero

Mario Carneiro (Jan 14 2023 at 06:20):

so the Nat.succ g in our lemmas isn't needed after all

Thomas Murrills (Jan 14 2023 at 06:28):

well…it kind of does, because if k is 0, you can’t get that it times anything is equal to 1 and that first chain of equalities doesn’t work anymore 🙃 but if we’re allowing contradictory hypotheses (as might be the case with g, idk)…

Thomas Murrills (Jan 14 2023 at 06:29):

(ok, unless you’re in the zero ring.)

Mario Carneiro (Jan 14 2023 at 06:41):

sure, but that's not "kind of works", that is "works". The hypotheses are the hypotheses, the whole point is that you don't have to think about whether they are true or not since you are assuming they are

Mario Carneiro (Jan 14 2023 at 06:43):

the reason it doesn't result in nonsense in the norm_num application is that we have a bundled invertibility assumption which implies that all denominators are nonzero (as natural numbers), so g is provably nonzero even if we don't take it as an assumption

Thomas Murrills (Jan 14 2023 at 07:09):

yeah, absolutely—all I mean by “doesn’t work” is that if k = 0 (and A is not the zero ring) the argument blows up and you’re not in any real situation (well, we hope…)

Kevin Buzzard (Jan 14 2023 at 08:39):

If A is the zero ring however then k=0 is fine and the nat a doesn't have to be zero

Mario Carneiro (Jan 14 2023 at 08:40):

cool, I incorporated this proof into mathlib4#1572 which kills all the remaining sorries

Mario Carneiro (Jan 14 2023 at 08:42):

probably the core pieces of those proofs should go somewhere in mathlib proper, it's a bit weird for those lemmas to actually have mathematical content beyond a library lemma

Eric Wieser (Jan 14 2023 at 08:47):

I assume we already have docs#is_unit_of_mul? Edit: docs#is_unit_of_dvd_unit

Mario Carneiro (Jan 14 2023 at 08:47):

Oh, this is interesting: norm_num is now unable to prove theorems like:

variable [Field α]
example : - - (16 / ((11 / (- - (6 * 19) + 12)) * 21)) = (96/11 : α) := by norm_num

because the denominators are not necessarily invertible unless you assume that alpha is CharZero

Mario Carneiro (Jan 14 2023 at 08:49):

Speaking of is_unit, why do we even have invertible since it's basically the same thing?

Eric Wieser (Jan 14 2023 at 08:50):

The latter is constructive

Mario Carneiro (Jan 14 2023 at 08:53):

Mario Carneiro said:

Oh, this is interesting: norm_num is now unable to prove theorems like:

variable [Field α]
example : - - (16 / ((11 / (- - (6 * 19) + 12)) * 21)) = (96/11 : α) := by norm_num

because the denominators are not necessarily invertible unless you assume that alpha is CharZero

Specifically, norm_num will reduce the goal to 16 / (11 / 126 * 21) = 96 / 11 and then stop because it can't prove that 126 is invertible so it doesn't know how to clear the denominator

Mario Carneiro (Jan 14 2023 at 08:55):

I'm confused how this ever worked

Mario Carneiro (Jan 14 2023 at 08:59):

Aha, it's an issue in the translation of the test - the original lean 3 test file was assuming LinearOrderedField α

Thomas Murrills (Jan 14 2023 at 09:43):

Ideally, we’d be able to look for the characteristic of the division ring, check coprimality to it, and if we have it (or if the characteristic is zero—not sure what convention is used) get an Invertible term, right? (I’m guessing we’d also want to normalize mod the characteristic, but I’d think this should be done in some basic evaluator way before the one for / if possible, and apply everywhere, not just when looking at inverses.)

But, I guess this would only be possible if we had either Char m α (is that how we spell it?) for some literal m, CharZero α, or found a way to make use of assumptions about m in the context, like if it was some unknown prime or something (but maybe we have special ways of spelling that too?)—but I don’t know if all that searching would be performant. Though, if we could find the relevant info once and keep it, maybe it wouldn’t be too bad. Just thinking out loud.

Mario Carneiro (Jan 14 2023 at 09:48):

I was thinking that (assuming there is no CharZero instance) you could look up and cache invertible instances for all the prime factors of the number. For the common case where you would want this kind of thing, it's always going to be 2 and/or 3 that you are searching for

Mario Carneiro (Jan 14 2023 at 09:50):

although I suppose for a ring of known characteristic that's less efficient than your approach if there are a lot of big primes being inverted

Mario Carneiro (Jan 14 2023 at 09:55):

It's basically two complementary situations you could be working in: if you have a known characteristic than everything is invertible except a fixed list of primes, or you might be working in a ring of "any characteristic except 2 or 3" in which case you know 2 and 3 are invertible but nothing else

Thomas Murrills (Jan 14 2023 at 09:57):

Oh interesting! How do we phrase that the ring has characteristic unequal to 2 or 3? Do we have some special way of writing it, or do we just have a hypothesis somewhere saying e.g. m > 2 or something? I’m wondering how we’ll tell which situation we’re in.

Thomas Murrills (Jan 14 2023 at 10:00):

(Also, nitpick: everything except a fixed list of *multiples of primes! unless I’m misinterpreting?)

Mario Carneiro (Jan 14 2023 at 10:10):

How do we phrase that the ring has characteristic unequal to 2 or 3?

[Invertible (2 : R)] [Invertible (3 : R)] I think

Thomas Murrills (Jan 14 2023 at 10:11):

That says a lot more—that it doesn’t have even characteristic/characteristic which is a multiple of 3

Yuyang Zhao (Jan 14 2023 at 10:13):

[NeZero (2 : R)] [NeZero (3 : R)]?

Eric Rodriguez (Jan 14 2023 at 10:13):

In mathlib I think this is what is used, anyways. I don't think people are overly concerned about the non-ID case

Thomas Murrills (Jan 14 2023 at 10:21):

If they aren’t already, some people will be eventually (hopefully!), so I feel like we ought to know what to do about it—it might also make other things easier indirectly if we’re able to know facts about the characteristic somehow

Thomas Murrills (Jan 14 2023 at 10:23):

NeZero is interesting! I wonder though if instead of looking for instances we can simply access outer hypotheses from within norm_num (without passing them in explicitly). Doing a single check through the context for relevant ones at one point might be useful.

Thomas Murrills (Jan 14 2023 at 10:32):

I guess basically what we’d be looking for in this case are explicit inequalities constraining the characteristic or primality/divisibility facts. I can’t think of much else that would be useful for this kind of normalization…

Kevin Buzzard (Jan 14 2023 at 11:44):

ne_zero is only useful in the presence of is_domain or whatever it's called. Then you can cancel nonzero things because ax=ay implies a(x-y)=0 so x-y=0

Reid Barton (Jan 14 2023 at 12:17):

If you are dividing by things then they should be invertible of course.

Mario Carneiro (Jan 14 2023 at 12:42):

yeah I think invertible is superior to ne_zero in this situation since being zero or not is only really interesting in domains and fields, whereas invertibility is what matters for rings

Thomas Murrills (Jan 14 2023 at 18:01):

Reid Barton said:

If you are dividing by things then they should be invertible of course.

See, that’s what I thought! It’s still so weird to me that a potential divide-by-zero expression even typechecks. I get why it’s good for functions to be total and not have to take extra assumptions as arguments, but…it also seems like we’re missing out on something types are usually meant to help with, right? I feel like mathlib’s convention of having x / 0 = 0 will somehow lead to human mistakes…

Thomas Murrills (Jan 14 2023 at 18:11):

Mario Carneiro said:

yeah I think invertible is superior to ne_zero in this situation since being zero or not is only really interesting in domains and fields, whereas invertibility is what matters for rings

I think this is backwards: invertibility tells you what you need to know about the characteristic for integral domains and fields, since the characteristic is necessarily prime and therefore saying that e.g. 2 is invertible only eliminates the case that the characteristic is 2.

But for rings, saying that 2 is invertible says that the characteristic is even, not that it’s not 2. If you want to make a statement about a general ring not having characteristic 2, you can’t do it with invertibility.

Mario Carneiro (Jan 14 2023 at 18:13):

But for rings, saying that 2 is invertible says that the characteristic is even, not that it’s not 2. If you want to make a statement about a general ring not having characteristic 2, you can’t do it with invertibility.

In this case I think the scenario would actually be even characteristic then, because the usual reason that kind of hypothesis pops up is because you want to invert 2.

Mario Carneiro (Jan 14 2023 at 18:14):

I'm thinking of stuff like the quadratic / cubic formula here, where you have to put 2 in the denominators of things and so you need an assumption that says that this is meaningful

Mario Carneiro (Jan 14 2023 at 18:16):

elliptic curve parameterization also involves a case split on the characteristic. I'm pretty sure in one of the cases we want to divide by 2 and not simply know that it is nonzero

Thomas Murrills (Jan 14 2023 at 18:17):

Yeah, that’s fair! Maybe knowing the characteristic is not one specific number is not really ever important for rings. All I can think of is the case where we know it’s not 2 because the ring doesn’t have some property char-2 rings have, but even then, knowing this doesn’t really help us normalize numbers.

Heather Macbeth (Jan 14 2023 at 19:15):

Wow, a lot has happened in 24 hours! Thank you so much @Thomas Murrills, @Mario Carneiro and @Gabriel Ebner.

Regarding your questions about the features of norm_num that I personally would like to use immediately:

  • I would need it to work for Real, so yes, I need the functionality to assess denominators using either a CharZero instance or (if it's easier) a LinearOrderedField instance.
  • no, I will never use fields of nonzero characteristic or general rings in which certain numbers like 2 or 3 are known to be invertible
  • yes, I would need it to cope with ≤, <, and ≠.
  • since it would be used for problems assessed by an autograder, there need to be no sorries in the supporting lemmas. But I can work on the sorries myself.

I am still reading through the various PRs and haven't caught up yet on exactly which of this functionality you've implemented already.

Notification Bot (Jan 14 2023 at 21:13):

9 messages were moved from this topic to #mathlib4 > false alarm about norm_num by Heather Macbeth.

Reid Barton (Jan 14 2023 at 22:16):

Regarding the autograder, you could potentially add the missing theorems as axioms (and add them to your list of allowed axioms)

Heather Macbeth (Jan 14 2023 at 22:32):

True, although it seems unprincipled to make a change like that to mathlib for one person's purposes. But in any case the sorries should be gone after mathlib4#1572.


Last updated: Dec 20 2023 at 11:08 UTC