Zulip Chat Archive

Stream: mathlib4

Topic: Norm num and junk values

Darylgolden (Feb 06 2026 at 15:18):

Hi all! I'm working on a proof-of-concept extension for simplifying trigonometric expressions (similar to grind and lean-egg in implementation). I noticed that integrating norm num into it results in a lot of (mathematically) incorrect simplifications, due to junk value conventions (e.g. 0 / 0 = 0). Does norm_num or Lean have a way to check whether junk value conventions are used? Or should I be building a verifier within my extension that makes sure no junk simplifications occur?

Riccardo Brasca (Feb 06 2026 at 15:28):

Can you give an example of an incorrect semplification? For Lean junk values are just values, it is a "human notion".

Yury G. Kudryashov (Feb 06 2026 at 15:35):

James E Hanson (Feb 06 2026 at 15:50):

Riccardo Brasca said:

For Lean junk values are just values, it is a "human notion".

I.e., Lean cannot conveniently formalize basic mathematics as it is commonly understood.

James E Hanson (Feb 06 2026 at 15:52):

Junk values for things like $1/0$ have nothing to do with humans vs. computers. It's a byproduct of the formalism used by Lean.

James E Hanson (Feb 06 2026 at 15:54):

And in fact there are many, many computer mathematics systems that handle $1/0$ correctly. If I boot up Sage and evaluate 1/0, I get an error.

Edward van de Meent (Feb 06 2026 at 16:16):

humans don't get an error when dividing by zero, they just say "don't do it in the first place". the fact that it errors is a byproduct of the formalism used by sage.

Edward van de Meent (Feb 06 2026 at 16:17):

basically, one can argue that erroring is not "handling correctly"

Ruben Van de Velde (Feb 06 2026 at 16:17):

And in any case, the wider question of junk values is off topic for @Darylgolden's question

Thomas Murrills (Feb 06 2026 at 16:18):

James E Hanson said:

And in fact there are many, many computer mathematics systems that handle $1/0$ correctly. If I boot up Sage and evaluate 1/0, I get an error.

Hmm, I think you’re conflating different sorts of computer mathematics systems. Computer algebra systems like sage can afford to just manipulate away and then throw an error at the last minute when something definitely bad is encountered. But formalizing math in a computer has a different set of needs, and is a different sense of “handling” than sage’s sense.

I think intrinsic to the needs of formalization is dealing with the fact that x / y might not even be well-defined if we were to take the natural language approach. Our options for formalization seem to be

demand a proof that y is not 0 before you’re allowed to write x / y
always allow writing x / y, but use a junk value when y = 0, and have this junk operation only be guaranteed to coincide with the familiar one when we know y is not zero
allow writing x / y but interpret it as a “fallible number” and prove things about these, and only allow converting to actual numbers when it turns out that y is not zero

Which of these is most ergonomic is indeed going to be a byproduct of the features of your proof assistant, but all of them are pretty different from what humans do.

Thomas Murrills (Feb 06 2026 at 16:21):

To get back to the original question, @Darylgolden (hello!), I’d like to ask: does the fact that these are occurring mean that the relevant expressions have divisions by zero in them? If so, are you sure that the issue is the simplification, and not the formation of the expressions in the first place?

Basically, there is no mathematical problem with simplifying to junk values—you just wind up with junk facts. :sweat_smile: (I.e., facts that are still mathematically true about our extended / operation and do not contradict ordinary ones, but are useless.)

(Also, it might indeed be reasonable for norm_num to warn on simplifying to junk values! :) It does not currently.)

Darylgolden (Feb 06 2026 at 16:29):

Riccardo Brasca said:

Can you give an example of an incorrect semplification? For Lean junk values are just values, it is a "human notion".

I'm away from my computer, so I don't have the exact chain of steps, but something like $\sin (3) \cos(1) + \cos(1) \sin(3)$ , which I expect to be simplified to $\sin(4)$ by the addition identity, is instead simplified to 0. Note that the technique I'm using, equality saturation, is essentially a brute-force technique that can do a lot of things that a human would never do (for example, a human would never end up with a division by 0 in the process of simplifying this expression and most others, but a brute-force technique might). Because I'm optimizing for the smallest AST size, empirically it's often the case that the simplifier manages to find some convoluted way of simplifying many nonzero expressions to 0, using a junk value convention along the way. I'm aware that junk values are not "wrong" per se, but in a scenario where I'm attempting to work with standard operations over the reals, it seems there should be some way to at least warn the user that the result may not be what they expect :slight_smile:

Violeta Hernández (Feb 06 2026 at 16:32):

Lean allows you to prove 1/0 = 0, but it doesn’t allow you to prove sin(4) = 0. Something weird is going on in your code.

Thomas Browning (Feb 06 2026 at 16:32):

Junk values allow you to prove junk theorems, but not false theorems :slight_smile:

Darylgolden (Feb 06 2026 at 16:34):

Violeta Hernández said:

Lean allows you to prove 1/0 = 0, but it doesn’t allow you to prove sin(4) = 0. Something weird is going on in your code.

I should mention I'm not converting my simplifications into Lean proofs yet. They exist as "explanations" in the egg Rust library. But making sure that the simplifications don't allow junk values would be a first step.

Thomas Murrills (Feb 06 2026 at 16:37):

Hmm, if you’re inclined to insert division by zero in the course of simplification, I suspect that doing so is the actual problem :grinning_face_with_smiling_eyes:I think you probably don’t want to do this, junk value or no. Is there another more formally correct operation that could replace these desired divisions?

Darylgolden (Feb 06 2026 at 16:42):

As mentioned, it's sort of a brute force technique that essentially comes up with loads of equivalent expressions that no human would ever think of. There of course is more directed ways of simplifying trigonometric expressions specifically, for example modelling the process taken in SymPy more directly, but my current project aims to investigate e-graph methods, so that's slightly out of scope for me at the moment. I could of course write my own verifier to check that the calls to norm num aren't anything bad, but I was wondering if there are existing ways of doing it.

Henrik Böving (Feb 06 2026 at 16:45):

Darylgolden said:

Violeta Hernández said:

Lean allows you to prove 1/0 = 0, but it doesn’t allow you to prove sin(4) = 0. Something weird is going on in your code.

I should mention I'm not converting my simplifications into Lean proofs yet. They exist as "explanations" in the egg Rust library. But making sure that the simplifications don't allow junk values would be a first step.

You seem to misunderstand the notion of junk values here. Rewriting with n / 0 = 0 is correct in Lean, there is no question about it and it is fully accepted and embraced by the community. It's just as valid of a simplification as 1 + 1 = 2. If you are working from some expression A that is not equal to 0 to another expression B which then simplifies to say C = 0 through a junk value rule the issue is not the step B = C but the step A = B. You would find when converting them to a Lean proof that A = B is not going to be provable.

Thomas Murrills (Feb 06 2026 at 16:46):

(@Darylgolden) My point is that I think this is actually exposing some incorrect simplification step, and that a simplification which may introduce division by zero ought to probably be avoided, whether it involves a junk value or not: such an expression is not mathematically equivalent to the original one. That is, norm_num warning you that it was using a junk value would not save you! :grinning_face_with_smiling_eyes:

Darylgolden (Feb 06 2026 at 16:46):

I don't think I'm misunderstanding it. I understand that it's correct in Lean, but my intention is to work with real numbers with standard real operations.

James E Hanson (Feb 06 2026 at 16:46):

Thomas Murrills said:

Hmm, I think you’re conflating different sorts of computer mathematics systems. Computer algebra systems like sage can afford to just manipulate away and then throw an error at the last minute when something definitely bad is encountered. But formalizing math in a computer has a different set of needs, and is a different sense of “handling” than sage’s sense.

I just find it a little bit difficult to accept the lengths people will go to justify what is really a fundamental UX failure of Lean.

Compare the way Lean handles 1/0 and [0,1,2][5]:

/-- info: 0 -/
#guard_msgs in
#eval 1/0

/-- error: failed to prove index is valid, possible solutions:
  - Use `have`-expressions to prove the index is valid
  - Use `a[i]!` notation instead, runtime check is performed, and 'Panic' error message is produced if index is not valid
  - Use `a[i]?` notation instead, result is an `Option` type
  - Use `a[i]'h` notation instead, where `h` is a proof that index is valid
⊢ 5 < [0, 1, 2].length -/
#guard_msgs in
#eval [0,1,2][5]

With array indexing, Lean goes to great lengths to cater to the expectation among programmers that you will be able to write a[5] and have it just work when it's supposed to. By contrast, with division (and subtraction on the naturals), Lean just happily silently behaves in a way that does not agree with what the symbols means in all contexts in mathematics and computer science other than that of certain proof assistants.

I agree that on some formal level, the junk value approach is probably the most ergonomic given the type-theoretic formalism of Lean, but the way the symbols / and - are being used is just wrong, period. That's just a fact.

James E Hanson (Feb 06 2026 at 16:46):

There should at the very least be a warning that you're producing a non-standard junk value.

Kevin Buzzard (Feb 06 2026 at 16:46):

@Darylgolden Right now you are doing a really bad job of explaining your problem. Claims such as "Lean reduces sin(3)cos(1)+cos(1)sin(3) to 0" are a strong indication that you have not understood something properly, because it is not possible to make Lean do this. Sometimes asking a question with a #mwe is much better than words. Can you give an example of something you're concerned about using code?

Kevin Buzzard (Feb 06 2026 at 16:50):

@James E Hanson your comments, whilst valid, and also a pet peeve of yours, might be better expressed in another thread. It's not yet clear to me that they are relevant to the actual question here. If the question is not junk then the answer will not be junk so either the OP is asking junk questions (in which case they get the chaos you are describing) or they are not (in which case they should not be having problems), and my impression is that they are not asking junk questions and are claiming they're getting junk answers, so perhaps the real issue lies elsewhere.

Yury G. Kudryashov (Feb 06 2026 at 16:50):

Note that we have, e.g., docs#divp to encode the "mathematical" notion of division.

Thomas Murrills (Feb 06 2026 at 16:52):

@James E Hanson I don’t want to hijack this thread any further, but I’ll just quickly point out: (1) I think the UX being optimized for here first is proof, not evaluation, and the behavior you want is further out of reach due to intrinsic needs there (2) but yeah, why not warn when using a nonstandard junk value in #eval and norm_num? Let’s optimize both UX’s! But that’s the last I’ll comment on it here. :grinning_face_with_smiling_eyes:

Darylgolden (Feb 06 2026 at 16:54):

@Kevin Buzzard I'm not claiming that Lean reduces the given trigonometric expression to 0. I'm claiming that my simplification procedure, which calls norm num using some junk values (and to be fair, there could and likely are many bugs in this procedure that might have resulted in this, other than the junk value conventions) resulted in this incorrect simplification. I only mentioned this specific example because I was asked about it; I apologize if it seemed like I was accusing this is Lean's fault.

My initial question was simple, and I believe doesn't require a mwe: is there established methods of detecting, in norm_num, whether a junk value convention is used?

Yury G. Kudryashov (Feb 06 2026 at 16:55):

AFAIK, no. An #mwe would help to better understand what's happening.

Henrik Böving (Feb 06 2026 at 16:55):

So are you using Lean's decision procedures to power your external tool in hopes of getting more correct results?

Thomas Murrills (Feb 06 2026 at 16:56):

I really think the simplification steps prior to norm_num are mathematically incorrect, and that detecting norm_num’s behavior would not really help. (I’m trying to #xy a bit here, to be clear! :) ) Unless you want your simplification to be able to fail, and just want to filter out incorrect results?

Kevin Buzzard (Feb 06 2026 at 17:02):

Instead of asking Lean to tell you that you've used junk values, why don't you just check yourself? Here are the things which you shouldn't do: (1) divide by 0. (2) subtract naturals. (3) divide naturals or integers. Do you have any reason to do any of those things? I don't understand the difference in practice between "get Lean to tell me I just did one of those things" and "use Lean to check whether I am about to do one of those things".

Darylgolden (Feb 06 2026 at 17:04):

Yury G. Kudryashov said:

AFAIK, no. An #mwe would help to better understand what's happening.

I suppose that's the answer I need. As for a #mwe, I'm not at my computer now (it's quite late for me), and am typing this on my phone. I can try to post one tomorrow, but I'm not sure in what form. The minimal case for my application would involve a plugin with a C and Rust FFI, not to mention it's still in a very messy state. I could also try to isolate the incorrect simplification and see if I can replicate the simplification proof in Lean itself (not in a Rust egg explanation, which is a separate DSL that has yet to be converted into a Lean proof), but I don't think that's of interest to either me or anyone else (since even if it works, it would be correct in a junk value sense, which isn't interesting mathematically)

Darylgolden (Feb 06 2026 at 17:09):

Kevin Buzzard said:

Instead of asking Lean to tell you that you've used junk values, why don't you just check yourself? Here are the things which you shouldn't do: (1) divide by 0. (2) subtract naturals. (3) divide naturals or integers. Do you have any reason to do any of those things? I don't understand the difference in practice between "get Lean to tell me I just did one of those things" and "use Lean to check whether I am about to do one of those things".

What does "check myself" mean? The code I'm running does lots of (mostly stupid and absurd) calls to norm num. These are computed at runtime and not something I can precheck by hand. Of course I could write code to check it, but my question is just precisely that: has anyone made something that checks it before, or is it already built into Lean or mathlib, and I've somehow missed it? It seems like the answer is no, so I think I'll proceed to write my own checker.

Kevin Buzzard (Feb 06 2026 at 17:13):

I guess I don't properly understand what you're doing. If I'm writing code manually then "check myself" means "don't press the / button on my keyboard unless the basic sanity checks are passed", and this would have some analogue in my naive mental model of what you are doing but perhaps my naive mental model is too naive.

Kevin Buzzard (Feb 06 2026 at 17:14):

norm_num is not going to do anything junky by itself; this is my main sticking point in trying to understand your question. If norm_num simplifies 1/0 to 0 then for me the question is how you ended up with 1/0 in the first place, not why norm_num simplified it to 0. The mathematically invalid step occurred earlier.

Darylgolden (Feb 06 2026 at 17:36):

There is very likely something going wrong with my code, but this seems orthogonal to the fact that there can still be a warning if a junk convention simplification is used, and it seems to me like this would be a helpful thing.

Thomas Browning (Feb 06 2026 at 17:38):

Well, a junk value simplification is only used if the input contains junk values, so the warning will only flag if you're already doing something silly like dividing by zero.

Thomas Murrills (Feb 06 2026 at 17:42):

(I do think a warning from norm_num that you’re likely doing something silly is probably helpful in general, though I’m (also) skeptical it’s what’s actually needed for this application, as it would merely repeat some of the advice on this zulip thread :) )

Andrés Goens (Feb 06 2026 at 17:42):

another way people usually call this is what some people call the "GIGO" principle: Garbage in, garbage out. It'd be interesting to understand where the "garbage in" part came from e.g. by looking at the explanation. Equality saturation shouldn't unify the eclasses of your original expression with 0/0, so the junk values are a bit of a red herring here

Julian Berman (Feb 06 2026 at 17:51):

Thomas Murrills said:

(I do think a warning from norm_num that you’re likely doing something silly is probably helpful in general, though I’m (also) skeptical it’s what’s actually needed for this application, as it would merely repeat some of the advice on this zulip thread :) )

norm_num isn't the only tactic that might be presented with some expression which involves a junk value though right, so it'd be broader than just one tactic -- and the list of junk values is not even finite right, someone can at any time add some completely new definition somewhere which is undefined on paper at value foo of its domain and then if you evaluate that thing there you're doing junk math but it all works out anyhow.

Thomas Murrills (Feb 06 2026 at 18:25):

Right, but this is no reason to avoid improving the UX in the common and feasible cases. :)

Julian Berman (Feb 06 2026 at 18:31):

Fair point!

Violeta Hernández (Feb 06 2026 at 18:33):

If we do make a norm_num warning for junk values, it should be strictly opt-in. There are legitimate reasons to prove "junk" theorems within Mathlib.

Thomas Murrills (Feb 06 2026 at 18:34):

Ah, okay, like what? I’m wondering what we would lose by having the warning be opt-out, so that it’s reflected in source that you’re depending on junk values.

Thomas Murrills (Feb 06 2026 at 18:35):

(That is, I’m imagining some syntax would be available, e.g. norm_num +junk, to proceed without warnings. Or maybe this is compatible with what you’re thinking about?)

Eric Wieser (Feb 06 2026 at 18:40):

It's not clear to me why junk values are norm_num's problem; if you want better UX for surprising junk values, this sounds like the job of a linter that looks at the generated terms

Thomas Murrills (Feb 06 2026 at 18:40):

(There is also the question of what to do, if anything, about simp theorems which exploit junk values. A different simp set? And what about simp theorems about e.g. / which are familiar and true everywhere, but only because we happen to have chosen _ / 0 = 0? Do we sequester those? This might be a nontrivial design choice.)

Thomas Murrills (Feb 06 2026 at 18:44):

Eric Wieser said:

It's not clear to me why junk values are norm_num's problem; if you want better UX for surprising junk values, this sounds like the job of a linter that looks at the generated terms

My thinking is basically because norm_num is the thing responsible for “actually applying” such a fact…in some sense. Also feasibility reasons: I’m not sure traversing proof terms with a linter would be performant enough, but it’s easy for norm_num to catch itself in the act.

It would be less of a principled approach to linting against junk values than “hey, I can definitely be sure you’re using them here, so I may as well let you know something’s fishy”. Maybe we don’t want this approach (and the warning itself would have to disclaim this), but I think it’s worth considering; it could be useful, especially for new users.

Ruben Van de Velde (Feb 06 2026 at 18:49):

But so is simp

Thomas Murrills (Feb 06 2026 at 18:53):

Ah, i.e. simp is also responsible for applying these facts? Indeed! Hence the message about simp above :grinning_face_with_smiling_eyes: (norm_num of course preprocesses via simp, so it’s not much use considering one without the other imo.)

Ruben Van de Velde (Feb 06 2026 at 18:59):

example : (1/0 : Nat) = 0 := by simp

for example

Thomas Murrills (Feb 06 2026 at 19:00):

Yes, I think we agree; did you see my message about simp above?

Aaron Liu (Feb 06 2026 at 19:03):

Thomas Murrills said:

Ah, okay, like what?

If you have theorems about functions that can return junk values then you usually have to add a side condition saying the input is not junk. But if the conclusion also happens to hold for junk inputs, then you can drop this assumption.

Thomas Murrills (Feb 06 2026 at 19:08):

Right, the latter is what I’m talking about here:

Thomas Murrills said:

And what about simp theorems about e.g. / which are familiar and true everywhere, but only because we happen to have chosen _ / 0 = 0? Do we sequester those? This might be a nontrivial design choice.

It’s indeed not clear to me what should happen in this case. The thing we care about is unexpected behavior, and I imagine sometimes this would produce unexpected behavior and sometimes not. I’m guessing we could cook up examples where the behavior is unexpected but nothing in the proof or term is untoward.

Thomas Murrills (Feb 06 2026 at 19:11):

(With norm_num, though, worth pointing out that it is very obvious whether the core normalization is doing something unexpected, given that it handles numerals. Can’t do anything if norm_num’s call to simp gets there first, though.)

Last updated: Feb 28 2026 at 14:05 UTC