Zulip Chat Archive
Stream: lean4
Topic: Separator character for numbers
Daniel Weber (Nov 22 2024 at 15:15):
Many languages (Java, Python, JavaScript, PHP, C#, C++, Rust, etc.) have some character they ignore in numbers, letting one write e.g. 12_345_678 = 12345678
, which can make large numbers more readable. Would it be hard/cause potential ambiguity to implement something similar in Lean? Should I make an RFC for this?
Jon Eugster (Nov 24 2024 at 10:31):
I personally find the Swiss locale style 12'345'678.02
visually the nicest and least intrusive, if any symbol could be implemented.
The worst locale is probably the German one: 12.345.678,02
:wink::rolling_eyes:
Kim Morrison (Nov 24 2024 at 12:11):
What is there for Rust? I can only find crates for formatting numbers in different locales, nothing about language support for entering numbers with thousands-separators.
Daniel Weber (Nov 24 2024 at 12:18):
fn main() {
println!("{}", 1_234_56_7.898_765);
}
Works for me
Ruben Van de Velde (Nov 24 2024 at 12:34):
https://doc.rust-lang.org/reference/tokens.html#numbers
Edward van de Meent (Nov 24 2024 at 12:34):
using underscores is also the standard for python, from what i recall.
Ruben Van de Velde (Nov 24 2024 at 12:36):
Indeed: https://docs.python.org/3/reference/lexical_analysis.html#integer-literals
Daniel Weber (Nov 24 2024 at 12:40):
IIRC All languages I mentioned use underscore except C++ which uses '
Daniel Weber (Nov 24 2024 at 12:44):
created lean4#6199
Edward Zhang (Nov 24 2024 at 14:01):
Kim Morrison said:
What is there for Rust? I can only find crates for formatting numbers in different locales, nothing about language support for entering numbers with thousands-separators.
fn main() {
let large_dec_number: i128 = 1_234_5678_90;
let large_hex_number: i128 = 0xFF_FF_FF_FF;
println!("large_dec_number (decimal):\t {}", large_dec_number);
println!("large_hex_number (decimal):\t {}", large_hex_number);
println!("large_hex_number (hexadecimal:\t {:#X})", large_hex_number);
// Output:
// large_dec_number (decimal): 1234567890
// large_hex_number (decimal): 4294967295
// large_hex_number (hexadecimal: 0xFFFFFFFF)
// BTW you can change the location of underscores
}
Just use underscores. Also work for hexadecimal numeric literals. I guess there is a "de facto standard" for using characters in number.
Edward Zhang (Nov 24 2024 at 14:21):
Confirm in Java SE 14 Edition. https://docs.oracle.com/javase/specs/jls/se14/jls14.pdf
Underscores are allowed as separators between digits that denote the integer.
What's for similar proof tools like Coq/Isabelle?
Edward Zhang (Nov 24 2024 at 14:22):
From the rust reference mentioned above:
All number literals allow _ as a visual separator: 1_234.0E+18f64`
Kyle Miller (Nov 24 2024 at 20:18):
I think the first appearance of using underscores in numeric literals was Ada in the 1980s, which probably influenced the rest.
Eric Wieser (Nov 24 2024 at 20:23):
To throw another language in, Verilog also uses _
s for visual grouping of bits in hex and binary literals.
Eric Wieser (Nov 24 2024 at 20:24):
(so is perhaps appealing to the bitvec users)
Kyle Miller (Nov 24 2024 at 20:44):
Do you happen to know if Verilog allows multiple _
s in sequence @Eric Wieser? The reference manual I found doesn't specify this in enough detail (page 190, 9-12)
Kyle Miller (Nov 24 2024 at 20:45):
Also, is 16'h_FF_FF
allowed?
Kyle Miller (Nov 24 2024 at 20:48):
Nevermind, I found an online verilog compiler and answered my own question. Multiple _
's are allowed (like 16'hFF__FF
) but an initial _
is not (like 16'h_FF_FF
). I guess this makes sense, because Verilog is very Ada-like (though Ada does not allow multiple _
s in sequence). Edit: found a copy of the specification, and this is consistent with what's written on pdf page 41, logical page 11.
Last updated: May 02 2025 at 03:31 UTC