Zulip Chat Archive

Stream: lean4

Topic: Separator character for numbers


Daniel Weber (Nov 22 2024 at 15:15):

Many languages (Java, Python, JavaScript, PHP, C#, C++, Rust, etc.) have some character they ignore in numbers, letting one write e.g. 12_345_678 = 12345678, which can make large numbers more readable. Would it be hard/cause potential ambiguity to implement something similar in Lean? Should I make an RFC for this?

Jon Eugster (Nov 24 2024 at 10:31):

I personally find the Swiss locale style 12'345'678.02 visually the nicest and least intrusive, if any symbol could be implemented.

The worst locale is probably the German one: 12.345.678,02 :wink::rolling_eyes:

Kim Morrison (Nov 24 2024 at 12:11):

What is there for Rust? I can only find crates for formatting numbers in different locales, nothing about language support for entering numbers with thousands-separators.

Daniel Weber (Nov 24 2024 at 12:18):

fn main() {
    println!("{}", 1_234_56_7.898_765);
}

Works for me

Ruben Van de Velde (Nov 24 2024 at 12:34):

https://doc.rust-lang.org/reference/tokens.html#numbers

Edward van de Meent (Nov 24 2024 at 12:34):

using underscores is also the standard for python, from what i recall.

Ruben Van de Velde (Nov 24 2024 at 12:36):

Indeed: https://docs.python.org/3/reference/lexical_analysis.html#integer-literals

Daniel Weber (Nov 24 2024 at 12:40):

IIRC All languages I mentioned use underscore except C++ which uses '

Daniel Weber (Nov 24 2024 at 12:44):

created lean4#6199

Edward Zhang (Nov 24 2024 at 14:01):

Kim Morrison said:

What is there for Rust? I can only find crates for formatting numbers in different locales, nothing about language support for entering numbers with thousands-separators.

fn main() {
    let large_dec_number: i128 = 1_234_5678_90;
    let large_hex_number: i128 = 0xFF_FF_FF_FF;
    println!("large_dec_number (decimal):\t {}", large_dec_number);
    println!("large_hex_number (decimal):\t {}", large_hex_number);
    println!("large_hex_number (hexadecimal:\t {:#X})", large_hex_number);
    // Output:
    // large_dec_number (decimal):      1234567890
    // large_hex_number (decimal):      4294967295
    // large_hex_number (hexadecimal:   0xFFFFFFFF)

    // BTW you can change the location of underscores
}

Just use underscores. Also work for hexadecimal numeric literals. I guess there is a "de facto standard" for using characters in number.

Edward Zhang (Nov 24 2024 at 14:21):

Confirm in Java SE 14 Edition. https://docs.oracle.com/javase/specs/jls/se14/jls14.pdf

Underscores are allowed as separators between digits that denote the integer.

What's for similar proof tools like Coq/Isabelle?

Edward Zhang (Nov 24 2024 at 14:22):

From the rust reference mentioned above:

All number literals allow _ as a visual separator: 1_234.0E+18f64`

Kyle Miller (Nov 24 2024 at 20:18):

I think the first appearance of using underscores in numeric literals was Ada in the 1980s, which probably influenced the rest.

Eric Wieser (Nov 24 2024 at 20:23):

To throw another language in, Verilog also uses _s for visual grouping of bits in hex and binary literals.

Eric Wieser (Nov 24 2024 at 20:24):

(so is perhaps appealing to the bitvec users)

Kyle Miller (Nov 24 2024 at 20:44):

Do you happen to know if Verilog allows multiple _s in sequence @Eric Wieser? The reference manual I found doesn't specify this in enough detail (page 190, 9-12)

Kyle Miller (Nov 24 2024 at 20:45):

Also, is 16'h_FF_FF allowed?

Kyle Miller (Nov 24 2024 at 20:48):

Nevermind, I found an online verilog compiler and answered my own question. Multiple _'s are allowed (like 16'hFF__FF) but an initial _ is not (like 16'h_FF_FF). I guess this makes sense, because Verilog is very Ada-like (though Ada does not allow multiple _s in sequence). Edit: found a copy of the specification, and this is consistent with what's written on pdf page 41, logical page 11.


Last updated: May 02 2025 at 03:31 UTC