Zulip Chat Archive

Stream: lean4

Topic: Float.floor


Marcus Rossel (Oct 16 2021 at 10:41):

Is there some kind of flooring function for Float (preferably returning a Nat)?
I'm trying and struggling to express the inverse to Cantor's pairing function.

Kevin Buzzard (Oct 16 2021 at 10:42):

Whatever does Float have to do with Cantor's pairing function? You know that there are infinitely many naturals but only finitely many floats, right?

Marcus Rossel (Oct 16 2021 at 10:45):

The inverse to Cantor's pairing function requires a square root and a floor. So I was just looking for something in Lean 4 that allows me to take a square root.
I'm not trying to prove any theorems about the definition.

Kevin Buzzard (Oct 16 2021 at 10:47):

The inverse to Cantor's pairing function does not require floats (or reals). Your reference for it does. Just use nat.sqrt. Port it from Lean 3, I doubt it's in Lean 4 yet but it would be a fun exercise.

František Silváši (Oct 16 2021 at 10:50):

For floor you can always cheat if you don't mind the linear complexity. def poorMansFloor (x : Float) := x.toString.takeWhile (·≠'.')

Kevin Buzzard (Oct 16 2021 at 10:55):

Just glancing over the Wikipedia ref you posted, it seems that the issue is that given a natural zz you want to find xx and yy such that z=π(x,y)z=\pi(x,y). The definition of π(x,y)\pi(x,y) is "some triangular number, + y", and the triangular number is the largest triangular number which is z\leq z. So what you _really_ need is a good algorithm to find the largest triangular number (i.e. number of the form w(w+1)/2w(w+1)/2) which is at most zz. So you need to find the largest natural ww such that (2w+1)28z+1(2w+1)^2\leq 8z+1. You don't need floats or reals for that, you could knock up a dumb algorithm easily and a smarter algorithm with some work.

Xubai Wang (Oct 16 2021 at 10:55):

You can also use

def Float.toNat (x : Float) : Nat :=
  x.toUInt64.toNat

but it seems that Float is not meant for mathematical use.

Kevin Buzzard (Oct 16 2021 at 10:56):

For example you could start with 1 and then keep adding powers of 2 and backtrack when you go over.

Kevin Buzzard (Oct 16 2021 at 10:57):

(the dumb algorithm is just to start with 1 and then keep adding 2 until you go over)

Mario Carneiro (Oct 16 2021 at 10:59):

FYI the inverse to cantor's pairing function is implemented in lean 3, see src#nat.unpair

Mario Carneiro (Oct 16 2021 at 11:00):

also src#nat.sqrt if you are interested in how to do sqrt without floats

Kevin Buzzard (Oct 16 2021 at 11:00):

Looking at the Lean code that looks to me at first sight like it's a different bijection to the one on the Wikipedia page.

Marcus Rossel (Oct 16 2021 at 11:00):

Thanks for all the feedback :blush:
My purpose for this is really just to have a method for iterating over all tuples in N26\mathbb{N}^{26}.
So I thought a generalized inverse pairing function would do the job :D

Mario Carneiro (Oct 16 2021 at 11:01):

Oh, actually that's true, this is the box-shaped bijection, not the diagonal one. It just changes a couple constants

Mario Carneiro (Oct 16 2021 at 11:02):

Note that for high dimensions, iterating the unpairing function results in (inverse) superexponential growth, so you get really biased coverage

Marcus Rossel (Oct 16 2021 at 11:02):

Awesome! unpair is perfect for my purposes.

Marcus Rossel (Oct 16 2021 at 11:03):

Mario Carneiro said:

Note that for high dimensions, iterating the unpairing function results in (inverse) superexponential growth, so you get really biased coverage

What does that mean?

Kevin Buzzard (Oct 16 2021 at 11:04):

The Lean bijection is more convenient than the Wikipedia definition because you don't need "largest odd integer whose square is <= n", you need "largest integer whose square is <= n" , and there's already a name for that.

Mario Carneiro (Oct 16 2021 at 11:06):

If you map n0, ..., n25 to (((n0, n1), ... n25) then if each ni is O(k) then (n0, n1) is O(k^2), ((n0, n1), n2) is O((k^2)^2), and so on so that (((n0, n1), ... n25) is O(k^(2^25))

Mario Carneiro (Oct 16 2021 at 11:07):

Which means that if you won't hit (3,3,3, ..., 3) until about 3^(2^25)

Marcus Rossel (Oct 16 2021 at 11:08):

Hmmm... I guess I didn't consider the number of combinations :sweat_smile:

Mario Carneiro (Oct 16 2021 at 11:09):

A much more balanced way to hit the numbers is to convert every number into binary and then read a sequence like 101,10,111,...,1000 as a number written in ternary on the alphabet 0 1 ,

Mario Carneiro (Oct 16 2021 at 11:10):

I think that is within a constant factor of optimal; you can do a bit better to encode the fact that there are exactly 25 ,'s

Mario Carneiro (Oct 16 2021 at 11:11):

but depending on what you are doing that may or may not be important

Marcus Rossel (Oct 16 2021 at 11:12):

Ok now I'm not sure if we're talking about the same thing :sweat_smile:
I have a function with 26 inputs f (a b c ... x y z : Nat) : Int and would like to systematically go through all input combinations (I'm aware that there are infinitely many).

Mario Carneiro (Oct 16 2021 at 11:12):

Yes, that's what I'm talking about too

Mario Carneiro (Oct 16 2021 at 11:13):

I'm pointing out that within such functions, there is still a notion of optimality, where we ask that for small inputs we want small outputs

Mario Carneiro (Oct 16 2021 at 11:14):

you get okay results with iterated unpairing if you split the coordinates in a balanced way

Marcus Rossel (Oct 16 2021 at 11:14):

Ahhh, now I get it

Mario Carneiro (Oct 16 2021 at 11:15):

for a lot of theoretical purposes, it literally doesn't matter at all what the growth rate of the function is, in which case iterated unpairing is the simplest implementation

Mario Carneiro (Oct 16 2021 at 11:15):

and in fact that's what mathlib does in its encodable instances

Marcus Rossel (Oct 16 2021 at 11:15):

Now I just have to figure out how to convert Nat into a list of ternary digits

Mario Carneiro (Oct 16 2021 at 11:16):

src#encodable.encode_list

Mario Carneiro (Oct 16 2021 at 11:17):

see src#nat.digits for converting a nat into a list of digits

Mario Carneiro (Oct 16 2021 at 11:18):

Another variation is the high-dimensional analogue of cantor's bijection, where you iterate over the level sets of x0+x1+x2+...+x25=kx_0+x_1+x_2+...+x_{25}=k

Mario Carneiro (Oct 16 2021 at 11:19):

I think you need inverse nth roots to express the inverse function though

Kevin Buzzard (Oct 16 2021 at 11:20):

Iterate over max<=k rather than sum<=k to make the inverse function nicer

Marcus Rossel (Oct 16 2021 at 11:23):

Actually, I don't get it ....
If I have a natural number n that I want to convert into a 26-tuple (a, b, c, ..., x, y, z), I first interpret n as ternary over symbols 0, 1 and ,. Let's say n = 01,1,001. How do I extract a to z from this? Since n's representation doesn't contain 25 ,s I don't know where to "split" n.

Mario Carneiro (Oct 16 2021 at 11:25):

Kevin Buzzard said:

Iterate over max<=k rather than sum<=k to make the inverse function nicer

I think you need the inverse nth root regardless, because the volume of a 26 dimensional box and a 26 dimensional simplex both grows as n^26

Kevin Buzzard (Oct 16 2021 at 11:25):

yes, but with the triangular one you need more than 26th root, you need "inverse of some random degree 26 polynomial"

Kevin Buzzard (Oct 16 2021 at 11:26):

like we needed inverse of triangular number for the Wikipedia version, but we could complete the square.

Mario Carneiro (Oct 16 2021 at 11:26):

aha

Mario Carneiro (Oct 16 2021 at 11:26):

yes, that will be a problem for d >= 5

Kevin Buzzard (Oct 16 2021 at 11:26):

:-)

Kevin Buzzard (Oct 16 2021 at 11:27):

Given that 3^26 is quite a big number, maybe a table of small values will be fine :-)

Mario Carneiro (Oct 16 2021 at 11:28):

@Marcus Rossel Are you allowed to have a partial inverse, because that will make things easier. You can just reject inputs that don't have 25 ,'s

Mario Carneiro (Oct 16 2021 at 11:29):

One way to enforce this is with some kind of stars-and-bars way of enumerating exactly where to place the commas. But at that point the high dimensional cantor is looking like a simpler solution

Mario Carneiro (Oct 16 2021 at 11:37):

Actually balanced unpair is much simpler. You are off from optimal by a polynomial factor depending on the amount of unbalance in the tree, which can't be perfect because 26 isn't a power of two, but that should only be something like O(n^0.4) or so from optimal, which is probably passable for practical use (certainly it's much better than n^(2^25))

Mario Carneiro (Oct 16 2021 at 11:40):

If you have 26 elements of O(k), then the result is O(k^32), where the optimal is O(k^26), so it's off by a factor O(n3/13)O(n0.24)O(n^{3/13}) \le O(n^{0.24})


Last updated: Dec 20 2023 at 11:08 UTC