Zulip Chat Archive

Stream: lean4

Topic: Variable width C types


Henrik Böving (Oct 19 2021 at 18:53):

Does there exist a mechanism or best practice to deal with variable width C types. Variable width in this context referring to time_t or other C types whose size is unknown for every platform but will be known during compilation time.

Mario Carneiro (Oct 20 2021 at 01:27):

In what sense is the size unknown then?

Mario Carneiro (Oct 20 2021 at 01:28):

Are you saying that the size is target dependent but otherwise fixed?

Henrik Böving (Oct 20 2021 at 06:01):

Exactly, in general the types in the sys/types.h header behave like this so C code can be written in a cross platform way without relying on any fixed size in the code itself. I could of course just pick the size that is most commonly aliased for these types but I figured this would lead to problems sooner or later.

Xubai Wang (Oct 20 2021 at 07:25):

You can use something like sizeof(time_t)

For example, these lines shows how Lean box size_t (USize) type, which is also a variable width type.

lean_obj_res lean_box_usize(size_t v) {
    lean_obj_res r = lean_alloc_ctor(0, 0, sizeof(size_t));
    lean_ctor_set_usize(r, 0, v);
    return r;
}

The only difference is that you have to manually box and unbox these types.

Gabriel Ebner (Oct 20 2021 at 07:36):

Does there exist a mechanism or best practice to deal with variable width C types. Variable width in this context referring to time_t or other C types whose size is unknown for every platform but will be known during compilation time.

The short answer is no. Everything in Lean is boxed unless it fits into a 63-bit integer. For time_t, you could also use def TimeT := Nat with overflow/bound checking happening at the interface.

For other structures, you should wrap them in a heap-allocated external object using lean_register_external_class, lean_alloc_external, etc.

Sebastian Ullrich (Oct 20 2021 at 07:45):

UInt*/USize are passed unboxed as long as they are not nested in other (non-trivial) types, so the current maximum is 64 bits or whatever your CPU word size is

Henrik Böving (Oct 20 2021 at 07:46):

Xubai Wang said:

You can use something like sizeof(time_t)

For example, these lines shows how Lean box size_t (USize) type, which is also a variable width type.

lean_obj_res lean_box_usize(size_t v) {
    lean_obj_res r = lean_alloc_ctor(0, 0, sizeof(size_t));
    lean_ctor_set_usize(r, 0, v);
    return r;
}

The only difference is that you have to manually box and unbox these types.

I'm aware how to do this in C, my question was more like how I could represent such a type on the lean side.

Gabriel Ebner said:

Does there exist a mechanism or best practice to deal with variable width C types. Variable width in this context referring to time_t or other C types whose size is unknown for every platform but will be known during compilation time.

The short answer is no. Everything in Lean is boxed unless it fits into a 63-bit integer. For time_t, you could also use def TimeT := Nat with overflow/bound checking happening at the interface.

For other structures, you should wrap them in a heap-allocated external object using lean_register_external_class, lean_alloc_external, etc.

So basically you are proposing that if i want to for example interface with clock_gettimeand return the timespec struct it fills for me into Lean (its defined like this in C:)

struct timespec {
    time_t  tv_sec;  /* Seconds */
    long    tv_nsec; /* Nanoseconds */
};

I should represent this like:

def TimeT := Nat
structure Timespec where
    sec : TimeT
    nsec : UInt64

?

Gabriel Ebner (Oct 20 2021 at 07:47):

Either that, or make an external type for timespec.

Sebastian Ullrich (Oct 20 2021 at 07:50):

See also IO.FS.SystemTime (which maybe shouldn't be in FS...)

Henrik Böving (Oct 20 2021 at 10:39):

Yeah I'm partially makng use of IO.FS.SystemTime as well (im more or less building a Haskell time clone)

Henrik Böving (Oct 21 2021 at 16:39):

I've settled on a mix of USize and the Nat suggestion for now and so far its working nicely

@[extern "lean_statvfs_fsblkcnt_nbits"] constant Fsblkcnt.getNumBits : Unit  { n : Nat // n = 32  n = 64 } :=
  fun _ => 64, Or.inr rfl -- inhabitant

def Fsblkcnt.numBits : Nat :=
  (getNumBits ()).val

theorem Fsblkcnt.numBits_eq : (numBits = 32)  (numBits = 64) :=
  (getNumBits ()).property

def Fsblkcnt.size : Nat := 2^numBits

abbrev Fsblkcnt := Fin Fsblkcnt.size

Mac (Oct 21 2021 at 20:14):

@Henrik Böving If the value in question can be 64 bits, you really should not be using Nat as it will be unnecessarily boxed . Why not just always use UInt64 -- casting to/from that is much cheaper? Or if the size just depends on the system word width, USize?

Henrik Böving (Oct 21 2021 at 20:23):

So instead just cast all my values of variable width into UInt64 (or the maximum size i suspect they can have)?

Mac (Oct 21 2021 at 21:02):

@Henrik Böving yes, that is a much more lightweight solution (along with using USize where feasible)


Last updated: Dec 20 2023 at 11:08 UTC