Zulip Chat Archive

Stream: general

Topic: c++ subset formalization

Huỳnh Trần Khanh (Jun 29 2021 at 15:13):

so... i'm trying to formalize a very restricted subset of c++, inspired by WASM. and my code looks somewhat... horrible. any tips to improve the code? https://gist.github.com/huynhtrankhanh/a39cf40c6c791b46b0bf2e87aadc19f6

Huỳnh Trần Khanh (Jun 29 2021 at 15:14):

like i don't really expect you to check the definitions and the generated code lol, i just want like general tips to improve readability

Yakov Pechersky (Jun 29 2021 at 15:17):

to reduce duplication, do you want to factor out the word size as a separate type (i8, i16, i32, i64)?

Huỳnh Trần Khanh (Jun 29 2021 at 15:25):

that sounds like a good idea... but i have a hunch that the refactor strategy is going to significantly affect the readability (and usability) of the code. so maybe... how would you factor out the word size and reduce duplication then? i'm p sure that i don't have enough experience to introduce more abstraction without turning my code into spaghetti

Huỳnh Trần Khanh (Jun 29 2021 at 15:42):

also, another question, which should be less open ended: how can i generate a random list instruction_t for testing?

Ryan Lahfa (Jun 29 2021 at 15:44):

Huỳnh Trần Khanh said:

also, another question, which should be less open ended: how can i generate a random list instruction_t for testing?

I think that sampling arbitrary GADTs is possible, there are some papers on the subject

Yakov Pechersky (Jun 29 2021 at 15:44):

Here's an example:

inductive word
| i8
| i16
| i32
| i64

inductive variable_t (w : word)
| var : variable_t

inductive value_t (w : word)
| val : value_t

inductive instruction_t
| const (w : word) (value : value_t w) (output : variable_t w) : instruction_t
| duplicate (w : word) (input : variable_t w) (output : variable_t w) : instruction_t
| add (w : word) (input1 : variable_t w) (input2 : variable_t w) (output : variable_t w) : instruction_t
| sub (w : word) (input1 : variable_t w) (input2 : variable_t w) (output : variable_t w) : instruction_t
| mul (w : word) (input1 : variable_t w) (input2 : variable_t w) (output : variable_t w) : instruction_t
| unsigned_div (w : word) (input1 : variable_t w) (input2 : variable_t w) (output : variable_t w) : instruction_t
| signed_div (w : word) (input1 : variable_t w) (input2 : variable_t w) (output : variable_t w) : instruction_t
| unsigned_mod (w : word) (input1 : variable_t w) (input2 : variable_t w) (output : variable_t w) : instruction_t
| signed_mod (w : word) (input1 : variable_t w) (input2 : variable_t w) (output : variable_t w) : instruction_t
| and (w : word) (input1 : variable_t w) (input2 : variable_t w) (output : variable_t w) : instruction_t
| or (w : word) (input1 : variable_t w) (input2 : variable_t w) (output : variable_t w) : instruction_t
| xor (w : word) (input1 : variable_t w) (input2 : variable_t w) (output : variable_t w) : instruction_t
| shift_left (w : word) (input1 : variable_t w) (input2 : variable_t w) (output : variable_t w) : instruction_t
| shift_right (w : word) (input1 : variable_t w) (input2 : variable_t w) (output : variable_t w) : instruction_t
| i32_to_i64 (input : variable_t word.i32) (output : variable_t word.i64) : instruction_t
| i64_to_i32 (input : variable_t word.i64) (output : variable_t word.i32) : instruction_t
| unsigned_load (address : variable_t word.i32) (output : variable_t word.i32) : instruction_t
| signed_load (address : variable_t word.i32) (output : variable_t word.i32) : instruction_t
| store (w : word) (address : variable_t word.i32) (output : variable_t w) : instruction_t
| break : instruction_t
| break_if (input : variable_t word.i32) : instruction_t
| loop (instrs : list instruction_t) : instruction_t

Yakov Pechersky (Jun 29 2021 at 15:45):

You can check out slim_check and the docs#slim_check.sampleable docs for arbitrary sampling.

Yakov Pechersky (Jun 29 2021 at 15:49):

I don't know exactly what your goal is. For extract_body, first write your own instruction_t -> str function. You can use sformat like so:

#eval sformat!"1 + 1 = {1 + 1}" -- "1 + 1 = 2"

Mario Carneiro (Jun 29 2021 at 19:22):

You should also factor out the binops themselves into an inductive. This will decrease both the size and visual complexity of the inductive as well as the printer

Yakov Pechersky (Jun 29 2021 at 19:34):

That was my hope, that my example would inspire further abstraction and refactoring.

Huỳnh Trần Khanh (Jun 30 2021 at 05:04):

this is not related to lean but is this a good way to force 8 bytes alignment?

union {uint8_t x[1000];uint64_t y;} m;

m.x is the program's linear memory, which is a fancy term for the heap

Mario Carneiro (Jun 30 2021 at 09:36):

yes, that will work

Last updated: May 02 2025 at 03:31 UTC