## Stream: Program verification

### Topic: RISC-V ISA in Lean

#### SnowFox (Oct 02 2020 at 06:38):

(Migrating from 'Is there code for X?" "bidirectional maps?" = https://leanprover.zulipchat.com/#narrow/stream/217875-Is-there.20code.20for.20X.3F/topic/bidirectional.20maps.3F/near/212013731) @Mario Carneiro

#### SnowFox (Oct 02 2020 at 07:18):

@Mario Carneiro Can you explain why you're using inductive Props in your x86.lean?

#### Mario Carneiro (Oct 02 2020 at 07:18):

because it's a spec

#### Mario Carneiro (Oct 02 2020 at 07:18):

it's not intended to be directly executable

#### Mario Carneiro (Oct 02 2020 at 07:18):

but the advantage is that you don't need separate encode/decode

#### SnowFox (Oct 02 2020 at 07:26):

Hmm. Not something I can use then.

#### Mario Carneiro (Oct 02 2020 at 07:26):

No, I don't expect that aspect will translate

#### Mario Carneiro (Oct 02 2020 at 07:27):

I wrote a proof producing disassembler for x86/ARM before which is closer to what you need, but it is at least for now closed source

Okay.

#### SnowFox (Oct 02 2020 at 07:31):

I've heard good things about metaprogramming in Lean but ... the documentation doesn't demonstrate how. Do you have any reference I can use for implementing make_defs? https://leanprover.github.io/reference/metaprogramming.html

#### Mario Carneiro (Oct 02 2020 at 07:34):

for the most part it's just functional programming

#### Mario Carneiro (Oct 02 2020 at 07:34):

you have to interact with the lean API to make definitions

#### Mario Carneiro (Oct 02 2020 at 07:34):

and the API is admittedly not so well documented

#### Mario Carneiro (Oct 02 2020 at 07:35):

Rob Lewis made a metaprogramming tutorial that is pretty good

#### SnowFox (Oct 02 2020 at 07:36):

Thanks. I'll watch the series.

#### Jannis Limperg (Oct 02 2020 at 11:31):

There's also the metaprogramming section of the Hitchhiker's guide and, if I may be so bold, my blog post on some gotchas of the API (but that'll only make sense once you have a bit of an overview). Beyond that, you'll have to read init/meta/expr.lean and init/meta/tactic.lean, which are reasonably documented.

#### SnowFox (Oct 02 2020 at 12:57):

Okay. I've watched the short series linked above and bookmarked the blog post. I've adjusted the python scripts to generate at least a decent mostly-correct subset of RISC-V instruction to format with embedded constants list. add_inductive and add_defn_equations appear to be my next steps, which look straight forward. Afterwards, I may open an issue in the riscv-opcode repo proposing an alternative format and a new code generator to make certain things cleaner and better reflect the specification.

Thanks for all the links everyone. I'll resume this tomorrow morning. :)

#### SnowFox (Oct 02 2020 at 13:52):

Okay, I couldn't resist but... ouch. add_inductive, as the documentation for its sibling add_ginductive suggests, doesn't generate the auxiliary definitions necessary for matching against it... I couldn't figure out how to use add_ginductive however. Could someone correct this code?

meta def make_defs (l : list (name × type)) : tactic unit :=
tactic.add_inductive rvim [] 0 (Type) (l.map (λ ⟨n, t⟩, ⟨n, expr.const rvim []⟩))


#### SnowFox (Oct 12 2020 at 06:11):

@Mario Carneiro Any insight on this issue?

#### Mario Carneiro (Oct 12 2020 at 06:20):

meta def make_defs' (l : list (name × type)) : tactic unit :=
do env ← tactic.get_env,
[] [] [((rvim', (Type)),
l.map (λ ⟨n, t⟩,
{ constr := rvim' ++ n,
type := expr.const rvim' [] }))]
ff,
tactic.set_env env'

run_cmd make_defs' [(a, ()), (b, ())]
#print rvim'


#### SnowFox (Oct 12 2020 at 06:25):

There we go, thanks again!

#### Mario Carneiro (Oct 12 2020 at 06:27):

Note that I changed your code a bit to prefix rvim on the names of the constructors. Lean will let you name the constructors something else, but the front end inductive command doesn't, so it seems ill-advised

#### SnowFox (Oct 12 2020 at 06:47):

@Mario Carneiro Noticed. Though now I'm stuck trying to control the types of the constructors. As these need the registers and immediate values. I was expecting something like type := (nat -> rvim) to work. But that says "invalid return type". As it does with just  (rvim).

#### Mario Carneiro (Oct 12 2020 at 06:48):

rvim needs to be a local constant

#### SnowFox (Oct 12 2020 at 06:49):

Like let rvim := rvim in (rvim)?

#### SnowFox (Oct 12 2020 at 06:50):

Or  let rvim := @expr.const ff rvim [] in (rvim) ? Neither worked.

#### Mario Carneiro (Oct 12 2020 at 06:51):

The tricky part is that rvim doesn't exist yet, so (rvim) isn't going to work

#### Mario Carneiro (Oct 12 2020 at 06:51):

You can use let rvim : expr := expr.const rvim' [], and then splice it in

#### Mario Carneiro (Oct 12 2020 at 06:51):

i.e. ((%%rvim : Type) → (%%rvim : Type))

#### Mario Carneiro (Oct 12 2020 at 06:52):

actually, it seems it should be a constant not a local constant

#### Mario Carneiro (Oct 12 2020 at 06:53):

which is extra tricky because that definitely doesn't typecheck

#### SnowFox (Oct 12 2020 at 06:53):

Well, this worked. let rvim : expr := expr.const rvim [] in (nat -> (%%rvim : Type))

#### Mario Carneiro (Oct 12 2020 at 06:53):

I think the usual approach if you need more advanced inference would be to create a new local constant with the appropriate type, and then substitute the constant for it at the end

#### SnowFox (Oct 12 2020 at 07:03):

Okay, I have rvim expanded with the correct number of natural arguments for each instruction. :)

#### SnowFox (Oct 12 2020 at 07:03):

But it is getting slow... which is worrying considering how early this is still.

#### SnowFox (Oct 12 2020 at 07:05):

What is the vscodium trick to expand all the variants on a def? def q : rvim -> nat := _

#### SnowFox (Oct 12 2020 at 07:06):

I thought it was under ctrl+dot with the _ in focus. But that doesn't include the option for some reason.

#### SnowFox (Oct 12 2020 at 07:07):

I've seen a different method on a video

#### SnowFox (Oct 12 2020 at 07:11):

Video used {! x -> y} and that worked. I guess the language server just has a hole..

#### Mario Carneiro (Oct 12 2020 at 07:17):

processing new inductives is a little sluggish, especially if you are using nested or mutual inductives

#### Mario Carneiro (Oct 12 2020 at 07:19):

You can double check if you have accidentally defined a nested inductive because if you #print rvim it will say def rvim := ..  instead of inductive rvim ...

#### SnowFox (Oct 12 2020 at 07:20):

It says inductive.

#### Mario Carneiro (Oct 12 2020 at 07:20):

how many variants are we talking here?

#### SnowFox (Oct 12 2020 at 07:21):

I have 63 instructions.

#### Mario Carneiro (Oct 12 2020 at 07:23):

hm, I wonder whether no_confusion is as expensive as I suspected

#### SnowFox (Oct 12 2020 at 07:23):

inductive fmt : Type
| R     (opcode7 funct3 funct7 : nat) -- (rd rs1 rs2     : nat)
| I     (opcode7 funct3        : nat) -- (rd rs1     imm : nat)
| ISH   (opcode7 funct3 funct7 : nat) -- (rd rs1     imm : nat)
| ISHW  (opcode7 funct3 funct7 : nat) -- (rd rs1     imm : nat)
| S     (opcode7 funct3        : nat) -- (   rs1 rs2 imm : nat)
| B     (opcode7 funct3        : nat) -- (   rs1 rs2 imm : nat)
| U     (opcode7               : nat) -- (rd         imm : nat)
| J     (opcode7               : nat) -- (rd         imm : nat)
| FENCE (opcode7               : nat)

meta def make_defs (l : list (name × fmt)) : tactic unit :=
tactic.updateex_env λ env, env.add_ginductive options.mk [] [] [( (rvim, (Type)), l.map (λ ⟨n, t⟩, { constr := rvim ++ n, type := let rvim : expr := expr.const rvim [] in match t with | (fmt.R _ _ _) := (nat -> nat -> nat -> (%%rvim : Type)) | (fmt.I _ _) := (nat -> nat -> nat -> (%%rvim : Type)) | (fmt.ISH _ _ _) := (nat -> nat -> nat -> (%%rvim : Type)) | (fmt.ISHW _ _ _) := (nat -> nat -> nat -> (%%rvim : Type)) | (fmt.S _ _) := (nat -> nat -> nat -> (%%rvim : Type)) | (fmt.B _ _) := (nat -> nat -> nat -> (%%rvim : Type)) | (fmt.U _) := (nat -> nat -> (%%rvim : Type)) | (fmt.J _) := (nat -> nat -> (%%rvim : Type)) | (fmt.FENCE _) := ((%%rvim : Type)) end } ) )] ff open fmt -- rv{32,64}[im] sans pseudo instructions run_cmd make_defs [ --#eval list.length [ -- rv32i (BEQ, B 0b1100011 0b000), (BNE, B 0b1100011 0b001), (BLT, B 0b1100011 0b100), (BGE, B 0b1100011 0b101), (BLTU, B 0b1100011 0b110), (BGEU, B 0b1100011 0b111), (JALR, I 0b1100111 0b000), (JAL, J 0b1101111), (LUI, U 0b0110111), (AUIPC, U 0b0010111), (ADDI, I 0b0010011 0b000), (SLLI, ISH 0b0010011 0b001 0b0000000), (SLTI, I 0b0010011 0b010), (SLTIU, I 0b0010011 0b011), (XORI, I 0b0010011 0b100), (SRLI, ISH 0b0010011 0b101 0b0000000), (SRAI, ISH 0b0010011 0b101 0b0100000), (ORI, I 0b0010011 0b110), (ANDI, I 0b0010011 0b111), (ADD, R 0b0110011 0b000 0b0000000), (SUB, R 0b0110011 0b000 0b0100000), (SLL, R 0b0110011 0b001 0b0000000), (SLT, R 0b0110011 0b010 0b0000000), (SLTU, R 0b0110011 0b011 0b0000000), (XOR, R 0b0110011 0b100 0b0000000), (SRL, R 0b0110011 0b101 0b0000000), (SRA, R 0b0110011 0b101 0b0100000), (OR, R 0b0110011 0b110 0b0000000), (AND, R 0b0110011 0b111 0b0000000), (LB, I 0b0000011 0b000), (LH, I 0b0000011 0b001), (LW, I 0b0000011 0b010), (LBU, I 0b0000011 0b100), (LHU, I 0b0000011 0b101), (SB, S 0b0100011 0b000), (SH, S 0b0100011 0b001), (SW, S 0b0100011 0b010), (FENCE, I 0b0001111 0b000), -- FENCEI belongs in Zifencei... why is it here? -- (FENCE_I, I 0b0001111 0b001), -- Shouldn't we have ECALL and EBREAK? -- rv32m (MUL, R 0b0110011 0b000 0b0000001), (MULH, R 0b0110011 0b001 0b0000001), (MULHSU, R 0b0110011 0b010 0b0000001), (MULHU, R 0b0110011 0b011 0b0000001), (DIV, R 0b0110011 0b100 0b0000001), (DIVU, R 0b0110011 0b101 0b0000001), (REM, R 0b0110011 0b110 0b0000001), (REMU, R 0b0110011 0b111 0b0000001), -- rv64i (ADDIW, I 0b0011011 0b000), (SLLIW, ISHW 0b0011011 0b001 0b0000000), (SRLIW, ISHW 0b0011011 0b101 0b0000000), (SRAIW, ISHW 0b0011011 0b101 0b0100000), (ADDW, R 0b0111011 0b000 0b0000000), (SUBW, R 0b0111011 0b000 0b0100000), (SLLW, R 0b0111011 0b001 0b0000000), (SRLW, R 0b0111011 0b101 0b0000000), (SRAW, R 0b0111011 0b101 0b0100000), (LD, I 0b0000011 0b011), (LWU, I 0b0000011 0b110), (SD, S 0b0100011 0b011), -- rv64m (MULW, R 0b0111011 0b000 0b0000001), (DIVW, R 0b0111011 0b100 0b0000001), (DIVUW, R 0b0111011 0b101 0b0000001), (REMW, R 0b0111011 0b110 0b0000001), (REMUW, R 0b0111011 0b111 0b0000001) ] def is_16_bit_instruction (n : ℕ) : Prop := n.land 0b11 ≠ 0b11 def is_32_bit_instruction (n : ℕ) : Prop := n.land 0b11 = 0b11 ∧ n.land 0b11100 ≠ 0b11100 def is_valid_instruction (n : ℕ) : Prop := is_16_bit_instruction n ∨ is_32_bit_instruction n  #### Mario Carneiro (Oct 12 2020 at 07:28): one thing you can do to make the ISA easier to analyze is to bucket the instructions, for example putting all binops in one variant #### Mario Carneiro (Oct 12 2020 at 07:29): It wouldn't be hard to bucket these by type now, although that may not be exactly the layout you want #### SnowFox (Oct 12 2020 at 07:34): RISC-V uses tri-ops generally; though compressed ops do include binary ops where the first source register is repeated as the destination. ADD rd rs1 rs2 is rd = rs1 + rs2. I like the current order, ordered by XLEN and ISA extensions; because these are the units they are introduced. If I were to add more order, it'd probably be either (opcode, funct3, funct7)-order or format-shape order. #### SnowFox (Oct 12 2020 at 07:34): reg[rd] = reg[rs1] + reg[rs2] ** #### SnowFox (Oct 12 2020 at 07:35): Oh, you meant reduce the inductive variants and use defs to map the instructions to the fewer variants? #### Mario Carneiro (Oct 12 2020 at 07:36): If you have an inductive type with 63 constructors, proving anything by cases on the type will be an exercise in patience (particularly if you have two variables, e.g. to prove determinism) #### SnowFox (Oct 12 2020 at 07:37): I'm open to any way to reduce the complexity. #### Mario Carneiro (Oct 12 2020 at 07:39): e.g. inductive binop | ADD | SUB | SLL -- | ... inductive loadop | LUI | LD -- | ... inductive rvim | binop (b : binop) (rd rs rt : ℕ) : rvim | load (b : loadop) (rd imm : ℕ) : rvim  #### Mario Carneiro (Oct 12 2020 at 07:41): Bucketing based on format shape is the obvious thing to do, but you should also separate out variants that treat their arguments significantly differently in the same format-shape, not sure if this is a major concern in riscv #### Mario Carneiro (Oct 12 2020 at 07:43): Note that proving no_confusion with a bucketed inductive like this is literally lower asymptotic complexity, since if you have a 7 * 7 bucketed inductive then proving that different variants are different is 8 * 7 ^ 2 large (8 inductives of 7 variants each) instead of 49 ^ 2 with a 49 variant inductive #### SnowFox (Oct 12 2020 at 07:49): Note all of these instructions are either a pick-3 of rd, rs1, rs2, imm; or both rd, imm; or zero (in the single case of FENCE), so they could all be reduced to 3 cases (so far; though there are more variations with more ISA extensions active). The format constants (opcode7, funct3, funct7) could be flattened into a single natural of 17 bits. The opcode7 internally embeds the tag for the format variant. #### SnowFox (Oct 12 2020 at 07:50): Rather, it doesn't explicitly embed the tag but is sufficient to reconstruct it. #### SnowFox (Oct 12 2020 at 07:51): The pick-3 can be seen at the top, in the comments of the inductive fmt section. #### SnowFox (Oct 12 2020 at 07:53): One deviating hack I've considered is to allow the rd/rs1/rs2 to exceed their 5-bit limit to represent pseudo-registers a la SSA. But this adds a mode where assembly is illegal; which is fine. Assembly should happen only after register allocation anyway and these instructions are useful before register allocation. #### SnowFox (Oct 12 2020 at 07:53): and imm, for offset symbols. #### Mario Carneiro (Oct 12 2020 at 07:56): The format constants (opcode7, funct3, funct7) could be flattened into a single natural of 17 bits. I would suggest translating these to variants as early as possible though, because lean's handling of 17 bit numbers will not be satisfactory to you #### SnowFox (Oct 12 2020 at 07:56): @Mario Carneiro BTW: "LUI" is load upper-immediate; isn't a load involving memory so I wouldn't call it a "load operation". It only writes the immediate to the upper bits of a register. #### Mario Carneiro (Oct 12 2020 at 07:56): yeah you can tell I'm mocking things quickly #### Mario Carneiro (Oct 12 2020 at 07:57): it's not actually about the type of operation that it does but rather it's "mode of operation", i.e. the way it handles its arguments #### Mario Carneiro (Oct 12 2020 at 07:57): just imagine you are doing a proof by cases on these variants, what is the most useful categorization? #### Mario Carneiro (Oct 12 2020 at 07:58): if some of these only interact with certain parts of the machine state (e.g. ALU ops vs memory ops) that's another possible categorization #### SnowFox (Oct 12 2020 at 07:59): RISC-V is indeed a load-store architecture; which means that the ALU operations do not touch memory at all; and memory operations do no arithmetic (besides immediate offsets added to the register for the address) #### Mario Carneiro (Oct 12 2020 at 08:00): One deviating hack I've considered is to allow the rd/rs1/rs2 to exceed their 5-bit limit to represent pseudo-registers a la SSA. But this adds a mode where assembly is illegal; which is fine. Assembly should happen only after register allocation anyway and these instructions are useful before register allocation. This should be fine; you can have an assembly function that turns an instruction into a bit pattern, and it can fail on some arguments #### Mario Carneiro (Oct 12 2020 at 08:02): although it sounds like you might need an extended semantics with a machine with infinite registers to make sense of the extra instructions #### Mario Carneiro (Oct 12 2020 at 08:02): unless they are just garbage instructions with no semantics #### Mario Carneiro (Oct 12 2020 at 08:04): I know some compilers use a v-code style approach, where you have the same instructions as the machine but with infinite registers; I prefer to just have an IR for that, where I get to pick a nicer set of high level instructions #### SnowFox (Oct 12 2020 at 08:08): Regarding proofs; all will target the non-{pseudo,compressed} instructions. Pseudoinstructions won't exist in my compiler. Compressed instructions are only abbreviations of the larger instructions. Compression is applied when certain registers are used and when the destination is the first source register and in a few other cases. Currently all instructions are 32-bits. Compressed instructions are 16. Higher sizes exist in the specification but have no members and their format isn't frozen. #### SnowFox (Oct 12 2020 at 08:09): v-code? #### SnowFox (Oct 12 2020 at 08:10): My IR will be RVSDG-based enriched with equivalence graphs a la eqsat. #### SnowFox (Oct 12 2020 at 08:11): RVSDG is very lambda calculus ish. My source language will be RDT based. #### Mario Carneiro (Oct 12 2020 at 08:11): well you also need a low level IR for doing register allocation #### Mario Carneiro (Oct 12 2020 at 08:12): unless you are going to go straight from there to machine code #### SnowFox (Oct 12 2020 at 08:12): #### Mario Carneiro (Oct 12 2020 at 08:14): I forget where I heard about v-code architecture, possibly cranelift #### SnowFox (Oct 12 2020 at 08:15): My understanding of RVSDG is that register colouring and instruction selection can be done straight from it without additional IRs. #### Mario Carneiro (Oct 12 2020 at 08:17): Is there a more complete reference for RVSDG than the poster? #### SnowFox (Oct 12 2020 at 08:17): BTW: eqsat enriched here means that the RVSDG IR is non-destructively optimized by adding equivalent programs in an equivalence graph. All optimizations are applied in all orders until the resource bounds are reached. #### SnowFox (Oct 12 2020 at 08:17): Yes. #### SnowFox (Oct 12 2020 at 08:18): I thought it had a link at the bottom but it just says the reference. #### SnowFox (Oct 12 2020 at 08:19): And that didn't seem to reference what I expected.. This is a great start. https://arxiv.org/abs/1912.05036 #### Mario Carneiro (Oct 12 2020 at 08:20): I also found Reissmann's thesis #### SnowFox (Oct 12 2020 at 08:22): Also a good resource. Though this arxiv paper is the GOTO reference IIUC. The poster is just a nice dense visual overview. :) #### SnowFox (Oct 12 2020 at 08:30): As an added bonus; with RDT as the source and RVSDG in the middle; I'll have linearity throughout the whole compiler. Which should have very good implications for memory usage. #### SnowFox (Oct 12 2020 at 08:47): Re splitting up rvim. If I split by these categories; the largest I have is 21. First by XLEN {32, 64}, then by ISA extension {i, m}, then into {control flow (8), arithmetic (21), memory (8)} with these numbers for rv32i. rv64i adds (0, 9, 3) respectively. rv32m +8 arith, rv64m +5 arith. #### SnowFox (Oct 12 2020 at 08:47): But this might be a bit too fine.. ? #### Mario Carneiro (Oct 12 2020 at 08:51): The XLEN thing can hopefully be parameterized #### Mario Carneiro (Oct 12 2020 at 08:52): ISA extension is probably not worth it unless you intend to do something in particular with e.g. compiling to just rv64i #### Mario Carneiro (Oct 12 2020 at 08:53): For arithmetic ops in particular I would try my best to get uniform handling of a large number of those #### Mario Carneiro (Oct 12 2020 at 08:53): i.e. operations which differ only in the mathematical binary function being applied #### SnowFox (Oct 12 2020 at 08:54): I will be compiling to restrictive configurations; but most extensions can be emulated so are safe to allow so long that they do get reduced to the target. I.e. multiplication without the M extension. #### Mario Carneiro (Oct 12 2020 at 08:54): Another way to represent that is to have a predicate that tells you whether an instruction is in the configuration #### Mario Carneiro (Oct 12 2020 at 08:55): which seems like a good idea especially for risc-v because it has so many extensions #### SnowFox (Oct 12 2020 at 08:55): You'll love the vector extension ;) #### Mario Carneiro (Oct 12 2020 at 08:56): egh.. throw those all in one variant so they don't pollute the rest of the code #### SnowFox (Oct 12 2020 at 08:57): Note these aren't all R, the format detection code I wrote in python to generate these doesn't yet understand the vectors. https://gist.github.com/4fd94ffef4917b2f42d9bdb48e72fa8b #### SnowFox (Oct 12 2020 at 08:57): 441 #### Mario Carneiro (Oct 12 2020 at 08:58): I mean, this is just a bad representation, total abuse of the concept of an instruction #### SnowFox (Oct 12 2020 at 08:58): Certainly a lot more consumable than Intel's many many thousands of instructions. #### Mario Carneiro (Oct 12 2020 at 08:58): #### Mario Carneiro (Oct 12 2020 at 08:59): in lean you can at least separate those mostly into orthogonal representation pieces (like size + op) #### Mario Carneiro (Oct 12 2020 at 08:59): there are 441 things you can do, in the same sense that there are 441 kinds of burgers I can get at the local burger shop #### SnowFox (Oct 12 2020 at 09:00): Very important note for RISC-V vectors; they are hardware configuration agnostic. I.e. unlike SSE, SSE3, AVX, AVX2, AVX512; RISC-V's V extension covers all of them generically. The vector lengths are user controlled and the hardware implements and optimizes however it wants to, if at all. #### Mario Carneiro (Oct 12 2020 at 09:01): I find 37 occurrences of ADD in that list. clearly it's not orthogonal enough #### SnowFox (Oct 12 2020 at 09:04): https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc #### Mario Carneiro (Oct 12 2020 at 09:08): One thing I notice in the docs is that instructions are indeed grouped, and they even use "dot notation" for separating out the orthogonal parts, e.g. vw(add|sub)(u?).{vw}{vx} has some clear orthogonality going on #### SnowFox (Oct 12 2020 at 09:08): Note that many of these instructions really are just names for variants of the same instruction. #### SnowFox (Oct 12 2020 at 09:08): As you just pointed out. :) #### Mario Carneiro (Oct 12 2020 at 09:09): I think it is very important that a formalization of the spec makes as much orthogonality explicit as possible #### SnowFox (Oct 12 2020 at 09:09): Agreed and this is the goal. #### Mario Carneiro (Oct 12 2020 at 09:10): I still don't have a great grasp on how many instructions x86 has if you do this properly, but it's a lot less than 1500 #### SnowFox (Oct 12 2020 at 09:11): Imagine how low that'd be for RISC-V ;) #### Mario Carneiro (Oct 12 2020 at 09:11): it's just unfortunate that so many ISA specs "case bash" in their documentation #### Mario Carneiro (Oct 12 2020 at 09:12): so it's not clear when something is one part of a family or a special exception until you read all the exponentially many doc strings to ensure they are all the same #### SnowFox (Oct 12 2020 at 09:33): @Mario Carneiro Not sure if you remember; but I started with only the "opcode map" and not the instructions themselves. Note that most opcodes are the same. The 32-bit instructions have 7 bits, where the low two are both high. Thus 5 bits, 32 opcodes. Note that SUB is really an ADD with one of its basically unused bits set high. #### SnowFox (Oct 12 2020 at 09:35): #### SnowFox (Oct 12 2020 at 09:37): All the branch instructions have the BRANCH opcode, row = 11, column = 000, size = 11 (32-bit length). 1100011, #### Mario Carneiro (Oct 12 2020 at 09:37): For the most part the decoding is less important than the semantic classes of the instructions. It's not clear to me if the difference between ADD and SUB is any less/more than the difference between ADD and XOR, except in the encoding #### SnowFox (Oct 12 2020 at 09:38): ADD and XOR have the same opcode, but are different functions. See their funct3 bits. ADD is 000, XOR is 100. #### Mario Carneiro (Oct 12 2020 at 09:38): I see that #### Mario Carneiro (Oct 12 2020 at 09:39): ADD and SUB have the same funct3 but differ in the extra bits #### SnowFox (Oct 12 2020 at 09:39): Yes. Similarly for shifting left and right. #### Mario Carneiro (Oct 12 2020 at 09:39): but that seems like they just wanted 10 ALU ops and ran out of bits for the encoding #### SnowFox (Oct 12 2020 at 09:41): Not quite; SUB is exactly the same hardware as ADD with one minor tweak. SUB is effectively a fused instruction to invert the sign, then add. Thus subtraction is an argument of addition.... ish #### Mario Carneiro (Oct 12 2020 at 09:42): hm, where's add-carry? #### SnowFox (Oct 12 2020 at 09:42): Er, the left/right shifts use funct3. The right shifts logical vs arithmetic use the extra bit in funct7. Which just decides if it sign-extends or not. #### SnowFox (Oct 12 2020 at 09:42): No add-carry because there are no carry flags or overflow exceptions ;) #### Mario Carneiro (Oct 12 2020 at 09:42): I guess a SUB is an ADC + NOT #### Mario Carneiro (Oct 12 2020 at 09:42): well I mean the operation x , y -> x + y + 1 #### Mario Carneiro (Oct 12 2020 at 09:43): or x + y + c if possible #### Mario Carneiro (Oct 12 2020 at 09:43): it's pretty important to have an op for that if you want to do e.g. bignum addition #### SnowFox (Oct 12 2020 at 09:44): Right, not an instruction. Easy to emulate of course. Just as rv32i does multiplication and division without multiplication or division. #### SnowFox (Oct 12 2020 at 09:44): Bignum implementations should defer carries anyway. Familiar with radix 2^51? #### SnowFox (Oct 12 2020 at 09:45): https://www.chosenplaintext.ca/articles/radix-2-51-trick.html #### SnowFox (Oct 12 2020 at 09:52): RISC-V avoids exceptions and flags. There are no arithmetic flags. I.e. division by zero acts like it does in Lean, but in hardware! There are floating point flags, available in the control and status registers (CSR); fflags. #### Mario Carneiro (Oct 12 2020 at 09:55): I don't object to that, but I hope there is still a way to perform a 65-bit addition #### SnowFox (Oct 12 2020 at 10:02): Lets ask godbolt. https://godbolt.org/z/WPs5Mq Add then set less-than unsigned. widdening_add: # @widdening_add add a0, a0, a1 sltu a1, a0, a1 ret  #### Mario Carneiro (Oct 12 2020 at 10:09): clever #### SnowFox (Oct 12 2020 at 10:10): These are equivalent. uint128 widdening_add(uint64 a, uint64 b) { return (uint128) a + (uint128) b; } uint128 widdening_add2(uint64 a, uint64 b) { uint64 c = a + b; return (uint128) c + (((uint128) (c < b)) << 64); }  #### SnowFox (Oct 12 2020 at 10:12): To contrast to x86... widdening_add: mov rax, rdi xor edx, edx xor edi, edi add rax, rsi adc rdx, rdi ret widdening_add2: add rdi, rsi mov edx, 0 mov rax, rdi jnc .L3 add rdx, 1 .L3: ret  #### SnowFox (Oct 12 2020 at 10:13): Register movement noise and it doesn't even generate the same code. #### Mario Carneiro (Oct 12 2020 at 10:14): I don't think that's optimal code though #### Mario Carneiro (Oct 12 2020 at 10:15): I'm pretty sure you can get it down to one instruction if you handcode the assembly #### SnowFox (Oct 12 2020 at 10:15): BTW: I use RISC-V's assembly to debug my x86 code... I avoid x86. :D #### Mario Carneiro (Oct 12 2020 at 10:16): I don't want to suggest that x86 is a good ISA by any stretch, but ADC looks like a useful instruction to me #### Mario Carneiro (Oct 12 2020 at 10:18): Aside: even norm_num likes adc #### SnowFox (Oct 12 2020 at 10:23): I believe that RISC-V is the best ISA option there is. It may not be perfect but the complaints are small and few. I think I can trim RISC-V down by using the real opcode map instead of the instruction list; then generalize over the formats and get something pretty small. With this, I guess I'm going to set aside our progress so far and start fresh. Focusing on keeping the inductives reasonable. An unfortunate reason to push towards this design, though I hope it'll work out better anyway. #### Mario Carneiro (Oct 12 2020 at 10:27): The main reason I chose x86 instead of RISC-V for the MM0 project was because x86 hardware is more accessible #### Mario Carneiro (Oct 12 2020 at 10:28): I think that from a technical perspective RISC-V is an obvious improvement on all axes, because it was actually designed instead of accrued #### SnowFox (Oct 12 2020 at 10:32): I don't know the implications on hardware demands for your project; but for mine, where there are sensitive security critical details involved... I'd really rather start with a well designed ISA I can memorize and reason with. I'm "okay" with running my code in an emulator or using a JIT to x86 recompiler to run my code. Simply because it will take time to implement my compiler and language, to implement what I want to in it, and verify the whole chain there. This is a project for the future, where I can only see success for RISC-V and x86 necessarily dying out. Worst case, someone else can just extend my compiler with an x86 backend. ;) #### SnowFox (Oct 12 2020 at 10:33): The emulator and JIT obviously ruin the security properties of the language and compiler; but are fine to just run code. #### Mario Carneiro (Oct 12 2020 at 10:33): Well, as long as it can be soundly specified one ISA is as good as another #### Mario Carneiro (Oct 12 2020 at 10:35): but other arches are on the todo list, after the initial MVP is done #### SnowFox (Oct 12 2020 at 10:36): What are your targets? #### Mario Carneiro (Oct 12 2020 at 10:36): But for the purpose of minimizing the chain of trust, it's no good to emulate the hardware because then you have to trust the emulator and the x86 machine it's running on #### Mario Carneiro (Oct 12 2020 at 10:37): Right now, I target a simple subset of x86-64 + a simple subset of posix #### Mario Carneiro (Oct 12 2020 at 10:38): the output is an ELF executable file, and the spec describes how the bits of that file turn into operational behavior #### SnowFox (Oct 12 2020 at 10:39): Right; I can't trust x86 at all and I certainly can't trust my RISC-V code when emulating it. The emulator is exclusively to just run it on legacy hardware until both all my code is ready, and RISC-V is cheap enough and has taken over the world. Yes, I'm calling all x86 legacy. :) #### SnowFox (Oct 12 2020 at 10:42): Kernel-wise; I'll target seL4 and will write a basic wrapper to run it on POSIX environments. OS-wise, I'll target Robigalia. Which is barely a thing so far.... but we're passionate to have a safe operating system. :) #### SnowFox (Oct 12 2020 at 10:45): I'm focusing on cryptography, capabilities, and verified distributed protocols. #### SnowFox (Oct 12 2020 at 10:46): Do you plan to implement all of x86? #### SnowFox (Oct 12 2020 at 10:47): I'd like to at least handle all of RV{32,64}IMCV. Note the lack of atomics and floats. Though they could come later. #### SnowFox (Oct 12 2020 at 10:48): Might throw in B too, the bit manipulation extension can be good for cryptography. But that is much less ready than V. V might even be stable soon. #### SnowFox (Oct 12 2020 at 10:50): I believe that SiFive is waiting on V to be stable before they offer a cheap Linux-ready SBC. Holding back with the expensive devboards until then. #### Mario Carneiro (Oct 12 2020 at 10:50): SnowFox said: Do you plan to implement all of x86? Certainly not. x86 is the means not the end. I've got about 30 instructions and even that might have been more than needed #### Mario Carneiro (Oct 12 2020 at 10:52): https://github.com/digama0/mm0/blob/master/examples/x86.mm0 is about 1600 lines to specify x86 + linux enough to compile a C-like programming language #### Mario Carneiro (Oct 12 2020 at 10:55): The main inductive type for instructions has 5+6+7+4=22 variants #### SnowFox (Oct 12 2020 at 10:56): aside: My godbolt test was a widening addition with 2x 64 bit numbers, not two 65 bit numbers. Not sure which you intended. If you meant the 2x 65-bit addition, then it is equivalent to 2x 128-bit addition. 3x 64-bit adds + one sltu to extract the carry bit. #### Mario Carneiro (Oct 12 2020 at 10:56): The unit of work in a bignum addition is 64 + 64 + 1 bit addition to produce a 65 bit result #### SnowFox (Oct 12 2020 at 10:59): Then I guess that'd be 3x add + 2x sltu. #### SnowFox (Oct 12 2020 at 11:00): But I think that'd be a silly method. When working with lots of or large numbers, you really want to defer the carries. Carries are too slow to handle immediately. #### SnowFox (Oct 12 2020 at 11:01): Especially if you have lots of operations to do. #### Mario Carneiro (Oct 12 2020 at 11:05): For verification though, with a single add loop, you probably don't want the complexity of deferred carries #### Mario Carneiro (Oct 12 2020 at 11:05): like in the norm_num example, I don't know if deferred carries help at all #### Mario Carneiro (Oct 12 2020 at 11:05): because it's serial #### SnowFox (Oct 12 2020 at 11:09): The difference is when you do something like adding 50 numbers. With the smaller radix; you split each number into smaller parts and let them grow until they could overflow, and finally apply the carries all at once. For radix 2^51 with 64-bit words, you can add 2^13 pre-normalized numbers before you need to normalize the carries. Next with associativity, you can see how parallel you can make this go. Not serial. #### Mario Carneiro (Oct 12 2020 at 11:10): No I mean proof checking is serial #### Mario Carneiro (Oct 12 2020 at 11:11): so something like norm_num has no advantage producing proofs of numerical facts with deferred carries #### SnowFox (Oct 12 2020 at 11:13): /me shrugs; just not where you'd use either then :P #### Mario Carneiro (Oct 12 2020 at 11:14): well, lean does addition in base 2 instead of 2^64 but the principle is the same #### Mario Carneiro (Oct 12 2020 at 11:15): amusingly, I've written norm_num three times now in three different proof systems and each time the base is different: lean's is base 2, metamath uses base 10, and MM0 uses base 16 #### SnowFox (Oct 12 2020 at 11:15): Heh #### SnowFox (Oct 12 2020 at 11:19): @Mario Carneiro You mentioned "nested inductives" earlier, what does this mean? I understand mutually inductive types, haven't seen nested. Is there a syntax for it or just a thing you can do with the low level tools? #### Mario Carneiro (Oct 12 2020 at 11:19): inductive foo | mk : list foo -> foo  #### Mario Carneiro (Oct 12 2020 at 11:20): you use the inductive you are defining as a parameter to another inductive type #### SnowFox (Oct 12 2020 at 11:20): Ah. Okay. #### Mario Carneiro (Oct 12 2020 at 11:20): functions don't count - this is a regular inductive inductive foo | mk : (nat -> foo) -> foo  #### SnowFox (Oct 12 2020 at 11:21): Noted. What is special about the nested variant that makes it a def instead of an inductive; in so far as #print sees it? #### Mario Carneiro (Oct 12 2020 at 11:25): lean's kernel doesn't actually support nested inductives, so instead it does some smoke and mirrors around a definition like mutual inductive foo, list_foo with foo : Type | mk : list_foo -> foo with list_foo : Type | nil : list_foo | cons : foo -> list_foo -> list_foo  which is itself defined in terms of inductive foo' : bool -> Type | mk : foo' tt -> foo' ff | nil : foo' tt | cons : foo' ff -> foo' tt -> foo' tt  which is a regular (indexed) inductive type that lean's kernel understands. Then you have to define all the maps from this type to the original type you wanted, and prove the recursion principle, and here the support is a little lacking, and you get a weak recursion principle that isn't good enough to prove all the facts you would like #### Mario Carneiro (Oct 12 2020 at 11:27): So I advise you to avoid the conditions under which lean triggers the nested/mutual compilation strategy, and write foo' directly if you need to #### SnowFox (Oct 12 2020 at 11:28): Huh, sounds unfortunate.. Noted. #### SnowFox (Oct 12 2020 at 11:50): @Mario Carneiro What do you think of this so far? abbreviation register := nat inductive reg_imm | reg (reg : register) | imm (imm : nat) inductive compare | eq | ne | lt | ge | ltu | geu inductive operation | add (invert : opt_param bool ff) | shift (amount : ℤ) (arith : opt_param bool ff) | xor | or | and inductive ast | branch (c : compare) (rs1 rs2 : register) (offset : nat) | jump (reg : register) (base : option nat) (offset : nat) | arith (op : operation) (rd rs1 : register) (rs2_imm : reg_imm) | load (dest : register) (width : nat) (base : register) (offset : nat) | store (src : register) (width : nat) (base : register) (offset : nat)  #### Mario Carneiro (Oct 12 2020 at 11:51): is the shift amount always an immediate? #### SnowFox (Oct 12 2020 at 11:52): Er, that amount shouldn't be in that position. #### SnowFox (Oct 12 2020 at 11:52): But it is the only integer here, so replace that with direction : bool #### Mario Carneiro (Oct 12 2020 at 11:53): looks pretty good. Is this the whole thing? #### SnowFox (Oct 12 2020 at 11:53): Or just two separate operations. #### SnowFox (Oct 12 2020 at 11:53): This covers RV32I. #### Mario Carneiro (Oct 12 2020 at 11:53): if so it's a pretty massive reduction from the original #### SnowFox (Oct 12 2020 at 11:54): Very! #### SnowFox (Oct 12 2020 at 11:56): There is also what I started with; before we derailed with the raw instructions. Which I'll probably use this round. inductive rv_base_opcode | load | load_fp | custom_0 | misc_mem | op_imm | auipc | op_imm_32 | store | store_fp | custom_1 | amo | op | lui | op_32 | madd | msub | nmsub | nmadd | op_fp | reserved_0 | custom_2 | branch | jalr | reserved_1 | jal | system | reserved_2 | custom_3  #### Mario Carneiro (Oct 12 2020 at 11:56): This doesn't cover encoding though; hopefully these classes align well with the opcodes #### Mario Carneiro (Oct 12 2020 at 11:56): right, that #### SnowFox (Oct 12 2020 at 11:58): They do. All operations over immediates are rv_base_opcode.op_imm. All operations over registers are rv_base_opcode.op. The specifics only map to funct3 and funct7 where funct7 is basically one bit rarely set, occasionally a different bit. :P #### SnowFox (Oct 12 2020 at 11:58): ... mostly lots of zeros :) #### Mario Carneiro (Oct 12 2020 at 11:59): I mean ast #### Mario Carneiro (Oct 12 2020 at 11:59): clearly rv_base_opcode maps well to the opcodes, AFAICT it's 1-1 #### SnowFox (Oct 12 2020 at 11:59): Right, I meant that ast.arith has only two opcodes. #### SnowFox (Oct 12 2020 at 12:00): ast.branch maps to opcodes.branch, ast.jump with jal and jalr. load and store, are well, exactly those. #### Mario Carneiro (Oct 12 2020 at 12:01): you should try writing the function ast -> (opc, funct3, funct7), hopefully you can pattern match your way to victory #### Mario Carneiro (Oct 12 2020 at 12:03): or I guess that could even map directly to nat (really fin (2^32)) using helper functions like fmt_R (opcode7 funct3 funct7 rd rs1 rs2 : nat) : nat #### Mario Carneiro (Oct 12 2020 at 12:04): that seems like the cleanest way to write down all that encoding information on this ast basis #### SnowFox (Oct 12 2020 at 12:10): Why does vscodium not always offer to expand case splits for you? o.o #### SnowFox (Oct 12 2020 at 13:16): Hmm. Does it make sense to expand these out or use "modes" of the base operation? inductive operation | add | sub | addw | sll | srl | sra | xor | or | and | mul | mulh | mulhs | mulhsu | mulw | div | divu | divw | divuw | rem | remu | remw | remuw  #### Mario Carneiro (Oct 12 2020 at 13:17): how are they specified? #### SnowFox (Oct 12 2020 at 13:17): rv64i adds addw. rvm adds the last 3 rows. #### SnowFox (Oct 12 2020 at 13:18): Eh, RV64I adds a few more. #### Mario Carneiro (Oct 12 2020 at 13:18): The main question is whether you are always going to have to case split the subparts anyway #### SnowFox (Oct 12 2020 at 13:19): Yes. #### Mario Carneiro (Oct 12 2020 at 13:19): if there are functions where you can say (mul _) := stuff where _ is the subop, then it's a win #### SnowFox (Oct 12 2020 at 13:19): The Ws are operations over sign-extended 32-bits on 64-bit systems. the rest are XLEN-wise. #### SnowFox (Oct 12 2020 at 13:20): All but the logical operations have a W variant. #### SnowFox (Oct 12 2020 at 13:20): So I guess I should split arithmetic into two modes and split logic out. #### SnowFox (Oct 12 2020 at 13:21): I also forgot about set less than (unsigned) which I guess is logic. #### SnowFox (Oct 12 2020 at 13:23): However, there is no mul[h[[s]u]]w. #### SnowFox (Oct 12 2020 at 13:23): Only mulw, of all those options. #### SnowFox (Oct 12 2020 at 13:24): Which I guess is fine to "allow" and later mark as invalid. #### SnowFox (Oct 12 2020 at 13:24): As I'd do for subtraction with an immediate. #### SnowFox (Oct 12 2020 at 13:25): (Because addition with an immediate has signed immediates already) #### SnowFox (Oct 12 2020 at 13:39): RV{32,64}IM with a few illegal cases, but easy to normalize (subi -> addi) or mark as invalid (no mulhw et al). abbreviation register := nat inductive reg_imm | reg (reg : register) | imm (imm : nat) inductive compare | eq | ne | lt | ge | ltu | geu inductive arithmetic | add | sub | sll | srl | sra | slt | sltu | mul | mulh | mulhu | mulhsu | div | divu | rem | remu inductive logical | xor | or | and inductive ast | arith (op : arithmetic) (w : bool) (rd rs1 : register) (rs2_imm : reg_imm) | logic (op : logical) (rd rs1 : register) (rs2_imm : reg_imm) | lui (rd : register) (offset : nat) | auipc (rd : register) (offset : nat) | jump (reg : register) (base : option nat) (offset : nat) | branch (c : compare) (rs1 rs2 : register) (offset : nat) | load (dest : register) (width : fin 4) (base : register) (offset : nat) | store (src : register) (width : fin 4) (base : register) (offset : nat) | fence (mode pred succ : nat) | call | break  #### SnowFox (Oct 12 2020 at 14:03): @Mario Carneiro get_opcode and get_funct namespace ast def get_opcode : ast -> opcode | (branch _ _ _ _) := opcode.branch | (jump _ (some _) _) := opcode.jalr | (jump _ none _) := opcode.jal | (arith _ ff _ _ (reg_imm.reg _)) := opcode.op | (arith _ ff _ _ (reg_imm.imm _)) := opcode.op_imm | (arith _ tt _ _ (reg_imm.reg _)) := opcode.op_32 | (arith _ tt _ _ (reg_imm.imm _)) := opcode.op_imm_32 | (logic _ _ _ (reg_imm.reg _)) := opcode.op_32 | (logic _ _ _ (reg_imm.imm _)) := opcode.op_imm_32 | (lui _ _) := opcode.lui | (auipc _ _) := opcode.auipc | (load _ _ _ _) := opcode.load | (store _ _ _ _) := opcode.store | (fence _ _ _) := opcode.misc_mem | call := opcode.system | break := opcode.system open arithmetic open logical open compare -- funct7 << 3 + funct3 def get_funct : ast -> nat -- rvi | (branch c _ _ _) := match c with | eq := 0b000 | ne := 0b001 | lt := 0b100 | ge := 0b101 | ltu := 0b110 | gtu := 0b111 end | (jump _ _ _) := 0 | (arith add _ _ _ _) := 0b000 | (arith sub _ _ _ _) := 0b000 + nat.shiftl 3 0b0100000 | (arith sll _ _ _ _) := 0b001 | (arith slt _ _ _ _) := 0b010 | (arith sltu _ _ _ _) := 0b011 | (arith srl _ _ _ _) := 0b101 + nat.shiftl 3 0b0100000 | (arith sra _ _ _ _) := 0b101 | (logic xor _ _ _) := 0b100 | (logic or _ _ _) := 0b110 | (logic and _ _ _) := 0b111 | (load _ width _ _) := width.val | (store _ width _ _) := width.val | (lui _ _) := 0 | (auipc _ _) := 0 | (fence _ _ _) := 0 | call := 0 | break := 0 -- rvm | (arith mul _ _ _ _) := 0b000 + nat.shiftl 3 0b0000001 | (arith mulh _ _ _ _) := 0b001 + nat.shiftl 3 0b0000001 | (arith mulhsu _ _ _ _) := 0b010 + nat.shiftl 3 0b0000001 | (arith mulhu _ _ _ _) := 0b011 + nat.shiftl 3 0b0000001 | (arith div _ _ _ _) := 0b100 + nat.shiftl 3 0b0000001 | (arith divu _ _ _ _) := 0b101 + nat.shiftl 3 0b0000001 | (arith rem _ _ _ _) := 0b110 + nat.shiftl 3 0b0000001 | (arith remu _ _ _ _) := 0b111 + nat.shiftl 3 0b0000001 end ast  #### SnowFox (Oct 12 2020 at 14:04): @Mario Carneiro How does this partitioning look? #### SnowFox (Oct 12 2020 at 14:14): s/ast/inst/g. Not quite a tree... #### Mario Carneiro (Oct 12 2020 at 14:17): why the match on branch conditions, instead of putting it in the top level? #### Mario Carneiro (Oct 12 2020 at 14:19): Unless funct7 is actually 3 bits above funct3, you probably shouldn't mix them like that, and instead either return an inductive type (with a case for funct3 and another for funct3 + funct7), or a precomposed nat containing the entire instruction (using auxiliary functions type_1 : funct3 -> nat and type_2 : funct3 -> funct7 -> nat) #### SnowFox (Oct 12 2020 at 14:21): Actually I should shift it more than 3, as they are a fixed distance across. This is consistent for all 32-bit instructions. #### SnowFox (Oct 12 2020 at 14:21): That match will be moved. I'm currently enumerating the opcodes of interest to map them to formats; then I'll write an encoder and decoder which masks appropriately. #### Mario Carneiro (Oct 12 2020 at 14:22): I mean you have those instruction formats, right? you should keep them separate until you have all the fields #### SnowFox (Oct 12 2020 at 14:25): The format only decides the layout of the immediates and which are present between rd rs1 rs2 funct3 funct7. funct7 "doesn't exist for immediates" but there are a few immediates which effectively use it. Specifically the shift right arithmetic. So it "acts" half-way between I and R. #### Mario Carneiro (Oct 12 2020 at 14:27): that is, you can define def fmt_R (opcode7 funct3 funct7 rd rs1 rs2 : nat) : nat := opcode7 + rd.shiftl 7 + funct3.shiftl 12 + rs1.shiftl 15 + rs2.shiftl 20 + funct7.shiftl 25  and then write encode : ast -> nat that calls fmt_R for R-type instructions #### Mario Carneiro (Oct 12 2020 at 14:28): and fmt_I doesn't have a funct7 argument so there is no bad typing #### SnowFox (Oct 12 2020 at 14:42): The format only influences decoding (as we need to know how to mask the correct values out); and for the immediate layout while encoding. The encoder can assume every instruction has an opcode, has all the registers, and has both function parts without caring about the format. #### SnowFox (Oct 12 2020 at 14:44): Only the immediate explosion depends on the format -- for encoding. Actually decoding could probably be pretty lazy too. #### Mario Carneiro (Oct 12 2020 at 14:54):  def opcode.encode : opcode → nat := sorry def branch_funct3 : compare → nat | eq := 0b000 | ne := 0b001 | lt := 0b100 | ge := 0b101 | ltu := 0b110 | gtu := 0b111 def B_type (opcode funct3 rs1 rs2 imm : nat) : nat := sorry def J_type (opcode rd imm : nat) : nat := sorry def I_type (opcode funct3 rd rs1 imm : nat) : nat := sorry def R_type (opcode funct3 funct7 rd rs1 rs2 : nat) : nat := sorry def encode_arith : arithmetic → ℕ × ℕ | add := (0b000, 0) | sub := (0b000, 0b0100000) | sll := (0b001, 0) | slt := (0b010, 0) | sltu := (0b011, 0) | srl := (0b101, 0b0100000) | sra := (0b101, 0) | _ := sorry def encode : ast -> nat | (branch c rs1 rs2 off) := B_type opcode.branch.encode (branch_funct3 c) rs1 rs2 off | (jump reg (some rs1) off) := I_type opcode.jalr.encode 0b000 reg rs1 off | (jump reg none off) := J_type opcode.jal.encode reg off | (arith op w rd rs1 (reg_imm.reg rs2)) := let (funct3, funct7) := encode_arith op in R_type (if w then opcode.op_32 else opcode.op).encode funct3 funct7 rd rs1 rs2 | (arith op w rd rs1 (reg_imm.imm imm)) := let (funct3, _) := encode_arith op in I_type (if w then opcode.op_imm_32 else opcode.op_imm).encode funct3 rd rs1 imm | _ := sorry  #### Mario Carneiro (Oct 12 2020 at 14:57): I'm ignoring the funct7 field of arith with immediate here, I suppose that means that subi doesn't exist? #### SnowFox (Oct 12 2020 at 14:58): Well, subi doesn't exist anyway. #### Mario Carneiro (Oct 12 2020 at 14:59): yeah, the type of this function doesn't really allow for error handling so I guess it doesn't matter what it does in that case #### SnowFox (Oct 12 2020 at 14:59): But srli does. #### Mario Carneiro (Oct 12 2020 at 14:59): oh? #### SnowFox (Oct 12 2020 at 15:00): Yes, the shifts are the exception to the format rule here; they don't fit in any of them. :P #### Mario Carneiro (Oct 12 2020 at 15:00): no ISA is complete without a few exceptions to the rule #### SnowFox (Oct 12 2020 at 15:00): Because you encode the shift amount in the lower bits of the immediate; the upper bits are all zeros except for the single bit which indicates if a right shift is arithmetic vs logical. #### SnowFox (Oct 12 2020 at 15:00): arithmetic means sign-extended shift. #### Mario Carneiro (Oct 12 2020 at 15:03): ah, the funct7 bits are switched between srl and sra #### SnowFox (Oct 12 2020 at 15:04): Yeah I noticed that while splitting the funct7 out to its own function. #### SnowFox (Oct 12 2020 at 15:18):  def encode (i : inst) : ℕ := i.get_opcode.to_nat + i.rd.shiftl 7 + i.funct3.shiftl 12 + i.rs1.shiftl 15 + i.rs2.shiftl 20 + i.funct7.shiftl 25 + i.imm  #### SnowFox (Oct 12 2020 at 15:18): This works for all 32-bit instructions. #### SnowFox (Oct 12 2020 at 15:19): The format only influences the positions of the immediates. :) #### SnowFox (Oct 12 2020 at 15:21): Returning to the magic metaprogramming we started with; implemented for all instructions. I'll adapt this to the opcode, funct3, and funct7 mapping to get bidirectional maps. #### SnowFox (Oct 12 2020 at 15:23): Actually. I may take it a step further regarding the opcode map. To actually use the rows/columns to encode them. :) Which is basically just the binary count of the position +3 while skipping any where the lower 5 bits are all high. #### SnowFox (Oct 12 2020 at 15:24): The condition map is at least a direct count. Is there a @[derive to_nat] for inductives? #### Mario Carneiro (Oct 12 2020 at 15:27): The point of the different functions is to make it easier to express which format a given instruction has, so that you have to pass 6 arguments vs 7 and such #### Mario Carneiro (Oct 12 2020 at 15:28): Regarding the opcode inductive, you could just skip it and put numbers in where I had things like opcode.op.encode #### Mario Carneiro (Oct 12 2020 at 15:30): There is no derive to_nat, although it wouldn't be hard to autogenerate #### SnowFox (Oct 12 2020 at 15:30): Could. #### SnowFox (Oct 12 2020 at 15:30): Probably will. #### SnowFox (Oct 12 2020 at 15:31): But it is nice to have the full list and to use them by-name to avoid mistakes. But this'll be verified eventually anyway... So it would be fine to just put the numbers. #### Mario Carneiro (Oct 12 2020 at 15:31): You can also skip the inductive and instead do def opcode.op := 0b0100011 or whatever #### SnowFox (Oct 12 2020 at 15:33): True. Although the biggest detail for annoyance will be flipping the encoder to decode; ideally using the metaprogramming trick to avoid the duplication. #### SnowFox (Oct 12 2020 at 15:51): !! Lean's language server appears to have a memory leak? #### SnowFox (Oct 12 2020 at 15:51): Filled my ram to 100% started using emergency swap. o.o #### Mario Carneiro (Oct 12 2020 at 17:42): keep your finger on the Lean: Restart button #### SnowFox (Oct 13 2020 at 13:08): @Mario Carneiro I'm stuck again with generating definitions. Is there anything stopping an interface for constructing these a la («def» q := 3) or just with def as it is obvious that we're inside an opened parenthesis so it mustn't be a command anyway. run_cmd [ [load, load_fp, custom_0, misc_mem, op_imm, auipc, op_imm_32], [store, store_fp, custom_1, amo, op, lui, op_32 ], [madd, msub, nmsub, nmadd, op_fp, reserved_0, custom_2 ], [branch, jalr, reserved, jal, system, reserved_1, custom_3 ] ].enum >>= λ ⟨i, n⟩, n.enum >>= λ ⟨j, n⟩, /- def n := 3 + j.shiftl 2 + i.shiftl 5 -/  #### Mario Carneiro (Oct 13 2020 at 13:09): well commands are not expressions, so that doesn't work literally #### SnowFox (Oct 13 2020 at 13:10): or any similar interface if not exactly using the expr-quotes. #### Mario Carneiro (Oct 13 2020 at 13:12): I mean for the longest time there wasn't even an API to add defs at all (with all the bells and whistles) #### Mario Carneiro (Oct 13 2020 at 13:13): you can use tactic.add_decl, I think that might not add equation lemmas though #### Mario Carneiro (Oct 13 2020 at 13:15): there is also environment.add_defn_eqns for more advanced definitions #### SnowFox (Oct 13 2020 at 13:16): Re reducing variations. I've arrived at embedding the width in the register and using an "alt" flag. Usually to distinguish between (un)signed operations, though also to select the high half of mults (which then respects source register (un)sign flags). This lines up well for tuning the operations. #### Mario Carneiro (Oct 13 2020 at 13:16): Anyway lean 4 will add something a lot like you are proposing; quotation and antiquotation now works on every syntactic class #### SnowFox (Oct 13 2020 at 13:18): Awesome. When I started with Lean, I dove into Lean 4's code ... then I found myself writing Lean 4 by mistake. :P #### SnowFox (Oct 13 2020 at 13:19): As in, while using lean3. #### SnowFox (Oct 13 2020 at 13:25): I was trying to use tactic.add_defn_equations before; though with add_decl, I feel much closer but have hit at least an issue with lists/tactics/monad switching... Not sure how to fix this in Lean...  run_cmd [ [load, load_fp, custom_0, misc_mem, op_imm, auipc, op_imm_32], [store, store_fp, custom_1, amo, op, lui, op_32 ], [madd, msub, nmsub, nmadd, op_fp, reserved_0, custom_2 ], [branch, jalr, reserved, jal, system, reserved_1, custom_3 ] ].enum >>= λ ⟨i, n⟩, n.enum >>= λ ⟨j, n⟩, tactic.add_decl (declaration.defn n [] (ℕ) (3 + j.shiftl 2 + i.shiftl 5) reducibility_hints.abbrev tt)  #### SnowFox (Oct 13 2020 at 13:31): There we go. .mmap  run_cmd [ [load, load_fp, custom_0, misc_mem, op_imm, auipc, op_imm_32], [store, store_fp, custom_1, amo, op, lui, op_32 ], [madd, msub, nmsub, nmadd, op_fp, reserved_0, custom_2 ], [branch, jalr, reserved, jal, system, reserved_1, custom_3 ] ].enum.mmap λ ⟨i, n⟩, n.enum.mmap \$ λ ⟨j, n⟩,
tactic.add_decl (declaration.defn n [] (ℕ) (3 + j.shiftl 2 + i.shiftl 5)
reducibility_hints.abbrev tt)


#### Mario Carneiro (Oct 13 2020 at 13:32):

you probably want mmap' there since you aren't returning anything

#### Mario Carneiro (Oct 13 2020 at 13:33):

you may or may not want to precompute those numbers btw, by using let n := 3 + j.shiftl 2 + i.shiftl 5 in reflect n

#### Mario Carneiro (Oct 13 2020 at 13:34):

oh, actually what you have definitely won't work because you haven't antiquoted i and j

#### SnowFox (Oct 13 2020 at 13:34):

/-
invalid field notation, 'branch' is not a valid "field" because environment does not contain 'nat.branch'
op
which has type
ℕ
-/


#### Mario Carneiro (Oct 13 2020 at 13:35):

did you write op.branch somewhere?

#### SnowFox (Oct 13 2020 at 13:35):

Yes. And I'm running the run_cmd inside the op namespace.

#### Mario Carneiro (Oct 13 2020 at 13:36):

you might want to qualify those names, op is kind of general for an unqualified name

#### SnowFox (Oct 13 2020 at 13:36):

It seems to be clobbering the namespace... does the add_decl bypass the namespace?

#### SnowFox (Oct 13 2020 at 13:37):

Seem so. (op ++ n)  worked.

#### Mario Carneiro (Oct 13 2020 at 13:37):

Yeah, you said add_decl op so it's making a definition called  op

#### Mario Carneiro (Oct 13 2020 at 13:38):

you can sometimes use  op to do name resolution but it won't work here because the thing doesn't exist yet

#### SnowFox (Oct 13 2020 at 13:38):

Makes sense. I assumed the namespace would inject itself but I guess that only affects the commands. The tactic just bypasses it.

#### SnowFox (Oct 13 2020 at 13:39):

Current code -> https://gist.github.com/4d56511885f5b44514b0fcf006a5546a

#### SnowFox (Oct 13 2020 at 13:50):

Any more tips @Mario Carneiro?

#### Mario Carneiro (Oct 13 2020 at 13:52):

It seems a bit odd to put the alt flag in the register instead of the instruction

#### SnowFox (Oct 13 2020 at 13:54):

The flag basically means that the register itself is treated as signed in most cases. The mul is the only deviation, where rd.alt means it acts on the high half of the result, but then the rs1.alt and rs2.alt represent their own signs.

#### SnowFox (Oct 13 2020 at 15:15):

I found one edge case with my approach with the type tagged registers (the alt flag). I can't encode one variant of the the SRAI hint, where the immediate is zero; because I use the sign of the immediate to differentiate between left and right shifts. The alt in this case applies to the destination, to distinguish logical vs signed arithmetic shifts. One minor edge case!

#### SnowFox (Oct 13 2020 at 15:16):

I could specially encode this with its own hack... but I'll leave it out for now.

#### SnowFox (Oct 13 2020 at 15:23):

Considering it is a single unused custom hint; which may never be used... pretty safe to ignore. :)

#### SnowFox (Oct 13 2020 at 15:28):

Actually... I just overconstrained it. I don't care about the direction in this position because it has already been declared that it is an arithmetic shift; which is only valid for right shifts.

#### SnowFox (Oct 13 2020 at 15:45):

Actually the hint issue does exist; SRL with rd = x0 and imm = 0 is inexpressible.

#### SnowFox (Oct 15 2020 at 09:04):

@Mario Carneiro Major update. Still a lot left to do. Now the registers may only be legally composed for each of the instructions. This should cover all of RV{32,64,128}IMFDQZicsr`, or soon. I need to do another pass over all the variations. The core instruction type unifies all these extensions well. Abstracting over int vs float, (un)signed, and all widths. https://gist.github.com/4d56511885f5b44514b0fcf006a5546a

Last updated: May 08 2021 at 23:10 UTC