Zulip Chat Archive

Stream: Program verification

Topic: RISC-V ISA in Lean


view this post on Zulip SnowFox (Oct 02 2020 at 06:38):

(Migrating from 'Is there code for X?" "bidirectional maps?" = https://leanprover.zulipchat.com/#narrow/stream/217875-Is-there.20code.20for.20X.3F/topic/bidirectional.20maps.3F/near/212013731) @Mario Carneiro

view this post on Zulip SnowFox (Oct 02 2020 at 07:18):

@Mario Carneiro Can you explain why you're using inductive Props in your x86.lean?

view this post on Zulip Mario Carneiro (Oct 02 2020 at 07:18):

because it's a spec

view this post on Zulip Mario Carneiro (Oct 02 2020 at 07:18):

it's not intended to be directly executable

view this post on Zulip Mario Carneiro (Oct 02 2020 at 07:18):

but the advantage is that you don't need separate encode/decode

view this post on Zulip SnowFox (Oct 02 2020 at 07:26):

Hmm. Not something I can use then.

view this post on Zulip Mario Carneiro (Oct 02 2020 at 07:26):

No, I don't expect that aspect will translate

view this post on Zulip Mario Carneiro (Oct 02 2020 at 07:27):

I wrote a proof producing disassembler for x86/ARM before which is closer to what you need, but it is at least for now closed source

view this post on Zulip SnowFox (Oct 02 2020 at 07:29):

Okay.

view this post on Zulip SnowFox (Oct 02 2020 at 07:31):

I've heard good things about metaprogramming in Lean but ... the documentation doesn't demonstrate how. Do you have any reference I can use for implementing make_defs? https://leanprover.github.io/reference/metaprogramming.html

view this post on Zulip Mario Carneiro (Oct 02 2020 at 07:34):

for the most part it's just functional programming

view this post on Zulip Mario Carneiro (Oct 02 2020 at 07:34):

you have to interact with the lean API to make definitions

view this post on Zulip Mario Carneiro (Oct 02 2020 at 07:34):

and the API is admittedly not so well documented

view this post on Zulip Mario Carneiro (Oct 02 2020 at 07:35):

Rob Lewis made a metaprogramming tutorial that is pretty good

view this post on Zulip Mario Carneiro (Oct 02 2020 at 07:35):

https://www.youtube.com/playlist?list=PLlF-CfQhukNnq2kDCw2P_vI5AfXN7egP2

view this post on Zulip SnowFox (Oct 02 2020 at 07:36):

Thanks. I'll watch the series.

view this post on Zulip Jannis Limperg (Oct 02 2020 at 11:31):

There's also the metaprogramming section of the Hitchhiker's guide and, if I may be so bold, my blog post on some gotchas of the API (but that'll only make sense once you have a bit of an overview). Beyond that, you'll have to read init/meta/expr.lean and init/meta/tactic.lean, which are reasonably documented.

view this post on Zulip SnowFox (Oct 02 2020 at 12:57):

Okay. I've watched the short series linked above and bookmarked the blog post. I've adjusted the python scripts to generate at least a decent mostly-correct subset of RISC-V instruction to format with embedded constants list. add_inductive and add_defn_equations appear to be my next steps, which look straight forward. Afterwards, I may open an issue in the riscv-opcode repo proposing an alternative format and a new code generator to make certain things cleaner and better reflect the specification.

Thanks for all the links everyone. I'll resume this tomorrow morning. :)

view this post on Zulip SnowFox (Oct 02 2020 at 13:52):

Okay, I couldn't resist but... ouch. add_inductive, as the documentation for its sibling add_ginductive suggests, doesn't generate the auxiliary definitions necessary for matching against it... I couldn't figure out how to use add_ginductive however. Could someone correct this code?

meta def make_defs (l : list (name × type)) : tactic unit :=
tactic.add_inductive `rvim [] 0 `(Type) (l.map (λ n, t⟩, n, expr.const `rvim []⟩))

view this post on Zulip SnowFox (Oct 12 2020 at 06:11):

@Mario Carneiro Any insight on this issue?

view this post on Zulip Mario Carneiro (Oct 12 2020 at 06:20):

meta def make_defs' (l : list (name × type)) : tactic unit :=
do env  tactic.get_env,
  env'  env.add_ginductive options.mk
    [] [] [((`rvim', `(Type)),
      l.map (λ n, t⟩,
        { constr := `rvim' ++ n,
          type := expr.const `rvim' [] }))]
    ff,
  tactic.set_env env'

run_cmd make_defs' [(`a, ()), (`b, ())]
#print rvim'

view this post on Zulip SnowFox (Oct 12 2020 at 06:25):

There we go, thanks again!

view this post on Zulip Mario Carneiro (Oct 12 2020 at 06:27):

Note that I changed your code a bit to prefix rvim on the names of the constructors. Lean will let you name the constructors something else, but the front end inductive command doesn't, so it seems ill-advised

view this post on Zulip SnowFox (Oct 12 2020 at 06:47):

@Mario Carneiro Noticed. Though now I'm stuck trying to control the types of the constructors. As these need the registers and immediate values. I was expecting something like type := `(nat -> `rvim) to work. But that says "invalid return type". As it does with just `(`rvim).

view this post on Zulip Mario Carneiro (Oct 12 2020 at 06:48):

rvim needs to be a local constant

view this post on Zulip SnowFox (Oct 12 2020 at 06:49):

Like let rvim := `rvim in `(rvim)?

view this post on Zulip SnowFox (Oct 12 2020 at 06:50):

Or let rvim := @expr.const ff `rvim [] in `(rvim) ? Neither worked.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 06:51):

The tricky part is that rvim doesn't exist yet, so `(rvim) isn't going to work

view this post on Zulip Mario Carneiro (Oct 12 2020 at 06:51):

You can use let rvim : expr := expr.const `rvim' [], and then splice it in

view this post on Zulip Mario Carneiro (Oct 12 2020 at 06:51):

i.e. `((%%rvim : Type) → (%%rvim : Type))

view this post on Zulip Mario Carneiro (Oct 12 2020 at 06:52):

actually, it seems it should be a constant not a local constant

view this post on Zulip Mario Carneiro (Oct 12 2020 at 06:53):

which is extra tricky because that definitely doesn't typecheck

view this post on Zulip SnowFox (Oct 12 2020 at 06:53):

Well, this worked. let rvim : expr := expr.const `rvim [] in `(nat -> (%%rvim : Type))

view this post on Zulip Mario Carneiro (Oct 12 2020 at 06:53):

I think the usual approach if you need more advanced inference would be to create a new local constant with the appropriate type, and then substitute the constant for it at the end

view this post on Zulip SnowFox (Oct 12 2020 at 07:03):

Okay, I have rvim expanded with the correct number of natural arguments for each instruction. :)

view this post on Zulip SnowFox (Oct 12 2020 at 07:03):

But it is getting slow... which is worrying considering how early this is still.

view this post on Zulip SnowFox (Oct 12 2020 at 07:05):

What is the vscodium trick to expand all the variants on a def? def q : rvim -> nat := _

view this post on Zulip SnowFox (Oct 12 2020 at 07:06):

I thought it was under ctrl+dot with the _ in focus. But that doesn't include the option for some reason.

view this post on Zulip SnowFox (Oct 12 2020 at 07:07):

I've seen a different method on a video

view this post on Zulip SnowFox (Oct 12 2020 at 07:11):

Video used {! x -> y} and that worked. I guess the language server just has a hole..

view this post on Zulip Mario Carneiro (Oct 12 2020 at 07:17):

processing new inductives is a little sluggish, especially if you are using nested or mutual inductives

view this post on Zulip Mario Carneiro (Oct 12 2020 at 07:19):

You can double check if you have accidentally defined a nested inductive because if you #print rvim it will say def rvim := .. instead of inductive rvim ...

view this post on Zulip SnowFox (Oct 12 2020 at 07:20):

It says inductive.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 07:20):

how many variants are we talking here?

view this post on Zulip SnowFox (Oct 12 2020 at 07:21):

I have 63 instructions.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 07:23):

hm, I wonder whether no_confusion is as expensive as I suspected

view this post on Zulip Mario Carneiro (Oct 12 2020 at 07:23):

it's basically quadratic in size

view this post on Zulip SnowFox (Oct 12 2020 at 07:23):

inductive fmt : Type
| R     (opcode7 funct3 funct7 : nat) -- (rd rs1 rs2     : nat)
| I     (opcode7 funct3        : nat) -- (rd rs1     imm : nat)
| ISH   (opcode7 funct3 funct7 : nat) -- (rd rs1     imm : nat)
| ISHW  (opcode7 funct3 funct7 : nat) -- (rd rs1     imm : nat)
| S     (opcode7 funct3        : nat) -- (   rs1 rs2 imm : nat)
| B     (opcode7 funct3        : nat) -- (   rs1 rs2 imm : nat)
| U     (opcode7               : nat) -- (rd         imm : nat)
| J     (opcode7               : nat) -- (rd         imm : nat)
| FENCE (opcode7               : nat)

meta def make_defs (l : list (name × fmt)) : tactic unit :=
  tactic.updateex_env $ λ env, env.add_ginductive options.mk
    [] [] [(
      (`rvim, `(Type)),
      l.map (λ n, t⟩,
        { constr := `rvim ++ n,
          type := let rvim : expr := expr.const `rvim [] in
                  match t with
                  | (fmt.R _ _ _) := `(nat -> nat -> nat -> (%%rvim : Type))
                  | (fmt.I _ _) := `(nat -> nat -> nat -> (%%rvim : Type))
                  | (fmt.ISH _ _ _) := `(nat -> nat -> nat -> (%%rvim : Type))
                  | (fmt.ISHW _ _ _) := `(nat -> nat -> nat -> (%%rvim : Type))
                  | (fmt.S _ _) := `(nat -> nat -> nat -> (%%rvim : Type))
                  | (fmt.B _ _) := `(nat -> nat -> nat -> (%%rvim : Type))
                  | (fmt.U _) := `(nat -> nat -> (%%rvim : Type))
                  | (fmt.J _) := `(nat -> nat -> (%%rvim : Type))
                  | (fmt.FENCE _) := `((%%rvim : Type))
                  end
        }
      )
    )]
  ff

open fmt

-- rv{32,64}[im] sans pseudo instructions
run_cmd make_defs [
--#eval list.length [
  -- rv32i
  (`BEQ,        B    0b1100011 0b000),
  (`BNE,        B    0b1100011 0b001),
  (`BLT,        B    0b1100011 0b100),
  (`BGE,        B    0b1100011 0b101),
  (`BLTU,       B    0b1100011 0b110),
  (`BGEU,       B    0b1100011 0b111),
  (`JALR,       I    0b1100111 0b000),
  (`JAL,        J    0b1101111),
  (`LUI,        U    0b0110111),
  (`AUIPC,      U    0b0010111),
  (`ADDI,       I    0b0010011 0b000),
  (`SLLI,       ISH  0b0010011 0b001 0b0000000),
  (`SLTI,       I    0b0010011 0b010),
  (`SLTIU,      I    0b0010011 0b011),
  (`XORI,       I    0b0010011 0b100),
  (`SRLI,       ISH  0b0010011 0b101 0b0000000),
  (`SRAI,       ISH  0b0010011 0b101 0b0100000),
  (`ORI,        I    0b0010011 0b110),
  (`ANDI,       I    0b0010011 0b111),
  (`ADD,        R    0b0110011 0b000 0b0000000),
  (`SUB,        R    0b0110011 0b000 0b0100000),
  (`SLL,        R    0b0110011 0b001 0b0000000),
  (`SLT,        R    0b0110011 0b010 0b0000000),
  (`SLTU,       R    0b0110011 0b011 0b0000000),
  (`XOR,        R    0b0110011 0b100 0b0000000),
  (`SRL,        R    0b0110011 0b101 0b0000000),
  (`SRA,        R    0b0110011 0b101 0b0100000),
  (`OR,         R    0b0110011 0b110 0b0000000),
  (`AND,        R    0b0110011 0b111 0b0000000),
  (`LB,         I    0b0000011 0b000),
  (`LH,         I    0b0000011 0b001),
  (`LW,         I    0b0000011 0b010),
  (`LBU,        I    0b0000011 0b100),
  (`LHU,        I    0b0000011 0b101),
  (`SB,         S    0b0100011 0b000),
  (`SH,         S    0b0100011 0b001),
  (`SW,         S    0b0100011 0b010),
  (`FENCE,      I    0b0001111 0b000),
  -- FENCEI belongs in Zifencei... why is it here?
  -- (`FENCE_I,    I    0b0001111 0b001),
  -- Shouldn't we have ECALL and EBREAK?
  -- rv32m
  (`MUL,        R    0b0110011 0b000 0b0000001),
  (`MULH,       R    0b0110011 0b001 0b0000001),
  (`MULHSU,     R    0b0110011 0b010 0b0000001),
  (`MULHU,      R    0b0110011 0b011 0b0000001),
  (`DIV,        R    0b0110011 0b100 0b0000001),
  (`DIVU,       R    0b0110011 0b101 0b0000001),
  (`REM,        R    0b0110011 0b110 0b0000001),
  (`REMU,       R    0b0110011 0b111 0b0000001),
  -- rv64i
  (`ADDIW,      I    0b0011011 0b000),
  (`SLLIW,      ISHW 0b0011011 0b001 0b0000000),
  (`SRLIW,      ISHW 0b0011011 0b101 0b0000000),
  (`SRAIW,      ISHW 0b0011011 0b101 0b0100000),
  (`ADDW,       R    0b0111011 0b000 0b0000000),
  (`SUBW,       R    0b0111011 0b000 0b0100000),
  (`SLLW,       R    0b0111011 0b001 0b0000000),
  (`SRLW,       R    0b0111011 0b101 0b0000000),
  (`SRAW,       R    0b0111011 0b101 0b0100000),
  (`LD,         I    0b0000011 0b011),
  (`LWU,        I    0b0000011 0b110),
  (`SD,         S    0b0100011 0b011),
  -- rv64m
  (`MULW,       R    0b0111011 0b000 0b0000001),
  (`DIVW,       R    0b0111011 0b100 0b0000001),
  (`DIVUW,      R    0b0111011 0b101 0b0000001),
  (`REMW,       R    0b0111011 0b110 0b0000001),
  (`REMUW,      R    0b0111011 0b111 0b0000001)
]

def is_16_bit_instruction (n : ) : Prop :=
n.land 0b11  0b11

def is_32_bit_instruction (n : ) : Prop :=
n.land 0b11 = 0b11  n.land 0b11100  0b11100

def is_valid_instruction (n : ) : Prop :=
is_16_bit_instruction n 
is_32_bit_instruction n

view this post on Zulip Mario Carneiro (Oct 12 2020 at 07:28):

one thing you can do to make the ISA easier to analyze is to bucket the instructions, for example putting all binops in one variant

view this post on Zulip Mario Carneiro (Oct 12 2020 at 07:29):

It wouldn't be hard to bucket these by type now, although that may not be exactly the layout you want

view this post on Zulip SnowFox (Oct 12 2020 at 07:34):

RISC-V uses tri-ops generally; though compressed ops do include binary ops where the first source register is repeated as the destination. ADD rd rs1 rs2 is rd = rs1 + rs2. I like the current order, ordered by XLEN and ISA extensions; because these are the units they are introduced. If I were to add more order, it'd probably be either (opcode, funct3, funct7)-order or format-shape order.

view this post on Zulip SnowFox (Oct 12 2020 at 07:34):

reg[rd] = reg[rs1] + reg[rs2] **

view this post on Zulip SnowFox (Oct 12 2020 at 07:35):

Oh, you meant reduce the inductive variants and use defs to map the instructions to the fewer variants?

view this post on Zulip Mario Carneiro (Oct 12 2020 at 07:36):

If you have an inductive type with 63 constructors, proving anything by cases on the type will be an exercise in patience (particularly if you have two variables, e.g. to prove determinism)

view this post on Zulip SnowFox (Oct 12 2020 at 07:37):

I'm open to any way to reduce the complexity.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 07:39):

e.g.

inductive binop
| ADD | SUB | SLL -- | ...

inductive loadop
| LUI | LD -- | ...

inductive rvim
| binop (b : binop) (rd rs rt : ) : rvim
| load (b : loadop) (rd imm : ) : rvim

view this post on Zulip Mario Carneiro (Oct 12 2020 at 07:41):

Bucketing based on format shape is the obvious thing to do, but you should also separate out variants that treat their arguments significantly differently in the same format-shape, not sure if this is a major concern in riscv

view this post on Zulip Mario Carneiro (Oct 12 2020 at 07:43):

Note that proving no_confusion with a bucketed inductive like this is literally lower asymptotic complexity, since if you have a 7 * 7 bucketed inductive then proving that different variants are different is 8 * 7 ^ 2 large (8 inductives of 7 variants each) instead of 49 ^ 2 with a 49 variant inductive

view this post on Zulip SnowFox (Oct 12 2020 at 07:49):

Note all of these instructions are either a pick-3 of rd, rs1, rs2, imm; or both rd, imm; or zero (in the single case of FENCE), so they could all be reduced to 3 cases (so far; though there are more variations with more ISA extensions active). The format constants (opcode7, funct3, funct7) could be flattened into a single natural of 17 bits. The opcode7 internally embeds the tag for the format variant.

view this post on Zulip SnowFox (Oct 12 2020 at 07:50):

Rather, it doesn't explicitly embed the tag but is sufficient to reconstruct it.

view this post on Zulip SnowFox (Oct 12 2020 at 07:51):

The pick-3 can be seen at the top, in the comments of the inductive fmt section.

view this post on Zulip SnowFox (Oct 12 2020 at 07:53):

One deviating hack I've considered is to allow the rd/rs1/rs2 to exceed their 5-bit limit to represent pseudo-registers a la SSA. But this adds a mode where assembly is illegal; which is fine. Assembly should happen only after register allocation anyway and these instructions are useful before register allocation.

view this post on Zulip SnowFox (Oct 12 2020 at 07:53):

and imm, for offset symbols.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 07:56):

The format constants (opcode7, funct3, funct7) could be flattened into a single natural of 17 bits.

I would suggest translating these to variants as early as possible though, because lean's handling of 17 bit numbers will not be satisfactory to you

view this post on Zulip SnowFox (Oct 12 2020 at 07:56):

@Mario Carneiro BTW: "LUI" is load upper-immediate; isn't a load involving memory so I wouldn't call it a "load operation". It only writes the immediate to the upper bits of a register.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 07:56):

yeah you can tell I'm mocking things quickly

view this post on Zulip Mario Carneiro (Oct 12 2020 at 07:57):

it's not actually about the type of operation that it does but rather it's "mode of operation", i.e. the way it handles its arguments

view this post on Zulip Mario Carneiro (Oct 12 2020 at 07:57):

just imagine you are doing a proof by cases on these variants, what is the most useful categorization?

view this post on Zulip Mario Carneiro (Oct 12 2020 at 07:58):

if some of these only interact with certain parts of the machine state (e.g. ALU ops vs memory ops) that's another possible categorization

view this post on Zulip SnowFox (Oct 12 2020 at 07:59):

RISC-V is indeed a load-store architecture; which means that the ALU operations do not touch memory at all; and memory operations do no arithmetic (besides immediate offsets added to the register for the address)

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:00):

One deviating hack I've considered is to allow the rd/rs1/rs2 to exceed their 5-bit limit to represent pseudo-registers a la SSA. But this adds a mode where assembly is illegal; which is fine. Assembly should happen only after register allocation anyway and these instructions are useful before register allocation.

This should be fine; you can have an assembly function that turns an instruction into a bit pattern, and it can fail on some arguments

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:02):

although it sounds like you might need an extended semantics with a machine with infinite registers to make sense of the extra instructions

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:02):

unless they are just garbage instructions with no semantics

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:04):

I know some compilers use a v-code style approach, where you have the same instructions as the machine but with infinite registers; I prefer to just have an IR for that, where I get to pick a nicer set of high level instructions

view this post on Zulip SnowFox (Oct 12 2020 at 08:08):

Regarding proofs; all will target the non-{pseudo,compressed} instructions. Pseudoinstructions won't exist in my compiler. Compressed instructions are only abbreviations of the larger instructions. Compression is applied when certain registers are used and when the destination is the first source register and in a few other cases. Currently all instructions are 32-bits. Compressed instructions are 16. Higher sizes exist in the specification but have no members and their format isn't frozen.

view this post on Zulip SnowFox (Oct 12 2020 at 08:09):

v-code?

view this post on Zulip SnowFox (Oct 12 2020 at 08:10):

My IR will be RVSDG-based enriched with equivalence graphs a la eqsat.

view this post on Zulip SnowFox (Oct 12 2020 at 08:11):

RVSDG is very lambda calculus ish. My source language will be RDT based.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:11):

well you also need a low level IR for doing register allocation

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:12):

unless you are going to go straight from there to machine code

view this post on Zulip SnowFox (Oct 12 2020 at 08:12):

RVSDG = https://www.sintef.no/contentassets/11da6d67207348db98a30ddbdf3b0bba/reissmann_poster.pdf
Equality Saturation = https://www.cs.cornell.edu/~ross/publications/eqsat/
Resourceful Dependent Types = http://www.cse.chalmers.se/~abela/talkTYPES18.pdf + http://www.cse.chalmers.se/~abela/types18.pdf

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:14):

I forget where I heard about v-code architecture, possibly cranelift

view this post on Zulip SnowFox (Oct 12 2020 at 08:15):

My understanding of RVSDG is that register colouring and instruction selection can be done straight from it without additional IRs.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:17):

Is there a more complete reference for RVSDG than the poster?

view this post on Zulip SnowFox (Oct 12 2020 at 08:17):

BTW: eqsat enriched here means that the RVSDG IR is non-destructively optimized by adding equivalent programs in an equivalence graph. All optimizations are applied in all orders until the resource bounds are reached.

view this post on Zulip SnowFox (Oct 12 2020 at 08:17):

Yes.

view this post on Zulip SnowFox (Oct 12 2020 at 08:18):

I thought it had a link at the bottom but it just says the reference.

view this post on Zulip SnowFox (Oct 12 2020 at 08:19):

And that didn't seem to reference what I expected.. This is a great start. https://arxiv.org/abs/1912.05036

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:20):

I also found Reissmann's thesis

view this post on Zulip SnowFox (Oct 12 2020 at 08:22):

Also a good resource. Though this arxiv paper is the GOTO reference IIUC. The poster is just a nice dense visual overview. :)

view this post on Zulip SnowFox (Oct 12 2020 at 08:30):

As an added bonus; with RDT as the source and RVSDG in the middle; I'll have linearity throughout the whole compiler. Which should have very good implications for memory usage.

view this post on Zulip SnowFox (Oct 12 2020 at 08:47):

Re splitting up rvim. If I split by these categories; the largest I have is 21. First by XLEN {32, 64}, then by ISA extension {i, m}, then into {control flow (8), arithmetic (21), memory (8)} with these numbers for rv32i. rv64i adds (0, 9, 3) respectively. rv32m +8 arith, rv64m +5 arith.

view this post on Zulip SnowFox (Oct 12 2020 at 08:47):

But this might be a bit too fine.. ?

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:51):

The XLEN thing can hopefully be parameterized

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:52):

ISA extension is probably not worth it unless you intend to do something in particular with e.g. compiling to just rv64i

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:53):

For arithmetic ops in particular I would try my best to get uniform handling of a large number of those

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:53):

i.e. operations which differ only in the mathematical binary function being applied

view this post on Zulip SnowFox (Oct 12 2020 at 08:54):

I will be compiling to restrictive configurations; but most extensions can be emulated so are safe to allow so long that they do get reduced to the target. I.e. multiplication without the M extension.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:54):

Another way to represent that is to have a predicate that tells you whether an instruction is in the configuration

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:55):

which seems like a good idea especially for risc-v because it has so many extensions

view this post on Zulip SnowFox (Oct 12 2020 at 08:55):

You'll love the vector extension ;)

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:56):

egh.. throw those all in one variant so they don't pollute the rest of the code

view this post on Zulip SnowFox (Oct 12 2020 at 08:57):

Note these aren't all R, the format detection code I wrote in python to generate these doesn't yet understand the vectors. https://gist.github.com/4fd94ffef4917b2f42d9bdb48e72fa8b

view this post on Zulip SnowFox (Oct 12 2020 at 08:57):

441

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:58):

I mean, this is just a bad representation, total abuse of the concept of an instruction

view this post on Zulip SnowFox (Oct 12 2020 at 08:58):

Certainly a lot more consumable than Intel's many many thousands of instructions.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:58):

see also https://www.sigarch.org/simd-instructions-considered-harmful/

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:59):

in lean you can at least separate those mostly into orthogonal representation pieces (like size + op)

view this post on Zulip Mario Carneiro (Oct 12 2020 at 08:59):

there are 441 things you can do, in the same sense that there are 441 kinds of burgers I can get at the local burger shop

view this post on Zulip SnowFox (Oct 12 2020 at 09:00):

Very important note for RISC-V vectors; they are hardware configuration agnostic. I.e. unlike SSE, SSE3, AVX, AVX2, AVX512; RISC-V's V extension covers all of them generically. The vector lengths are user controlled and the hardware implements and optimizes however it wants to, if at all.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 09:01):

I find 37 occurrences of ADD in that list. clearly it's not orthogonal enough

view this post on Zulip SnowFox (Oct 12 2020 at 09:04):

https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc

view this post on Zulip Mario Carneiro (Oct 12 2020 at 09:08):

One thing I notice in the docs is that instructions are indeed grouped, and they even use "dot notation" for separating out the orthogonal parts, e.g. vw(add|sub)(u?).{vw}{vx} has some clear orthogonality going on

view this post on Zulip SnowFox (Oct 12 2020 at 09:08):

Note that many of these instructions really are just names for variants of the same instruction.

view this post on Zulip SnowFox (Oct 12 2020 at 09:08):

As you just pointed out. :)

view this post on Zulip Mario Carneiro (Oct 12 2020 at 09:09):

I think it is very important that a formalization of the spec makes as much orthogonality explicit as possible

view this post on Zulip SnowFox (Oct 12 2020 at 09:09):

Agreed and this is the goal.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 09:10):

I still don't have a great grasp on how many instructions x86 has if you do this properly, but it's a lot less than 1500

view this post on Zulip SnowFox (Oct 12 2020 at 09:11):

Imagine how low that'd be for RISC-V ;)

view this post on Zulip Mario Carneiro (Oct 12 2020 at 09:11):

it's just unfortunate that so many ISA specs "case bash" in their documentation

view this post on Zulip Mario Carneiro (Oct 12 2020 at 09:12):

so it's not clear when something is one part of a family or a special exception until you read all the exponentially many doc strings to ensure they are all the same

view this post on Zulip SnowFox (Oct 12 2020 at 09:33):

@Mario Carneiro Not sure if you remember; but I started with only the "opcode map" and not the instructions themselves. Note that most opcodes are the same. The 32-bit instructions have 7 bits, where the low two are both high. Thus 5 bits, 32 opcodes. Note that SUB is really an ADD with one of its basically unused bits set high.

view this post on Zulip SnowFox (Oct 12 2020 at 09:35):

See the top of page 129. https://github.com/riscv/riscv-isa-manual/releases/download/draft-20200727-8088ba4/riscv-spec.pdf

view this post on Zulip SnowFox (Oct 12 2020 at 09:37):

All the branch instructions have the BRANCH opcode, row = 11, column = 000, size = 11 (32-bit length). 1100011,

view this post on Zulip Mario Carneiro (Oct 12 2020 at 09:37):

For the most part the decoding is less important than the semantic classes of the instructions. It's not clear to me if the difference between ADD and SUB is any less/more than the difference between ADD and XOR, except in the encoding

view this post on Zulip SnowFox (Oct 12 2020 at 09:38):

ADD and XOR have the same opcode, but are different functions. See their funct3 bits. ADD is 000, XOR is 100.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 09:38):

I see that

view this post on Zulip Mario Carneiro (Oct 12 2020 at 09:39):

ADD and SUB have the same funct3 but differ in the extra bits

view this post on Zulip SnowFox (Oct 12 2020 at 09:39):

Yes. Similarly for shifting left and right.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 09:39):

but that seems like they just wanted 10 ALU ops and ran out of bits for the encoding

view this post on Zulip SnowFox (Oct 12 2020 at 09:41):

Not quite; SUB is exactly the same hardware as ADD with one minor tweak. SUB is effectively a fused instruction to invert the sign, then add. Thus subtraction is an argument of addition.... ish

view this post on Zulip Mario Carneiro (Oct 12 2020 at 09:42):

hm, where's add-carry?

view this post on Zulip SnowFox (Oct 12 2020 at 09:42):

Er, the left/right shifts use funct3. The right shifts logical vs arithmetic use the extra bit in funct7. Which just decides if it sign-extends or not.

view this post on Zulip SnowFox (Oct 12 2020 at 09:42):

No add-carry because there are no carry flags or overflow exceptions ;)

view this post on Zulip Mario Carneiro (Oct 12 2020 at 09:42):

I guess a SUB is an ADC + NOT

view this post on Zulip Mario Carneiro (Oct 12 2020 at 09:42):

well I mean the operation x , y -> x + y + 1

view this post on Zulip Mario Carneiro (Oct 12 2020 at 09:43):

or x + y + c if possible

view this post on Zulip Mario Carneiro (Oct 12 2020 at 09:43):

it's pretty important to have an op for that if you want to do e.g. bignum addition

view this post on Zulip SnowFox (Oct 12 2020 at 09:44):

Right, not an instruction. Easy to emulate of course. Just as rv32i does multiplication and division without multiplication or division.

view this post on Zulip SnowFox (Oct 12 2020 at 09:44):

Bignum implementations should defer carries anyway. Familiar with radix 2^51?

view this post on Zulip SnowFox (Oct 12 2020 at 09:45):

https://www.chosenplaintext.ca/articles/radix-2-51-trick.html

view this post on Zulip SnowFox (Oct 12 2020 at 09:52):

RISC-V avoids exceptions and flags. There are no arithmetic flags. I.e. division by zero acts like it does in Lean, but in hardware! There are floating point flags, available in the control and status registers (CSR); fflags.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 09:55):

I don't object to that, but I hope there is still a way to perform a 65-bit addition

view this post on Zulip SnowFox (Oct 12 2020 at 10:02):

Lets ask godbolt. https://godbolt.org/z/WPs5Mq

Add then set less-than unsigned.

widdening_add:                          # @widdening_add
        add     a0, a0, a1
        sltu    a1, a0, a1
        ret

view this post on Zulip Mario Carneiro (Oct 12 2020 at 10:09):

clever

view this post on Zulip SnowFox (Oct 12 2020 at 10:10):

These are equivalent.

uint128 widdening_add(uint64 a, uint64 b) {
    return (uint128) a + (uint128) b;
}

uint128 widdening_add2(uint64 a, uint64 b) {
    uint64 c = a + b;
    return (uint128) c +
      (((uint128) (c < b)) << 64);
}

view this post on Zulip SnowFox (Oct 12 2020 at 10:12):

To contrast to x86...

widdening_add:
        mov     rax, rdi
        xor     edx, edx
        xor     edi, edi
        add     rax, rsi
        adc     rdx, rdi
        ret
widdening_add2:
        add     rdi, rsi
        mov     edx, 0
        mov     rax, rdi
        jnc     .L3
        add     rdx, 1
.L3:
        ret

view this post on Zulip SnowFox (Oct 12 2020 at 10:13):

Register movement noise and it doesn't even generate the same code.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 10:14):

I don't think that's optimal code though

view this post on Zulip Mario Carneiro (Oct 12 2020 at 10:15):

I'm pretty sure you can get it down to one instruction if you handcode the assembly

view this post on Zulip SnowFox (Oct 12 2020 at 10:15):

BTW: I use RISC-V's assembly to debug my x86 code... I avoid x86. :D

view this post on Zulip Mario Carneiro (Oct 12 2020 at 10:16):

I don't want to suggest that x86 is a good ISA by any stretch, but ADC looks like a useful instruction to me

view this post on Zulip Mario Carneiro (Oct 12 2020 at 10:18):

Aside: even norm_num likes adc

view this post on Zulip SnowFox (Oct 12 2020 at 10:23):

I believe that RISC-V is the best ISA option there is. It may not be perfect but the complaints are small and few. I think I can trim RISC-V down by using the real opcode map instead of the instruction list; then generalize over the formats and get something pretty small.

With this, I guess I'm going to set aside our progress so far and start fresh. Focusing on keeping the inductives reasonable. An unfortunate reason to push towards this design, though I hope it'll work out better anyway.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 10:27):

The main reason I chose x86 instead of RISC-V for the MM0 project was because x86 hardware is more accessible

view this post on Zulip Mario Carneiro (Oct 12 2020 at 10:28):

I think that from a technical perspective RISC-V is an obvious improvement on all axes, because it was actually designed instead of accrued

view this post on Zulip SnowFox (Oct 12 2020 at 10:32):

I don't know the implications on hardware demands for your project; but for mine, where there are sensitive security critical details involved... I'd really rather start with a well designed ISA I can memorize and reason with.

I'm "okay" with running my code in an emulator or using a JIT to x86 recompiler to run my code. Simply because it will take time to implement my compiler and language, to implement what I want to in it, and verify the whole chain there. This is a project for the future, where I can only see success for RISC-V and x86 necessarily dying out.

Worst case, someone else can just extend my compiler with an x86 backend. ;)

view this post on Zulip SnowFox (Oct 12 2020 at 10:33):

The emulator and JIT obviously ruin the security properties of the language and compiler; but are fine to just run code.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 10:33):

Well, as long as it can be soundly specified one ISA is as good as another

view this post on Zulip Mario Carneiro (Oct 12 2020 at 10:35):

but other arches are on the todo list, after the initial MVP is done

view this post on Zulip SnowFox (Oct 12 2020 at 10:36):

What are your targets?

view this post on Zulip Mario Carneiro (Oct 12 2020 at 10:36):

But for the purpose of minimizing the chain of trust, it's no good to emulate the hardware because then you have to trust the emulator and the x86 machine it's running on

view this post on Zulip Mario Carneiro (Oct 12 2020 at 10:37):

Right now, I target a simple subset of x86-64 + a simple subset of posix

view this post on Zulip Mario Carneiro (Oct 12 2020 at 10:38):

the output is an ELF executable file, and the spec describes how the bits of that file turn into operational behavior

view this post on Zulip SnowFox (Oct 12 2020 at 10:39):

Right; I can't trust x86 at all and I certainly can't trust my RISC-V code when emulating it. The emulator is exclusively to just run it on legacy hardware until both all my code is ready, and RISC-V is cheap enough and has taken over the world. Yes, I'm calling all x86 legacy. :)

view this post on Zulip SnowFox (Oct 12 2020 at 10:42):

Kernel-wise; I'll target seL4 and will write a basic wrapper to run it on POSIX environments. OS-wise, I'll target Robigalia. Which is barely a thing so far.... but we're passionate to have a safe operating system. :)

view this post on Zulip SnowFox (Oct 12 2020 at 10:45):

I'm focusing on cryptography, capabilities, and verified distributed protocols.

view this post on Zulip SnowFox (Oct 12 2020 at 10:46):

Do you plan to implement all of x86?

view this post on Zulip SnowFox (Oct 12 2020 at 10:47):

I'd like to at least handle all of RV{32,64}IMCV. Note the lack of atomics and floats. Though they could come later.

view this post on Zulip SnowFox (Oct 12 2020 at 10:48):

Might throw in B too, the bit manipulation extension can be good for cryptography. But that is much less ready than V. V might even be stable soon.

view this post on Zulip SnowFox (Oct 12 2020 at 10:50):

I believe that SiFive is waiting on V to be stable before they offer a cheap Linux-ready SBC. Holding back with the expensive devboards until then.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 10:50):

SnowFox said:

Do you plan to implement all of x86?

Certainly not. x86 is the means not the end. I've got about 30 instructions and even that might have been more than needed

view this post on Zulip Mario Carneiro (Oct 12 2020 at 10:52):

https://github.com/digama0/mm0/blob/master/examples/x86.mm0 is about 1600 lines to specify x86 + linux enough to compile a C-like programming language

view this post on Zulip Mario Carneiro (Oct 12 2020 at 10:55):

The main inductive type for instructions has 5+6+7+4=22 variants

view this post on Zulip SnowFox (Oct 12 2020 at 10:56):

aside: My godbolt test was a widening addition with 2x 64 bit numbers, not two 65 bit numbers. Not sure which you intended. If you meant the 2x 65-bit addition, then it is equivalent to 2x 128-bit addition. 3x 64-bit adds + one sltu to extract the carry bit.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 10:56):

The unit of work in a bignum addition is 64 + 64 + 1 bit addition to produce a 65 bit result

view this post on Zulip SnowFox (Oct 12 2020 at 10:59):

Then I guess that'd be 3x add + 2x sltu.

view this post on Zulip SnowFox (Oct 12 2020 at 11:00):

But I think that'd be a silly method. When working with lots of or large numbers, you really want to defer the carries. Carries are too slow to handle immediately.

view this post on Zulip SnowFox (Oct 12 2020 at 11:01):

Especially if you have lots of operations to do.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:05):

For verification though, with a single add loop, you probably don't want the complexity of deferred carries

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:05):

like in the norm_num example, I don't know if deferred carries help at all

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:05):

because it's serial

view this post on Zulip SnowFox (Oct 12 2020 at 11:09):

The difference is when you do something like adding 50 numbers. With the smaller radix; you split each number into smaller parts and let them grow until they could overflow, and finally apply the carries all at once. For radix 2^51 with 64-bit words, you can add 2^13 pre-normalized numbers before you need to normalize the carries. Next with associativity, you can see how parallel you can make this go. Not serial.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:10):

No I mean proof checking is serial

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:11):

so something like norm_num has no advantage producing proofs of numerical facts with deferred carries

view this post on Zulip SnowFox (Oct 12 2020 at 11:13):

/me shrugs; just not where you'd use either then :P

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:14):

well, lean does addition in base 2 instead of 2^64 but the principle is the same

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:15):

amusingly, I've written norm_num three times now in three different proof systems and each time the base is different: lean's is base 2, metamath uses base 10, and MM0 uses base 16

view this post on Zulip SnowFox (Oct 12 2020 at 11:15):

Heh

view this post on Zulip SnowFox (Oct 12 2020 at 11:19):

@Mario Carneiro You mentioned "nested inductives" earlier, what does this mean? I understand mutually inductive types, haven't seen nested. Is there a syntax for it or just a thing you can do with the low level tools?

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:19):

inductive foo
| mk : list foo -> foo

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:20):

you use the inductive you are defining as a parameter to another inductive type

view this post on Zulip SnowFox (Oct 12 2020 at 11:20):

Ah. Okay.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:20):

functions don't count - this is a regular inductive

inductive foo
| mk : (nat -> foo) -> foo

view this post on Zulip SnowFox (Oct 12 2020 at 11:21):

Noted. What is special about the nested variant that makes it a def instead of an inductive; in so far as #print sees it?

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:25):

lean's kernel doesn't actually support nested inductives, so instead it does some smoke and mirrors around a definition like

mutual inductive foo, list_foo
with foo : Type
| mk : list_foo -> foo
with list_foo : Type
| nil : list_foo
| cons : foo -> list_foo -> list_foo

which is itself defined in terms of

inductive foo' : bool -> Type
| mk : foo' tt -> foo' ff
| nil : foo' tt
| cons : foo' ff -> foo' tt -> foo' tt

which is a regular (indexed) inductive type that lean's kernel understands. Then you have to define all the maps from this type to the original type you wanted, and prove the recursion principle, and here the support is a little lacking, and you get a weak recursion principle that isn't good enough to prove all the facts you would like

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:27):

So I advise you to avoid the conditions under which lean triggers the nested/mutual compilation strategy, and write foo' directly if you need to

view this post on Zulip SnowFox (Oct 12 2020 at 11:28):

Huh, sounds unfortunate.. Noted.

view this post on Zulip SnowFox (Oct 12 2020 at 11:50):

@Mario Carneiro What do you think of this so far?

abbreviation register := nat

inductive reg_imm
| reg (reg : register)
| imm (imm : nat)

inductive compare | eq | ne | lt | ge | ltu | geu

inductive operation
| add (invert : opt_param bool ff)
| shift (amount : ) (arith : opt_param bool ff)
| xor
| or
| and

inductive ast
| branch (c : compare) (rs1 rs2 : register) (offset : nat)
| jump (reg : register) (base : option nat) (offset : nat)
| arith (op : operation) (rd rs1 : register) (rs2_imm : reg_imm)
| load (dest : register) (width : nat) (base : register) (offset : nat)
| store (src : register) (width : nat) (base : register) (offset : nat)

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:51):

is the shift amount always an immediate?

view this post on Zulip SnowFox (Oct 12 2020 at 11:52):

Er, that amount shouldn't be in that position.

view this post on Zulip SnowFox (Oct 12 2020 at 11:52):

But it is the only integer here, so replace that with direction : bool

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:53):

looks pretty good. Is this the whole thing?

view this post on Zulip SnowFox (Oct 12 2020 at 11:53):

Or just two separate operations.

view this post on Zulip SnowFox (Oct 12 2020 at 11:53):

This covers RV32I.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:53):

if so it's a pretty massive reduction from the original

view this post on Zulip SnowFox (Oct 12 2020 at 11:54):

Very!

view this post on Zulip SnowFox (Oct 12 2020 at 11:56):

There is also what I started with; before we derailed with the raw instructions. Which I'll probably use this round.

inductive rv_base_opcode
| load   | load_fp  | custom_0   | misc_mem | op_imm | auipc      | op_imm_32
| store  | store_fp | custom_1   | amo      | op     | lui        | op_32
| madd   | msub     | nmsub      | nmadd    | op_fp  | reserved_0 | custom_2
| branch | jalr     | reserved_1 | jal      | system | reserved_2 | custom_3

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:56):

This doesn't cover encoding though; hopefully these classes align well with the opcodes

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:56):

right, that

view this post on Zulip SnowFox (Oct 12 2020 at 11:58):

They do. All operations over immediates are rv_base_opcode.op_imm. All operations over registers are rv_base_opcode.op. The specifics only map to funct3 and funct7 where funct7 is basically one bit rarely set, occasionally a different bit. :P

view this post on Zulip SnowFox (Oct 12 2020 at 11:58):

... mostly lots of zeros :)

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:59):

I mean ast

view this post on Zulip Mario Carneiro (Oct 12 2020 at 11:59):

clearly rv_base_opcode maps well to the opcodes, AFAICT it's 1-1

view this post on Zulip SnowFox (Oct 12 2020 at 11:59):

Right, I meant that ast.arith has only two opcodes.

view this post on Zulip SnowFox (Oct 12 2020 at 12:00):

ast.branch maps to opcodes.branch, ast.jump with jal and jalr. load and store, are well, exactly those.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 12:01):

you should try writing the function ast -> (opc, funct3, funct7), hopefully you can pattern match your way to victory

view this post on Zulip Mario Carneiro (Oct 12 2020 at 12:03):

or I guess that could even map directly to nat (really fin (2^32)) using helper functions like fmt_R (opcode7 funct3 funct7 rd rs1 rs2 : nat) : nat

view this post on Zulip Mario Carneiro (Oct 12 2020 at 12:04):

that seems like the cleanest way to write down all that encoding information on this ast basis

view this post on Zulip SnowFox (Oct 12 2020 at 12:10):

Why does vscodium not always offer to expand case splits for you? o.o

view this post on Zulip SnowFox (Oct 12 2020 at 13:16):

Hmm. Does it make sense to expand these out or use "modes" of the base operation?

inductive operation
| add | sub | addw
| sll | srl | sra
| xor
| or
| and
| mul | mulh | mulhs | mulhsu | mulw
| div | divu | divw | divuw
| rem | remu | remw | remuw

view this post on Zulip Mario Carneiro (Oct 12 2020 at 13:17):

how are they specified?

view this post on Zulip SnowFox (Oct 12 2020 at 13:17):

rv64i adds addw. rvm adds the last 3 rows.

view this post on Zulip SnowFox (Oct 12 2020 at 13:18):

Eh, RV64I adds a few more.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 13:18):

The main question is whether you are always going to have to case split the subparts anyway

view this post on Zulip SnowFox (Oct 12 2020 at 13:19):

Yes.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 13:19):

if there are functions where you can say (mul _) := stuff where _ is the subop, then it's a win

view this post on Zulip SnowFox (Oct 12 2020 at 13:19):

The Ws are operations over sign-extended 32-bits on 64-bit systems. the rest are XLEN-wise.

view this post on Zulip SnowFox (Oct 12 2020 at 13:20):

All but the logical operations have a W variant.

view this post on Zulip SnowFox (Oct 12 2020 at 13:20):

So I guess I should split arithmetic into two modes and split logic out.

view this post on Zulip SnowFox (Oct 12 2020 at 13:21):

I also forgot about set less than (unsigned) which I guess is logic.

view this post on Zulip SnowFox (Oct 12 2020 at 13:23):

However, there is no mul[h[[s]u]]w.

view this post on Zulip SnowFox (Oct 12 2020 at 13:23):

Only mulw, of all those options.

view this post on Zulip SnowFox (Oct 12 2020 at 13:24):

Which I guess is fine to "allow" and later mark as invalid.

view this post on Zulip SnowFox (Oct 12 2020 at 13:24):

As I'd do for subtraction with an immediate.

view this post on Zulip SnowFox (Oct 12 2020 at 13:25):

(Because addition with an immediate has signed immediates already)

view this post on Zulip SnowFox (Oct 12 2020 at 13:39):

RV{32,64}IM with a few illegal cases, but easy to normalize (subi -> addi) or mark as invalid (no mulhw et al).

abbreviation register := nat

inductive reg_imm
| reg (reg : register)
| imm (imm : nat)

inductive compare | eq | ne | lt | ge | ltu | geu

inductive arithmetic
| add | sub
| sll | srl | sra
| slt | sltu
| mul | mulh | mulhu | mulhsu
| div | divu
| rem | remu

inductive logical
| xor
| or
| and

inductive ast
| arith (op : arithmetic) (w : bool) (rd rs1 : register) (rs2_imm : reg_imm)
| logic (op : logical)  (rd rs1 : register) (rs2_imm : reg_imm)
| lui (rd : register) (offset : nat)
| auipc (rd : register) (offset : nat)
| jump (reg : register) (base : option nat) (offset : nat)
| branch (c : compare) (rs1 rs2 : register) (offset : nat)
| load (dest : register) (width : fin 4) (base : register) (offset : nat)
| store (src : register) (width : fin 4) (base : register) (offset : nat)
| fence (mode pred succ : nat)
| call
| break

view this post on Zulip SnowFox (Oct 12 2020 at 14:03):

@Mario Carneiro get_opcode and get_funct

namespace ast
  def get_opcode : ast -> opcode
  | (branch _ _ _ _) := opcode.branch
  | (jump _ (some _) _) := opcode.jalr
  | (jump _ none _) := opcode.jal
  | (arith _ ff _ _ (reg_imm.reg _)) := opcode.op
  | (arith _ ff _ _ (reg_imm.imm _)) := opcode.op_imm
  | (arith _ tt _ _ (reg_imm.reg _)) := opcode.op_32
  | (arith _ tt _ _ (reg_imm.imm _)) := opcode.op_imm_32
  | (logic _ _ _ (reg_imm.reg _)) := opcode.op_32
  | (logic _ _ _ (reg_imm.imm _)) := opcode.op_imm_32
  | (lui _ _) := opcode.lui
  | (auipc _ _) := opcode.auipc
  | (load _ _ _ _) := opcode.load
  | (store _ _ _ _) := opcode.store
  | (fence _ _ _) := opcode.misc_mem
  | call := opcode.system
  | break := opcode.system

  open arithmetic
  open logical
  open compare

  -- funct7 << 3 + funct3
  def get_funct : ast -> nat
  -- rvi
  | (branch c _ _ _) :=
    match c with
    | eq := 0b000
    | ne := 0b001
    | lt := 0b100
    | ge := 0b101
    | ltu := 0b110
    | gtu := 0b111
    end
  | (jump _ _ _) := 0
  | (arith add  _ _ _ _) := 0b000
  | (arith sub  _ _ _ _) := 0b000 + nat.shiftl 3 0b0100000
  | (arith sll  _ _ _ _) := 0b001
  | (arith slt  _ _ _ _) := 0b010
  | (arith sltu _ _ _ _) := 0b011
  | (arith srl  _ _ _ _) := 0b101 + nat.shiftl 3 0b0100000
  | (arith sra  _ _ _ _) := 0b101
  | (logic xor  _ _ _)   := 0b100
  | (logic or   _ _ _)   := 0b110
  | (logic and  _ _ _)   := 0b111
  | (load _ width _ _)   := width.val
  | (store _ width _ _)  := width.val
  | (lui _ _) := 0
  | (auipc _ _) := 0
  | (fence _ _ _) := 0
  | call := 0
  | break := 0
  -- rvm
  | (arith mul    _ _ _ _) := 0b000 + nat.shiftl 3 0b0000001
  | (arith mulh   _ _ _ _) := 0b001 + nat.shiftl 3 0b0000001
  | (arith mulhsu _ _ _ _) := 0b010 + nat.shiftl 3 0b0000001
  | (arith mulhu  _ _ _ _) := 0b011 + nat.shiftl 3 0b0000001
  | (arith div    _ _ _ _) := 0b100 + nat.shiftl 3 0b0000001
  | (arith divu   _ _ _ _) := 0b101 + nat.shiftl 3 0b0000001
  | (arith rem    _ _ _ _) := 0b110 + nat.shiftl 3 0b0000001
  | (arith remu   _ _ _ _) := 0b111 + nat.shiftl 3 0b0000001
end ast

view this post on Zulip SnowFox (Oct 12 2020 at 14:04):

@Mario Carneiro How does this partitioning look?

view this post on Zulip SnowFox (Oct 12 2020 at 14:14):

s/ast/inst/g. Not quite a tree...

view this post on Zulip Mario Carneiro (Oct 12 2020 at 14:17):

why the match on branch conditions, instead of putting it in the top level?

view this post on Zulip Mario Carneiro (Oct 12 2020 at 14:19):

Unless funct7 is actually 3 bits above funct3, you probably shouldn't mix them like that, and instead either return an inductive type (with a case for funct3 and another for funct3 + funct7), or a precomposed nat containing the entire instruction (using auxiliary functions type_1 : funct3 -> nat and type_2 : funct3 -> funct7 -> nat)

view this post on Zulip SnowFox (Oct 12 2020 at 14:21):

Actually I should shift it more than 3, as they are a fixed distance across. This is consistent for all 32-bit instructions.

view this post on Zulip SnowFox (Oct 12 2020 at 14:21):

That match will be moved. I'm currently enumerating the opcodes of interest to map them to formats; then I'll write an encoder and decoder which masks appropriately.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 14:22):

I mean you have those instruction formats, right? you should keep them separate until you have all the fields

view this post on Zulip SnowFox (Oct 12 2020 at 14:25):

The format only decides the layout of the immediates and which are present between rd rs1 rs2 funct3 funct7. funct7 "doesn't exist for immediates" but there are a few immediates which effectively use it. Specifically the shift right arithmetic. So it "acts" half-way between I and R.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 14:27):

that is, you can define

def fmt_R (opcode7 funct3 funct7 rd rs1 rs2 : nat) : nat :=
opcode7 + rd.shiftl 7 + funct3.shiftl 12 +
rs1.shiftl 15 + rs2.shiftl 20 + funct7.shiftl 25

and then write encode : ast -> nat that calls fmt_R for R-type instructions

view this post on Zulip Mario Carneiro (Oct 12 2020 at 14:28):

and fmt_I doesn't have a funct7 argument so there is no bad typing

view this post on Zulip SnowFox (Oct 12 2020 at 14:42):

The format only influences decoding (as we need to know how to mask the correct values out); and for the immediate layout while encoding. The encoder can assume every instruction has an opcode, has all the registers, and has both function parts without caring about the format.

view this post on Zulip SnowFox (Oct 12 2020 at 14:44):

Only the immediate explosion depends on the format -- for encoding. Actually decoding could probably be pretty lazy too.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 14:54):

  def opcode.encode : opcode  nat := sorry

  def branch_funct3 : compare  nat
  | eq := 0b000
  | ne := 0b001
  | lt := 0b100
  | ge := 0b101
  | ltu := 0b110
  | gtu := 0b111

  def B_type (opcode funct3 rs1 rs2 imm : nat) : nat := sorry
  def J_type (opcode rd imm : nat) : nat := sorry
  def I_type (opcode funct3 rd rs1 imm : nat) : nat := sorry
  def R_type (opcode funct3 funct7 rd rs1 rs2 : nat) : nat := sorry

  def encode_arith : arithmetic   × 
  | add := (0b000, 0)
  | sub := (0b000, 0b0100000)
  | sll := (0b001, 0)
  | slt := (0b010, 0)
  | sltu := (0b011, 0)
  | srl := (0b101, 0b0100000)
  | sra := (0b101, 0)
  | _ := sorry

  def encode : ast -> nat
  | (branch c rs1 rs2 off) := B_type opcode.branch.encode (branch_funct3 c) rs1 rs2 off
  | (jump reg (some rs1) off) := I_type opcode.jalr.encode 0b000 reg rs1 off
  | (jump reg none off) := J_type opcode.jal.encode reg off
  | (arith op w rd rs1 (reg_imm.reg rs2)) :=
    let (funct3, funct7) := encode_arith op in
    R_type (if w then opcode.op_32 else opcode.op).encode funct3 funct7 rd rs1 rs2
  | (arith op w rd rs1 (reg_imm.imm imm)) :=
    let (funct3, _) := encode_arith op in
    I_type (if w then opcode.op_imm_32 else opcode.op_imm).encode funct3 rd rs1 imm
  | _ := sorry

view this post on Zulip Mario Carneiro (Oct 12 2020 at 14:57):

I'm ignoring the funct7 field of arith with immediate here, I suppose that means that subi doesn't exist?

view this post on Zulip SnowFox (Oct 12 2020 at 14:58):

Well, subi doesn't exist anyway.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 14:59):

yeah, the type of this function doesn't really allow for error handling so I guess it doesn't matter what it does in that case

view this post on Zulip SnowFox (Oct 12 2020 at 14:59):

But srli does.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 14:59):

oh?

view this post on Zulip SnowFox (Oct 12 2020 at 15:00):

Yes, the shifts are the exception to the format rule here; they don't fit in any of them. :P

view this post on Zulip Mario Carneiro (Oct 12 2020 at 15:00):

no ISA is complete without a few exceptions to the rule

view this post on Zulip SnowFox (Oct 12 2020 at 15:00):

Because you encode the shift amount in the lower bits of the immediate; the upper bits are all zeros except for the single bit which indicates if a right shift is arithmetic vs logical.

view this post on Zulip SnowFox (Oct 12 2020 at 15:00):

arithmetic means sign-extended shift.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 15:03):

ah, the funct7 bits are switched between srl and sra

view this post on Zulip SnowFox (Oct 12 2020 at 15:04):

Yeah I noticed that while splitting the funct7 out to its own function.

view this post on Zulip SnowFox (Oct 12 2020 at 15:18):

  def encode (i : inst) :  :=
  i.get_opcode.to_nat + i.rd.shiftl 7 + i.funct3.shiftl 12
  + i.rs1.shiftl 15 + i.rs2.shiftl 20 + i.funct7.shiftl 25 + i.imm

view this post on Zulip SnowFox (Oct 12 2020 at 15:18):

This works for all 32-bit instructions.

view this post on Zulip SnowFox (Oct 12 2020 at 15:19):

The format only influences the positions of the immediates. :)

view this post on Zulip SnowFox (Oct 12 2020 at 15:21):

Returning to the magic metaprogramming we started with; implemented for all instructions. I'll adapt this to the opcode, funct3, and funct7 mapping to get bidirectional maps.

view this post on Zulip SnowFox (Oct 12 2020 at 15:23):

Actually. I may take it a step further regarding the opcode map. To actually use the rows/columns to encode them. :) Which is basically just the binary count of the position +3 while skipping any where the lower 5 bits are all high.

view this post on Zulip SnowFox (Oct 12 2020 at 15:24):

The condition map is at least a direct count. Is there a @[derive to_nat] for inductives?

view this post on Zulip Mario Carneiro (Oct 12 2020 at 15:27):

The point of the different functions is to make it easier to express which format a given instruction has, so that you have to pass 6 arguments vs 7 and such

view this post on Zulip Mario Carneiro (Oct 12 2020 at 15:28):

Regarding the opcode inductive, you could just skip it and put numbers in where I had things like opcode.op.encode

view this post on Zulip Mario Carneiro (Oct 12 2020 at 15:30):

There is no derive to_nat, although it wouldn't be hard to autogenerate

view this post on Zulip SnowFox (Oct 12 2020 at 15:30):

Could.

view this post on Zulip SnowFox (Oct 12 2020 at 15:30):

Probably will.

view this post on Zulip SnowFox (Oct 12 2020 at 15:31):

But it is nice to have the full list and to use them by-name to avoid mistakes. But this'll be verified eventually anyway... So it would be fine to just put the numbers.

view this post on Zulip Mario Carneiro (Oct 12 2020 at 15:31):

You can also skip the inductive and instead do def opcode.op := 0b0100011 or whatever

view this post on Zulip SnowFox (Oct 12 2020 at 15:33):

True. Although the biggest detail for annoyance will be flipping the encoder to decode; ideally using the metaprogramming trick to avoid the duplication.

view this post on Zulip SnowFox (Oct 12 2020 at 15:51):

!! Lean's language server appears to have a memory leak?

view this post on Zulip SnowFox (Oct 12 2020 at 15:51):

Filled my ram to 100% started using emergency swap. o.o

view this post on Zulip Mario Carneiro (Oct 12 2020 at 17:42):

keep your finger on the Lean: Restart button

view this post on Zulip SnowFox (Oct 13 2020 at 13:08):

@Mario Carneiro I'm stuck again with generating definitions. Is there anything stopping an interface for constructing these a la `(«def» q := 3) or just with def as it is obvious that we're inside an opened parenthesis so it mustn't be a command anyway.

run_cmd [
    [`load,   `load_fp,  `custom_0, `misc_mem, `op_imm,   `auipc,      `op_imm_32],
    [`store,  `store_fp, `custom_1, `amo,      `op,       `lui,        `op_32    ],
    [`madd,   `msub,     `nmsub,    `nmadd,    `op_fp,    `reserved_0, `custom_2 ],
    [`branch, `jalr,     `reserved, `jal,      `system,   `reserved_1, `custom_3 ]
  ].enum >>= λ i, n⟩, n.enum >>= λ j, n⟩, /- def n := 3 + j.shiftl 2 + i.shiftl 5 -/

view this post on Zulip Mario Carneiro (Oct 13 2020 at 13:09):

well commands are not expressions, so that doesn't work literally

view this post on Zulip SnowFox (Oct 13 2020 at 13:10):

or any similar interface if not exactly using the expr-quotes.

view this post on Zulip Mario Carneiro (Oct 13 2020 at 13:12):

I mean for the longest time there wasn't even an API to add defs at all (with all the bells and whistles)

view this post on Zulip Mario Carneiro (Oct 13 2020 at 13:13):

you can use tactic.add_decl, I think that might not add equation lemmas though

view this post on Zulip Mario Carneiro (Oct 13 2020 at 13:15):

there is also environment.add_defn_eqns for more advanced definitions

view this post on Zulip SnowFox (Oct 13 2020 at 13:16):

Re reducing variations. I've arrived at embedding the width in the register and using an "alt" flag. Usually to distinguish between (un)signed operations, though also to select the high half of mults (which then respects source register (un)sign flags). This lines up well for tuning the operations.

view this post on Zulip Mario Carneiro (Oct 13 2020 at 13:16):

Anyway lean 4 will add something a lot like you are proposing; quotation and antiquotation now works on every syntactic class

view this post on Zulip SnowFox (Oct 13 2020 at 13:18):

Awesome. When I started with Lean, I dove into Lean 4's code ... then I found myself writing Lean 4 by mistake. :P

view this post on Zulip SnowFox (Oct 13 2020 at 13:19):

As in, while using lean3.

view this post on Zulip SnowFox (Oct 13 2020 at 13:25):

I was trying to use tactic.add_defn_equations before; though with add_decl, I feel much closer but have hit at least an issue with lists/tactics/monad switching... Not sure how to fix this in Lean...

  run_cmd [
    [`load,   `load_fp,  `custom_0, `misc_mem, `op_imm,   `auipc,      `op_imm_32],
    [`store,  `store_fp, `custom_1, `amo,      `op,       `lui,        `op_32    ],
    [`madd,   `msub,     `nmsub,    `nmadd,    `op_fp,    `reserved_0, `custom_2 ],
    [`branch, `jalr,     `reserved, `jal,      `system,   `reserved_1, `custom_3 ]
  ].enum >>= λ i, n⟩, n.enum >>= λ j, n⟩,
  tactic.add_decl (declaration.defn n [] `() `(3 + j.shiftl 2 + i.shiftl 5)
                     reducibility_hints.abbrev tt)

view this post on Zulip SnowFox (Oct 13 2020 at 13:31):

There we go. .mmap

  run_cmd [
    [`load,   `load_fp,  `custom_0, `misc_mem, `op_imm,   `auipc,      `op_imm_32],
    [`store,  `store_fp, `custom_1, `amo,      `op,       `lui,        `op_32    ],
    [`madd,   `msub,     `nmsub,    `nmadd,    `op_fp,    `reserved_0, `custom_2 ],
    [`branch, `jalr,     `reserved, `jal,      `system,   `reserved_1, `custom_3 ]
  ].enum.mmap $ λ i, n⟩, n.enum.mmap $ λ j, n⟩,
  tactic.add_decl (declaration.defn n [] `() `(3 + j.shiftl 2 + i.shiftl 5)
                     reducibility_hints.abbrev tt)

view this post on Zulip Mario Carneiro (Oct 13 2020 at 13:32):

you probably want mmap' there since you aren't returning anything

view this post on Zulip Mario Carneiro (Oct 13 2020 at 13:33):

you may or may not want to precompute those numbers btw, by using let n := 3 + j.shiftl 2 + i.shiftl 5 in reflect n

view this post on Zulip Mario Carneiro (Oct 13 2020 at 13:34):

oh, actually what you have definitely won't work because you haven't antiquoted i and j

view this post on Zulip SnowFox (Oct 13 2020 at 13:34):

/-
invalid field notation, 'branch' is not a valid "field" because environment does not contain 'nat.branch'
  op
which has type

-/

view this post on Zulip Mario Carneiro (Oct 13 2020 at 13:35):

did you write op.branch somewhere?

view this post on Zulip SnowFox (Oct 13 2020 at 13:35):

Yes. And I'm running the run_cmd inside the op namespace.

view this post on Zulip Mario Carneiro (Oct 13 2020 at 13:36):

you might want to qualify those names, op is kind of general for an unqualified name

view this post on Zulip SnowFox (Oct 13 2020 at 13:36):

It seems to be clobbering the namespace... does the add_decl bypass the namespace?

view this post on Zulip SnowFox (Oct 13 2020 at 13:37):

Seem so. (`op ++ n) worked.

view this post on Zulip Mario Carneiro (Oct 13 2020 at 13:37):

Yeah, you said add_decl `op so it's making a definition called `op

view this post on Zulip Mario Carneiro (Oct 13 2020 at 13:38):

you can sometimes use ``op to do name resolution but it won't work here because the thing doesn't exist yet

view this post on Zulip SnowFox (Oct 13 2020 at 13:38):

Makes sense. I assumed the namespace would inject itself but I guess that only affects the commands. The tactic just bypasses it.

view this post on Zulip SnowFox (Oct 13 2020 at 13:39):

Current code -> https://gist.github.com/4d56511885f5b44514b0fcf006a5546a

view this post on Zulip SnowFox (Oct 13 2020 at 13:50):

Any more tips @Mario Carneiro?

view this post on Zulip Mario Carneiro (Oct 13 2020 at 13:52):

It seems a bit odd to put the alt flag in the register instead of the instruction

view this post on Zulip SnowFox (Oct 13 2020 at 13:54):

The flag basically means that the register itself is treated as signed in most cases. The mul is the only deviation, where rd.alt means it acts on the high half of the result, but then the rs1.alt and rs2.alt represent their own signs.

view this post on Zulip SnowFox (Oct 13 2020 at 15:15):

I found one edge case with my approach with the type tagged registers (the alt flag). I can't encode one variant of the the SRAI hint, where the immediate is zero; because I use the sign of the immediate to differentiate between left and right shifts. The alt in this case applies to the destination, to distinguish logical vs signed arithmetic shifts. One minor edge case!

view this post on Zulip SnowFox (Oct 13 2020 at 15:16):

I could specially encode this with its own hack... but I'll leave it out for now.

view this post on Zulip SnowFox (Oct 13 2020 at 15:23):

Considering it is a single unused custom hint; which may never be used... pretty safe to ignore. :)

view this post on Zulip SnowFox (Oct 13 2020 at 15:28):

Actually... I just overconstrained it. I don't care about the direction in this position because it has already been declared that it is an arithmetic shift; which is only valid for right shifts.

view this post on Zulip SnowFox (Oct 13 2020 at 15:45):

Actually the hint issue does exist; SRL with rd = x0 and imm = 0 is inexpressible.

view this post on Zulip SnowFox (Oct 15 2020 at 09:04):

@Mario Carneiro Major update. Still a lot left to do. Now the registers may only be legally composed for each of the instructions. This should cover all of RV{32,64,128}IMFDQZicsr, or soon. I need to do another pass over all the variations. The core instruction type unifies all these extensions well. Abstracting over int vs float, (un)signed, and all widths. https://gist.github.com/4d56511885f5b44514b0fcf006a5546a


Last updated: May 08 2021 at 23:10 UTC