Zulip Chat Archive

Stream: general

Topic: hardware verification


Tim Daly (Feb 15 2020 at 13:38):

I've been looking at "verification down to the metal". Robert Baruch has been designing a CPU in an FPGA and documenting each step with video. One of his videos is "Very Basic Introduction to Formal Verification" which involves formal verification of Verilog (hardware design language) modules. See (https://www.youtube.com/watch?v=9e7F1XhjhKw) and also the paper "Veriflog Synthesis and Formal Verification with Yosys" (http://www.clifford.at/papers/2016/yosys-synth-formal/slides.pdf)

Tim Daly (Feb 15 2020 at 13:49):

Yosys is using things like BDDs or model checking. But a lot of "metal" level "modules" in Verilog are basically either Mealy or Moore state machines. It seems to me that these could be modelled in logic and Lean-like proof machinery could be applied. I'm reading a book on hardware state machines at the moment with an eye towards a potential implementation. That would mean that, for instance, microcode implementations would be fully verifiable at the hardware level.

Tim Daly (Feb 15 2020 at 14:04):

For those who want to play along see Volnei Pedroni "Finite State Machines in Hardware" (https://www.amazon.com/Finite-State-Machines-Hardware-SystemVerilog/dp/0262019663)

Tim Daly (Feb 15 2020 at 14:12):

Lean seems to have an advantage over, say ACL2, in that we have dependent types. It should be possible to declare a 'register type' that knows the size of the register, for example.

Tim Daly (Feb 15 2020 at 22:50):

Lets play 'assume'. Assume you can prove that a hardware finite state machine (FSM) implements a specification. (Since the implementation is finite state then it seems you can use a SAT solver?). Assume that the specification is Lean's proof checker (so the FSM can check a Lean proof). Assume the state machine(s) is small enough to fit in an Field Programmable Gate Array (FPGA) (which are huge these days). Then there is a piece of hardware to proof check. Note that the large data center companies (but not ordinary mortals) have access to the new Intel CPU which has a built-in FPGA. So... one could have a hardware "instruction" to check a proof.

Tim Daly (Feb 15 2020 at 22:54):

(sorry, wrote thread). A more (mealy?) interesting question: Can CIC be implemented in finite state machines?

Mario Carneiro (Feb 15 2020 at 23:08):

yes?

Mario Carneiro (Feb 15 2020 at 23:08):

What's the actual question? Of course everything is possible in principle

Tim Daly (Feb 15 2020 at 23:14):

The X86 instruction set is beyond my powers to fully specify (I've tried). So trying to validate an implementation is also outside the possible (at least for me). However, an "instruction set" that is specific to proof checking and/or CIC and is well designed (say, as an FSM) then it would be easier to validate.

Tim Daly (Feb 15 2020 at 23:16):

In addition, data centers used FPGAs all over the place (e.g. they run compression algorithms "on the wire" so the CPU can send/receive at higher bandwidth, they run encryption/decryption "on the wire" so data in flight can't be read or modified). So if code "carried proofs" then you could check, in parallel, that the code is valid.

Tim Daly (Feb 15 2020 at 23:18):

Modern Intel CPUs contain an FPGA which, in principle, allows you to define your own instructions. So you could "side load" a proof checker.

Tim Daly (Feb 15 2020 at 23:21):

So you package the proof checker FPGA binfile, the code, the code proof, all "under signed encryption", and then I (well, not I since I don't have access to the new CPUs) can send a "proven GCD" algorithm.

Tim Daly (Feb 15 2020 at 23:24):

I have a couple FPGAs and I'm looking at the question of validating FSMs for my own chosen instruction set. The game is to design the instruction set to underlie the proof checker.

Tim Daly (Feb 15 2020 at 23:33):

Rather than a Peano succ(succ(succ zero))) computation you'd rather work in normal machine binary (which, as I understand Univalence, says they are "equal"). So Lean can use much higher level (h-level) kinds of reasoning if you can "insert a computer algebra system" into Lean. But you need proven algorithms, presumably "down to the metal".

Mario Carneiro (Feb 15 2020 at 23:34):

Yes, this is possible, but lean is not optimally positioned for such an effort. You really want a simplified kernel that doesn't take tens of thousands of lines to write and especially to specify

Tim Daly (Feb 15 2020 at 23:35):

I've been looking at Andromeda (Bauer) which could (possibly) use Lean's CIC rules in declarative form.

Mario Carneiro (Feb 15 2020 at 23:37):

FYI, in case you weren't aware, you are describing my current project almost exactly

Mario Carneiro (Feb 15 2020 at 23:37):

I'm not targeting an FPGA, although it could be done

Tim Daly (Feb 15 2020 at 23:38):

Sorry, I'm not trying to step on your turf. I'm coming at it from the question of "computational mathematics", merging proof and computer algebra.

Mario Carneiro (Feb 15 2020 at 23:38):

I just wasn't sure if you were trolling me ;)

Chris B (Feb 15 2020 at 23:39):

It's too late for that now. Pistols at dawn.

Tim Daly (Feb 15 2020 at 23:39):

I assume you've seen my "paper" on the Intel instruction semantics.

Mario Carneiro (Feb 15 2020 at 23:39):

The 7000 page one?

Tim Daly (Feb 15 2020 at 23:40):

Yeah (Intel is big, the Intel reference book pile is a couple feet tall)

Mario Carneiro (Feb 15 2020 at 23:40):

You don't need the entire instruction set to write a proof checker though

Tim Daly (Feb 15 2020 at 23:40):

Leave Dawn out of this . She is innocent.

Tim Daly (Feb 15 2020 at 23:43):

Yep, you don't need all that semantics. You only (maybe) need a finite state machine with a few, validated, and carefully crafted instructions.

Tim Daly (Feb 15 2020 at 23:45):

I tried to do an FPGA startup so I have some Altera FPGA hardware (which is the company Intel bought) and I've been playing around "in the sand", thinking about a validated proof checker.

Tim Daly (Feb 15 2020 at 23:45):

I will look deeper into what you're doing. I'm seriously not trying to step on your turf.

Mario Carneiro (Feb 15 2020 at 23:46):

you know, there are other options than stepping on turf, you could come in the front door

Tim Daly (Feb 15 2020 at 23:46):

Rumor has it someone has a bullet with my name on it :-)

Mario Carneiro (Feb 15 2020 at 23:46):

I'm sure there are ways to collaborate

Mario Carneiro (Feb 15 2020 at 23:46):

without the pistols

Tim Daly (Feb 15 2020 at 23:49):

Let me get up to speed on your links and see if I can help

Mario Carneiro (Feb 15 2020 at 23:49):

This might be a place to start: https://arxiv.org/pdf/1907.01283.pdf

Tim Daly (Feb 15 2020 at 23:50):

Printing it now. Thanks.

Tim Daly (Feb 15 2020 at 23:53):

I've been looking at your paper on the Type theory of Lean, trying to figure out how to wedge that into Andromeda

Mario Carneiro (Feb 15 2020 at 23:58):

The last question at http://www.andromeda-prover.org/answers.html suggests that Andromeda is too "opinionated" to be general purpose, and in particular cannot support CIC

Mario Carneiro (Feb 16 2020 at 00:00):

There is an MM0 implementation of lean at lean.mm1

Simon Cruanes (Feb 16 2020 at 00:02):

@Mario Carneiro that's the perfect moment to ask @Tim Daly to write another mm0 proof checker :upside_down:
Or maybe two checkers, the second of which would be on his FPGA.

Tim Daly (Feb 16 2020 at 00:20):

Comments on your x86 paper.

(1) "information on Linux system calls"... yeah, I had to skip over system calls in the 7000 page paper. System calls were "on the list" to be thought about but our Malware project got cancelled.

(2) "bootstrapping a theorem prover"... I can see where we've crossed paths. I gave up on the x86 path and decided to look at FPGA finite state machines (which a SAT solver can check).

(3) The link I gave above to the Baruch video "Very Basic Introduction to Formal Verification" shows a "prove" mechanism that can be applied to Verilog (hardware spec) code. It includes inductive proofs.

Mario Carneiro (Feb 16 2020 at 00:24):

I don't handle very many linux system calls, but you need a few in order to do anything. I have open, read, write, fstat and mmap. In addition to that, you need to axiomatize the program loading behavior, which is surprisingly subtle and context dependent. I think it's not actually possible to guarantee that your program will not run out of memory and access out of bounds before it gets to the first line of the actual program (the ELF _start label)

Tim Daly (Feb 16 2020 at 00:24):

(4) The 7k paper includes information on the ELF format, so we've both suffered through that.

Tim Daly (Feb 16 2020 at 00:29):

The linking loader is a self-modifying program. It takes "data" of your program and "data" of libraries, constructs a program, and then "passes execution" to the constructed binary. I'm outlining a paper on "Self Modifying Code" using the linking loader as an example. The idea is to view the linking loader in a "continuation passing style" so the loaded program "returns to the linker.

The Tail-Called Linker
It isn't what it was because it is gone.
It isn't what it will be because it is becoming.
But it alway is.

Tim Daly (Feb 16 2020 at 00:30):

The 7k semantics program was hand-checked against the C standard library, instruction by instruction.

Tim Daly (Feb 16 2020 at 00:33):

(I have a morbid fascination with self-modifying code. I believe it is a way out of Turing's box but that's just a belief so far)

Chris B (Feb 16 2020 at 00:33):

Not sure how much overlap there is with what you want to do, but I saw a talk once by Adam Chlipala about work his team was doing on verifying hardware/ISA stuff for riscv using coq and a tool they built in coq called Kami. The repo is still pretty active and seems to have examples, it might give you some insight into how people have applied a DTT-based prover to that domain.
https://github.com/sifive/Kami

Tim Daly (Feb 16 2020 at 00:35):

Hmpf. I've read a bit of Chlipala's work but this is the first I've heard of Kami. Thanks.

Tim Daly (Feb 16 2020 at 00:36):

@Simon Cruanes re: re-implementing... not a bad idea but first I have to reverse-engineer mm0 :-)

Chris B (Feb 16 2020 at 00:39):

Looks like there are some papers on the main page, they might be more informative than the repo. http://plv.csail.mit.edu/kami/

Tim Daly (Feb 16 2020 at 00:42):

Their Bluespec work looks dead center on what I've been trying to do. Pistols at dawn. It's all her fault anyway.

Tim Daly (Feb 16 2020 at 00:48):

Google search "zipcpu formal" for a lot of interesting threads

Tim Daly (Feb 23 2020 at 09:01):

There is some hardware state machine work at York: Foster, Baxter, Cavalcanti, Miyazawa, and Woodcock "Automating Verification of State Machines with Reactive Design in Isabelle/UTP" (https://arxiv.org/pdf/1807.08588.pdf). This is based on Hoare and He "Unified Theory of Programming" (UTP) in Isabelle.

In pursuit of a "proven hardware proof checker" I bought a Diligent CmodA7 Field Programmable Gate Array and have the Xilinx Vivado software. I'm looking for papers providing formal proof techniques for finite state machines. The idea is that the hardware proof checker has its own proof which can then proof-check Lean proofs.

Tim Daly (Feb 23 2020 at 09:39):

Hoare "Unified Theories of Programming" (https://fi.ort.edu.uy/innovaportal/file/20124/1/04-hoare_unified_theories.pdf) merging Denotational, Algebraic, and Operational semantics.

Tim Daly (Feb 25 2020 at 08:41):

Since Lean and Coq implement the same logic it seems to me that their proof checker should be the same code. Is this reasonable?

Tim Daly (Feb 25 2020 at 08:43):

Further, since the proof checker basically takes an input and returns either true or false, it is essentially a Binary Decision Diagram (BDD). Knuth probably wrote one on a napkin over lunch for his book. Given that, it seems that the instruction set for a proof checker ought to support the decisions.

Donald Sebastian Leung (Feb 25 2020 at 08:43):

Hmmm ... I'm no expert but I would disagree. Even though their base calculus (CoIC) is the same, they may extend it in different ways. For example, Haskell is based on System-F but it adds general recursion and a whole host of other things on top of it.

Tim Daly (Feb 25 2020 at 08:44):

Shouldn't that be "reasoned away" by the time you get to the bottom of the trusted core?

Donald Sebastian Leung (Feb 25 2020 at 08:45):

Maybe, IDK, I'm no expert so I might just be making a fool out of myself :stuck_out_tongue:

Tim Daly (Feb 25 2020 at 08:46):

Oh, not to worry. I'm certainly making a fool of myself.

Tim Daly (Feb 25 2020 at 09:01):

There is too much to know and not enough time to know it all. It helps to ask (stupid) questions so someone with expertise can explain why I'm wrong.

Patrick Massot (Feb 25 2020 at 09:05):

You could still take a couple of days off trolling and actually try to learn some Lean by writing Lean code.

Mario Carneiro (Feb 25 2020 at 09:10):

The short answer is no, they do not have the same kernel, nor even compatible kernels, for a host of insignificant but technically complicated reasons

Kevin Buzzard (Feb 25 2020 at 09:12):

I'm also not an expert, but my impression is that Mario "wrote down what Lean's logic was" in his MSc thesis, and nobody has done the analogous thing for Coq's logic?

Tim Daly (Feb 25 2020 at 09:13):

I am spending time writing Lean code (working thru TPIL). I just don't have anything worth saying about it. I'm sorry you feel that I'm "trolling". The whole Lean community only seems to care about issues related to mathlib. So, obviously, trying to introduce topics of program proof and "down to the metal" are "off topic". Note that these are active topics in Coq (e.g. Bluespec (Chlipala), Deep Specification (Appel, Pierce, et al), and PhD work on the subject.

Mario Carneiro (Feb 25 2020 at 09:13):

And yet @Bas Spitters says that Coq's theory is more well understood than lean's

Tim Daly (Feb 25 2020 at 09:14):

The computer algebra community also seems me at "trolling" because nobody believes that any mathematician writing computer programs would waste time on a proof.

Tim Daly (Feb 25 2020 at 09:16):

I guess its time to shut down this thread.

Kevin Buzzard (Feb 25 2020 at 09:23):

It's certainly true that the two provers seem to have attracted two different communities. I think Leo wants to attract more program-proof people with Lean 4, I get the impression that certain design decisions have been made in order to tempt these people over.

Tim Daly (Feb 25 2020 at 09:31):

mathlib is certainly useful. Its results need to be woven into the inheritance hierarchy of Axiom so individual programs can use the theorems. I'm working on that "weaving" (hence my discussion about parsing and comments about naming conventions and non-unicode). With a focus on proving (math) programs correct it is also vital that proofs have execution semantics (hence my "trolling" on proglib). But since I"m not "contributing" to mathlib development it is clear that any other topic is "off topic". So I don't think Leo will attract any program-proof people, especially from computer algebra who already think proofs are not worth writing.

Tim Daly (Feb 25 2020 at 09:34):

I speak from experience since "proving programs correct" was one of the main issues that caused a hard fork of the Axiom project. You would think that mathematicians who write algebra would want proofs but apparently not. And mathematicians who write proofs don't want programs.

Sebastian Ullrich (Feb 25 2020 at 09:35):

If you want executable proofs, a system with built-in proof irrelevance does not seem like the best choice, yes. Nothing of that will change in Lean 4.

Tim Daly (Feb 25 2020 at 09:49):

Why on earth would you write a new kernel for a proof system and not prove it correct? The most trusted part of the trusted system should be proven "down to the metal". Program proofs are central to the whole issue. I'd expect the kernel to be written starting from the judgments with incremental development down to code with each step proven correct. This is computational mathematics, not a game design.

Tim Daly (Feb 25 2020 at 10:03):

But I am "singing the dual of that argument" in the computer algebra clade. Why on earth would you write a GCD algorithm and not prove it correct, down to the metal? This is computational mathematics, not game design. Somehow computational mathematics has forked into two clades that don't have the same notion of "computational mathematics", although it is probably just me thinking they are related, given the evidence.

Tim Daly (Feb 25 2020 at 10:04):

Anyway, I'm clearly off topic again. Sigh.

Bas Spitters (Feb 25 2020 at 10:10):

@Mario Carneiro I was referring to your post on coq-club: "the VM is undergoing some significant changes".
Ignoring the computational/meta-theoretical aspects of syntax, semantically, the theories seem to the similarly worked out. E.g. the set theoretical model for pCuIC:
https://hal.inria.fr/hal-01952037/document
@Gaëtan Gilbert 's thesis contains quite a detailed overview of the implementation of both Coq and lean.

Mario Carneiro (Feb 25 2020 at 10:16):

@Bas Spitters Do you have a link to @Gaëtan Gilbert 's thesis? I'm not finding it in the obvious places online

Gaëtan Gilbert (Feb 25 2020 at 10:23):

I talked a bit about type theory which can be applied to lean but I would say it's the theoretical side rather than the implementation side.

https://gitlab.com/SkySkimmer/thesis/-/jobs/artifacts/master/download?job=build
I need to find a better place to put it.

Mario Carneiro (Feb 25 2020 at 10:25):

I'm looking for a theoretical account of the complete kernel

Mario Carneiro (Feb 25 2020 at 10:26):

For instance this would have to include both the fixed universes and also parametric universes, because the coq kernel deals with both systems simultaneously

Mario Carneiro (Feb 25 2020 at 10:27):

Also it has to define recursion using fix and match rather than elimination, because that's what the kernel does

Mario Carneiro (Feb 25 2020 at 10:28):

(I have had difficulty finding the complete set of rules for the structural less than relation)

Johan Commelin (Feb 25 2020 at 12:55):

Tim Daly said:

mathlib is certainly useful. Its results need to be woven into the inheritance hierarchy of Axiom so individual programs can use the theorems.

(emphasis mine)

Dear Tim, I think it's statements like this that rub some of us the wrong way. You are highly invested in Axiom, and that's fine. But often you write as if there is a logical necessity that Lean and Axiom work together. Or that Lean is designed to be some piece in the big grand toolchain of Axiom. That's just not true.

Working on building interaction between Lean and Axiom is a great idea. Very important. But both systems are also valuable (maybe less, but still) without that interaction.

Simon Cruanes (Feb 25 2020 at 14:33):

Tim Daly said:

Further, since the proof checker basically takes an input and returns either true or false, it is essentially a Binary Decision Diagram (BDD).

by that argument, most programs in logic are "basically" a BDD. But inputs are of arbitrary size, so it doesn't work that way!
A kernel for CiC is a complicated thing to write (afaik, the most complicated kind of kernel of trust found in proof assistants). Doing it purely in hardware would be a research problem.

Reid Barton (Feb 25 2020 at 14:49):

And not just programs in logic but also all decision problems, which means basically all problems considered in complexity theory. Suggesting they all reduce to something Knuth wrote on a napkin cannot be taken seriously.


Last updated: Dec 20 2023 at 11:08 UTC