Zulip Chat Archive

Stream: maths

Topic: Formalizing a theoretical AI safety result

Geoffrey Irving (Nov 27 2023 at 11:54):

I put https://leanprover.zulipchat.com/#narrow/stream/236449-Program-verification/topic/Formalizing.20a.20theoretical.20AI.20safety.20result/near/404387500 in "Program verification", but since it's a theoretical complexity theory result rather than an normal program I'll mention it here as well.

https://github.com/google-deepmind/debate formalizes a theoretical result related to AI safety in Lean 4 (the paper is https://arxiv.org/abs/2311.14125). It is a fairly simple theorem, but was a fun exercise to formalize (and was useful in learning the differences between Lean 3 and Lean 4). And now we have the basic setting down in case we want to formalize follow-up results (this paper is just a baby step).

debate.png

Kevin Buzzard (Nov 27 2023 at 12:06):

completeness 3/5 and soundness 3/5 -- are these marks out of 5?

Geoffrey Irving (Nov 27 2023 at 12:06):

They’re probabilities of success. The only important thing is that they’re > 1/2, similar to the definition of the complexity class BPP. (I should probably fix that theorem statement to be more closed: that version is pretty lazy.)

Last updated: May 02 2025 at 03:31 UTC