Zulip Chat Archive

Stream: lean4

Topic: quickchecks `Gen` monad


Henrik Böving (Feb 26 2022 at 15:17):

I've been looking into building a quickcheck implementation for Lean and I noticed that in the original quickcheck paper they mention that the Gen monad is in fact not a lawful monad due to the way it is implemented by splitting the RNG into two seperate ones and passing one down to each computation in the bind implementation.

Now I've been wondering why exactly it is done like this and not with a State monad wrapping around an RNG? It would be a lawful monad like this and you'd still be carrying an RNG around with you. It would also mimick the behaviour of an RNG in the imperative world since there it usually has global state (well here it is not global but you get the point)

so my question: Why exactly is it done via the split method on RNG instead of via a state monad? If they worry about efficiency they could also use some sort of ref as well?

Kevin Buzzard (Feb 26 2022 at 15:19):

(Is this related to slim_check in Lean 3? Sorry for the noise if not.)

Henrik Böving (Feb 26 2022 at 15:24):

Yes it could be used to implement a tactic like this. In general quickcheck is a tool for what software engineers calls "property based testing" so instead of specifying test cases yourself, e.g. 1 + 2 == 2 + 1 you tell quickcheck, here I have some function with inputs x y z, generate n input pairs for me and check whether this function always returns true. If it doesn't (this is new in quickcheck v2) try to figure out a minimal counter example based on the inputs you found.

And of course this could be used as a tactic to quickly check whether a theorem is bogus as well, just like slim_check.

Reid Barton (Feb 26 2022 at 15:28):

I don't actually know the reason--I would guess it is related to laziness or possibly parallelism somehow.

Reid Barton (Feb 26 2022 at 15:28):

But, I do know that this caused the following fun bug: https://gitlab.haskell.org/ghc/ghc/-/issues/3620

Henrik Böving (Feb 26 2022 at 15:29):

Yeah something like this was what I was curious about as well, the split doc string mentions that it has to ensure both new RNGs are independent but in a verified world we'd actually like a proof for what as well which seems very complicated.

Henrik Böving (Feb 26 2022 at 15:32):

At least very complicated compared to just going the state monad approach

Reid Barton (Feb 26 2022 at 15:40):

I'm not sure how (or why) you would make slimcheck verified anyways--I guess you could try to state that the PRNG is indistinguishable from randomness under some computational complexity assumption, but surely you can't prove it

Henrik Böving (Feb 26 2022 at 15:42):

The Coq folks at least seem to be interested in verifying parts of their quickcheck variant (they very creatively called it QuickChick, if someone has an idea for Lean feel free to tell me :p) although it is not concerned with properties of the RNG but rather with properties of the value generators themselves.

Chris B (Feb 26 2022 at 15:45):

I think Simon Hudon was looking at something like the inductive relation generator thing, he might have some advice if you go in that direction.

Henrik Böving (Feb 26 2022 at 15:45):

Thats still far far in the future but eventually sure!

Chris B (Feb 26 2022 at 15:49):

Do you know if quickchick (the coq one) also uses type-based shrinking? I was reading some property testing stuff a few weeks ago, and the Haskell people seem to believe that doing shrinking by type was a mistake, but it seems like there would be ways around that in Coq/Lean, maybe with subtyping.

Matthew Ballard (Feb 26 2022 at 15:49):

I had to quickly google the difference between PBT and fuzzing. I wonder how much of the work could be transferable to a theorem fuzzer?

Henrik Böving (Feb 26 2022 at 15:52):

Chris B said:

Do you know if quickchick (the coq one) also uses type-based shrinking? I was reading some property testing stuff a few weeks ago, and the Haskell people seem to believe that doing shrinking by type was a mistake, but it seems like there would be ways around that in Coq/Lean, maybe with subtyping.

I havent fully read through their paper yet and admitedly my Coq skills are kinda non existent? I'm just reading it like I would read Lean code really haha. But I'm definitely planning to understand what both quick check and chick are doing before actually getting this done, the question in the thread just sprang to my attention very early on since the Gen monad is so fundamental to this.

Henrik Böving (Feb 26 2022 at 15:59):

Matthew Ballard said:

I had to quickly google the difference between PBT and fuzzing. I wonder how much of the work could be transferable to a theorem fuzzer?

What would a theorem fuzzer do?

Mario Carneiro (Feb 26 2022 at 16:00):

Wouldn't it be simpler to port slim_check? Or is there something about the design you don't like

Henrik Böving (Feb 26 2022 at 16:09):

What exactly is the difference between slim_check and quickcheck? It seems to me like it is very closely modeled after it? we have the gen monad, the testable typeclass, sampleable instead of arbitrary (?)

Henrik Böving (Feb 26 2022 at 16:13):

I can of course see that slim_check is meant to actually operate on stuff of type Prop while quickcheck is of course very much not capable of doing this due to Haskell limitations but that I was definitely planning to look into as well

Julian Berman (Feb 26 2022 at 16:17):

(I'm no expert certainly but maybe https://hypothesis.works/articles/quickcheck-in-every-language/ and the Hypothesis docs are worth a glance which is what we use in Python-land. David is not a huge fan of QuickCheck I think but I've seen lots of quickcheck users show up and like Hypothesis)

Henrik Böving (Feb 26 2022 at 16:24):

Seems to me like their idea here is to explicitly pass generators instead of letting them be automatically infered by type class resolution?

Mario Carneiro (Feb 26 2022 at 16:25):

Henrik Böving said:

What exactly is the difference between slim_check and quickcheck? It seems to me like it is very closely modeled after it? we have the gen monad, the testable typeclass, sampleable instead of arbitrary (?)

I'm saying that slim_check is a port of quickcheck to lean 3, so if you are working on a port to lean 4 then the existing thing should at least be considered rather than redoing the work

Henrik Böving (Feb 26 2022 at 16:25):

Ah of course yes

Mario Carneiro (Feb 26 2022 at 16:27):

And of course you can have it in lean 4 syntax at https://github.com/leanprover-community/mathlib3port/blob/master/Mathbin/Testing/SlimCheck/Functions.lean

Matthew Ballard (Feb 26 2022 at 17:21):

Henrik Böving said:

What would a theorem fuzzer do?

Good question. Add random results to the context and try to prove false?

Henrik Böving (Feb 26 2022 at 17:22):

And what would be the benefit of that?

Kevin Buzzard (Feb 26 2022 at 17:28):

Well I guess we've got to figure out how to make these things prove theorems somehow...

Jay Sulzberger (Feb 26 2022 at 17:41):

Look at the proof. At the proof, ah, syntax, of various lemmas. Then, bearing in mind the local topology of these lemmas, and the various topologies induced by various variants of already proved theorems used, cunningly relax the lemmas. That is, remove or weaken things used in the proof. So, not add stuff, but remove. Obviously Math Zero will both weaken and strengthen, and deform sideways, everything in sight.

Simon Hudon (Feb 26 2022 at 20:09):

@Henrik Böving I'd be happy to see slim_check ported to Lean 4. Let me know if you want to discuss some of the design decisions

Henrik Böving (Feb 26 2022 at 20:11):

Sure! I'm still reading into all the three variants I've found interesting so far (Haskell Coq Lean) but I'll get back here with more questions sooner or later probably, but so far I'm very much liking the slim_check variant, are there some things you were limited in by 3 that you would've done in another way if possible?

Simon Hudon (Feb 26 2022 at 20:14):

I can't think of any particular pain points in what is up there. I started experimenting with generators on my own but I never got to something I found usable. The generators seemed too slow to be used within Lean. I was trying it out on a typing relation for a programming language. I remember that proving the termination of those generators would get really hairy.

Simon Hudon (Feb 26 2022 at 20:16):

I'm hoping that ramping up the performances in Lean 4 could help make the library more usable. For the termination problem, it might just be that staying out of meta-land was overly ambitious and that something more useful could be produced by cutting some corners.

Simon Hudon (Feb 26 2022 at 20:18):

I was especially attached to the fact that the type of testing functions would promise that every counterexample was actually sound. I don't know if that would have to be compromised to improve the usability.


Last updated: Dec 20 2023 at 11:08 UTC