Zulip Chat Archive
Stream: Is there code for X?
Topic: Vacuous implication from plausible
Siddhartha Gadgil (Nov 13 2024 at 08:39):
If we can show (part of) the hypothesis cannot be satisfied using plausible
, is there a tactic that proves the result by vacuous implication. What I am missing is extracting the proof of the contradiction from plausible
.
Siddhartha Gadgil (Nov 13 2024 at 08:57):
@Henrik Böving Can I do this?
Siddhartha Gadgil (Nov 13 2024 at 09:03):
I see Testable.checkIO
with which I should be able to work.
Henrik Böving (Nov 13 2024 at 09:16):
You can't use that to actually construct a proof from the meta level that you can put to close a goal. Instead what you should do is extract the counter example values from a TestResult and then using that write some meta code that does the proof that False is in the context. The counter example that is presented there is not really meant for...any but the infoviews consumption really, given that it is just a List String
. So you would probably need to modify TestResult to contain a more expressive representation of counter examples
Siddhartha Gadgil (Nov 13 2024 at 09:32):
Henrik Böving said:
You can't use that to actually construct a proof from the meta level that you can put to close a goal. Instead what you should do is extract the counter example values from a TestResult and then using that write some meta code that does the proof that False is in the context. The counter example that is presented there is not really meant for...any but the infoviews consumption really, given that it is just a
List String
. So you would probably need to modify TestResult to contain a more expressive representation of counter examples
Thanks. Yes, I was thinking in terms of extracting from .faliure pf ..
.
It seems it would be convenient to have a vacuous
tactic proving results by using plausible
to contradict hypothesis. Will experiment.
Siddhartha Gadgil (Nov 18 2024 at 08:38):
@Henrik Böving I wrote some code that does work in principle, but (as it involves using reduce
on a large scale`) causes crashes in larger cases. I am copying it below.
I could not figure out a better way to lift the term proof in TestResult
to an expression proof. I see that you mentioned modifying TestResult
to have more. Is that what you meant? Is there an option short of lifting everything to Meta level, requiring ToExpr
and constructing proof expressions at each stage?
Here is my code in case anyone can see improvements.
import Lean
import Plausible
open Lean Meta Elab Tactic Plausible
def Plausible.Testable.checkDirect (p : Prop) [Testable p] (cfg : Configuration := {})(defaultSeed : Nat) : TestResult p :=
let seed := cfg.randomSeed.getD defaultSeed
let suite : RandT Id (TestResult p) := Testable.runSuite p cfg
let res := runRandWith seed suite
res.run
def getProof? (p e: Expr) : MetaM <| Option Expr := do
let n ← mkFreshExprMVar <| mkConst ``Nat
let s ← mkFreshExprMVar <| ← mkAppM ``List #[mkConst ``String]
let np ← mkAppM ``Not #[p]
let contra ← mkFreshExprMVar np
let fExpr ← mkAppOptM ``TestResult.failure #[p, contra, s, n]
if ← isDefEq e fExpr then
return some contra
else
return none
def findDisproof? (p: Expr) : MetaM <| Option Expr := do
try
unless ← isProp p do
return none
let p' ← Decorations.addDecorations p
let inst ← synthInstance (← mkAppM ``Testable #[p'])
let cfg : Configuration := {}
let cfg : Configuration := {cfg with
traceDiscarded := cfg.traceDiscarded || (← isTracingEnabledFor `plausible.discarded),
traceSuccesses := cfg.traceSuccesses || (← isTracingEnabledFor `plausible.success),
traceShrink := cfg.traceShrink || (← isTracingEnabledFor `plausible.shrink.steps),
traceShrinkCandidates := cfg.traceShrinkCandidates
|| (← isTracingEnabledFor `plausible.shrink.candidates) }
let defaultSeed ← IO.rand 0 1000000
let testResult ← mkAppOptM ``Testable.checkDirect #[p, inst, toExpr cfg, toExpr defaultSeed]
let testResult ← reduce testResult -- times out
getProof? p testResult
catch e =>
logWarning m!"Failed to find disproof: {e.toMessageData}"
return none
def proveVacuous? (p: Expr) : MetaM <| Option Expr := do
match p with
| .forallE n d b bi =>
withLocalDecl n bi d fun x => do
let b := b.instantiate1 x
let contra? ← findDisproof? d -- proof of ¬d
match contra? with
| some contra =>
let pfFalse ← mkAppM' contra #[x]
let pfBody ← mkAppOptM ``False.elim #[b, pfFalse]
mkLambdaFVars #[x] pfBody
| none =>
return none
| _ =>
return none
elab "find_disproof" type:term : term => do
let p ← Term.elabType type
-- logInfo m!"Finding disproof of {p}"
let disproof ← findDisproof? p
match disproof with
| some contra =>
-- logInfo m!"Found disproof: {contra}"
return contra
| none =>
logWarning m!"No disproof found"
return mkConst ``False
#check find_disproof ((2 : Nat) < 1)
elab "prove_vacuous" type:term : term => do
let p ← Term.elabType type
let vacuous ← proveVacuous? p
match vacuous with
| some pf =>
return pf
| none =>
logWarning m!"No vacuous proof found"
return mkConst ``False
/-
fun a => False.elim (Nat.not_le_of_not_ble_eq_true (fun h => Bool.noConfusion h) a) : 2 < 1 → 1 ≤ 3
-/
#check prove_vacuous ((2 : Nat) < 1) → 1 ≤ 3
#check find_disproof (∀ n: Nat, n < (4: Nat)) -- times out
Henrik Böving (Nov 18 2024 at 08:43):
Running plausible within reduction is not a good idea. You should instead modify TestResult
to actually return you the values of the counter example, so something along the lines of HashMap FVarId Expr
most likely. Then use that counter example to build a proof term at the meta level and submit that proof term to the kernel.
Siddhartha Gadgil (Nov 18 2024 at 08:47):
Henrik Böving said:
Running plausible within reduction is not a good idea. You should instead modify
TestResult
to actually return you the values of the counter example, so something along the lines ofHashMap FVarId Expr
most likely. Then use that counter example to build a proof term at the meta level and submit that proof term to the kernel.
Thanks. I take it that means copying and modifying most of the file Testable.lean
. Is that correct?
Siddhartha Gadgil (Nov 18 2024 at 08:47):
I suppose I can avoid stuff not needed for counterexamples.
Henrik Böving (Nov 18 2024 at 08:48):
I think it mostly comes down to modifying the addVarInfo
function there, you probably want to do this on a fork of plausible
Siddhartha Gadgil (Nov 18 2024 at 08:51):
Do you mean that the failure case should have additional info besides the string, and this is added in addVarInfo
?
Henrik Böving (Nov 18 2024 at 08:53):
It should record the Expr
that is being assigned to the variable at the very least.
Siddhartha Gadgil (Nov 18 2024 at 08:54):
It seems to me that just recording the proof expression may be easiest.
Siddhartha Gadgil (Nov 18 2024 at 08:55):
There may even be no variables, for instance when proved using decide
.
Henrik Böving (Nov 18 2024 at 08:56):
Also an option yes
Siddhartha Gadgil (Nov 18 2024 at 08:57):
Let me work on this then.
Siddhartha Gadgil (Nov 25 2024 at 03:07):
@Henrik Böving I have opened a PR at https://github.com/leanprover-community/plausible/pull/7. This lifts (some of) the code to expression level and uses this to introduce two new tactics:
- The
vacuous
tactic tries to negate (a part of the) hypothesis to prove by vacuous implication. - The
random_search
tactic negates the goal and tries to disprove it.
These will probably rarely come up in human proofs but I feel will be very much needed for AI generated code.
The design is a little complex but is the best I could come up with. Suggestions on improving the code are very welcome (@Kim Morrison and @Eric Wieser - please let me know if you have any design suggestions).
Henrik Böving (Nov 25 2024 at 08:47):
I don't think that duplicating what basically amounts to the entire code base is a reasonable thing to do as a design. We should instead work to integrate this with the already existing code base through a refactoring. Currently there aren't that many implementors of typeclasses in plausible so we can still change the interface .
Siddhartha Gadgil (Nov 25 2024 at 08:49):
Henrik Böving said:
I don't think that duplicating what basically amounts to the entire code base is a reasonable thing to do as a design. We should instead work to integrate this with the already existing code base through a refactoring. Currently there aren't that many implementors of typeclasses in plausible so we can still change the interface .
One issue is that to construct proofs we need a ToExpr
for the proxies of what is sampled. So, for example, the MyType
does not work as is.
Siddhartha Gadgil (Nov 25 2024 at 08:49):
So there will be fewer instances of MetaTestable
than of Testable
.
Siddhartha Gadgil (Nov 25 2024 at 08:52):
Other than that, I agree that duplication is not good. Part of the reason for duplication was just not to break things.
Henrik Böving (Nov 25 2024 at 08:53):
The specific MyType
example can be made to work with a decide
as all values are going to be concrete but the point is correct yes. Still I don't think that splitting up the plausible eco system into two type classes is something that can be sustainable in the long term, surely there must be some good design alternative here somewhere.
Siddhartha Gadgil (Nov 25 2024 at 08:57):
I agree that splitting into two type classes is not desirable.
Siddhartha Gadgil (Nov 25 2024 at 09:05):
One option I can think of is replacing the Expr
arguments in .failure
as well as run
with Option Expr
. We then have lower priority instances in the absence of ToExpr
and higher priority ones in the presence of these.
Siddhartha Gadgil (Nov 25 2024 at 09:05):
This would mean generation of proof expressions is best-effort.
Eric Wieser (Nov 25 2024 at 12:01):
Siddhartha Gadgil said:
One issue is that to construct proofs we need a
ToExpr
for the proxies of what is sampled. So, for example, theMyType
does not work as is.
One proposed refactor is that everything should be sampling from Expr
in the first place, rather than sampling from MyType
and converting to Expr
Siddhartha Gadgil (Nov 25 2024 at 12:10):
It seems reasonable to me. No need of a proxy and just generate expressions. One can try to have elaborators to construct new instances
Henrik Böving (Nov 25 2024 at 12:11):
Sure we can try that approach as well
Siddhartha Gadgil (Nov 26 2024 at 13:07):
I have redesigned the code so that SampleableExt
has a field proxyExpr?
to optionally map proxy types to expressions. With this I could get rid of most of the differences between MetaTestable
and Testable
, so that the former can hopefully replace the latter.
There is unfortunately one regression, which is the instance Pi.sampleableExt. The new field is adding universe constraints which Lean is not able to solve. I really do not understand these things well (never used ULift
), so help is greatly appreciated : @Henrik Böving @Eric Wieser
I did not go with the switching to expressions only as I find it much harder to get expressions correct without the typed code to guide me.
Besides the regression, does it look reasonable to try to switch Testable
to MetaTestable
? I can make replacements in the PR.
Eric Wieser (Nov 26 2024 at 13:45):
Regarding type-correct expressions, Qq is a good option
Eric Wieser (Nov 26 2024 at 13:45):
For switching to exprs generally, I suspect that we want to drop typeclasses entirely and switch to a more simproc/norm_num/positivity-style registration mechanism
Siddhartha Gadgil (Nov 29 2024 at 13:41):
The good news is that the regression is fixed. So carrying along expressions has parity with the original plausible code. Unfortunately it is a lot slower.
@Eric Wieser I should start using Qq - it looks really nice. The other approach is currently out of my experience/skill
@Henrik Böving What seems the best plan. I can make MetaTestable
to be the new Testable
with the new Sampleable
. This allows for disproofs and hence the new tactics. The only catch is the slow running, as can be seen at https://github.com/siddhartha-gadgil/plausible/blob/2217b44d355f131ff73875ddea577d440838f0a6/Test/Testable.lean#L30.
Henrik Böving (Nov 29 2024 at 15:39):
Have you collected any profile data on what it is doing that is slower?
Siddhartha Gadgil (Nov 29 2024 at 15:40):
Not yet
Siddhartha Gadgil (Nov 29 2024 at 15:41):
What is the best way to profile
Henrik Böving (Nov 29 2024 at 15:43):
Given that you are now doing meta programming I would suggest set_option trace.profiler true
as a first step.
Siddhartha Gadgil (Nov 29 2024 at 16:32):
Below is what I got. I don't understand it:
[Elab.command] [21.713837] #eval MetaTestable.check <| ∀ a b : MyType, a.y ≤ b.x → a.x ≤ b.y ▼
[Meta.synthInstance] [0.012368] ✅️ Plausible.MetaTestable
(Plausible.NamedBinder "a"
(∀ (a : MyType),
Plausible.NamedBinder "b"
(∀ (b : MyType), Plausible.NamedBinder "a._@.Test.Testable._hyg.275" (a.y ≤ b.x → a.x ≤ b.y))))
[def.processPreDef] [21.581367] process pre-definitions ▼
[compiler] [21.580126] compiling old: [_eval]
Seems like "process pre-definitions" are taking all the time.
Michael Stoll (Nov 29 2024 at 16:38):
Perhaps clicking on the ▼
will give you more details?
Henrik Böving (Nov 29 2024 at 16:38):
It seems more like compiling the code is taking the whole time here
Henrik Böving (Nov 29 2024 at 16:38):
The downward arrow is already clicked, it's the compiler node below, if its a right pointing arrow it's not yet clicked
Siddhartha Gadgil (Nov 29 2024 at 16:41):
It may not be a regression then. I was puzzled because if I do not pass a type expression as an argument, the MetaTestable.check
function should behave just like the Testable.check
function
Henrik Böving (Nov 29 2024 at 16:43):
The reason it is taking so long to compile is most likely that because there is a type class mechanism involved the entire hierarchy of functions that are involved are getting specialized and compiled on the fly here. This is also an issue that was already present in the regular Testable approach. However now that you extended the type class to contain expression generating code as well you need to compile even more code.
This issue would indeed be solved by:
Eric Wieser said:
For switching to exprs generally, I suspect that we want to drop typeclasses entirely and switch to a more simproc/norm_num/positivity-style registration mechanism
as there would be much less specialisation left to do
Siddhartha Gadgil (Nov 29 2024 at 16:45):
I see. Is there a way to make them compile in advance
Henrik Böving (Nov 29 2024 at 16:46):
No that's not how specialisation works
Henrik Böving (Nov 29 2024 at 16:52):
When Lean encounters a function that has a type class argument it explicitly substitutes in whatever TC synthesis figured out for that argument for compilation. So whenever you present the compiler with a new call to check with a new p
it is going to make a completely new set of compiled definitions for all of the things involved in executing the code of that [Testable p]
instance. As you cannot know a priori what p
are going to be there you cannot precompile.
If instead we used an attribute like @[plausible MyType]
on functions of type SomeMetaGenMonad Expr
and then call these functions when we need to generate a value of type MyType
that function does not have any TC argument so there is no need to be specialised -> we can use an already compiled version from the library itself.
Eric Wieser (Nov 29 2024 at 19:06):
While what Henrik says is true, it isn't in my mind the primary reason that typeclasses are the wrong tool here; using a registration mechanism instead lets us match on partial expressions, or lets us restrict to things like "is an expression in a ring" (by just failing the handler if this condition isn't met) which can't be captured by typeclass alone
Siddhartha Gadgil (Feb 10 2025 at 09:56):
@Eric Wieser @Henrik Böving I am thinking of giving this a shot. I am asking for suggestions because I am a bit confused about some things even if I am willing to give up some generality (which I am for a first pass).
- Have an attribute @[plausible MyType]
for functions that will be
SomeMetaGenMonad Expr`. - The attribute is registered in some environment extension similar to
symm
except byMyType
. - As a first pass, one can simply try to match
MyType
withisDefEq
as there are not that many generators. - However,
MyType
may really be a type family. So it seems one should instantiateMVar
's and fill in to get a type. ThenisDefEq
will solve instantiating some of them. - In case some of the metavariables are not assigned by
isDefEq
, one should assume these are meant to be generated. So we recurse here (with some bound on depth). Some may be assigned involving other metavariables adding a twist to the tale. - It seems that as generators will not be common, one can store them without indexing in an environment extension. Also, it is not obvious to me how to use discriminant trees here.
- One should have tactics
plausible using
to try to close using some tactic, instead of depending on decidable equality. - As I implemented earlier, the same mechanism can be used to prove existential results by a search.
Siddhartha Gadgil (Feb 10 2025 at 11:36):
The above will not work, depending on too much magic. It seems that, just as instances are often based on the presence of other typeclasses and use samples from them, the attribute should be in general more like @[plausible MyType from ...]
, and the generated expression depending on other expressions.
Eric Wieser (Feb 10 2025 at 11:38):
For simprocs and norm_num, this is handled by just allowing a handler to say "actually I can't help here after all"
Siddhartha Gadgil (Feb 10 2025 at 11:39):
I mean what is the way to annotate: can generate List A
if we have a way to generate A
Eric Wieser (Feb 10 2025 at 11:39):
Annotate with List _
Eric Wieser (Feb 10 2025 at 11:39):
Then in the handler, pattern match and recurse like we do in the norm_num handlers
Siddhartha Gadgil (Feb 10 2025 at 11:41):
I see. Thanks. I have not done pattern matching with underscores before, but will learn (starting with the basic cases where they are arguments).
But does the expression not depend on the value in the underscore?
Siddhartha Gadgil (Feb 10 2025 at 11:44):
I suppose the expressions can be lambdas with generated arguments filled in.
Siddhartha Gadgil (Feb 10 2025 at 11:48):
Am I correct that when elaborating underscores just become meta-variables?
If I have an expression like _ + _
, we should get meta-variables for both the underscores as well as all the HAdd
parameters. Are these distinguished by one of them being "synthetic"?
Eric Wieser (Feb 10 2025 at 12:10):
I don't remember exactly how the @[norm_num patt]
attribute works. I think the pattern elaboration is handled by the discrtree machinery. To actually extract the value of the _ for the case in question, I would usually recommend ~q(List $a)
matching
Eric Wieser (Feb 10 2025 at 12:15):
I think building a dumb set of machinery by cargo-culting positivity
would be reasonable straightforward
Eric Wieser (Feb 10 2025 at 12:16):
But if you want to do things like "sample lists of primes of length 4" efficiently, expressed via separate hypotheses, you might need a different design
Siddhartha Gadgil (Feb 10 2025 at 12:19):
I will start with trying to make something that works, though may not be optimal in brevity, syntax or coverage. If this works I can try to refine.
Eric Wieser (Feb 10 2025 at 12:51):
I guess for the above example, sampling from List.Vector Nat.Primes 4
would be enough for a first attempt
Last updated: May 02 2025 at 03:31 UTC