Zulip Chat Archive

Stream: Is there code for X?

Topic: Vacuous implication from plausible

Siddhartha Gadgil (Nov 13 2024 at 08:39):

If we can show (part of) the hypothesis cannot be satisfied using plausible, is there a tactic that proves the result by vacuous implication. What I am missing is extracting the proof of the contradiction from plausible.

Siddhartha Gadgil (Nov 13 2024 at 08:57):

@Henrik Böving Can I do this?

Siddhartha Gadgil (Nov 13 2024 at 09:03):

I see Testable.checkIO with which I should be able to work.

Henrik Böving (Nov 13 2024 at 09:16):

You can't use that to actually construct a proof from the meta level that you can put to close a goal. Instead what you should do is extract the counter example values from a TestResult and then using that write some meta code that does the proof that False is in the context. The counter example that is presented there is not really meant for...any but the infoviews consumption really, given that it is just a List String. So you would probably need to modify TestResult to contain a more expressive representation of counter examples

Siddhartha Gadgil (Nov 13 2024 at 09:32):

Henrik Böving said:

You can't use that to actually construct a proof from the meta level that you can put to close a goal. Instead what you should do is extract the counter example values from a TestResult and then using that write some meta code that does the proof that False is in the context. The counter example that is presented there is not really meant for...any but the infoviews consumption really, given that it is just a List String. So you would probably need to modify TestResult to contain a more expressive representation of counter examples

Thanks. Yes, I was thinking in terms of extracting from .faliure pf ...

It seems it would be convenient to have a vacuous tactic proving results by using plausible to contradict hypothesis. Will experiment.

Siddhartha Gadgil (Nov 18 2024 at 08:38):

@Henrik Böving I wrote some code that does work in principle, but (as it involves using reduce on a large scale`) causes crashes in larger cases. I am copying it below.

I could not figure out a better way to lift the term proof in TestResult to an expression proof. I see that you mentioned modifying TestResult to have more. Is that what you meant? Is there an option short of lifting everything to Meta level, requiring ToExpr and constructing proof expressions at each stage?

Here is my code in case anyone can see improvements.

import Lean
import Plausible

open Lean Meta Elab Tactic Plausible

def Plausible.Testable.checkDirect (p : Prop) [Testable p] (cfg : Configuration := {})(defaultSeed : Nat) : TestResult p :=
  let seed := cfg.randomSeed.getD defaultSeed
  let suite : RandT Id (TestResult p) := Testable.runSuite p cfg
  let res := runRandWith seed suite
  res.run


def getProof? (p e: Expr) : MetaM <| Option Expr := do
  let n ← mkFreshExprMVar <| mkConst ``Nat
  let s ← mkFreshExprMVar <| ← mkAppM ``List #[mkConst ``String]
  let np ← mkAppM ``Not #[p]
  let contra ← mkFreshExprMVar np
  let fExpr ← mkAppOptM ``TestResult.failure #[p, contra, s, n]
  if ← isDefEq e fExpr then
    return some contra
  else
    return none

def findDisproof? (p: Expr) : MetaM <| Option Expr := do
  try
    unless ← isProp p do
      return none
    let p' ← Decorations.addDecorations p
    let inst ← synthInstance (← mkAppM ``Testable #[p'])
    let cfg : Configuration := {}
    let cfg : Configuration := {cfg with
    traceDiscarded := cfg.traceDiscarded || (← isTracingEnabledFor `plausible.discarded),
    traceSuccesses := cfg.traceSuccesses || (← isTracingEnabledFor `plausible.success),
    traceShrink := cfg.traceShrink || (← isTracingEnabledFor `plausible.shrink.steps),
    traceShrinkCandidates := cfg.traceShrinkCandidates
      || (← isTracingEnabledFor `plausible.shrink.candidates) }
    let defaultSeed ← IO.rand 0 1000000
    let testResult ← mkAppOptM ``Testable.checkDirect #[p, inst, toExpr cfg, toExpr defaultSeed]
    let testResult ← reduce testResult -- times out
    getProof? p testResult
  catch e =>
    logWarning m!"Failed to find disproof: {e.toMessageData}"
    return none

def proveVacuous? (p: Expr) : MetaM <| Option Expr := do
  match p with
  | .forallE n d b bi =>
    withLocalDecl n bi d fun x => do
      let b := b.instantiate1 x
      let contra? ← findDisproof? d -- proof of ¬d
      match contra? with
      | some contra =>
        let pfFalse ← mkAppM' contra #[x]
        let pfBody ← mkAppOptM ``False.elim #[b, pfFalse]
        mkLambdaFVars #[x] pfBody
      | none =>
        return none
  | _ =>
    return none

elab "find_disproof" type:term : term => do
  let p ← Term.elabType type
  -- logInfo m!"Finding disproof of {p}"
  let disproof ← findDisproof? p
  match disproof with
  | some contra =>
    -- logInfo m!"Found disproof: {contra}"
    return contra
  | none =>
    logWarning m!"No disproof found"
    return mkConst ``False

#check find_disproof ((2 : Nat) < 1)

elab "prove_vacuous" type:term : term => do
  let p ← Term.elabType type
  let vacuous ← proveVacuous? p
  match vacuous with
  | some pf =>
    return pf
  | none =>
    logWarning m!"No vacuous proof found"
    return mkConst ``False

/-
fun a => False.elim (Nat.not_le_of_not_ble_eq_true (fun h => Bool.noConfusion h) a) : 2 < 1 → 1 ≤ 3
-/
#check prove_vacuous ((2 : Nat) < 1) → 1 ≤ 3

#check find_disproof (∀ n: Nat, n < (4: Nat)) -- times out

Henrik Böving (Nov 18 2024 at 08:43):

Running plausible within reduction is not a good idea. You should instead modify TestResult to actually return you the values of the counter example, so something along the lines of HashMap FVarId Expr most likely. Then use that counter example to build a proof term at the meta level and submit that proof term to the kernel.

Siddhartha Gadgil (Nov 18 2024 at 08:47):

Henrik Böving said:

Running plausible within reduction is not a good idea. You should instead modify TestResult to actually return you the values of the counter example, so something along the lines of HashMap FVarId Expr most likely. Then use that counter example to build a proof term at the meta level and submit that proof term to the kernel.

Thanks. I take it that means copying and modifying most of the file Testable.lean. Is that correct?

Siddhartha Gadgil (Nov 18 2024 at 08:47):

I suppose I can avoid stuff not needed for counterexamples.

Henrik Böving (Nov 18 2024 at 08:48):

I think it mostly comes down to modifying the addVarInfo function there, you probably want to do this on a fork of plausible

Siddhartha Gadgil (Nov 18 2024 at 08:51):

Do you mean that the failure case should have additional info besides the string, and this is added in addVarInfo?

Henrik Böving (Nov 18 2024 at 08:53):

It should record the Expr that is being assigned to the variable at the very least.

Siddhartha Gadgil (Nov 18 2024 at 08:54):

It seems to me that just recording the proof expression may be easiest.

Siddhartha Gadgil (Nov 18 2024 at 08:55):

There may even be no variables, for instance when proved using decide.

Henrik Böving (Nov 18 2024 at 08:56):

Also an option yes

Siddhartha Gadgil (Nov 18 2024 at 08:57):

Let me work on this then.

Siddhartha Gadgil (Nov 25 2024 at 03:07):

@Henrik Böving I have opened a PR at https://github.com/leanprover-community/plausible/pull/7. This lifts (some of) the code to expression level and uses this to introduce two new tactics:

The vacuous tactic tries to negate (a part of the) hypothesis to prove by vacuous implication.
The random_search tactic negates the goal and tries to disprove it.

These will probably rarely come up in human proofs but I feel will be very much needed for AI generated code.

The design is a little complex but is the best I could come up with. Suggestions on improving the code are very welcome (@Kim Morrison and @Eric Wieser - please let me know if you have any design suggestions).

Henrik Böving (Nov 25 2024 at 08:47):

I don't think that duplicating what basically amounts to the entire code base is a reasonable thing to do as a design. We should instead work to integrate this with the already existing code base through a refactoring. Currently there aren't that many implementors of typeclasses in plausible so we can still change the interface .

Siddhartha Gadgil (Nov 25 2024 at 08:49):

Henrik Böving said:

I don't think that duplicating what basically amounts to the entire code base is a reasonable thing to do as a design. We should instead work to integrate this with the already existing code base through a refactoring. Currently there aren't that many implementors of typeclasses in plausible so we can still change the interface .

One issue is that to construct proofs we need a ToExpr for the proxies of what is sampled. So, for example, the MyType does not work as is.

Siddhartha Gadgil (Nov 25 2024 at 08:49):

So there will be fewer instances of MetaTestable than of Testable.

Siddhartha Gadgil (Nov 25 2024 at 08:52):

Other than that, I agree that duplication is not good. Part of the reason for duplication was just not to break things.

Henrik Böving (Nov 25 2024 at 08:53):

The specific MyType example can be made to work with a decide as all values are going to be concrete but the point is correct yes. Still I don't think that splitting up the plausible eco system into two type classes is something that can be sustainable in the long term, surely there must be some good design alternative here somewhere.

Siddhartha Gadgil (Nov 25 2024 at 08:57):

I agree that splitting into two type classes is not desirable.

Siddhartha Gadgil (Nov 25 2024 at 09:05):

One option I can think of is replacing the Expr arguments in .failure as well as run with Option Expr. We then have lower priority instances in the absence of ToExpr and higher priority ones in the presence of these.

Siddhartha Gadgil (Nov 25 2024 at 09:05):

This would mean generation of proof expressions is best-effort.

Eric Wieser (Nov 25 2024 at 12:01):

Siddhartha Gadgil said:

One issue is that to construct proofs we need a ToExpr for the proxies of what is sampled. So, for example, the MyType does not work as is.

One proposed refactor is that everything should be sampling from Expr in the first place, rather than sampling from MyType and converting to Expr

Siddhartha Gadgil (Nov 25 2024 at 12:10):

It seems reasonable to me. No need of a proxy and just generate expressions. One can try to have elaborators to construct new instances

Henrik Böving (Nov 25 2024 at 12:11):

Sure we can try that approach as well

Siddhartha Gadgil (Nov 26 2024 at 13:07):

I have redesigned the code so that SampleableExt has a field proxyExpr?to optionally map proxy types to expressions. With this I could get rid of most of the differences between MetaTestable and Testable, so that the former can hopefully replace the latter.

There is unfortunately one regression, which is the instance Pi.sampleableExt. The new field is adding universe constraints which Lean is not able to solve. I really do not understand these things well (never used ULift), so help is greatly appreciated : @Henrik Böving @Eric Wieser

I did not go with the switching to expressions only as I find it much harder to get expressions correct without the typed code to guide me.

Besides the regression, does it look reasonable to try to switch Testable to MetaTestable? I can make replacements in the PR.

Eric Wieser (Nov 26 2024 at 13:45):

Regarding type-correct expressions, Qq is a good option

Eric Wieser (Nov 26 2024 at 13:45):

For switching to exprs generally, I suspect that we want to drop typeclasses entirely and switch to a more simproc/norm_num/positivity-style registration mechanism

Siddhartha Gadgil (Nov 29 2024 at 13:41):

The good news is that the regression is fixed. So carrying along expressions has parity with the original plausible code. Unfortunately it is a lot slower.

@Eric Wieser I should start using Qq - it looks really nice. The other approach is currently out of my experience/skill

@Henrik Böving What seems the best plan. I can make MetaTestable to be the new Testable with the new Sampleable. This allows for disproofs and hence the new tactics. The only catch is the slow running, as can be seen at https://github.com/siddhartha-gadgil/plausible/blob/2217b44d355f131ff73875ddea577d440838f0a6/Test/Testable.lean#L30.

Henrik Böving (Nov 29 2024 at 15:39):

Have you collected any profile data on what it is doing that is slower?

Siddhartha Gadgil (Nov 29 2024 at 15:40):

Not yet

Siddhartha Gadgil (Nov 29 2024 at 15:41):

What is the best way to profile

Henrik Böving (Nov 29 2024 at 15:43):

Given that you are now doing meta programming I would suggest set_option trace.profiler true as a first step.

Siddhartha Gadgil (Nov 29 2024 at 16:32):

Below is what I got. I don't understand it:

[Elab.command] [21.713837] #eval MetaTestable.check <| ∀ a b : MyType, a.y ≤ b.x → a.x ≤ b.y ▼
  [Meta.synthInstance] [0.012368] ✅️ Plausible.MetaTestable
      (Plausible.NamedBinder "a"
        (∀ (a : MyType),
          Plausible.NamedBinder "b"
            (∀ (b : MyType), Plausible.NamedBinder "a._@.Test.Testable._hyg.275" (a.y ≤ b.x → a.x ≤ b.y))))
  [def.processPreDef] [21.581367] process pre-definitions ▼
  [compiler] [21.580126] compiling old: [_eval]

Seems like "process pre-definitions" are taking all the time.

Michael Stoll (Nov 29 2024 at 16:38):

Perhaps clicking on the ▼ will give you more details?

Henrik Böving (Nov 29 2024 at 16:38):

It seems more like compiling the code is taking the whole time here

Henrik Böving (Nov 29 2024 at 16:38):

The downward arrow is already clicked, it's the compiler node below, if its a right pointing arrow it's not yet clicked

Siddhartha Gadgil (Nov 29 2024 at 16:41):

It may not be a regression then. I was puzzled because if I do not pass a type expression as an argument, the MetaTestable.check function should behave just like the Testable.check function

Henrik Böving (Nov 29 2024 at 16:43):

The reason it is taking so long to compile is most likely that because there is a type class mechanism involved the entire hierarchy of functions that are involved are getting specialized and compiled on the fly here. This is also an issue that was already present in the regular Testable approach. However now that you extended the type class to contain expression generating code as well you need to compile even more code.

This issue would indeed be solved by:

Eric Wieser said:

For switching to exprs generally, I suspect that we want to drop typeclasses entirely and switch to a more simproc/norm_num/positivity-style registration mechanism

as there would be much less specialisation left to do

Siddhartha Gadgil (Nov 29 2024 at 16:45):

I see. Is there a way to make them compile in advance

Henrik Böving (Nov 29 2024 at 16:46):

No that's not how specialisation works

Henrik Böving (Nov 29 2024 at 16:52):

When Lean encounters a function that has a type class argument it explicitly substitutes in whatever TC synthesis figured out for that argument for compilation. So whenever you present the compiler with a new call to check with a new p it is going to make a completely new set of compiled definitions for all of the things involved in executing the code of that [Testable p] instance. As you cannot know a priori what p are going to be there you cannot precompile.

If instead we used an attribute like @[plausible MyType] on functions of type SomeMetaGenMonad Expr and then call these functions when we need to generate a value of type MyType that function does not have any TC argument so there is no need to be specialised -> we can use an already compiled version from the library itself.

Eric Wieser (Nov 29 2024 at 19:06):

While what Henrik says is true, it isn't in my mind the primary reason that typeclasses are the wrong tool here; using a registration mechanism instead lets us match on partial expressions, or lets us restrict to things like "is an expression in a ring" (by just failing the handler if this condition isn't met) which can't be captured by typeclass alone

Siddhartha Gadgil (Feb 10 2025 at 09:56):

@Eric Wieser @Henrik Böving I am thinking of giving this a shot. I am asking for suggestions because I am a bit confused about some things even if I am willing to give up some generality (which I am for a first pass).

Have an attribute @[plausible MyType] for functions that will be SomeMetaGenMonad Expr`.
The attribute is registered in some environment extension similar to symm except by MyType.
As a first pass, one can simply try to match MyType with isDefEq as there are not that many generators.
However, MyType may really be a type family. So it seems one should instantiate MVar's and fill in to get a type. Then isDefEq will solve instantiating some of them.
In case some of the metavariables are not assigned by isDefEq, one should assume these are meant to be generated. So we recurse here (with some bound on depth). Some may be assigned involving other metavariables adding a twist to the tale.
It seems that as generators will not be common, one can store them without indexing in an environment extension. Also, it is not obvious to me how to use discriminant trees here.
One should have tactics plausible using to try to close using some tactic, instead of depending on decidable equality.
As I implemented earlier, the same mechanism can be used to prove existential results by a search.

Siddhartha Gadgil (Feb 10 2025 at 11:36):

The above will not work, depending on too much magic. It seems that, just as instances are often based on the presence of other typeclasses and use samples from them, the attribute should be in general more like @[plausible MyType from ...], and the generated expression depending on other expressions.

Eric Wieser (Feb 10 2025 at 11:38):

For simprocs and norm_num, this is handled by just allowing a handler to say "actually I can't help here after all"

Siddhartha Gadgil (Feb 10 2025 at 11:39):

I mean what is the way to annotate: can generate List A if we have a way to generate A

Eric Wieser (Feb 10 2025 at 11:39):

Annotate with List _

Eric Wieser (Feb 10 2025 at 11:39):

Then in the handler, pattern match and recurse like we do in the norm_num handlers

Siddhartha Gadgil (Feb 10 2025 at 11:41):

I see. Thanks. I have not done pattern matching with underscores before, but will learn (starting with the basic cases where they are arguments).

But does the expression not depend on the value in the underscore?

Siddhartha Gadgil (Feb 10 2025 at 11:44):

I suppose the expressions can be lambdas with generated arguments filled in.

Siddhartha Gadgil (Feb 10 2025 at 11:48):

Am I correct that when elaborating underscores just become meta-variables?
If I have an expression like _ + _, we should get meta-variables for both the underscores as well as all the HAdd parameters. Are these distinguished by one of them being "synthetic"?

Eric Wieser (Feb 10 2025 at 12:10):

I don't remember exactly how the @[norm_num patt] attribute works. I think the pattern elaboration is handled by the discrtree machinery. To actually extract the value of the _ for the case in question, I would usually recommend ~q(List $a) matching

Eric Wieser (Feb 10 2025 at 12:15):

I think building a dumb set of machinery by cargo-culting positivity would be reasonable straightforward

Eric Wieser (Feb 10 2025 at 12:16):

But if you want to do things like "sample lists of primes of length 4" efficiently, expressed via separate hypotheses, you might need a different design

Siddhartha Gadgil (Feb 10 2025 at 12:19):

I will start with trying to make something that works, though may not be optimal in brevity, syntax or coverage. If this works I can try to refine.

Eric Wieser (Feb 10 2025 at 12:51):

I guess for the above example, sampling from List.Vector Nat.Primes 4 would be enough for a first attempt

Last updated: May 02 2025 at 03:31 UTC