Zulip Chat Archive

Stream: mathlib4

Topic: Model Theory: BoundedFormula, Formula and Sentence

Chris Hughes (Feb 16 2025 at 11:54):

There's been some discussion of this on github, and I thought I'd summarise the key points of this discussion in one place.

Currently we have three type of first-order formula in the model theory library. We have BoundedFormula, Formula, and Sentence. Formula and Sentence are both defined as abbreviations in terms of BoundedFormula.

BoundedFormula L α n is the type of formulas with free variables indexed by α and n additional free variables. The n additional free variables are the variables that may later be bounded by applying an all constructor. This implementation is forced because of universe issues, you could try making a constructor all : Formula (Option α) -> Formula α, but then this forces Formula to go up a universe as one of the constructors takes a type as an argument.
Formula L α is the type of formulas with free variables indexed by α, defined to be BoundedFormula L α 0
Sentence L is the type of formulas with no free variables, defined to be Formula L Empty.

There are different functions for talking about the semantics of each of these formulas. BoundedFormula.Realize, Formula.Realize and Sentence.Realize. My understanding is that the main motivation for having these three different functions for talking about the semantics of formulas is that in the special case that n = 0, BoundedFormula.Realize takes an argument Fin 0 -> M, and these arguments are always equal of course, but may not be syntactically or definitionally equal.

This duplication leads to a lot of duplication in the model theory library. For example we have bothTerm.equal which is a Formula and Term.bdEqual which is a BoundedFormula. These both need simp lemmas about their semantics, and additionally, because Formula is reducible, I could apply BoundedFormula.Realizeto Term.equal and then maybe we need a third simp lemma for that.

I don't have a good answer for what the solution is here. I suspect that making BoundedFormula a implementation detail that should often be mentioned by users and adding a definition Formula.all : Formula L (Option α) -> Formula L α or similar might be a good idea. I think better support for equality of terms Fin 0 -> _ or Empty -> _ will also help here.

Chris Hughes (Feb 16 2025 at 12:03):

@Yaël Dillies @Aaron Anderson

Chris Hughes (Feb 16 2025 at 12:17):

Maybe @Jesse Michael Han and @Floris van Doorn have something to say here

Bhavik Mehta (Feb 16 2025 at 20:20):

Some functions notably missing from this setup are substitution of terms (which may have free variables) into other terms, and into formulae: my impression, last I looked, is that this is all much easier to work with if you use _purely_ BoundedFormula.

Johan Commelin (Feb 17 2025 at 13:21):

This is maybe a bit ugly, but would it help to completely eliminate $\alpha$ from the story, and just work with a hard-coded Var := Nat or something like that?

Johan Commelin (Feb 17 2025 at 13:21):

That would eliminate the universe bump in all, right?

Chris Hughes (Feb 17 2025 at 16:05):

Bhavik Mehta said:

Some functions notably missing from this setup are substitution of terms (which may have free variables) into other terms, and into formulae: my impression, last I looked, is that this is all much easier to work with if you use _purely_ BoundedFormula.

This doesn't look particularly hard to me. The interface is not there, but it's not hard to define.

import Mathlib.ModelTheory.Satisfiability
set_option linter.style.header false
namespace FirstOrder
namespace Language

variable {α β : Type*} {L : Language}

def Formula.subst (φ : L.Formula α) (f : α → L.Term β) : L.Formula β :=
  Language.BoundedFormula.subst φ f

@[simp]
theorem realize_subst {M : Type} [L.Structure M] (f : α → L.Term β) (φ : L.Formula α)
    (x : β → M) : Formula.Realize (φ.subst f) x = φ.Realize (fun a => (f a).realize x) := by
  simp [Formula.subst, Formula.Realize]

end Language
end FirstOrder

Chris Hughes (Feb 17 2025 at 16:15):

Johan Commelin said:

This is maybe a bit ugly, but would it help to completely eliminate $\alpha$ from the story, and just work with a hard-coded Var := Nat or something like that?

I think this would make life very difficult, the type α is usually a sum or sigma type, or combination thereof, and working with this is a lot easier than whatever embedding into nat you chose. You would also have the problem that the semantics would then depend on the semantics of variables not in the image of this map (which wouldn't appear in the formula, but you have to deal with reasoning about that). You also have the problem of variables being bound in one position in the formula, but free in another, which seems like it would make things like relabelling quite complicated.

Bhavik Mehta (Feb 17 2025 at 16:50):

Chris Hughes said:

Bhavik Mehta said:

Some functions notably missing from this setup are substitution of terms (which may have free variables) into other terms, and into formulae: my impression, last I looked, is that this is all much easier to work with if you use _purely_ BoundedFormula.

This doesn't look particularly hard to me. The interface is not there, but it's not hard to define.
import Mathlib.ModelTheory.Satisfiability
set_option linter.style.header false
namespace FirstOrder
namespace Language

variable {α β : Type*} {L : Language}

def Formula.subst (φ : L.Formula α) (f : α → L.Term β) : L.Formula β :=
  Language.BoundedFormula.subst φ f

@[simp]
theorem realize_subst {M : Type} [L.Structure M] (f : α → L.Term β) (φ : L.Formula α)
    (x : β → M) : Formula.Realize (φ.subst f) x = φ.Realize (fun a => (f a).realize x) := by
  simp [Formula.subst, Formula.Realize]

end Language
end FirstOrder

Ah! For some reason I thought BoundedFormula.subst didn't exist; you're completely right and this is easy.

Aaron Anderson (Feb 17 2025 at 17:15):

I think for a lot of applications, we want some form of the ability to just use an arbitrary type as variables. However, there are potentially ways to have the basic definition of a formula have just Fin n as its variable type.
At a most basic level, the information of a Formula α can always be captured with a BoundedFormula Empty n and a function f : Fin n → α.

Aaron Anderson (Feb 17 2025 at 17:22):

One idea I've had has to do with the fact that there is yet another layer of definition that is relevant - formulas modulo some theory. These are closer to what is actually talked about in model theory (nobody really cares about the difference between ∀ ¬ φ(x) and ¬ ∃ φ(x)). I tried to implement these in #16801, but in addition to me getting caught up with teaching, one of the big issues was that this doubles the number of definitions, when there are already too many!

Aaron Anderson (Feb 17 2025 at 17:23):

These would be really nice to have though, because then, for instance, formulas actually form a meaningful boolean algebra.

Aaron Anderson (Feb 17 2025 at 17:29):

As an experiment, I think it would be interesting to create a branch where there are just two definitions of a formula, a low-level one and a high-level one:

Formula n is basically the current BoundedFormula Empty n, with variables indexed by Fin n, it plays well with quantifiers, and is good for setting up the basic recursive definitions.
For a theory T, T.Formula α is what you would want to work with for applications. This could be implemented as an equivalence class of pairs ⟨n, φ, f⟩, where φ : Formula n and
f : Fin n → α.

Aaron Anderson (Mar 10 2025 at 14:03):

Chris Hughes said:

Johan Commelin said:

This is maybe a bit ugly, but would it help to completely eliminate $\alpha$ from the story, and just work with a hard-coded Var := Nat or something like that?

I think this would make life very difficult, the type α is usually a sum or sigma type, or combination thereof, and working with this is a lot easier than whatever embedding into nat you chose. You would also have the problem that the semantics would then depend on the semantics of variables not in the image of this map (which wouldn't appear in the formula, but you have to deal with reasoning about that). You also have the problem of variables being bound in one position in the formula, but free in another, which seems like it would make things like relabelling quite complicated.

What applications do we actually have using variables indexed by α?
I think @Chris Hughes has added all those applications as part of Ax-Grothendieck.

I ask because I'd like to see how much (outside of the basic model theory library itself) would be changed by altering the way those variables are introduced.

Aaron Anderson (Mar 10 2025 at 14:04):

Aaron Anderson said:

As an experiment, I think it would be interesting to create a branch where there are just two definitions of a formula, a low-level one and a high-level one:

Formula n is basically the current BoundedFormula Empty n, with variables indexed by Fin n, it plays well with quantifiers, and is good for setting up the basic recursive definitions.

For a theory T, T.Formula α is what you would want to work with for applications. This could be implemented as an equivalence class of pairs ⟨n, φ, f⟩, where φ : Formula n and
f : Fin n → α.

I can think of another way to introduce variables indexed by α - technically don't, and instead define Formula αas a sentence in a language with extra constants indexed by α.

Aaron Anderson (Mar 10 2025 at 14:05):

As far as I know, this could be difficult to deal with, but it has some advantages, such as easier inductive definitions and the fact that then a type is technically just a theory, so we can use the same API for both.

Chris Hughes (Mar 10 2025 at 14:11):

For ax Grothendieck you have to quantify over the coefficients of the MV polynomials in question, so that's one variable for each monomial. And then for injectivity and surjectivity you need two elements of C^n as well, so overall alpha is the sum of all that.

Aaron Anderson (Mar 10 2025 at 15:13):

Chris Hughes said:

For ax Grothendieck you have to quantify over the coefficients of the MV polynomials in question, so that's one variable for each monomial. And then for injectivity and surjectivity you need two elements of C^n as well, so overall alpha is the sum of all that.

Yes, I think the Ax-Grothendieck application is proof enough that we want some kind of Formula α. I just think at the moment this is the only use case where Formula α obviously wins over BoundedFormula Empty n.

Aaron Anderson (Mar 10 2025 at 15:18):

Furthermore, I think that anything to do with rings would be easier to work with in the context of formulas modulo a theory. If we defined T.Term α to be the type of all terms mod equivalence under the theory T, and similarly T.Formula α to be the type of all formulas mod T-equivalence, then T.Term α would be in natural bijection with MVPolynomial α.

Chris Hughes (Mar 10 2025 at 16:32):

I still don't see why we need bounded formula. Doesn't Formula (Fin n) work just fine?

For terms modulo a theory I have a PR open that defines FunctionalFormula which would be a generalization of this that also included all definable functions, so for example you could have a sqrt function in real closed fields, or a division function in fields without having to make this part of the theory.

I agree that formulas modulo equivalence is a good idea. My only concern is about properties that aren't preserved under equivalence, like being quantifier free, but I don't think this is much of an issue as it's rare to need to distinguish between quantifier free and equivalent to a quantifier free formula.

The last thing I would change is that I think Theories should be sets of formulas closed under deduction. This is how I was taught it, and you get a sensible partial order with the right notion of equality. You can of course then talk about the theory generated by a set.

Aaron Anderson (Mar 10 2025 at 17:23):

Chris Hughes said:

I still don't see why we need bounded formula. Doesn't Formula (Fin n) work just fine?

Personally, I think that enough of the basic definitions involve recursion on formulas that BoundedFormulas are probably essential.

Chris Hughes (Mar 10 2025 at 22:14):

You can define custom induction principles for this. I think this might be necessary anyway as I think different theorems might require different induction principles anyway.

Aaron Anderson (Mar 11 2025 at 20:48):

Chris Hughes said:

The last thing I would change is that I think Theories should be sets of formulas closed under deduction. This is how I was taught it, and you get a sensible partial order with the right notion of equality. You can of course then talk about the theory generated by a set.

Some version of this should happen. I worry firstly that defining theories could be mildly inconvenient. The perhaps bigger issue is the order.
If we provide a SetLike instance, then T ≤ T' means that T ⊆ T', or that T' implies T. This seems reasonable on the face of it, but some things feel a bit backward to me:

⊤ is the inconsistent theory.
There is a natural boolean algebra structure (which I desperately want to implement) on formulas, where ⊤ is true and ⊥ is false. The map from φ to the theory generated by {φ} then becomes antimonotone.

Aaron Anderson (Mar 11 2025 at 21:01):

Chris Hughes said:

I agree that formulas modulo equivalence is a good idea. My only concern is about properties that aren't preserved under equivalence, like being quantifier free, but I don't think this is much of an issue as it's rare to need to distinguish between quantifier free and equivalent to a quantifier free formula.

I think that equivalent to a quantifier-free formula is really what we want. For this, I think we define quantifier complexity first on BoundedFormulas, as is currently done, in the way that is easiest to compute with. Then we define sublattices of the boolean algebra of formulas mod T for different complexity levels - equivalent to quantifier-free, equivalent to universal, equivalent to existential, etc.

Aaron Anderson (Mar 11 2025 at 21:02):

Then φ is in the lattice, say, of quantifier-free formulas mod T if and only if embeddings of models of T preserve φ in both directions.

Chris Hughes (Mar 11 2025 at 21:03):

Aaron Anderson said:

Chris Hughes said:

The last thing I would change is that I think Theories should be sets of formulas closed under deduction. This is how I was taught it, and you get a sensible partial order with the right notion of equality. You can of course then talk about the theory generated by a set.

Some version of this should happen. I worry firstly that defining theories could be mildly inconvenient. The perhaps bigger issue is the order.
If we provide a SetLike instance, then T ≤ T' means that T ⊆ T', or that T' implies T. This seems reasonable on the face of it, but some things feel a bit backward to me:

⊤ is the inconsistent theory.

There is a natural boolean algebra structure (which I desperately want to implement) on formulas, where ⊤ is true and ⊥ is false. The map from φ to the theory generated by {φ} then becomes antimonotone.

I agree about the order. A Theory is a bit like a filter, and we have the backwards order on filters, so that could work on theories as well

Aaron Anderson (Mar 11 2025 at 21:07):

You're totally right - the order on Theories should absolutely match Filters, and should match that API.

Kevin Buzzard (Mar 11 2025 at 21:19):

the order is also "backwards" on topologies if you believe that a topology is a collection of open sets (as opposed to this being an implementation detail), and T1 <= T2 means that the identity function from (X,T1) to (X,T2) is continuous.

Kevin Buzzard (Mar 11 2025 at 21:19):

If you imagined a topology to be "equal" to the collection of sets which aren't open then the order wouldn't be "backwards".

Kevin Buzzard (Mar 11 2025 at 21:23):

If you imagine a filter on a type just to be some abstract category-theoretic completion of the category of subsets of the type (there is such a way of describing it but I forgot the details, it is something like the pro-category?) then again you don't come away with the impression that "the order is backwards" because with this implementation you think of a filter as being a generalized subset.

Chris Hughes (Mar 11 2025 at 21:29):

With filters, you also have this complement isomorphism with sets closed under subset, finite union and empty set. And then the order would be correct.

Chris Hughes (Mar 11 2025 at 21:29):

So you can think about the Finite Cofilter instead of the Cofinite Filter

Aaron Liu (Mar 11 2025 at 21:31):

Chris Hughes said:

With filters, you also have this complement isomorphism with sets closed under subset, finite union and empty set. And then the order would be correct.

So... order ideals?

Chris Hughes (Mar 11 2025 at 21:33):

Yes, after looking up the definition of order ideal, it seems so

Bhavik Mehta (Mar 11 2025 at 21:36):

Chris Hughes said:

Aaron Anderson said:

Chris Hughes said:

The last thing I would change is that I think Theories should be sets of formulas closed under deduction. This is how I was taught it, and you get a sensible partial order with the right notion of equality. You can of course then talk about the theory generated by a set.

Some version of this should happen. I worry firstly that defining theories could be mildly inconvenient. The perhaps bigger issue is the order.
If we provide a SetLike instance, then T ≤ T' means that T ⊆ T', or that T' implies T. This seems reasonable on the face of it, but some things feel a bit backward to me:

⊤ is the inconsistent theory.

There is a natural boolean algebra structure (which I desperately want to implement) on formulas, where ⊤ is true and ⊥ is false. The map from φ to the theory generated by {φ} then becomes antimonotone.

I agree about the order. A Theory is a bit like a filter, and we have the backwards order on filters, so that could work on theories as well

I think it would be best to pick the order which makes the syntax-semantics adjunction for first-order logic a galois connection, without inserting OrderDual a bunch of times

Kevin Buzzard (Mar 11 2025 at 21:40):

If one argues like that then you might want to deduce "I think it would be best to pick the order which makes Galois theory a Galois connection, given that this is what it's named after, without inserting OrderDual a bunch of times", and then you would necessarily have to choose a very odd-looking order on either subgroups of a group or subfields of a field :-(

Kevin Buzzard (Mar 11 2025 at 21:41):

Chris Hughes said:

With filters, you also have this complement isomorphism with sets closed under subset, finite union and empty set. And then the order would be correct.

Rather annoyingly, I don't think this is correct. You have to take "sets not in the filter" rather than "sets whose complement is in the filter". That's why I said "sets which are not open" rather than "closed sets" in the topology example. If you have loads of open sets, then you have hardly any non-open sets, but you also have loads of closed sets.

Aaron Liu (Mar 11 2025 at 21:42):

If you take subsets not in the filter then its not closed under finite union :(

Chris Hughes (Mar 11 2025 at 21:42):

Yeah, there is that. I think that `T₁ ≤ T₂ → T₁ ⊨ φ → T₂ ⊨ φ might be the thing you end up needing the most.

Chris Hughes (Mar 11 2025 at 21:43):

Kevin Buzzard said:

Chris Hughes said:

With filters, you also have this complement isomorphism with sets closed under subset, finite union and empty set. And then the order would be correct.

Rather annoyingly, I don't think this is correct. You have to take "sets not in the filter" rather than "sets whose complement is in the filter". That's why I said "sets which are not open" rather than "closed sets" in the topology example. If you have loads of open sets, then you have hardly any non-open sets, but you also have loads of closed sets.

I should have said image under complement I think

Kevin Buzzard (Mar 11 2025 at 21:43):

Aaron Liu said:

If you take subsets not in the filter then its not closed under finite union

Yes, "subsets not in the filter" stinks as a mathematical property, it's just an artificial way of giving a definition which is mathematically equivalent to a filter and for which the ordering we have in mathlib is not "backwards", so it's really just an indication that this whole concept of whether an ordering is "backwards" is not really well-defined.

Chris Hughes (Mar 11 2025 at 21:44):

It doesn't stink that hard if you say subsets in the order ideal instead. I don't know that it's that artificial

Kevin Buzzard (Mar 11 2025 at 21:45):

Yes, but the order on ideals is still backwards

Kevin Buzzard (Mar 11 2025 at 21:47):

I'm arguing that this concept of "backwards" only comes from having a map from Filter X or Theory X to some other type which you are convinced has a natural ordering (e.g. Set (Set X)) in the filter case), and this map might be "the standard implementation of the concept" but what I'm arguing is that there may be other implementations.

Chris Hughes (Mar 11 2025 at 21:48):

I was thinking about this recently. There's this fact about polynomials which says that they "go to +-infinity"at +-infinity, which in terms of filters and Tendsto would be Filter.Tendsto f.eval Filter.Cobounded Filter.Cobounded. But on this particular filter it feels more natural to say the preimage of any bounded set is bounded.

Kevin Buzzard (Mar 11 2025 at 21:49):

Kevin Buzzard said:

Yes, but the order on ideals is still backwards

If you implement topologies as closed sets rather than open sets, then it's still true that T1 <= T2 iff T1 has more closed sets than T2, even though closed sets are the opposite of open sets. The more open sets you have, the more closed sets you have. Similarly if you implement a filter as the sets whose complement is in the filter (i.e. the ideal) then you still have F1 <= F2 iff F1 has more sets in it than F2.

Chris Hughes (Mar 11 2025 at 21:50):

Oh, yes, you're right

Aaron Liu (Mar 11 2025 at 21:52):

If you implement the order on ideals sanely (such that s ⊆ t ↔ principal s ≤ principal t) then we have principal s ≤ i ↔ s ∈ i, so I think the order is very much not backwards.

Aaron Anderson (Mar 11 2025 at 21:53):

Kevin Buzzard said:

I'm arguing that this concept of "backwards" only comes from having a map from Filter X or Theory X to some other type which you are convinced has a natural ordering (e.g. Set (Set X)) in the filter case), and this map might be "the standard implementation of the concept" but what I'm arguing is that there may be other implementations.

You're right that neither order is really "forwards" or "backwards" - one could order theories based on having more or fewer axioms, or one could order theories based on having more or fewer models. The question is just which of these dual orders is more mathlib-appropriate, and I'm convinced now it should be the latter. (One could also view closed-under-deduction theories as the closed sets in a type space, in which case they also get this ordering.)

Kevin Buzzard (Mar 11 2025 at 22:16):

In geometry there is this fundamental construction which goes from commutative rings to spaces; if I speak vaguely then this can apply to all kinds of geometric objects, but the construction is that given a space, you consider the ring of all functions on the space, and conversely given a ring, you consider the prime or maximal ideals of the ring (which choice you make depends on which century you're in and also which kind of geometric objects you're considering). These are adjoint functors between "algebra" and "geometry" but in this example I've always felt that you simply cannot get away from the fact that it's contravariant. Given a map of spaces X -> Y, then for a function on X you can't push it forwards to a function on Y (because it might take different values at different element of X which map to the same element of Y) but given a function on Y you can always pull it back to a function on X, so you get a map Ring(Y) -> Ring(X). This is an example of a beautiful adjunction for which it really is impossible to remove the ^ops. I don't know if the syntax/semantics duality has the same properties though.

Last updated: May 02 2025 at 03:31 UTC