Zulip Chat Archive
Stream: mathlib4
Topic: Many-sorted model theory
Mathias Stout (Oct 06 2025 at 16:29):
With a couple of model theorists, we're working to formalize the basics of the model theory of valued fields.
Most theorems we want to state require many-sorted logic.
As far as we know, this has not been formally implemented in Lean 4.
We are having some success translating the very basics of the one-sorted material in Mathlib essentially by
- keeping track of a type
Sof Sorts (in the Model Theory sense), - encoding signatures of function/relation symbols as objects of type
List S(compared to just having to keep track of the arityn : ℕin the one-sorted case), - encoding structures as dependent types
M : S → Type u, - encoding "tuples with signature
σ : List Sin a structureM : S → Type u" as dependent lists, represented by e.g.(i : Fin σ.length) → (M σ[i]).
Ideally, this such basic material on many-sorted logic would eventually end up in the Mathlib, so we wanted to reach to the community about basic design decisions. In particular
- Are there any opinions on using Lists for signatures rather than e.g.
Σ n, Fin n → Sor a separate datatype? - It feels to us like the "dependent lists" should probably be wrapped in a structure or definition, with definitions and theorems to convert between the many equivalent representations, rather than the explicit Pi type above.
- In any case, these dependent lists seem like a rather general concept. Are there any existing lemmas in Mathlib or other standard repositories that already provide such an interface?
- Any other piece of advice welcome!
Aaron Liu (Oct 06 2025 at 16:55):
Mathias Stout said:
- Are there any opinions on using Lists for signatures rather than e.g.
Σ n, Fin n → Sor a separate datatype?
I would suggest using Multiset S (unless you have a reason to do lists?)
Aaron Liu (Oct 06 2025 at 16:57):
Mathias Stout said:
- In any case, these dependent lists seem like a rather general concept. Are there any existing lemmas in Mathlib or other standard repositories that already provide such an interface?
I am using docs#List.TProd, which is not very well supported (and other people have told me to not use it)
Luigi Massacci (Oct 06 2025 at 17:01):
Aaron Liu said:
I would suggest using
Multiset S(unless you have a reason to do lists?)
How would you keep track of which sort is attached to which function parameter with a multiset?
Aaron Liu (Oct 06 2025 at 17:06):
Edward van de Meent (Oct 06 2025 at 17:34):
i'm not sure that works? since Actually, it works, but morally you should be using Multiset is List quotiented by permutation, you wouldn't be able to track it. (i.e. you wont be able to define things like Nat.sub)Finsupp α Nat if you want to go that route
Edward van de Meent (Oct 06 2025 at 17:37):
where α is your type of sorts
Edward van de Meent (Oct 06 2025 at 17:40):
that aside, i'm not convinced that forgetting the order in which arguments of different types appear is beneficial
Edward van de Meent (Oct 06 2025 at 17:41):
i have no doubt it will lead to some difficult to work with dependent types
Aaron Liu (Oct 06 2025 at 18:50):
You just have the sort of the argument along with a number telling you which one it is
Dexin Zhang (Oct 06 2025 at 18:57):
When Finsupp S Nat works, S → Nat should also work, I guess
Aaron Liu (Oct 06 2025 at 19:02):
then you get infinitary functions but not all the infinitary functions so it becomes a bit complicated
Dexin Zhang (Oct 06 2025 at 19:03):
I think we can expect S be finite for practical first-order languages
Dexin Zhang (Oct 06 2025 at 19:04):
It's not infinitary function, any term or formula can only use finite indices from its signature
Dexin Zhang (Oct 06 2025 at 19:27):
Oh, I realized. If we define Func : (S → Nat) → Type in Language then it is indeed infinitary
Dexin Zhang (Oct 06 2025 at 19:27):
So Finsupp does make sense
Mathias Stout (Oct 06 2025 at 19:28):
Thanks a lot for the input!
Mathias Stout (Oct 06 2025 at 19:28):
We did actually start out using Finsupp S Nat, which is super convenient, up to a point. Essentially many of the very basic definitions/lemmas of the Mathlib go through almost immediately.
For practical applications, it seems better to keep track of the order of the variables. Basically, having all the variables ordered gives you more information to work with.
Mathias Stout (Oct 06 2025 at 19:28):
For example, suppose I did not order my variables "between sorts" and I want to convert a many-sorted structure to a one-sorted one (where the different sorts now become relation symbols representing disjoint sets).
Since in the one-sorted setting the variables are totally ordered, such a conversion would depend on a choice of total order on the set of sorts, which seems like it could have undesirable consequences.
Luigi Massacci (Oct 06 2025 at 19:53):
By the way, if it's not an issue to ask, who's "we" in "we're working to formalize...."? I currently have all my lean time taken up by PRing another project, but I'd be very happy to collaborate on this in the near future. I actually wanted to formalize some model theoretic algebra last year (I was thinking about QE for RCFs) but then other stuff got in the way : (
Mathias Stout (Oct 06 2025 at 20:06):
Luigi Massacci said:
By the way, if it's not an issue to ask, who's "we" in "we're working to formalize...."? I currently have all my lean time taken up by PRing another project, but I'd be very happy to collaborate on this in the near future. I actually wanted to formalize some model theoretic algebra last year (I was thinking about QE for RCFs) but then other stuff got in the way : (
No issue! This is a project initiated by Deirdre Haskell in which Greg Cousins, Aaron Crighton, John Nicholson and I are currently involved with (to varying degrees). In general, we are very happy to see that other people might be interested in this as well. Feel free to direct message me when you have more time :)
Jasper Mulder-Sohn (Oct 07 2025 at 05:04):
Cool that someone is working on this! I've played around locally in a pathetic attempt to formalize my master's thesis. (As I'm not a Lean expert I quickly realized that it would be a tough journey and put it aside for some other day when the foundations are laid.)
Indeed I also went the route of using List S. One of the issues I encountered quickly is that it quickly makes you lose defeqs. I.e. you need to provide (annoying/tedious) proofs that the arguments line up, like so:
def SortedPreterm.apps {SL : SortedLanguage sorts} {l : List sorts} {s : sorts}
(t : SortedPreterm SL l s) (lst' : List (SortedTerm' SL)) (hlst' : lst'.map (·.sort) = l := by rfl) :
SortedTerm SL s :=
Here the problem lies in needing hlst' in the definition. I tried to trick around it using rfl but I quickly came to think the problem is with List. Since equality of Lists requires length comparison (hence proof) and there is no common quantification over two Lists.
My next attempt would be to refactor to Vector to have a shared length between tuples of sorts, but I haven't gotten round to it.
Sorry for the wall of text, this just seemed like a logical place to share my experiences before anyone else pours more time into it. With again the caveat that I have limited experience with the "setting up the right definition" aspect of Lean. Hope it helps.
Jasper Mulder-Sohn (Oct 07 2025 at 05:06):
Dexin Zhang said:
I think we can expect
Sbe finite for practical first-order languages
2 cents: In categorical logic, it is most definitely not the case that S is always finite, because many techniques involve S = ob <cat>. I would strongly advocate for a setup that does not assume S is finite, even if the focus is not on categorical logic right now.
Jasper Mulder-Sohn (Oct 07 2025 at 05:09):
Mathias Stout said:
No issue! This is a project initiated by Deirdre Haskell in which Greg Cousins, Aaron Crighton, John Nicholson and I are currently involved with (to varying degrees). In general, we are very happy to see that other people might be interested in this as well. Feel free to direct message me when you have more time :slight_smile:
I will follow your project with interest, though I am only a 'casual mathematician' and cannot commit reliable time. But would be happy to contribute on select topics and/or provide thoughts/feedback!
Jasper Mulder-Sohn (Oct 07 2025 at 05:11):
Final remark, there is also this thread: #maths > Categorical logic in Lean? (algebraic theories, etc.)
Mathias Stout (Oct 07 2025 at 13:52):
Thanks for sharing your experience! These are the kind of responses I was hoping for :).
Jasper Mulder-Sohn said:
Indeed I also went the route of using
List S. One of the issues I encountered quickly is that it quickly makes you lose defeqs. I.e. you need to provide (annoying/tedious) proofs that the arguments line up, like so:def SortedPreterm.apps {SL : SortedLanguage sorts} {l : List sorts} {s : sorts} (t : SortedPreterm SL l s) (lst' : List (SortedTerm' SL)) (hlst' : lst'.map (·.sort) = l := by rfl) : SortedTerm SL s :=
I think right here we can get around this problem by using a a dependent list/vector for lst' instead: i.e. let lst' is an object of some type like DVec l (SortedTerm SL) := (i : Fin l.length) → (SortedTerm SL l[i]) instead.
However, this only partially solves this problem and kicks the can down the road.
Even if you have a proof that l = l', you cannot directly compare DVec l with DVec l'. You have to cast over l=l', which complicates things.
These feel like subtle, but general issues that maybe other people have already given more thought to.
Jasper Mulder-Sohn said:
My next attempt would be to refactor to
Vectorto have a shared length between tuples of sorts, but I haven't gotten round to it.
If I understand correctly, you would suggest rather working with Fin n → Sorts or Vector, rather than lists for signatures?
Jasper Mulder-Sohn said:
2 cents: In categorical logic, it is most definitely not the case that `S` is always finite, because many techniques involve `S = ob <cat>`. I would strongly advocate for a setup that does not assume `S` is finite, even if the focus is not on categorical logic right now.
Yes, I fully agree with this. In addition, we would eventually want to talk about the T_eq construction and elimination of imaginaries, which would require infinitely many sorts.
Oh, and thanks for the link to the categorical logic thread!
Aaron Liu (Oct 07 2025 at 13:59):
Mathias Stout said:
I think right here we can get around this problem by using a a dependent list/vector for lst' instead: i.e. let lst' is an object of some type like
DVec l (SortedTerm SL) := (i : Fin l.length) → (SortedTerm SL l[i])instead.
However, this only partially solves this problem and kicks the can down the road.
Even if you have a proof that l = l', you cannot directly compareDVec lwithDVec l'. You have to cast overl=l', which complicates things.
These feel like subtle, but general issues that maybe other people have already given more thought to.
I had this exact problem! Here's what I did
Aaron Liu said:
I'm working with
List.TProdand there's barely any API. Is there anything that I can use to index into aList.TProd? Something like this:import Mathlib def List.TProd.get {ι : Type u} {α : ι → Type v} {l : List ι} (t : l.TProd α) (n : Nat) (i : ι) (hi : i ∈ l[n]?) : α i := match l, n, hi with | _ :: _, 0, rfl => t.1 | _ :: _, n + 1, hi => List.TProd.get t.2 n i hi
Jasper Mulder-Sohn (Oct 07 2025 at 15:01):
Mathias Stout said:
Jasper Mulder-Sohn said:
My next attempt would be to refactor to
Vectorto have a shared length between tuples of sorts, but I haven't gotten round to it.If I understand correctly, you would suggest rather working with
Fin n → SortsorVector, rather than lists for signatures?
I actually started out with Fin n but it had problems even earlier, in the declaration of terms/formulas. IIRC you get issues that there is no defeq/pattern-match on the relevant Fin n. That got resolved nicely with List (with a result more similar to the current one-sorted situation). My hope is that Vector would work better still (with the length explicit) but I didn't actually get around to trying it.
Indeed maybe a hybrid approach mixing low-level types would work better, but it does raise the question how we can a. make clear to others when to do what; and b. successfully abstract away from the concrete implementation asap. With a possible future c. without killing performance.
I think the key to doing it all successfully is to have as many nice defeqs for as long as possible.
Kenny Lau (Oct 07 2025 at 15:08):
I would suggest independently you work with Fin n; it has no pattern matching, but it has custom recursors, and it also has a custom notation ![3, 5, 1]
Kenny Lau (Oct 07 2025 at 15:09):
see e.g. Fin.snocInduction
Jasper Mulder-Sohn (Oct 07 2025 at 15:12):
One of the powerful things in the current one-sorted formalization is that well-formedness is implicitly baked into the type itself through inductive definition. Is it possible to retain that with this type of construction? If so, how?
Jasper Mulder-Sohn (Oct 07 2025 at 15:15):
My current local stuff is this:
universe u v
structure SortedLanguage (sorts : Type u) where
functions : List sorts → sorts → Type v
relations : List sorts → Type v
variable {sorts : Type u}
inductive SortedPreterm (SL : SortedLanguage sorts) : List sorts → sorts → Type u where
| var (k : Nat) (s : sorts) : SortedPreterm SL List.nil s
| func {l : List sorts} {s : sorts} (f : SL.functions l s) : SortedPreterm SL l s
| app {l : List sorts} {s s' : sorts} (t : SortedPreterm SL (s' :: l) s)
(t' : SortedPreterm SL List.nil s') : SortedPreterm SL l s
How could one do that with the Fin n recursors?
Yaël Dillies (Oct 07 2025 at 15:15):
Jasper Mulder-Sohn said:
As I'm not a Lean expert I quickly realized that it would be a tough journey
It's a tough journey for everyone :joy:
Kenny Lau (Oct 07 2025 at 15:17):
how is it done currently?
Kenny Lau (Oct 07 2025 at 15:17):
Jasper Mulder-Sohn (Oct 07 2025 at 15:19):
Kenny Lau said:
how is it done currently?
In one-sorted FOL the power of Nat's formalization is used and we need a many-sorted equally powerful method.
My question was: how can you define the app inductive (which now uses the pattern s' :: l) using the Fin n recursors?
Kenny Lau (Oct 07 2025 at 15:21):
I know nothing about multisorted model theory but I feel like sortedness maybe should not be checked on the level of parsing
Kenny Lau (Oct 07 2025 at 15:21):
to me just intuitively I feel like functions : List sorts → sorts → Type v is unusable
Kenny Lau (Oct 07 2025 at 15:22):
intuitively I want to just have function symbols (basically copy what FirstOrder.Language is doing) and then later assign sorts to each input of each function
Kenny Lau (Oct 07 2025 at 15:22):
hmm maybe this is ok
Kenny Lau (Oct 07 2025 at 15:23):
sorry i have not thought about this very much
Jasper Mulder-Sohn (Oct 07 2025 at 15:24):
Kenny Lau said:
to me just intuitively I feel like
functions : List sorts → sorts → Type vis unusable
I had no problem formalizing ZFC language to the point of valid formulas...
Kenny Lau (Oct 07 2025 at 15:24):
and your elements and sets are different sorts?
Kenny Lau (Oct 07 2025 at 15:25):
(sorry i only know one-sorted zfc)
Jasper Mulder-Sohn (Oct 07 2025 at 15:26):
I plainly added a noset sort and checked that stuff broke down if I used a noset sort somewhere. Kind of toy model but at least the many sorts should reduce to the one sort right?
Kenny Lau (Oct 07 2025 at 15:29):
what is a noset?
Jasper Mulder-Sohn (Oct 07 2025 at 15:29):
inductive ZFC_sorts where
| set
| noset
A "noset" is, well, not a set, so shouldn't allow set operations. I'll attach the code to avoid this back-and-forth
Jasper Mulder-Sohn (Oct 07 2025 at 15:30):
Jasper Mulder-Sohn (Oct 07 2025 at 15:31):
It also includes some ideas for fragments of logic baked into the formula type.
Kenny Lau (Oct 07 2025 at 15:34):
well, it does seem to work for you so far, i'm just worried that it will be hard to convert from list to a function, and you might as well need functions
Jasper Mulder-Sohn (Oct 07 2025 at 15:40):
Yeah well as the thread above shows it quickly starts to become difficult to have well-behaving definitions on top of this basis. So I was hoping you would have concrete ideas on how to leverage Fin n in a better way than List.
Jasper Mulder-Sohn (Oct 07 2025 at 15:41):
Or maybe someone else will swoop in and save the day :slight_smile:
Anyway, I have to attend to IRL business now, so my responses will be delayed.
Kenny Lau (Oct 07 2025 at 15:50):
import Mathlib
universe v w u
structure MultiSorted.Language (Sorts : Type u) where
/-- For every arity, a `Type*` of functions of that arity -/
Functions : ℕ → Type v
functionDomain : {n : ℕ} → Functions n → Fin n → Sorts
functionTarget : {n : ℕ} → Functions n → Sorts
/-- For every arity, a `Type*` of relations of that arity -/
Relations : ℕ → Type w
relationDomain : {n : ℕ} → Relations n → Fin n → Sorts
variable {sorts : Type u}
/-- `SortedPreterm SL s` means "a term of sort `s`" -/
inductive SortedPreterm (SL : MultiSorted.Language sorts) : sorts → Type _ where
| var (k : Nat) (s : sorts) : SortedPreterm SL s
| app {n : ℕ} (f : SL.Functions n)
(args : (i : Fin n) → SortedPreterm SL (SL.functionDomain f i)) :
SortedPreterm SL (SL.functionTarget f)
Kenny Lau (Oct 07 2025 at 15:50):
@Jasper Mulder-Sohn this is my attempt; idk what the list sort input was doing in your preterm and you had no docstring so i removed it
Kenny Lau (Oct 07 2025 at 15:50):
i assume it means "current context" but i don't need context
Kenny Lau (Oct 07 2025 at 15:50):
i also don't know what func is doing, clearly a function is not a term
Aaron Liu (Oct 07 2025 at 16:01):
Jasper Mulder-Sohn said:
inductive ZFC_sorts where | set | nosetA "noset" is, well, not a set, so shouldn't allow set operations. I'll attach the code to avoid this back-and-forth
Like ZFC with atoms?
Kenny Lau (Oct 07 2025 at 16:03):
i think they just wanted to attach a nonfunctional sort to test it out
Mathias Stout (Oct 07 2025 at 17:52):
@Kenny Lau That seems interesting, I had not considered decoupling the signature of a function from its type. On a purely mathematical level, this feels a bit strange/new to me. Usually I think of multisorted languages as having a set of function or relation symbols, for each possible signature.
Do you see any Lean-theoretical advantages to such an approach?
Mathias Stout (Oct 07 2025 at 17:54):
Also, for completeness, here is our attempt at Terms, as based on the Mathlib and using Lists
structure MSLanguage (Sorts : Type z) where
Functions : List Sorts → Sorts → Type u
Relations : List Sorts → Type v
-- The sorts and a collection of variable symbols for each sort
variable {Sorts : Type z} {α : Sorts → Type u'}
/-- A term is either a variable symbol, or a valid function application to terms-/
inductive Term {L : MSLanguage Sorts} (α : Sorts → Type u') : Sorts → Type _
| var s : (α s) → Term α s
| func (σ : List (Sorts)) s : ∀ (_ : L.Functions σ s),
((i : Fin σ.length) → Term α σ[i]) → Term α s
Mathias Stout (Oct 07 2025 at 17:56):
It seems relatively similar, except for that we call σ .length and that σ already knows the domain of the function.
Kenny Lau (Oct 07 2025 at 18:36):
your (α : Sorts → Type u') is trying to interpret the language, i think you should keep language and evaluation separate
Kenny Lau (Oct 07 2025 at 18:37):
oh nvm they are variables, sorry
Kenny Lau (Oct 07 2025 at 18:37):
what happened to the previous idea of using Nat-indexed variables?
Mathias Stout (Oct 07 2025 at 19:37):
Kenny Lau said:
oh nvm they are variables, sorry
Still, I think you have a good point there: with the set-up for languages you suggested, it could make sense to separate it into α : Type and varDomain : α → Sorts.
The reason why we have not gone with this set-up so far is that the former type seems to come up more naturally, e.g. Term α itself is of this type.
Mathias Stout (Oct 07 2025 at 19:44):
Kenny Lau said:
what happened to the previous idea of using Nat-indexed variables?
Essentially, we are trying to follow the very general one-sorted Mathlib set-up, which has a type α for "truly free" variables and indexes the variables that one can quantify over by Nat.
Terms which have a mix of these variables are then encoded as L.Term (α ⊕ (Fin n)) (again, in the one-sorted case).
Jasper Mulder-Sohn (Oct 08 2025 at 07:36):
Kenny Lau said:
Jasper Mulder-Sohn this is my attempt; idk what the list sort input was doing in your preterm and you had no docstring so i removed it
I was following the "partial application" paradigm from Flypitch, which has advantages of naturally having proper types for incomplete terms/formulas (hence Preterm and Preformula). One could also argue that it leads to more "Lean-ish" behaviour of the function symbols. That might simplify defining interpretations into other structures using actual functions.
Both in your and Mathias' approach there is instead a dependent type formulation for immediate full application. I guess the question is if in practice explicitly providing an element of that dependent type will be annoying. Or if it is easy to do so inline.
Kenny Lau (Oct 08 2025 at 07:37):
you're dealing with sorts, you can't escape the dependent type
Jasper Mulder-Sohn (Oct 08 2025 at 07:39):
My impression from this thread so far (FWIW) is that from the current position it is difficult to estimate which approach will work out best. Maybe we ought to try and determine some hallmark features that the formalization should exhibit and support well, and proceed to try and make those work first?
Jasper Mulder-Sohn (Oct 08 2025 at 07:41):
Kenny Lau said:
you're dealing with sorts, you can't escape the dependent type
With partial application paradigm (i.e. this), you can deal with the 'right' sort one at a time, thus embedding the dependent nature in the inductive definition. Or am I misunderstanding?
Kenny Lau (Oct 08 2025 at 07:41):
i think the amount of headache you experience while developing examples will be a good enough indicator
Kenny Lau (Oct 08 2025 at 07:42):
if that escapes the dependent type then sure use it
Kenny Lau (Oct 08 2025 at 07:42):
again i don't have any expert opinion on this, i have never used multi sorted
Jasper Mulder-Sohn (Oct 08 2025 at 07:56):
I guess my suggestion for a way forward would be to set some goals like:
- simple toy examples/statements (e.g. one-sorted reduction, vector space + field)
- key definitions (e.g. interpretation into a structure)
as well as keep an inventory of possible approaches:
- dependent type formulation
Fin norList - inductive formulation with
ListorVector - searching for/introducing some more effective base type that solves the issues
That is, some sort of framework that helps to assess whether the formalization is going in a good direction, instead of 'just' plowing on.
Aaron Liu (Oct 08 2025 at 10:12):
my opinion is that you will encounter a lot of casts so you better be comfortable with them
Adrian Marti (Oct 08 2025 at 11:22):
This is a pretty interesting thread to me, since I've been thinking quite a bit of how one would formalize terms in the context of categorical logic (and the terms and formulas are really the same as in model theory). Ideally, one would get an approach to terms that is general enough to provide a unified API for terms that works for sorted first-order logic (or any of its fragments or extensions) and also in contexts where composition of terms is constrained in some way, as it is in (possibly symmetric) linear logic, where we can't use the same variable twice (this happens when we construct terms in monoidal categories such as vector spaces).
Please excuse the abstract nonsense.
Header
Unfortunately, depending of how one formalizes things, a term would be of the type , where are the input sorts and is an output sort, which runs us into the topic you're talking about here. Composing these might be tricky in dependent type theory.
However, mathlib's category API does exactly the same thing, right? Why did mathlib even choose to formalize categories in this dependent way? There certainly are ways to define them as a type of arrows and a type of objects together with maps determining the source and the target of an object, like in a quiver (mathlib also does quivers dependently though). Now all arrows are in the same type and we could provide notation for composing them that simply yields junk values when the target of one arrow does not match the source of the other arrow. Is there any reason we're not doing this? Cause we could certainly do the same thing for terms.
Kenny Lau (Oct 08 2025 at 11:25):
you don't really gain anything by unbundling the sources and targets ime, because to get a non-junk value out of a valid function composition might be the same amount of work
Kenny Lau (Oct 08 2025 at 11:25):
but this comparison to category is useful
Kenny Lau (Oct 08 2025 at 11:27):
or maybe i'm wrong, i don't really know
Adrian Marti (Oct 08 2025 at 11:51):
I mean it certainly wouldn't be clean, since the junk value would be one of the identities on one of the objects (there is no "0" we can choose like for reals). Also, it would require decidable equality on objects, which is also not ideal. However, it would certainly keep tricky unification problems out of definitions and offload them to theorem proofs.
Adam Topaz (Oct 08 2025 at 12:19):
FWIW, I have a different point of view on doing multi-sorted model theory (or really logic more generally) in Lean. Namely, I think the objects should be as close as possible to the corresponding syntax we would write in the surface type theory. In other words, my priorities are in making it easy to apply the internalized logical framework to contexts outside the framework.
In the case of multi-sorted theories, my inclination would be to do something like the following:
universe u
inductive ProdExpr (S : Type u) where
| of : S → ProdExpr S
| prod : ProdExpr S → ProdExpr S → ProdExpr S
| nil : ProdExpr S
structure MSLanguage (S : Type u) where
Function : ProdExpr S → S → Type u
Relation : ProdExpr S → Type u
inductive MSLanguage.Term {S : Type u} (L : MSLanguage S) (α : S → Type u) :
ProdExpr S → Type _ where
-- Vars of type `s : S` give terms of type `s : S`.
| var (s : S) : α s → L.Term α (.of s)
-- If we have a term of type `s : ProdExpr S` and a function in the language to `t : S`, then
-- applying the function results in a term of type `t : S`.
| func {s : ProdExpr S} {t : S} (f : L.Function s t) (r : L.Term α s) : L.Term α (.of t)
-- If we have terms of type `s` and `t` then combining them results in a term of type `s.prod t`.
| prod {s t : ProdExpr S} : L.Term α s → L.Term α t → L.Term α (s.prod t)
Adam Topaz (Oct 08 2025 at 12:23):
The way to think about it is that ProdExpr S corresponds to the syntax of writing iterated products of sorts s : S, which is not associative by default, whereas using Multiset S would make it associative from the start. IMO associativity should be taken care of downstream, either by hand or using automation similar to prod_assoc%.
MSLanguage.Term in this case is the type of terms on an iterated product of sorts, not just on a single sort.
Aaron Liu (Oct 08 2025 at 12:24):
That sounds similar to what I was doing with typed lambda calculus
Adam Topaz (Oct 08 2025 at 12:24):
(NB. Maybe the inductive type for terms needs an additional constructor |unit : L.Term \alpha .nil)
Aaron Liu (Oct 08 2025 at 12:25):
It was much easier to work with when I had the well-formedness separately
Adam Topaz (Oct 08 2025 at 12:25):
Sounds probable. Do you have a link?
Aaron Liu (Oct 08 2025 at 13:06):
It's on a branch on my mathlib fork
Adam Topaz (Oct 08 2025 at 13:40):
To expand a bit given what was written above. Once you have a language L in the sense of the code block above, you can construct inductively the type of functions of type A -> B where A and B are ProdExpr Ss. Then composition could be done easily.
Aaron Liu (Oct 08 2025 at 14:38):
Adam Topaz said:
Sounds probable. Do you have a link?
you can see it at my branch aliu/ccc
Mathias Stout (Oct 08 2025 at 15:21):
Jasper Mulder-Sohn said:
I guess my suggestion for a way forward would be to set some goals like:
- simple toy examples/statements (e.g. one-sorted reduction, vector space + field)
- key definitions (e.g. interpretation into a structure)
Yes, I definitely agree. We are testing out approaches specifically on vector spaces + fields to see what works and what additional API and lemmas are required to
- make it as smooth as possible to actually define theories and structures and state theorems about them
- help Lean unfold operations on signatures and variable tuples, as to be able to prove those theorems.
Additionally, it felt prudent to check with more experienced Lean people if there could be an obviously wrong/right approach. Hence this thread, which has been extremely interesting :)
Mathias Stout (Oct 08 2025 at 15:29):
Adrian Marti said:
I mean it certainly wouldn't be clean, since the junk value would be one of the identities on one of the objects (there is no "0" we can choose like for reals). Also, it would require decidable equality on objects, which is also not ideal. However, it would certainly keep tricky unification problems out of definitions and offload them to theorem proofs.
That is a very interesting perspective.
From our perspective, dependent types seemed quite nice (initially), precisely because they naturally capture the idea of disjointness of sorts at the type level.
Do you know any examples in the Mathlib or other projects where this less dependent approach reduces the overhead?
Mathias Stout (Oct 08 2025 at 15:34):
@Adam Topaz Interesting!
So if I understand correctly, a Term α s actually represents a (nonassociative) product of terms, with targets in the sorts described by s?
Mathias Stout (Oct 08 2025 at 15:40):
In the context that we want to work in, all signatures are associative. Do I understand correctly that you believe one should build a sufficiently general framework where also nonassociative signatures
are allowed?
Mathias Stout (Oct 08 2025 at 15:46):
What would you think of instead defining an Dependent List inductive type e.g. as follows
inductive DList (S : Type*) (f : S → Type*) where
| nil : DList S f
| cons s : f s → DList S f → DList S f
And then rather passing on a DList S (Term α) to a function symbol to construct terms?
Aaron Liu (Oct 08 2025 at 15:47):
this looks like the same as List (Sigma f)
Adam Topaz (Oct 08 2025 at 16:08):
Mathias Stout said:
Do I understand correctly that you believe one should build a sufficiently general framework where also nonassociative signatures
are allowed?
No, not quite. I'm thinking about the usual model-theoretic context. The point of ProdExpr S is that if you have some function X : S -> Type, then you can associate to P : ProdExpr S a type as follows:
inductive ProdExpr (S : Type u) where
| of : S → ProdExpr S
| prod : ProdExpr S → ProdExpr S → ProdExpr S
| nil : ProdExpr S
def ProdExpr.Interpret {S : Type} (X : S → Type v) : ProdExpr S → Type v
| of s => X s
| prod a b => a.Interpret X × b.Interpret X
| nil => PUnit
And such "interpretations" of ProdExpr S are exactly the kinds of types where you consider elements if you have some concrete interpretation of your language with sorts S (which would involve such a function X that interprets elements of S as types). The ProdExpr S then keeps track of how such a type is constructed out of the sorts in a syntactic way, which means that you could more easily relate the logic which is simulated internally to the type-theoretic logic you get in such an interpretation.
Mathias Stout (Oct 08 2025 at 16:19):
Aaron Liu said:
this looks like the same as
List (Sigma f)
Yes, I believe it essentially the same thing, but packed in a nonstandard way (which may make it worse than List (Sigma f)).
Adrian Marti (Oct 08 2025 at 16:21):
Mathias Stout said:
That is a very interesting perspective.
From our perspective, dependent types seemed quite nice (initially), precisely because they naturally capture the idea of disjointness of sorts at the type level.
Do you know any examples in the Mathlib or other projects where this less dependent approach reduces the overhead?
Honestly, I am also naturally more drawn to the dependent version, though I really don't know how well unification of list types, or of the Fin n -> A works out when trying to compose terms. I think the partial application idea might be a compromise between the fully dependent version and the fully untyped version.
Mathias Stout (Oct 08 2025 at 16:30):
@Adam Topaz
Oh, I see. Interesting!
It seems like I should try to better understand how this contrasts to e.g. List (Sigma X).
(Don't have time right now, but will think it over later)
Aaron Liu (Oct 08 2025 at 16:52):
Adrian Marti said:
Honestly, I am also naturally more drawn to the dependent version, though I really don't know how well unification of list types, or of the
Fin n -> Aworks out when trying to compose terms. I think the partial application idea might be a compromise between the fully dependent version and the fully untyped version.
What is the partial application idea
Adrian Marti (Oct 08 2025 at 16:57):
I meant something like this, where you apply one variable at the time. I haven't thought about whether this actually works out or not though.
Adrian Marti (Oct 08 2025 at 17:03):
To me a "compromise" is anything where we construct terms where we check that they have the right number of inputs through types, but we don't check that the inputs have the right sorts.
Jasper Mulder-Sohn (Oct 08 2025 at 19:04):
Adam Topaz said:
The
ProdExpr Sthen keeps track of how such a type is constructed out of the sorts in a syntactic way, which means that you could more easily relate the logic which is simulated internally to the type-theoretic logic you get in such an interpretation.
I think in that sense the ProdExpr approach is comparable to partial application in that it allows to incrementally define the interpretation rather than having to ex-machina interpret the entire domain of a function symbol.
Conceptually though it seems at least inelegant to have a base type which is nonassociative. After all, a function symbol must a priori specify its "association" so it feels kind of moot.
Do you know if it really behaves any different than choosing an arbitrary yet consistent variant (classical iterated binary product)?
What your approach did trigger in me is that of course the interpretation and actual model theory will be easier when there is a more natural correspondence between the ways the syntax and the model are formalized. Stuff will carry over in the "natural/obvious" way. So maybe the approaches considered here can be compared with how the foundations are formalized on the model side (notably sets and categories).
Aaron Liu (Oct 08 2025 at 19:06):
this feels really similar to when I did the lambda calculus
Adam Topaz (Oct 08 2025 at 19:08):
My point of view is that no what what you do, you will have to deal with the fact that products in type theory are not definitionally associative. If you base your work on lists L, you will probably end up defining a type associated to a list L, and you wouldn't have (L ++ L').type = L.type x L'.type, so then you end up dealing with dependent type theory hell if you associated (A ++ B) ++ C to A ++ (B ++C). The approach using ProdExpr takes the point of view that you should deal with this hell as early as possible (which in my experience makes it much easier downstream, at the expense of a bit more work initially).
Aaron Liu (Oct 08 2025 at 19:11):
what I did is that I have the terms separate from their typing judgements
Aaron Liu (Oct 08 2025 at 19:11):
so you can put a cast on the typing judgement without doing anything to the term
Aaron Liu (Oct 08 2025 at 19:13):
this worked great until I encountered a situation where the form of the terms depended on the types and then it was annoying and tedious (but not difficult)
Adam Topaz (Oct 08 2025 at 19:14):
I think for such annoyances one should do a bit of metaprogramming
Aaron Liu (Oct 08 2025 at 19:17):
what would the metaprogram do
Aaron Liu (Oct 08 2025 at 19:18):
do you mean a metaprogram like file#Tactic/DepRewrite
Adam Topaz (Oct 08 2025 at 19:18):
Maybe I don't have a concrete enough idea of what sorts of issues you encountered. What I envision is something like prod_assoc%.
Aaron Liu (Oct 08 2025 at 19:20):
Adam Topaz said:
If you base your work on lists
L, you will probably end up defining a type associated to a listL, and you wouldn't have(L ++ L').type = L.type x L'.type, so then you end up dealing with dependent type theory hell if you associated(A ++ B) ++ CtoA ++ (B ++C).
exactly this sort of issues
Adam Topaz (Oct 08 2025 at 19:20):
Yeah, take a look at what prod_assoc does.
Aaron Liu (Oct 08 2025 at 19:20):
where can I look at it
Adam Topaz (Oct 08 2025 at 19:20):
Aaron Liu (Oct 08 2025 at 19:21):
this technically won't work in my situation because I don't have a definite amount of products
Aaron Liu (Oct 08 2025 at 19:21):
since it's a docs#List.TProd
Aaron Liu (Oct 08 2025 at 19:21):
but I see the idea
Adam Topaz (Oct 08 2025 at 19:22):
Oh, I see if you use TProd then you won't be able to do something similar indeed.
Adam Topaz (Oct 08 2025 at 19:22):
One of the benefits of things like ProdExpr is that doing an analogous thing should work.
Aaron Liu (Oct 08 2025 at 19:22):
my case is a bit trickier because with products you don't need coherence lemmas since they're all definitional equalities but I need coherences
Jasper Mulder-Sohn (Oct 08 2025 at 19:28):
Adam Topaz said:
My point of view is that no what what you do, you will have to deal with the fact that products in type theory are not definitionally associative. If you base your work on lists
L, you will probably end up defining a type associated to a listL, and you wouldn't have(L ++ L').type = L.type x L'.type, so then you end up dealing with dependent type theory hell if you associated(A ++ B) ++ CtoA ++ (B ++C). The approach usingProdExprtakes the point of view that you should deal with this hell as early as possible (which in my experience makes it much easier downstream, at the expense of a bit more work initially).
I fully agree with this sentiment, that the foundations should cater for this type of issue. Maybe I am just still resistant to the idea that the work must actually be done and must be 'hell' :slight_smile:
I would be happy to defer to people actually having gone through that in Lean. I guess it is all ok as long as there is an end to the 'hell' phase...
Mathias Stout (Oct 09 2025 at 00:54):
Adam Topaz said:
My point of view is that no what what you do, you will have to deal with the fact that products in type theory are not definitionally associative. If you base your work on lists
L, you will probably end up defining a type associated to a listL, and you wouldn't have(L ++ L').type = L.type x L'.type, so then you end up dealing with dependent type theory hell if you associated(A ++ B) ++ CtoA ++ (B ++C). The approach usingProdExprtakes the point of view that you should deal with this hell as early as possible (which in my experience makes it much easier downstream, at the expense of a bit more work initially).
Yes, this is definitely currently an annoyance. Even if L = L', then annoying casts are needed between L.type and L'.type. This 'hell' seems like it could be overcome by providing definitions along the lines of Fin.castand then proving lots of lemmas 'flattening' repeated applications. If something along the lines of ProdExpr would clean this up (possibly at the cost of proving lots lemmas), that would be an improvement.
Mathias Stout (Oct 09 2025 at 00:58):
Still, I don't think I fully grasp the idea behind ProdExpr. Given just the definition of ProdExpr and ProdExpr.Interpret, my first attempt at defining terms would be like so:
import Mathlib.Tactic
universe u v w z
inductive ProdExpr (S : Type u) where
| of : S → ProdExpr S
| prod : ProdExpr S → ProdExpr S → ProdExpr S
| nil : ProdExpr S
structure MSLanguage (S : Type u) where
Function : ProdExpr S → S → Type v
Relation : ProdExpr S → Type w
def ProdExpr.Interpret {S : Type u} (X : S → Type z) : ProdExpr S → Type z
| of s => X s
| prod a b => a.Interpret X × b.Interpret X
| nil => PUnit
inductive MSLanguage.Term {S : Type u} (L : MSLanguage S) (α : S → Type u) : S → Type _ where
-- Vars of type `s : S` give terms of type `s : S`.
| var (s : S) : α s → L.Term α s
-- If we have a product of terms of type in the language to `t : S`, then
-- applying the function results in a term of type `t : S`.
| func {s : ProdExpr S} {t : S} (f : L.Function s t) (r : ProdExpr.Interpret (Term L α) s) : L.Term α t
This inductive definition fails with the error (kernel) arg #7 of 'MSLanguage.Term.func' contains a non valid occurrence of the datatypes being declared. This is new to me. It seems that the mutual induction on ProdExpr.Interpret and Term L α is causing some Type issues?
Aaron Liu (Oct 09 2025 at 01:06):
ok I see what you want to do and I see why it doesn't work
Aaron Liu (Oct 09 2025 at 01:06):
maybe you can make ProdExpr.Interpret an inductive type instead?
Adam Topaz (Oct 09 2025 at 01:16):
That’s not the purpose of ProdExpr.Interpret, rather its purpose would be to define a type-theoretic interpretation of your language. The inductive type in my original code block should correspond to terms with this approach (modulo the missing unit constructor)
Adam Topaz (Oct 09 2025 at 01:18):
I guess mathlib calls it Structure in the single sorted case: https://leanprover-community.github.io/mathlib4_docs/Mathlib/ModelTheory/Basic.html#FirstOrder.Language.Structure
Mathias Stout (Oct 09 2025 at 01:24):
Oh ok, thanks for helping me out in understanding this!
I figured I went wrong somewhere when I tried to use this in the same way that I would use the dependent types that were discussed earlier.
Mathias Stout (Oct 09 2025 at 01:24):
Would it be fair to say that if I have a dependent type, say X : S → Type, and some s : ProdExpr S, then X s represents a product (X s₁) × (X s₂) × ... × (X sₙ) (modulo associativity issues)?
Mathias Stout (Oct 09 2025 at 01:27):
Adam Topaz said:
I guess mathlib calls it
Structurein the single sorted case: https://leanprover-community.github.io/mathlib4_docs/Mathlib/ModelTheory/Basic.html#FirstOrder.Language.Structure
Oh, I see that makes a lot of sense now, and is surprisingly elegant.
Aaron Liu (Oct 09 2025 at 01:29):
Mathias Stout said:
Would it be fair to say that if I have a dependent type, say
X : S → Type, and somes : ProdExpr S, thenX srepresents a product(X s₁) × (X s₂) × ... × (X sₙ)(modulo associativity issues)?
I think this would be
Application type mismatch: The argument
s
has type
ProdExpr S
but is expected to have type
S
in the application
X s
Adam Topaz (Oct 09 2025 at 01:30):
Right it’s not exactly X s rather s.Interpret X
Aaron Liu (Oct 09 2025 at 01:33):
yeah then that seems to would be correct
Mathias Stout (Oct 09 2025 at 02:02):
Oh yes, indeed.
I guess then throughout the development of the basics, one would use e.g.Term α s for some s : ProdExpr S where the List approach would use dependent objects like (i : Fin l.length) → (Term' α l[i]) with l : List S and Term' α : S → Type v.
Mathias Stout (Oct 09 2025 at 02:03):
At the very least, I feel like this nice idea should be properly tested. It deviates quite a bit from the "dependent listlike" approaches, which essentially all are isomorphic to a List of Sigma types (but might have different advantages in Lean). I currently am quite far from having any sufficient amount of experience to guess to the eventual effects and benefits.
Aaron Liu (Oct 09 2025 at 02:05):
Mathias Stout said:
which essentially all are isomorphic to a
ListofSigmatypes
sometimes you have a list of indices in mind and you want to restrict the list to be "over" your indices
Aaron Liu (Oct 09 2025 at 02:05):
so like a list of sigmas but separating the fst and snd into separate lists
Mathias Stout (Oct 09 2025 at 02:35):
Yeah, exactly. I guess it would have been more precise to state that you have a List of Sigma types + a proof that the projection onto the first components gives you the list of indices that you have in mind.
Jasper Mulder-Sohn (Oct 09 2025 at 05:40):
@Mathias Stout I am curious where your developments will take you; is there some place I can keep tabs on the progress, other than this thread? (And possibly chip in for some contributions if I find the opportunity?)
Adam Topaz (Oct 09 2025 at 12:25):
By the way, I heard rumors that some folks are interested in a formalization of the model theory of valued fields. Is this related?
Kenny Lau (Oct 09 2025 at 12:27):
Mathias Stout said:
With a couple of model theorists, we're working to formalize the basics of the model theory of valued fields.
@Adam Topaz it seems like we've gone full circle :slight_smile:
Adam Topaz (Oct 09 2025 at 12:27):
Aha!
Adam Topaz (Oct 09 2025 at 12:27):
Ok I guess I didn’t scroll up high enough
Mathias Stout (Oct 09 2025 at 14:43):
Jasper Mulder-Sohn said:
Mathias Stout I am curious where your developments will take you; is there some place I can keep tabs on the progress, other than this thread? (And possibly chip in for some contributions if I find the opportunity?)
Awesome, really cool to hear that you're interested.
There is currently no public-facing Github repo or anything like that. But I'll discuss with the rest of our group and definitely get back to you :)
Mathias Stout (Nov 18 2025 at 19:02):
Jasper Mulder-Sohn said:
Mathias Stout I am curious where your developments will take you; is there some place I can keep tabs on the progress, other than this thread? (And possibly chip in for some contributions if I find the opportunity?)
Took quite some time to get back to this, sorry for that. But there is a now a public GitHub, containing a partial proof of concept, along the lines of the first post in this thread. However, we're interested in finetuning the definition and trying in more detail some of the suggestions that were made in this thread. Feel free to point out any issues and give feedback.
https://github.com/Mathias-Stout/Many-sorted-model-theory
Jasper Mulder-Sohn (Nov 19 2025 at 14:12):
Mathias Stout said:
Took quite some time to get back to this, sorry for that. But there is a now a public GitHub, containing a partial proof of concept, along the lines of the first post in this thread. However, we're interested in finetuning the definition and trying in more detail some of the suggestions that were made in this thread. Feel free to point out any issues and give feedback.
Thanks for the effort and heads up. I'll check it out and report back :thumbs_up:
Aaron Anderson (Nov 21 2025 at 18:07):
Adam Topaz said:
Mathias Stout said:
Do I understand correctly that you believe one should build a sufficiently general framework where also nonassociative signatures
are allowed?No, not quite. I'm thinking about the usual model-theoretic context. The point of
ProdExpr Sis that if you have some functionX : S -> Type, then you can associate toP : ProdExpr Sa type as follows:inductive ProdExpr (S : Type u) where | of : S → ProdExpr S | prod : ProdExpr S → ProdExpr S → ProdExpr S | nil : ProdExpr S def ProdExpr.Interpret {S : Type} (X : S → Type v) : ProdExpr S → Type v | of s => X s | prod a b => a.Interpret X × b.Interpret X | nil => PUnitAnd such "interpretations" of
ProdExpr Sare exactly the kinds of types where you consider elements if you have some concrete interpretation of your language with sorts S (which would involve such a functionXthat interprets elements ofSas types). TheProdExpr Sthen keeps track of how such a type is constructed out of the sorts in a syntactic way, which means that you could more easily relate the logic which is simulated internally to the type-theoretic logic you get in such an interpretation.
I'm not really sure what the consequences of this will be, but I at least have one reason not to want this to be the core definition of first-order logic for mathlib: In variations on first-order logic, such as continuous logic (which I very much want to formalize at some point), we work with formulas that have infinitely many free variables, and unless I'm missing something, this construction can't account for that.
Adam Topaz (Nov 21 2025 at 19:58):
I should clarify that ProdExpr S here is the syntactic representation of the types where the free (or bound) variables live, not something representing the variables themselves.
Mathias Stout (Nov 24 2025 at 20:46):
Just to add a little bit more: in the end, our interest is in building out the basics of model theory of valued fields (say, at least up to Ax-Kochen/Ersov) principles, on which more advanced (possibly research-level) mathematics can be built, which would be of interest to fellow model theorists.
So for the formalization of many-sorted logic, we're looking for the following qualities
1) The basic definitions should be (able to be) integrated into Mathlib, so that the rest of the project is on stable footing.
2) (Related to 1) Experienced contributors to Mathlib/Mathlib.ModelTheory should be happy with the choice of formalization.
3) It should be as accessible as possible to do actual model theory (of say, valued fields) with the end product.
So to that end, we're very interested in people telling us now that the current approach is fundamentally flawed, or likely to be OK (with maybe some small changes later, informed by the issues we face when doing more advanced stuff).
Mathias Stout (Nov 24 2025 at 20:46):
Also, knowing a little bit more about the inner workings of Lean than last time, I see the appeal of ProdExpr more clearly.
Still, I find it hard to see the long-term impact on e.g. how it might detach formalized model theory from the way it is informally written (if it impacts this at all).
It also seems a bit harder to go back on using/not using ProdExpr than for the other suggestions (such as partial application, or swapping out List for a different structure).
While I don't have big issues with rewriting our foundations now, I'd rather not have to go back while we are doing more advanced mathematics.
Adam Topaz (Nov 24 2025 at 20:52):
My point of view on this is that if you want to do abstract model theory, then it doesn't matter too much whether you use lists, multisets, or whatever other representation you choose. However, if the plan is to do applied model theory (for example, the model theory of valued fields ;)) then you probably want your model-theoretic foundations to be as close as possible to Lean's foundations, so that you have an easier time actually applying model theory to, say, valued fields defined in terms docs#ValuativeRel . IMO, this ProdExpr S approach is an approach that would give something closer to the syntactic behavior of the surface type theory.
At the end of the day, the language, terms, etc. are syntactic objects. By working with List/Multiset/... you're mixing syntax with semantics a little bit.
Mathias Stout (Nov 28 2025 at 23:10):
Yes, I (am starting to) see that now :). Thanks for the suggestions and all the clarifications!
We've been experimenting a bit more with ProdExpr S and will be uploading the intermediate results to the GitHub repo soon.
It does raise the question of whether you could push this even further first work e.g. with 'FisrtOrderExpressions' rather than Formula, where the former might be potentially ill-formed, and if such a thing would even be desirable (for an appropriate definition of desirable)
Last updated: Dec 20 2025 at 21:32 UTC