Zulip Chat Archive

Stream: mathlib4

Topic: Content-hashing declarations

Johan Commelin (Apr 14 2025 at 12:22):

Intuitively, I'm a big proponent of everything that increases the decoupling of names from lambda expressions.

Mario Carneiro (Apr 14 2025 at 12:45):

I think there is a meaningful intermediary here, i.e. it would not be a good idea to go "full unison" and say that names are just resolving to lambda expressions. Rather names resolve to definitions which have lambda expression bodies

Mario Carneiro (Apr 14 2025 at 12:46):

otherwise we don't have anywhere to hang definition unfolding heuristics

Mario Carneiro (Apr 14 2025 at 12:46):

But I see no innate issue with a definition having multiple names like a linux hard link

Johan Commelin (Apr 14 2025 at 12:57):

Mario Carneiro said:

otherwise we don't have anywhere to hang definition unfolding heuristics

You can hang those in a separate db, indexed by hashes, right?

Mario Carneiro (Apr 14 2025 at 13:04):

hashes of what?

Mario Carneiro (Apr 14 2025 at 13:04):

if you are hashing the lambda term then you would not be able to have two definitions with the same content and different metadata

Mario Carneiro (Apr 14 2025 at 13:05):

basically I'm saying that if we erase all the names and just look at it as a soup of nodes, then there is a node type representing expressions but there is also a node type representing definitions together with all their metadata (except the name)

Mario Carneiro (Apr 14 2025 at 13:06):

and constant references in expressions refer to definitions, not the body expression directly

Mario Carneiro (Apr 14 2025 at 13:07):

and then you can layer names on top of that and content hash to your heart's content

Mario Carneiro (Apr 14 2025 at 13:08):

Another case of content hashing causing problems is if you erase names in inductive types, it would make structure Foo where n : Nat and structure Bar where n : Nat the same type

Mario Carneiro (Apr 14 2025 at 13:09):

some things have to be "generative"

Mario Carneiro (Apr 14 2025 at 13:09):

in the soup of nodes picture these would be distinct nodes that have the same subnodes

Johan Commelin (Apr 14 2025 at 13:10):

Mario Carneiro said:

Another case of content hashing causing problems is if you erase names in inductive types, it would make structure Foo where n : Nat and structure Bar where n : Nat the same type

Is that a feature or a bug?

Johan Commelin (Apr 14 2025 at 13:10):

Anyway, maybe we should split this thread.

Notification Bot (Apr 14 2025 at 15:09):

15 messages were moved here from #mathlib maintainers > Deprecating field names by Mario Carneiro.

Mario Carneiro (Apr 14 2025 at 15:12):

On the subject of nominal types, the #leantt formalization does not use nominal inductive types, it has a $\mu T:F. K$ term constructor for inductive types instead of an inductive declaration form

Mario Carneiro (Apr 14 2025 at 15:12):

which has the side effect that you can prove the equality of types like Foo and Bar

Mario Carneiro (Apr 14 2025 at 15:13):

And the way you can "fix" that is to put the name in the constructor, i.e. $\mu[Foo]T:F.K$ , or an auto-incrementing counter

Johan Commelin (Apr 15 2025 at 04:37):

Maybe I'm naive, but to me it seems that names are strictly an elaborator feature.
I don't see harm in being able to prove that Foo and Bar are equal.

Mario Carneiro (Apr 15 2025 at 13:13):

The standard argument for newtypes is things like structure Meters where n : Real and structure Feet where n : Real

Mario Carneiro (Apr 15 2025 at 13:14):

if I have such types and have a function def tooBig : Meters -> Bool it should be a type error to pass Feet.mk 37 to it

Mario Carneiro (Apr 15 2025 at 13:16):

Put another way, a type is not just a set of values but also an interpretation of those values, even though only the set of values is explicitly expressed in the syntax. If the interpretation changes then using a different type puts guard rails avoiding accidental misinterpretation

Mario Carneiro (Apr 15 2025 at 13:20):

Mario Carneiro said:

And the way you can "fix" that is to put the name in the constructor, i.e. $\mu[Foo]T:F.K$ , or an auto-incrementing counter

When I call it an auto-incrementing counter it makes this sound rather unnatural, but there are other ways of presenting type declarations that are more like variable bindings, and it should not be any surprise that when you are in a scope x : Nat, y : Nat |- ..., x and y do not magically become equal simply because they have the same type. They are different variables and the type theory respects this, and in the underlying system this is accomplished using de bruijn variables, i.e. "an auto-incrementing counter". Note that this really has nothing to do with the names x and y, rather just that they are different variables because the declaration operation was performed twice.

Johan Commelin (Apr 15 2025 at 16:34):

But that auto-incrementing counter does imply that you need some global entity that knows about all declarations out there, and then assigns them a "de bruijn index".

Johan Commelin (Apr 15 2025 at 16:35):

Mario Carneiro said:

if I have such types and have a function def tooBig : Meters -> Bool it should be a type error to pass Feet.mk 37 to it

Can't it be enough from a UX POV to have this be a type error at the elaboration level?

Mario Carneiro (Apr 15 2025 at 16:36):

Johan Commelin said:

But that auto-incrementing counter does imply that you need some global entity that knows about all declarations out there, and then assigns them a "de bruijn index".

That depends on how you conceptualize it, I suppose. You need them to be recognized as distinct, but there are ways to get that without central coordination points, e.g. UUIDs

Mario Carneiro (Apr 15 2025 at 16:38):

Johan Commelin said:

Mario Carneiro said:

if I have such types and have a function def tooBig : Meters -> Bool it should be a type error to pass Feet.mk 37 to it

Can't it be enough from a UX POV to have this be a type error at the elaboration level?

:shrug: I think all typing can be done at the elaboration level

Mario Carneiro (Apr 15 2025 at 16:39):

but if you do that then it's not clear how you've done anything other than shift the burden of checking to a different subsystem and still having nominal semantics

Johan Commelin (Apr 15 2025 at 16:42):

Well, I'm not doing all typing at the elaboration level.

My main application is mathematical consistency, proof correctness. Plugging Feet.mk 37 into tooBig will not give you a proof of False.

Johan Commelin (Apr 15 2025 at 16:43):

Mario Carneiro said:

That depends on how you conceptualize it, I suppose. You need them to be recognized as distinct, but there are ways to get that without central coordination points, e.g. UUIDs

But then you lose the feature that if person Foo defines Nat in Greece in 370 BC and person Bar defines Nat in India in 530 BC, that they automatically end up with the same Nat.
And so now person Foo can't use theorems by Bar, when they discovers them.

Mario Carneiro (Apr 15 2025 at 16:49):

yeah I think that is by design, that's the same thing you would get if both the ancient Greek and Indian lean users used the inductive command separately

Mario Carneiro (Apr 15 2025 at 16:56):

Johan Commelin said:

My main application is mathematical consistency, proof correctness. Plugging Feet.mk 37 into tooBig will not give you a proof of False.

It can lead to proofs by explosion though

Mario Carneiro (Apr 15 2025 at 17:01):

I think it is more appropriate to have separate types but use something HoTT-styled to mediate the various ways that the types can be identified. Just because they have the same structure doesn't mean the "identity function" induced by this is the right isomorphism to apply to turn Greek mathematics into Indian mathematics

Mario Carneiro (Apr 15 2025 at 17:01):

Nat is a special case because it has no nontrivial isomorphisms

Aaron Liu (Apr 15 2025 at 17:04):

Mario Carneiro said:

Nat is a special case because it has no nontrivial isomorphisms

What kind of isomorphisms?

Kevin Buzzard (Apr 15 2025 at 17:06):

Does every inductive have no nontrivial automorphisms if by "automorphism" you been "constructor-preserving map"?

Aaron Liu (Apr 15 2025 at 17:07):

Then Bool would seem to have Bool.not

Mario Carneiro (Apr 15 2025 at 17:08):

I think kevin means mapping each constructor to the corresponding constructor

Aaron Liu (Apr 15 2025 at 17:08):

Preserves constructors up to renaming

Kevin Buzzard (Apr 15 2025 at 17:08):

That doesn't preserve the constructor Bool.true. This is why your question is pertinent!

Mario Carneiro (Apr 15 2025 at 17:08):

and I think the answer is yes

Aaron Liu (Apr 15 2025 at 17:08):

What about Prod.swap?

Aaron Liu (Apr 15 2025 at 17:08):

It preserves the only constructor Prod.mk

Anatole Dedecker (Apr 15 2025 at 17:09):

No, because this takes two arguments in order

Mario Carneiro (Apr 15 2025 at 17:09):

(I also assume that preserving a constructor means mapping the arguments in the same order :stuck_out_tongue_closed_eyes: )

Aaron Liu (Apr 15 2025 at 17:10):

Actually, what does "preserves constructors" even mean?

Mario Carneiro (Apr 15 2025 at 17:11):

a homomorphism of Nat is a function f such that f zero = zero and f (succ n) = succ (f n)

Mario Carneiro (Apr 15 2025 at 17:12):

and since this has clauses mirroring those of a definition by recursion, you can prove by induction that it's the identity function

Mario Carneiro (Apr 15 2025 at 17:12):

so this is true for all inductive types

Mario Carneiro (Apr 15 2025 at 17:13):

This is interesting, it's a kind of "rigidity" I associate with ZFC set theory, "junk theorems" and all that

Aaron Liu (Apr 15 2025 at 17:13):

Mario Carneiro said:

a homomorphism of Nat is a function f such that f zero = zero and f (succ n) = succ (f n)

So how does this definition apply to Prod (or Prod Nat Nat if you want to be concrete)?

Aaron Liu (Apr 15 2025 at 17:14):

Would it just be f (a, b) = (a, b)?

Mario Carneiro (Apr 15 2025 at 17:14):

a homomorphism of h : A x B -> A' x B' relative to morphisms f : A -> A', g : B -> B' is a function such that h (a, b) = (f a, g b)

Yaël Dillies (Apr 15 2025 at 17:14):

(misread)

Mario Carneiro (Apr 15 2025 at 17:15):

the "relative to" part is because Prod is a parameterized type

Mario Carneiro (Apr 15 2025 at 17:15):

but if f and g are identity functions then you can prove by recursion that h is also an identity function

Aaron Liu (Apr 15 2025 at 17:16):

What about arrow types?

Mario Carneiro (Apr 15 2025 at 17:18):

I think you need isomorphisms to do contravariant things, a homomorphism of f : (A -> B) -> (A' -> B') over a : A ~= A' and b : B ~= B' satisfies f g = fun x => b (g (a^-1 x))

Aaron Liu (Apr 15 2025 at 17:19):

What about universe types, like Prop and Type? Those don't seem to have any constructors.

Mario Carneiro (Apr 15 2025 at 17:19):

HoTT can basically compute these expressions for any type former

Mario Carneiro (Apr 15 2025 at 17:19):

I was just getting to that: the universe is an example of a non-rigid type since it is not inductively defined. It actually has nontrivial isomorphisms

Mario Carneiro (Apr 15 2025 at 17:20):

and those isomorphisms are precisely the things that e.g. interchange Meters and Feet

Aaron Liu (Apr 15 2025 at 17:21):

So what's a homomorphism of Type have to satisfy?

Mario Carneiro (Apr 15 2025 at 17:23):

In HoTT, it would be a function F : Type -> Type such that (A = B) ~= (F A ~= F B)

Mario Carneiro (Apr 15 2025 at 17:23):

that really doesn't make a lot of sense without HoTT though, since applied to the identity function it's literally univalence

Mario Carneiro (Apr 15 2025 at 17:24):

but the simple consequence would be that isomorphic types are mapped to isomorphic types

Mario Carneiro (Apr 15 2025 at 17:25):

(Don't quote me on that one though, I haven't actually worked it out)

Kevin Buzzard (Apr 15 2025 at 17:52):

Type is a category and you could ask for an equivalence from Type to itself (or slightly more ridiculously for an isomorphism). One such isomorphism could be "swap Meters and Feet and fix everything else" although of course there is a model of Lean where this is the identity and another model where it's not ;-)

Last updated: May 02 2025 at 03:31 UTC