Zulip Chat Archive

Stream: mathlib4

Topic: Some observations on eta experiment

Anatole Dedecker (May 01 2023 at 21:33):

I tried to understand why we need the eta experiment so much in e.g !4#3708, and I've managed to minimize it to this:

import Mathlib.Algebra.Module.LinearMap

-- succeeds
example {R₁ R₂ M₁ M₂ : Type _} [Semiring R₁] [Semiring R₂] [AddCommMonoid M₁] [AddCommMonoid M₂]
  [Module R₁ M₁] [Module R₂ M₂] (σ : R₁ →+* R₂) :
  SemilinearMapClass (M₁ →ₛₗ[σ] M₂) σ M₁ M₂ :=
inferInstance

-- fails
example {R₁ R₂ M₁ M₂ : Type _} [Ring R₁] [Semiring R₂] [AddCommMonoid M₁] [AddCommMonoid M₂]
  [Module R₁ M₁] [Module R₂ M₂] (σ : R₁ →+* R₂) :
  SemilinearMapClass (M₁ →ₛₗ[σ] M₂) σ M₁ M₂ :=
inferInstance

-- succeeds
set_option synthInstance.etaExperiment true in
example {R₁ R₂ M₁ M₂ : Type _} [Ring R₁] [Semiring R₂] [AddCommMonoid M₁] [AddCommMonoid M₂]
  [Module R₁ M₁] [Module R₂ M₂] (σ : R₁ →+* R₂) :
  SemilinearMapClass (M₁ →ₛₗ[σ] M₂) σ M₁ M₂ :=
inferInstance

I believe that this explains quite a lot of troubles we've had when working with (continuous) semilinear maps, and looking at it closer it seems that the problem comes from unifying the NonAssocSemiring instances appearing in σ : R₁ →+* R₂: in one case it's Ring -> NonAssocRing -> NonAssocSemiring while in the other it's Ring -> Semiring -> NonAssocSemiring. So it would be really nice to try and optimize this diamond: of course there will still be cases where we need eta reduction, but right now we are at the point where some files need synthInstance.etaExperiment for every lemma, which slows down typeclass search a lot, so we could try to optimize that.

Anatole Dedecker (May 01 2023 at 21:36):

I've tried to play a bit with the precise definition of all the Ring-like structure, but I didn't find how to make it work yet. I guess we could just make all of them extend NonUnitalNonAssocSemiring and have all fields be added to that, but at that point it's not clear if we get any of the advantages of new structures. Anyway, if someone wants to play with that, I thought this was a good test.

Jireh Loreaux (May 01 2023 at 21:46):

You should read Eric's paper if he'll send it to you. It's quite good.

Eric Wieser (May 01 2023 at 21:47):

Maybe I should just send it to arXiv, I just wasn't ready to fight their TeX compiler just yet

Eric Wieser (May 01 2023 at 21:47):

(I already sent it to @Anatole Dedecker :smile:)

Anatole Dedecker (May 01 2023 at 21:56):

Yes I've read it, and I understood from it that the only two ways to get to a situation where everything works is either to flatten everything back of to enable the etaExperiment. My point is, even if we have to manually do it for now in some places, since there are so many cases that could be solved all at once, maybe it would be worth trying to change this one diamond (between Ring and NonAssocSemiring), at least to make porting easier (in its current state, Analysis.NormedSpace.OperatorNorm is very slow to compile because of etaExperiment being enabled for most lemmas). Sorry if I'm juste re-discovering well-known facts.

Eric Wieser (May 01 2023 at 22:07):

I agree that some small-scale optimization of the "preferred" paths could work as an interim solution

Eric Wieser (May 01 2023 at 22:07):

I suspect something else that would help is changing the priority of the first parent of structures to be higher than that of subsequent parents; but I think there is a lean bug that makes that impossible

Kevin Buzzard (May 01 2023 at 22:15):

Did you open an issue? Sounds like it might be more easily fixed than type class inference!

Eric Wieser (May 01 2023 at 22:19):

Yes, there's already an issue about not being able to change extends priority

Eric Wieser (May 01 2023 at 22:19):

lean4#2115

Scott Morrison (May 01 2023 at 22:26):

Could @Mario Carneiro or @Gabriel Ebner put a mathlib-high-prio label on that one?

Jeremy Tan (May 02 2023 at 00:32):

(deleted)

Jeremy Tan (May 02 2023 at 00:33):

(deleted)

Sebastien Gouezel (May 03 2023 at 06:52):

I think I have finally understood something that Eric Wieser had already explained to me, that there is no way to avoid either eta or flat structures, and that I had not fullly grasped at the time. Let me record this for my future self, and maybe others.

When a structure B extends A, then it has normally a field toA (and additional fields).

Suppose now that both B and C extend A, and that D extends B and C (think A = NonUnitalNonAssocSemiring and B = NonUnitalSemiring and C = NonAssocSemiring and D = Semiring). Then one can not put in D two fields toC and toB, as the two maps one would get to A would be unrelated. Instead, one makes an arbitrary choice that there is a field toC, say, and additional fields that correspond to the missing fields of B. And the map from D to B is an ad-hoc construction, such that D.toB foo has a field toA given by foo.toC.toA and the other missing fields added by hand. In this way, both ways to go from D to A commute definitionally, and there is no diamond issue.

Note a big difference between the two maps D.toC and D.toB above. The map D.toC foo produces a whole structure C (without defining fields individually) (I will say this is a whole map) while D.toB foo produces a structure by filling in individual fields (I will say this is an opening map).

For now, everything is fine. Assume now that there are other structures with a prime, that follow the same inheritance as above (D' extends C' and B', which both extend A'), and moreover each primed version extends the non-primed one (remove Semi in the examples above). How should one set the maps in the cube A, B, C, D, A', B', C', D'? Which ones should be whole and which ones should be opening?

The claim is that there is no solution for this diagram (which shows up in mathlib) where all arrows commute definitionally if we don't use eta or flat structures. Flat structures ensures that all the maps are opening, and would solve all the issues. Assume instead we want to use whole maps whenever possible for performance reasons, say one whole map going out of each vertex (property (I)). Without eta, something obtained from a whole map can not be defeq to something obtained from an opening map if I understand right, so if we want defeq everywhere, each vertex in the cube should be the target only of opening maps, or only of whole maps (property (II)).

Then one cheks that there is no labelling of the oriented edges of the cube A, B, C, D, A', B', C', D' by whole maps or opening maps satisfying the properties (I) and (II). Too bad...

Kevin Buzzard (May 03 2023 at 07:14):

And because turning on structure eta in typeclass search makes things slower, so is not acceptable, the only solution is flat structures, right? @Gabriel Ebner

PS thanks a lot Sebastien -- your explanation has finally made the penny drop about what the issue is.

Johan Commelin (May 03 2023 at 07:18):

Well, the other option is to redesign the hierarchy with a different amount of bundling, by going full mixin on notation typeclasses. But that will be a huge refactor. (And we don't have a solid amount of data to know if the result will actually be usable in practice.)

Patrick Massot (May 03 2023 at 07:22):

I know it's probably way too late but remember than some people have been telling us for five years that the solution is to use unification hints instead of type classes.

Kevin Buzzard (May 03 2023 at 07:26):

I don't understand what you're proposing Johan (and I'd be interested to understand). Sebastien is saying that we can't have property (I) which means we can't have "whole maps" (forgetful functors) at all if we want to preserve defeq (which we do if eta is off).

Kevin Buzzard (May 03 2023 at 07:27):

Oh, you mean [ring_notation R] to be +-*01?

Eric Wieser (May 03 2023 at 07:34):

@Sebastien Gouezel, your explanation is missing an important piece; the problem only arises when you add typeclasses that depend on other typeclasses to the mix, like module, star_ring, and archimedean

Eric Wieser (May 03 2023 at 07:35):

I will try to submit my CICM preprint to arXiv today, as it includes that note

Sebastien Gouezel (May 03 2023 at 08:30):

Can you explain why? I had the impression that the cube example gives rise to non-defeq diamonds whatever you do, even without additional typeclasses that depend on those, unless you use eta or flat structures.

Eric Wieser (May 03 2023 at 08:35):

Those non-defeq diamonds don't matter if in the end you just use the raw data projections, as composed with those projections the results are defeq again

Eric Wieser (May 03 2023 at 08:36):

I suppose probably you hit the same problem with structures that consume other typeclasses, not just typeclasses consuming typeclasses

Sebastien Gouezel (May 03 2023 at 08:38):

Sure, you need something that uses the full A, and not subfields of A.

Eric Wieser (May 03 2023 at 08:40):

In practice I think that something has to be an inductive, though I suppose an irreducible def might do the trick too

Kevin Buzzard (May 03 2023 at 09:30):

Patrick Massot said:

I know it's probably way too late but remember than some people have been telling us for five years that the solution is to use unification hints instead of type classes.

What does the unification hints solution look like?

Note that we used flat structures for 5 years and also had very few problems.

Jireh Loreaux (May 03 2023 at 12:13):

Kevin, you forgot: "... and a slow mathlib." Of course, I'm not claiming that this is a major contributing factor to that, but maybe it played some role.

Scott Morrison (May 03 2023 at 12:17):

We may just be able to turn on eta globally. There is a patch to Lean4 + mathlib4 by Gabriel that does this (and which I've just updated to current master), and it's faster than the current mathlib4. It may take a while to work out how / if / when it could be merged, but don't despair that we're stuck with etaExperiment quite yet. :-)

Eric Wieser (Jun 03 2023 at 17:52):

Here's a diagrammatic representation of @Sebastien Gouezel's post above:

image.png

Yury G. Kudryashov (Jun 04 2023 at 00:43):

Is there an algebraic hierarchy design that works better with Lean 4?

Kevin Buzzard (Jun 04 2023 at 08:44):

Flat structures like lean 3. But then they'll become gigantic and slow in the long run

Eric Wieser (Jun 04 2023 at 09:19):

It's not clear to me that flat structures have any reason to be less performant than nested structures, in the presence of eta expansion

Kevin Buzzard (Jun 04 2023 at 10:34):

Aren't the main problems these defeq pseudo-diamonds which lean 4 spends forever fussing about?

Eric Wieser (Jun 04 2023 at 11:26):

I don't think switching between the two approaches has much effect on whether it should spend forever fussing

Eric Wieser (Jun 04 2023 at 11:26):

Though it's certainly easier to reason about the flat case!

Kevin Buzzard (Jun 04 2023 at 11:32):

Why did Lean 3's typeclass system have no problems unifying Field.toSemiring and Ring.toSemiring and CommSemiring.toSemiring?

Yury G. Kudryashov (Jun 04 2023 at 17:43):

Will we reduce the number of diamonds if we somehow decouple data and properties?

Yury G. Kudryashov (Jun 04 2023 at 17:43):

What is the trade-off here?

Eric Wieser (Sep 01 2023 at 20:50):

Eric Wieser said:

Here's a diagrammatic representation of Sebastien Gouezel's post above:

Having now published this, I realize that this claim is false; in particularly, this plausible characterization turns out not to the the case

Sebastien Gouezel said:

Without eta, something obtained from a chain of whole maps can not be defeq to something obtained from an opening map if I understand right

The true statement is that something obtained only from whole maps cannot be defeq to something not obtained this way. Once both paths include an open step, then all bets are off unless you can see the full graph.

The claim is that there is no solution for this diagram

The figure in my post above is precisely such a solution, I apparently did not check it correctly in Lean

Kevin Buzzard (Sep 01 2023 at 20:54):

So we should go back to carefully choosing spanning trees a la #6011 ?

Eric Wieser (Sep 01 2023 at 21:03):

Well, we still don't know that it's possible for sure; all we know is that the counterexample we thought we had wasn't good enough

Eric Wieser (Sep 01 2023 at 21:04):

If a solution is possible, I think the above is pretty strong evidence that we can't be trusted to find it by hand

Eric Wieser (Sep 01 2023 at 21:05):

It definitely still isn't possible to solve it for cyclic typeclass graphs, but we pretty much never use those anyway

Kevin Buzzard (Sep 01 2023 at 21:06):

that picture is: one down-arrow from nu_semi to nu_na_semi, and then two right arrows (adding nu) and four left arrows (dropping semi). Is that what the pattern is supposed to be?

Eric Wieser (Sep 01 2023 at 21:08):

That picture is neither intended to represent reality, or to represent what mathlib should change to. It was supposed to be a counterexample, but... At any rate, we're not solving for just that cube in mathlib; we're solving for the entire graph of typeclasses

Mario Carneiro (Sep 01 2023 at 21:08):

I'm not sure I understand whether this problem is expressible using just graph theory

Mario Carneiro (Sep 01 2023 at 21:08):

is there some property of the dotted edges that we want to ensure?

Mario Carneiro (Sep 01 2023 at 21:09):

not to mention that if you include eta then everything becomes a defeq so I don't know any more what is being measured from a theory perspective

Eric Wieser (Sep 01 2023 at 21:10):

not to mention that if you include eta then everything becomes a defeq

Right, to test these things you have to use Lean 3.

Mario Carneiro (Sep 01 2023 at 21:10):

okay so the claim is about defeq without eta?

Eric Wieser (Sep 01 2023 at 21:10):

Yes, exactly

Mario Carneiro (Sep 01 2023 at 21:11):

and you want to evaluate whether any two paths are defeq?

Eric Wieser (Sep 01 2023 at 21:11):

So this isn't a problem with us using eta, but it is a problem in my/our understanding of why (and whether) it was necessary

Mario Carneiro (Sep 01 2023 at 21:12):

I just want to understand what the theoretical version of the problem is, and to what extent the picture above is or isn't a counterexample and what properties you need to find a real one

Mario Carneiro (Sep 01 2023 at 21:12):

What does it mean, precisely, that the given square does not commute?

Mario Carneiro (Sep 01 2023 at 21:13):

are you saying that given a variable of type ring, the two paths yield terms of type na_semiring that are not defeq without eta?

Eric Wieser (Sep 01 2023 at 21:13):

That is what the original (false) claim was, yes

Mario Carneiro (Sep 01 2023 at 21:14):

are these non-flat structures?

Eric Wieser (Sep 01 2023 at 21:14):

Yes, it's all trivial for flat structures

Mario Carneiro (Sep 01 2023 at 21:27):

Here's a model of that situation if I understand correctly:

opaque NUNASemiring : Type
structure NUSemiring where
  to : NUNASemiring
  assoc : Bool
structure NASemiring where
  to : NUNASemiring
  unit : Bool
structure Semiring where
  to : NUSemiring
  unit : Bool
structure NARing where
  to : NASemiring
  neg : Bool
structure NUNARing where
  to : NUNASemiring
  neg : Bool
structure NURing where
  to : NUSemiring
  neg : Bool
structure Ring where
  to : Semiring
  neg : Bool

def Ring.toNA (r : Ring) : NARing where
  to.to := r.to.to.to
  to.unit := r.to.unit
  neg := r.neg
def Ring.toNU (r : Ring) : NURing where
  to := r.to.to
  neg := r.neg
def Semiring.toNA (r : Semiring) : NASemiring where
  to := r.to.to
  unit := r.unit
def Semiring.toNU (r : Semiring) : NUSemiring where
  to := r.to.to
  assoc := r.to.assoc
def NURing.toNA (r : NURing) : NUNARing where
  to := r.to.to
  neg := r.neg
def NARing.toNU (r : NARing) : NUNARing where
  to := r.to.to
  neg := r.neg

def Eq' := @Eq

example (r : Semiring) : Eq' r.toNA.to r.to.to := by
  dsimp only [Semiring.toNA]
  guard_target =ₛ Eq' r.to.to r.to.to; rfl
example (r : NURing) : Eq' r.toNA.to r.to.to := by
  dsimp only [NURing.toNA]
  guard_target =ₛ Eq' r.to.to r.to.to; rfl
example (r : NARing) : Eq' r.toNU.to r.to.to := by
  dsimp only [NARing.toNU]
  guard_target =ₛ Eq' r.to.to r.to.to; rfl
example (r : Ring) : Eq' r.toNU.to r.to.to := by
  dsimp only [Ring.toNU]
  guard_target =ₛ Eq' r.to.to r.to.to; rfl
example (r : Ring) : Eq' r.toNA.to r.to.toNA := by -- the one in the paper
  dsimp only [Ring.toNA, Semiring.toNA]
  guard_target =ₛ @Eq' NASemiring ⟨r.to.to.to, r.to.unit⟩ ⟨r.to.to.to, r.to.unit⟩; rfl
example (r : Ring) : Eq' r.toNA.toNU r.toNU.toNA := by
  dsimp only [Ring.toNU, NARing.toNU, NURing.toNA, Ring.toNA]
  guard_target =ₛ @Eq' NUNARing ⟨r.to.to.to, r.neg⟩ ⟨r.to.to.to, r.neg⟩; rfl

-- fail test
def Ring.toNU' (r : Ring) : NURing where
  to.to := r.to.to.to
  to.assoc := r.to.to.assoc
  neg := r.neg
example (r : Ring) : Eq' r.toNU'.to r.to.to := by
  dsimp only [Ring.toNU']
  fail_if_success guard_target =ₛ Eq' r.to.to r.to.to
  rfl

Note that dsimp only will not use eta here

Eric Wieser (Sep 01 2023 at 21:29):

Note I edited a link to a (lean3) model in my top post; it looks like your model is indeed the same

Eric Wieser (Sep 01 2023 at 21:30):

Note the challenge is not to show that a situation with failing defeq exists (the paper already contains a genuine one), but to show that for any unlabelled graph, it's possible to label the edges as to or toX (to use your naming) such that no such problem is present

Mario Carneiro (Sep 01 2023 at 21:39):

wait, so your (new) claim is that it is always possible to make a DAG commute by some spanning tree?

Mario Carneiro (Sep 01 2023 at 21:41):

I did actually insert a bit of "domain knowledge" into the model: it's a cube graph, and I added fields representing each thing that was lost in each forgetful functor

Mario Carneiro (Sep 01 2023 at 21:41):

in a general DAG it's not immediately obvious how to determine these extra fields, and they are relevant to the definitions of the non-parent projections

Eric Wieser (Sep 01 2023 at 21:42):

I'm pretty sure the extra fields are a distraction (beyond using "forgetful inheritance" to populate them)

Mario Carneiro (Sep 01 2023 at 21:43):

maybe it would be good to have an example of a failing defeq in the same framework

Eric Wieser (Sep 01 2023 at 21:43):

https://github.com/eric-wieser/lean-multiple-inheritance has such an example, admittedly without the clear statement of defeq

Mario Carneiro (Sep 01 2023 at 21:44):

which one?

Eric Wieser (Sep 01 2023 at 21:47):

this one

Eric Wieser (Sep 01 2023 at 21:47):

(this doesn't confirm the fields are a distraction, but it's easy to eliminate them and check)

Mario Carneiro (Sep 01 2023 at 21:53):

ah, my dsimp only trick doesn't work, here's a variant that should fail but doesn't

def Ring.toNU' (r : Ring) : NURing where
  to.to := r.to.to.to
  to.assoc := r.to.to.assoc
  neg := r.neg
example (r : Ring) : r.toNU'.to = r.to.to := by
  unfold Ring.toNU'
  dsimp only

Eric Wieser (Sep 01 2023 at 21:58):

Mario Carneiro said:

wait, so your (new) claim is that it is always possible to make a DAG commute by some spanning tree?

Yes, that's my (tentative) claim

Mario Carneiro (Sep 01 2023 at 22:02):

okay, I updated the code above with an indirection to avoid the accidental eta application

Mario Carneiro (Sep 01 2023 at 22:03):

and indeed you can remove all the auxiliary fields while retaining the same behavior

Mario Carneiro (Sep 01 2023 at 22:16):

Okay, so I think it can be phrased as a graph theory problem thusly: Given a DAG and a spanning tree of that DAG, given any two points u and v in the graph there is a unique path from u to v in the spanning tree, and a unique minimum element in that path wrt the DAG order, call that lca(u, v) (least common ancestor). We can extend this to any sequence of vertices a1, ..., an via lca(a1, ..., an) = lca(a1, lca(a2, ..., an)). Then we say a pair of paths a1 -...-> an, b1 -...-> bm (with a1 = b1 and an = bm) in the original graph commutes if lca(a1, ..., an) = lca(b1, ..., bn). Does every DAG admit a spanning tree such that every pair of paths with the same source and target in the original graph commutes?

Sebastien Gouezel (Sep 02 2023 at 06:30):

Eric Wieser said:

The true statement is that something obtained only from whole maps cannot be defeq to something not obtained this way. Once both paths include an open step, then all bets are off unless you can see the full graph.

Very good point!

Mario Carneiro (Sep 02 2023 at 20:44):

Oh by the way @Eric Wieser , I checked the graph theoretic statement for all dags with up to 7 points, and about a 20% random sampling of all 8 point configurations, for commuting triangles and squares. So that's some evidence that the statement is true, maybe it's worth trying to find a proof

Mario Carneiro (Sep 02 2023 at 20:46):

there are about 2^28 dags to check for 8 points and 2^36 for 9 points, if I let it run for a few hours I could probably get all 8 point configurations

Kevin Buzzard (Sep 02 2023 at 20:48):

@Bhavik Mehta or @Yaël Dillies can you prove it?

Bhavik Mehta (Sep 02 2023 at 22:04):

Mario Carneiro said:

Okay, so I think it can be phrased as a graph theory problem thusly: Given a DAG and a spanning tree of that DAG, given any two points u and v in the graph there is a unique path from u to v in the spanning tree, and a unique minimum element in that path wrt the DAG order, call that lca(u, v) (least common ancestor). We can extend this to any sequence of vertices a1, ..., an via lca(a1, ..., an) = lca(a1, lca(a2, ..., an)). Then we say a pair of paths a1 -...-> an, b1 -...-> bm (with a1 = b1 and an = bm) in the original graph commutes if lca(a1, ..., an) = lca(b1, ..., bn). Does every DAG admit a spanning tree such that every pair of paths with the same source and target in the original graph commutes?

I think I'm misunderstanding something about the statement (although I haven't read the earlier context) - given a path, lca on the path doesn't depend on the tree, only on the DAG order, I think. So the commuting property shouldn't depend on the choice of tree either...

Jireh Loreaux (Sep 02 2023 at 22:17):

Yes, it does. The lca is the least element on the unique path in the spanning tree.

Bhavik Mehta (Sep 02 2023 at 22:25):

Jireh Loreaux said:

Yes, it does. The lca is the least element on the unique path in the spanning tree.

Ah, I was confusing the path which is an input to the lca and the path used in its computation, thanks

Yaël Dillies (Sep 03 2023 at 05:14):

/me is thinking

Damiano Testa (Sep 03 2023 at 06:43):

Mario, could you check if in your examples, a spanning tree that works can be obtained as follows:

Among all pairs of paths with same beginning and end in the DAG, find one where the length of one of the two paths is maximal.
Remove the last edge of that longest path.
Repeat until there are no more pairs of paths joining two vertices.

Does the resulting spanning tree have the right property ?

Eric Wieser (Sep 03 2023 at 09:37):

I was about to say that paths lengths can't help, because you could split any edge into a series of edges; but I would have been incorrect, as when you do this all the new edges are solid lines

Arend Mellendijk (Sep 03 2023 at 10:18):

What's the lca of a and c in a -> b <- c? Does "minimal" here depend on the chosen topological ordering?

Arend Mellendijk (Sep 03 2023 at 10:23):

Either way, here are some thoughts that may or may not be useful or correct.
This only concerns the case where the DAG has a unique initial element. Take a DFS tree (respecting the DAG order) starting at the initial element. I think this tree makes the DAG commute.

Note that in this construction the LCA of a set of nodes can be found by moving the deepest node "upwards" towards the root until they all coincide.
All triangles commute. Consider a triangle a -> b -> c. If a is found before b by the DFS then lca(a,b,c) = a. If b is found first then c is a descendant of b and the path from c to a passes through b. If c is found first then draw (squiggly) paths from each point going to the root, and consider where they can intersect given the DFS property.
Consider the transitive closure G of the DAG. Our DFS tree is still a valid DFS tree in G. Hence triangles commute in G and G commutes. Then so does our original DAG.

Mario Carneiro (Sep 03 2023 at 15:21):

Sorry, I should have specified better. I am assuming that the dag is directed, which implies for finite dags that there is a unique minimal element. The case of non-directed dags is not interesting because there are no spanning trees in that case (probably this is also a bit of underspecification; the spanning trees have to have the property that every vertex has in-degree at most 1). To have a spanning tree in that case, since there is one more vertex than edge, there must be exactly one node which has no parent, and that node must be the minimum of the dag because nothing else can point to the minimum element of the dag.

Mario Carneiro (Sep 03 2023 at 15:22):

Does "minimal" here depend on the chosen topological ordering?

Eric Wieser (Sep 03 2023 at 15:38):

I don't think the in-degree is limited to one in the real case; though I also don't know which way is in and which way is out in your description!

Ruben Van de Velde (Sep 03 2023 at 17:02):

(doesn't the "d" in "dag" stand for "directed"?)

Kyle Miller (Sep 03 2023 at 22:05):

There's also directed in the order sense, versus directed in the directed graph sense. The DAG induces an order (say if u -> v then u <= v), and it's directed if every finite subset of vertices has an upper bound. (I'm not sure which conventions are being used here, maybe Mario's using lower bound.) At least this is what I think is going on, I'm not following too closely.

Mario Carneiro (Sep 04 2023 at 03:22):

oops, I mean downward directed (in the order sense) of course

Mario Carneiro (Sep 04 2023 at 03:39):

Eric Wieser said:

I don't think the in-degree is limited to one in the real case; though I also don't know which way is in and which way is out in your description!

Hm, this does complicate matters if we are considering multiple "preferred parents" in the real case. If in the tree you have a -> b <- c, i.e. b has two parents, then it must be the case that a and c are incomparable - they have no common lower bound - in which case they have no lca. However this doesn't come up when considering pairs of paths with a common source and target, because in that case we only ever take the lca of comparable nodes (i.e. related in the transitive closure one way or the other).

Eric Wieser (Sep 04 2023 at 04:38):

I think your arrows are backwards with respect to my diagram above (so indeed in and out are swapped)

Eric Wieser (Sep 04 2023 at 04:38):

But yes, that case can happen if Semiring extends monoid and add_comm_monoid, which would both be preferred/whole maps

Mario Carneiro (Sep 04 2023 at 04:44):

I was using the same convention as Arend Mellendijk and Kyle Miller there, so a root has in-degree zero

Mario Carneiro (Sep 04 2023 at 04:44):

oh yeah your diagram is the other way around

Mario Carneiro (Sep 04 2023 at 04:45):

I guess the usual convention for trees in CS gives the root in-degree 0

Last updated: May 02 2025 at 03:31 UTC