Zulip Chat Archive

Stream: general

Topic: CFP: improved Blueprint UX


Jesse Michael Han (Dec 06 2023 at 22:40):

Hey all, the team at Morph Labs has been extremely impressed by the formalization projects (PFR, FLT-regular) that have been completed using Blueprint. That seems like the best extant solution for coordinating large-scale formalizations (cc @Kevin Buzzard). It is also, as @Terence Tao has pointed out, a promising locus for applying LLMs and proof automation. We'd like to apply our design and engineering skills to make the Blueprint experience even better and significantly lower the barrier of entry for both experienced mathematicians and formalizers to start and finish formalizations. We would love to hear from the community about ways that Blueprint could improved.

From some recent discussion in the PFR stream as well as DMs, it seems like some features that people like to see include:

  • Integration with GitHub issues or some other kind of issue tracking / todo item management system
  • Automated linking and generation of dependencies instead of \uses{} in the LaTeX
  • Faster updates on the status of nodes in the blueprint (e.g. faster CI for checking updated proofs)
  • Bidirectional integration with GitHub PRs / issues (related to the above, i.e. automatic insertion of \leanoks)
  • More expressive status indicators for blueprint nodes to indicate help wanted on e.g. informal text or formal proof

Some features that I would be excited to see would include:

  • AI-driven suggestions of informal and formal proof sketches for blueprint items which are missing either;
  • Automated creation of blueprint scaffolding from plain LaTeX
  • Support for Markdown and community-driven edits to individual sections of the blueprint, with a way of keeping a LaTeX source of truth in sync
  • Automated generation of a digest of all necessary information for getting a contributor up to speed without necessarily pulling in the entire dependency graph or forcing them to jump through the LaTeX themselves.

We're excited to expand this list and to hear what you all have to say.

Jesse Michael Han (Dec 06 2023 at 22:43):

and of course Blueprint wouldn't exist without @Patrick Massot's amazing efforts of developing it in the first place!

Johan Commelin (Dec 07 2023 at 03:11):

I think there should be an explicit source of truth for the dependency tree. And I think \uses is a pretty good way of doing that.
So have automation help with suggesting new links is of course a good idea. But I would be sad to see \uses disappear.

Patrick Massot (Dec 07 2023 at 03:22):

Automated linking and generation of dependencies instead of \uses{} in the LaTeX

Not having this is an explicit very important feature of the blueprint plugin. It would be easy to collect uses of \ref in the code source, but it would miss a lot of links.

Mario Carneiro (Dec 07 2023 at 03:23):

you could do both, no?

Patrick Massot (Dec 07 2023 at 03:23):

Faster updates on the status of nodes in the blueprint would be nice but this is only relevant from Scholze or Tao starts a project. For normal projects, the project speed does not require faster updates.

Patrick Massot (Dec 07 2023 at 03:24):

Mario Carneiro said:

you could do both, no?

It would be very brittle if people start omitting \uses because there is a \ref and then later remove the \ref.

Mario Carneiro (Dec 07 2023 at 03:25):

Maybe I don't have enough experience with blueprints but it seems strange that there would be a use that is not a \ref

Mario Carneiro (Dec 07 2023 at 03:25):

is this some abstract notion of "use"?

Patrick Massot (Dec 07 2023 at 03:25):

I think you don't have enough experience with reading math papers.

Mario Carneiro (Dec 07 2023 at 03:26):

Sure but I'm not talking about math papers I'm talking about blueprints

Patrick Massot (Dec 07 2023 at 03:26):

It is very important that math papers can be gradually turned into blueprints.

Mario Carneiro (Dec 07 2023 at 03:27):

it still seems like this would be served by \uses as an extra thing on top of \ref

Mario Carneiro (Dec 07 2023 at 03:28):

if someone removes a \ref then I guess it's not being used any more, I would not immediately assume that the link remains in the face of such a refactor

Johan Commelin (Dec 07 2023 at 03:28):

Having a script that parses \refs and suggests \uses that way is certainly possible and a good idea.
But \ref and \uses serve different purposes.

Johan Commelin (Dec 07 2023 at 03:28):

You want \uses when using a definition. But \ref is almost never used that way.

Mario Carneiro (Dec 07 2023 at 03:29):

is it safe to say that \ref is a subset of \uses?

Terence Tao (Dec 07 2023 at 03:29):

Yeah, having some tool to suggest \uses based on \refs or other contexts, but does not replace them, is what I had in mind also.

Mario Carneiro (Dec 07 2023 at 03:31):

some more formalized notion of using a definition would be useful here, like a semantic markup on the notation or so

Johan Commelin (Dec 07 2023 at 03:31):

Mario Carneiro said:

is it safe to say that \ref is a subset of \uses?

You can have forward \refs.

"We now prove a result in complex analysis. We will prove the $p$-adic analytic analogue of this in Proposition~\ref{fancy-label}."

Mario Carneiro (Dec 07 2023 at 03:32):

Are blueprints ordered in dependency order?

Terence Tao (Dec 07 2023 at 03:32):

Mario Carneiro said:

is it safe to say that \ref is a subset of \uses?

For lemmas, that would probably be the case. For definitions... I ended up removing some \uses leading back to core definitions (like the definition of entropy) because it that really cluttered up the dependency graph (this could in part because the dependency graph does not eliminate transitive edges).

Johan Commelin (Dec 07 2023 at 03:32):

Mario Carneiro said:

Are blueprints ordered in dependency order?

No, in whatever order the author organizes the LaTeX

Mario Carneiro (Dec 07 2023 at 03:32):

are they acyclic?

Johan Commelin (Dec 07 2023 at 03:32):

\uses is. But \ref doesn't have to be.

Johan Commelin (Dec 07 2023 at 03:33):

There are no formal guarantees that the \uses-graph is a DAG, atm. (@Patrick Massot correct me if I'm wrong.)

Mario Carneiro (Dec 07 2023 at 03:34):

so if you are suggesting \uses based on \ref you will have to break cycles somehow

Johan Commelin (Dec 07 2023 at 03:34):

Mario Carneiro said:

some more formalized notion of using a definition would be useful here, like a semantic markup on the notation or so

This would probably be a nice feature. But you quickly run into natural-language pretty printing issues. So I think it's great to just have a non-pretty-printing \uses.

Johan Commelin (Dec 07 2023 at 03:34):

Mario Carneiro said:

so if you are suggesting \uses based on \ref you will have to break cycles somehow

Sure, by not accepting some of the suggestions.

Mario Carneiro (Dec 07 2023 at 03:35):

well it's a bit noisy to be suggesting things that break the dag property

Johan Commelin (Dec 07 2023 at 03:36):

You could filter out all forward refs in the LaTeX ordering (possibly behind an --option to the suggestion tool).

Johan Commelin (Dec 07 2023 at 03:36):

But you can also have backrefs that aren't mathematical dependencies.

Mario Carneiro (Dec 07 2023 at 03:37):

I think you will need a \notuses in order to permanently reject suggestions or else they will keep coming back

Johan Commelin (Dec 07 2023 at 03:38):

That might be a good idea, yes.

Patrick Massot (Dec 07 2023 at 03:38):

Johan Commelin said:

There are no formal guarantees that the \uses-graph is a DAG, atm. (Patrick Massot correct me if I'm wrong.)

This isn't checked, but it would be easy to check.

Terence Tao (Dec 07 2023 at 03:38):

Tbh managing the \uses was not too bad to do manually. One could imagine one way in which casual participants of a project could help out would be to find out minor issues in the blueprint such as a misplaced \uses. Having some GUI to add or remove a \uses with a few mouseclicks (e.g., adding a \use autosuggested coming from a \ref) rather than manually editing the LaTeX code would be nice, but certainly not essential.

Mario Carneiro (Dec 07 2023 at 03:39):

How important is it that the code is latex + stuff rather than lean + latex or some other arrangement?

Patrick Massot (Dec 07 2023 at 03:39):

I'm a bit worried that the suggestion thing is over-engineered. I don't think that managing those \uses can be a significant amount of work. Updating them when Lean name change is painful but this has nothing to do with \ref.

Mario Carneiro (Dec 07 2023 at 03:40):

(embedding in lean + stuff would solve the name change issue)

Patrick Massot (Dec 07 2023 at 03:41):

I think it is important to keep LaTeX as the primary input. It allows anyone to write a blueprint.

Mario Carneiro (Dec 07 2023 at 03:41):

it could be 99% latex with some #lang thing at the beginning

Mario Carneiro (Dec 07 2023 at 03:42):

downside is that you won't be able to directly process it with a latex tool unless you get clever with a polyglot

Patrick Massot (Dec 07 2023 at 03:42):

Mario I think you overestimate what average mathematicians can do with a computer.

Mario Carneiro (Dec 07 2023 at 03:43):

What is the workflow for writing these blueprints? Throw it in overleaf / your favorite latex editor and use a no-op macro for \uses?

Patrick Massot (Dec 07 2023 at 03:44):

Yes, they come with no-op macros.

Mario Carneiro (Dec 07 2023 at 03:45):

the thing about using latex is that it seriously constrains our ability to improve the experience. I totally get the motivation, but it is a limitation

Terence Tao (Dec 07 2023 at 03:45):

I do like that the blueprint compiles both in regular LaTeX as a printed form, and in the blueprint web page. Particularly since I never bothered to install blueprint software on my local laptop, so it made it possible to check that the blueprint didn't have really egregious build errors without having to push to Github.

Mario Carneiro (Dec 07 2023 at 03:48):

not being able to check the validity of lean references is the worst part of this workflow IMO. Especially if people don't even run the blueprint software locally, it's all just down to copy paste

Patrick Massot (Dec 07 2023 at 03:49):

This limitation is a trade-off. My conjecture was that it was a good one, and the real world agrees so far.

Mario Carneiro (Dec 07 2023 at 03:50):

I think this is a good approach, but I'm not sure we have much data on the alternatives, they might be palatable too. AFAICT we're just projecting

Patrick Massot (Dec 07 2023 at 03:50):

Not being able to check validity of Lean references is a regression that happened when Utensil quickly hacked some Lean 4 support. There were no problem in Lean 3. However it did require running the blueprint locally.

Mario Carneiro (Dec 07 2023 at 03:50):

I meant for the case where you only run blueprint in CI

Terence Tao (Dec 07 2023 at 03:50):

So maybe it's not so much the \uses that should be moved off of the LaTeX platform, but rather the \lean and \leanok macros?

Mario Carneiro (Dec 07 2023 at 03:51):

and assuming all the bugs are fixed

Patrick Massot (Dec 07 2023 at 03:51):

Oh sure, it was caught by CI even if you don't run the blueprint locally.

Mario Carneiro (Dec 07 2023 at 03:51):

yeah but that's way too long an editing cycle

Patrick Massot (Dec 07 2023 at 03:52):

Moving the \lean out of the TeX code would probably be less convenient.

Mario Carneiro (Dec 07 2023 at 03:52):

but maybe the people who are allergic to lean are willing to deal with that editing cycle since it is what they are used to?

Mario Carneiro (Dec 07 2023 at 03:53):

We can check lean references even in the tex code if needed, but something has to run locally

Patrick Massot (Dec 07 2023 at 03:54):

One thing that is definitely possible if you have the Lean code and the TeX file is to check that every declaration mentioned in a \lean exists. Those can be extracted without compiling the TeX code. We can even change the no-op macros used by the regular TeX compiler into something that logs the list of Lean declarations to a file (this would be more robust).

Mario Carneiro (Dec 07 2023 at 03:54):

Is it too overengineered to have a vscode extension which does the lean reference check in-editor?

Mario Carneiro (Dec 07 2023 at 03:55):

of course I have no idea what editors the target audience is using

Mario Carneiro (Dec 07 2023 at 03:55):

so this may not be sufficient

Patrick Massot (Dec 07 2023 at 03:55):

I need to go now, but I promise I will devote time to this around Christmas.

Terence Tao (Dec 07 2023 at 03:59):

If we end up having something like a github issues page for each blueprint node in order to handle comments, claims, updates, etc., this would be the logical home to store \lean and \leanok type information, rather than the LaTeX, I think.

Terence Tao (Dec 07 2023 at 04:00):

(assuming that the issues page can somehow store metadata)

Johan Commelin (Dec 07 2023 at 04:04):

I don't think you can attach metadata in the usual sense of the word. But you can programmatically edit and parse the comments in an issue, and the topmost comment is often used to store some extra data. This is how we manage dependency chains of PRs for mathlib, for example.

Mario Carneiro (Dec 07 2023 at 04:07):

What is \leanok actually doing? Does the blueprint repo have any documentation?

Mario Carneiro (Dec 07 2023 at 04:08):

(I'm looking at https://github.com/PatrickMassot/leanblueprint, it tells me very little about how to install it)

Terence Tao (Dec 07 2023 at 04:08):

It changes the color of the dependency graph bubble and adds a checkmark to the blueprint printout, and I think that's about it.

Mario Carneiro (Dec 07 2023 at 04:09):

oh so you have to manually indicate that proofs are done?

Terence Tao (Dec 07 2023 at 04:09):

yeah

Mario Carneiro (Dec 07 2023 at 04:09):

...yeah that doesn't seem necessary, CI can calculate that itself

Jesse Michael Han (Dec 07 2023 at 04:11):

my current understanding (wearing my CS hat) is that Blueprint is a static website generator which uses a LaTeX document as a source of truth; it calculates a color coded dependency graph, an XML render of the LaTeX document with markup controlled by macros like /leanok, and can export to PDF as well

Terence Tao (Dec 07 2023 at 04:12):

\leanok is used both for proofs and for statements. The \leanok flag for statements is saying that the statement of the lemma is satisfactory and not a placeholder. That flag may be a little harder to automate, because one may have an initial spelling of the lemma in Lean that technically compiles, but might not be what one actually wants to prove

Mario Carneiro (Dec 07 2023 at 04:14):

you could indicate that on the lean side with something like @[provisional] def foo

Terence Tao (Dec 07 2023 at 04:16):

Yes, that sounds reasonable to me.

Terence Tao (Dec 07 2023 at 04:19):

One slight advantage of relying on the lean side to supply the \leanok information is that it can possibly describe partial lean formalization in which multiple Lean lemmas are assigned to one Blueprint node, but only some of them are fully formalized at a given point in time. The Latex \leanok macro can only supply one boolean bit of information in this regard.

Jesse Michael Han (Dec 07 2023 at 04:25):

even in the 1-to-1 case, we could even add a no-op tactic combinator to annotate steps with missing proofs on the Lean side, with a progress indicator (e.g. 5/8 steps discharged) updated in the blueprint node by CI

Mario Carneiro (Dec 07 2023 at 04:26):

The social organization parts seem like they would be difficult to do in a static web site, should "claims" appear on the graph?

Mac Malone (Dec 07 2023 at 04:38):

Mario Carneiro said:

it could be 99% latex with some #lang thing at the beginning

downside is that you won't be able to directly process it with a latex tool unless you get clever with a polyglot

Why not have it just be 100% LaTeX and yet still processable by Lean (e.g., via a custom frotend)? For instance, Alloy can process a pure C file as a Lean file.

Johan Commelin (Dec 07 2023 at 05:25):

@Mac Malone Because parsing LaTeX is hard. And mathematicians exploit every crazy feature/bug that LaTeX has.

Johan Commelin (Dec 07 2023 at 05:25):

And you don't want to put restrictions on the kind of LaTeX that can be used in blueprints.

Utensil Song (Dec 07 2023 at 06:27):

It's dead simple to check the validity of Lean declarations, that's what inv check did. It can also check sorries in them, I just didn't have enough spare time to finish it. It's not a reason to switch away from the LaTeX form of blueprint that mathematicians feel more comfortable than other technical tools. The division of labor is the essential idea here.

Observing the generated dep graph then modifying a \use, then just regenerate the blueprint in a few seconds (locally) is dead simple too, much easier than to fire up Lean, edit some code (like stating the lemma and writing the proof referencing another lemma) and wait for some automation completes. The point is to make the blueprint to be the source of the truth (of mathematical interpretation), not the other way around. Anything checking the Lean side came later.

\lean also has to be manually maintained, it's how to refer to a formal thing from the informal world, and it can't be automated unless we fire up LLMs to guess the correspondence, which is also over engineering.

\leanok can be automated easily, as a bot to check the Lean side, hinted by \leanin LaTeX and submit PRs to the blueprint. It doesn't need to alter the blueprint itself as well.

Utensil Song (Dec 07 2023 at 06:30):

Issues can also be created based on the nodes in the blueprint in a CI, the CI can also monitor the status of the issues, or be requested like bors comments to change the status of claims etc.

Utensil Song (Dec 07 2023 at 08:14):

Also there is actually poor correspondence between \uses and \ref. \uses highlights the hard dependency, ignores transitive dependency, and less important dependencies, so if a \ref is mentioned in the proof, it doesn't mean it should make into \uses.

ref doesn't necessarily cover all uses too. Many proofs in blueprints are completely ignored, they might be too obvious or the author just went to the Lean code to get the work done, and the blueprint is left with an empty proof.

These observations came from reading the original math paper, the formalization paper, the blueprint, and the Lean code of these real formalization projects. It's far from "writing in the programming language called Undergraduate", generally it's somewhat "sloppy" as it's informal, and it's a sketch.

If a mathematician believes another mathematician has all the details, they'll be reluctant to elaborate further, and if all goes well in the Lean side, the other mathematician will never come back to ask for more details on the blueprint. (This also explain the lack of documentation of the blueprint, once everyone involved in the infra side of the formalization project got the hang of it, and it doesn't break, no one is asking for more documents on it, and everyone is busy doing the real work, and the source code in the repo is the de fecto doc, and actually @Yaël Dillies has written a blueprint usage doc for PFR, so I can simplify it a bit, but I don't think a doc PR to leanblueprint is going to help anyone immediately so I postponed improving it)

That's why there is the distinction between \uses and \ref, and automation is only helpful to check after the fact, but that might also be annoying if mathematicians believe they have communicated the intentions clearly with the union set of both, and they don't need to be notified to make the correspondence perfect.

Yaël Dillies (Dec 07 2023 at 08:59):

(this was stolen from sphere-eversion!)

Ruben Van de Velde (Dec 07 2023 at 09:57):

I was surprised about \leanok as well; before using the blueprint, I would have imagined that lean declarations were tagged with the identifier given in the blueprint, and we'd colour the graph based on whether such a tag existed and whether its proof is sorry-free. Not sure if there's good reasons to avoid that

Yaël Dillies (Dec 07 2023 at 09:58):

What do you mean exactly by "whether the proof is sorry-free"?

Yaël Dillies (Dec 07 2023 at 09:59):

If it's the naïve understanding of "#print axioms my_decl doesn't contain sorryAx", then Kevin will never get the FLT node green during his 5 years grant!

Ruben Van de Velde (Dec 07 2023 at 09:59):

#print axioms?

Ruben Van de Velde (Dec 07 2023 at 09:59):

Should it be?

Ruben Van de Velde (Dec 07 2023 at 10:00):

I may be misunderstanding what green is

Yaël Dillies (Dec 07 2023 at 10:00):

Well, nodes in the blueprint can go green before their prerequisites. A (lemma) node corresponds to a proof, not to the completion of everything above it.

Yaël Dillies (Dec 07 2023 at 10:01):

This happens fairly often, when a prerequisite turns out to be harder than expected, or when someone writes the corollaries of a big theorem before the big theorem itself is formalised.

Ruben Van de Velde (Dec 07 2023 at 10:01):

Yeah, then I misunderstood. I suppose there's no obvious way of checking that a proof is sorry-free itself but not transitively

Yaël Dillies (Dec 07 2023 at 10:02):

I believe it's metaprogrammatically possible, but certainly not as easy as #print axioms.

Yaël Dillies (Dec 07 2023 at 10:02):

(and note that currently the blueprint doesn't even know what Lean is!)

Ruben Van de Velde (Dec 07 2023 at 10:03):

Yeah, this is what was unexpected to me

Utensil Song (Dec 07 2023 at 10:17):

Yaël Dillies said:

(and note that currently the blueprint doesn't even know what Lean is!)

I believe this is the way it should be. There should be no dependency on Lean for authoring the blueprint. Everything Lean should be checking or automatically (help to via PRs by bots from CI) maintaining the status synchronization from the Lean side afterward.

No checking is a regression caused by me, and by then I didn't realize how important and helpful it is, but it's never technically difficult or intentionally designed this way.

The Lean 3 blueprint depends on Lean to collect the declarations to point to the source code location, it will also do the checking (admittedly, only the existence of Lean proof but NOT whether it's correctly proved, that means examining whether it's directly using sorries, even run it through lean4checker /lean4lean etc. to be strict).

In Patrick's Christmas (present?) version, I'll imagine this will revive, and personally I believe the dependency on Lean can be relaxed somehow, at least for a mathematician who is editing the blueprint but doesn't want to bother with the Lean side, at least locally (this is also the scaling problem on the mathematician side).

In PFR, most people simply edits the blueprint, without building it to a web version locally, then commit it, waiting for the CI to finish, then examine the result on the website (which would be based on doc-gen4 from the Lean side), this workflow is not ideal, compared to the need to see the result of LaTeX authoring immediately. Some does build it locally without the hassle or time cost from Lean side, because my regressed version doesn't depend on Lean at all, and I consider it a feature and believes this is useful.

Ruben Van de Velde (Dec 07 2023 at 10:20):

This should be possible in an approach similar to the one I hallucinated too; you just wouldn't get the dependency graph

Utensil Song (Dec 07 2023 at 10:21):

The dependency graph purely depends on human annotated \uses, not Lean.

Ruben Van de Velde (Dec 07 2023 at 10:22):

I feel like we're talking past each other, and I'm not likely to be able to work on this, so I'll just shut up now :)

Mario Carneiro (Dec 07 2023 at 10:28):

Yaël Dillies said:

I believe it's metaprogrammatically possible, but certainly not as easy as #print axioms.

It's actually simpler to do this than to go transitively. It's a one liner, just find the contained constants and see if sorryAx is mentioned

Mario Carneiro (Dec 07 2023 at 10:30):

I was definitely imagining that one would check whether the declaration itself uses sorry, looking transitively would be pointless

Mario Carneiro (Dec 07 2023 at 10:30):

It's also possible to see whether a definition non-transitively uses sorry without metaprogramming: lean will throw a warning on every declaration that directly uses sorry

Mario Carneiro (Dec 07 2023 at 10:34):

Utensil Song said:

Yaël Dillies said:

(and note that currently the blueprint doesn't even know what Lean is!)

I believe this is the way it should be. There should be no dependency on Lean for authoring the blueprint. Everything Lean should be checking or automatically (help to via PRs by bots from CI) maintaining the status synchronization from the Lean side afterward.

The blueprint software itself should definitely know about lean (and ideally would be implemented in lean itself), but authoring the blueprint can be done by a plain tex installation from my understanding. I don't see a need to involve python though, I can only assume this is just because it's what Patrick is used to

Utensil Song (Dec 07 2023 at 10:43):

The color of the nodes also doesn't depend on Lean, but also human annotated \leanok, but this can also be annotated by a Lean aware bot. The status of a node should be persisted in a human readable way, ideally just in the LaTeX like now (instead of some binary cache like the Python pickle in Lean 3 blueprint, or a too technical JSON manifest), so a blueprint can be built without involving Lean.

Mario Carneiro (Dec 07 2023 at 10:45):

why "a blueprint can be built without involving Lean"? Specifically, you need to use some language to implement this and I don't see why not to use lean

Mario Carneiro (Dec 07 2023 at 10:46):

it can be done without building the project, sure

Utensil Song (Dec 07 2023 at 10:46):

I do agree that if blueprint software is written in Lean, the benefit is that it's a built executable, so there would be no hassle with installing Python like now.

Mario Carneiro (Dec 07 2023 at 10:47):

persisting the \leanoks in a json file written by CI seems like the most natural option

Yaël Dillies (Dec 07 2023 at 10:48):

This has the advantage of being machine-readable and allow one to "replay" the dependency graph. cc @Julian Berman

Utensil Song (Dec 07 2023 at 10:49):

In order for the model to scale, the authors of the blueprint should include some mathematicians who don't need to install Lean (or Python).

Mario Carneiro (Dec 07 2023 at 10:49):

they can just edit the tex

Mario Carneiro (Dec 07 2023 at 10:50):

but I don't think it is unreasonable to say you need to install lean (or python) to run blueprint locally

Utensil Song (Dec 07 2023 at 10:52):

If the status of the nodes are persistented in the Github repository, a mathematician can just get the repo, then start building and editing the blueprint with their working LaTeX installation, this would be ideal.

Utensil Song (Dec 07 2023 at 10:52):

including the dep graph

Mario Carneiro (Dec 07 2023 at 11:00):

is latex the one rendering the dep graph?

Mario Carneiro (Dec 07 2023 at 11:00):

I thought it was just handling the mathematics content

Yaël Dillies (Dec 07 2023 at 11:02):

No indeed, the rendering is done by a python script.

Mario Carneiro (Dec 07 2023 at 11:06):

I really hope we don't go the route of getting latex to do the rendering "for the convenience of users"

Utensil Song (Dec 07 2023 at 11:22):

Anyway, observing the PFR project makes me believe that involving more Lean in the blueprint software, is the wrong way. At least, if Lean is indeed involved, it should not be noticeable, like installing Lean, running Lean, writing in Lean DSL, are not ideal, and should be avoided. Of course, I might be wrong (in this role play trying to take my CS hat off) and mathematicians are more resilient (I can't recall who, but one told me that they are quite resilient after the torture of LaTeX :joy: ).

Mario Carneiro (Dec 07 2023 at 11:25):

I'm not sure exactly how you derived that from the PFR project, everyone involved was quite adept at lean and probably significantly less so wrt python

Utensil Song (Dec 07 2023 at 11:27):

Mario Carneiro said:

is latex the one rendering the dep graph?

If the dep graph is persisted, not only the node status, but also the dot file generated by graphviz (which is just a tool easily installed, no Lean or Python involved). LaTeX can render it via dot2texi, a LaTeX package, free of dependencies on Python.

Mario Carneiro (Dec 07 2023 at 11:28):

We should certainly lean on the strengths of the target audience, and keep things familiar, but a lot of what is being discussed is backend tech, it doesn't matter what that is written in

Mario Carneiro (Dec 07 2023 at 11:29):

the frontend is currently 100% latex and that doesn't seem to be negotiable AFAICT

Utensil Song (Dec 07 2023 at 11:29):

Yes, I'm not against about how it's implemented, only believing that authoring a blueprint should not be writing a Lean DSL.

Mario Carneiro (Dec 07 2023 at 11:29):

I would like to know exactly how crazy the latex actually is in practice though

Mario Carneiro (Dec 07 2023 at 11:30):

I realize it can in principle be arbitrarily complex but we have a few data points now, it might be interesting to see what is actually required of the latex

Sky Wilshaw (Dec 07 2023 at 11:31):

Just throwing my two cents out here, I think that using Python isn't a great solution, even for backend code. While I was setting up the infrastructure for the Lean3 version of Con(NF), I found it quite difficult to get all of the Python dependencies to work properly. In particular, mathlibtools would only work with a version of Python that isn't available via apt!

Sky Wilshaw (Dec 07 2023 at 11:32):

This isn't necessarily a problem with Python, of course, but I think Lean 4 tends to be better with version management.

Mario Carneiro (Dec 07 2023 at 11:32):

heh...

Mario Carneiro (Dec 07 2023 at 11:32):

for some sense of "better"

Sky Wilshaw (Dec 07 2023 at 11:33):

Lean versions only break if you lake update them, python versions break whenever the apt maintainers decide to.

Utensil Song (Dec 07 2023 at 11:34):

It's at least as crazy as how the python part of leanblueprint can't handle, it partly relies on a working LaTeX installation. Also, for example, if there is something like a commutative diagram involved, it completely relies on the LaTeX package to generate the diagram.

Patrick Massot said:

It is very important that math papers can be gradually turned into blueprints.

Mauricio Collares (Dec 07 2023 at 11:34):

Sky Wilshaw said:

Lean versions only break if you lake update them, python versions break whenever the apt maintainers decide to.

Weird, I can't find the Nix emoji in Zulip's list. A snowflake will have to do.

Sky Wilshaw (Dec 07 2023 at 11:35):

Do commutative diagrams even get typeset properly on the HTML side?

Utensil Song (Dec 07 2023 at 11:37):

Sky Wilshaw said:

Do commutative diagrams even get typeset properly on the HTML side?

It can, the result is a SVG diagram which renders perfectly and natively on the HTML side.

Sky Wilshaw (Dec 07 2023 at 11:38):

Ah nice! Is this for xypic or tikzcd?

Anand Rao Tadipatri (Dec 07 2023 at 11:40):

Regarding meta-programmatically checking whether a theorem is sorry-free, I have written a short script on @Yaël Dillies's fork of LeanBlueprint to extract the declarations from a file together with their line numbers and whether they contain a sorry. This uses InfoTrees and is loosely based on LeanInk.

The main limitation of this code is that it operates on individual files, while the blueprint requires information about specific constants in the environment; but maybe with some caching it wouldn't be too inefficient to find the file containing a declaration using Environment.getModuleIdxFor?and then run this script on it.

Utensil Song (Dec 07 2023 at 11:40):

Sky Wilshaw said:

This isn't necessarily a problem with Python, of course, but I think Lean 4 tends to be better with version management.

Lean is less stable on version management, especially when the latest Mathlib is involved, which is essential for formalization projects. That's one of the reasons I was voicing out in this (long) thread.

Utensil Song (Dec 07 2023 at 11:42):

Sky Wilshaw said:

Ah nice! Is this for xypic or tikzcd?

Yes. leanblueprint (and its backbone plastex) uses LaTeX to generate a PDF of the diagram (no matter how crazy in LaTeX), covert it to SVG, then display it in the web version.

Utensil Song (Dec 07 2023 at 11:50):

Anand Rao Tadipatri said:

Regarding meta-programmatically checking whether a theorem is sorry-free, I have written a short script on Yaël Dillies's fork of LeanBlueprint to extract the declarations from a file together with their line numbers and whether they contain a sorry. This uses InfoTrees and is loosely based on LeanInk.

The main limitation of this code is that it operates on individual files, while the blueprint requires information about specific constants in the environment; but maybe with some caching it wouldn't be too inefficient to find the file containing a declaration using Environment.getModuleIdxFor?and then run this script on it.

In principle, it's rather easy in Lean meta programming to collect things from the environment, all the way down to the imported dependencies, beyond the one file limit. I had an attempt earlier and it's feasible. Lean experts already involved in PFR can certainly make one more elegantly. Also it doesn't need to depend on the now less maintained LeanInk, it's also feasible that doesn't require the caching idea.

Utensil Song (Dec 07 2023 at 11:59):

This information is already available to Lean when one opens a Lean file in VS Code, so one should expect these information collected in a few seconds (unlike how long it takes doc-gen4 to run, it did much much more), provided that the Mathlib cache (olean files) is available.

Sky Wilshaw (Dec 07 2023 at 12:01):

Utensil Song said:

Lean is less stable on version management, especially when the latest Mathlib is involved, which is essential for formalization projects.

I hope that this won't be as big of a problem when only Std is involved.

Anand Rao Tadipatri (Dec 07 2023 at 12:34):

Utensil Song said:

In principle, it's rather easy in Lean meta programming to collect things from the environment, all the way down to the imported dependencies, beyond the one file limit. I had an attempt earlier and it's feasible. Lean experts already involved in PFR can certainly make one more elegantly. Also it doesn't need to depend on the now less maintained LeanInk, it's also feasible that doesn't require the caching idea.

For extracting the line numbers and file positions of declarations, the information contained in the environment may not be enough (but I could be wrong). I would like to avoid the caching idea as well and have something more direct.

As you mention, the relevant information is available in the VS Code session (for example, Ctrl+Click-ing on definitions uses file position information), so there may be a quick way of accessing that. It may be possible to extract details about declarations in imported dependencies through the InfoTree, since that is where a lot of the information about a document generated during elaboration is stored, but I'm not aware of a way.

Utensil Song (Dec 07 2023 at 12:42):

Anand Rao Tadipatri said:

Utensil Song said:

In principle, it's rather easy in Lean meta programming to collect things from the environment, all the way down to the imported dependencies, beyond the one file limit. I had an attempt earlier and it's feasible. Lean experts already involved in PFR can certainly make one more elegantly. Also it doesn't need to depend on the now less maintained LeanInk, it's also feasible that doesn't require the caching idea.

For extracting the line numbers and file positions of declarations, the information contained in the environment may not be enough (but I could be wrong). I would like to avoid the caching idea as well and have something more direct.

As you mention, the relevant information is available in the VS Code session (for example, Ctrl+Click-ing on definitions uses file position information), so there may be a quick way of accessing that. It may be possible to extract details about declarations in imported dependencies through the InfoTree, since that is where a lot of the information about a document generated during elaboration is stored, but I'm not aware of a way.

Some more information related was discussed in this thread. Note that olean contains the code position and the module name of the file, but not the file name since it's platform specific, and determined by lake not lean. Also there are naming differences between the directory name, the Github repo name, and the package name, which is a pain point. From the thread, you can see it's very much feasible.

But right now I'm only happily waiting for Patrick's Christmas present package :wink:

Johan Commelin (Dec 07 2023 at 13:39):

Sky Wilshaw said:

Just throwing my two cents out here, I think that using Python isn't a great solution, even for backend code. While I was setting up the infrastructure for the Lean3 version of Con(NF), I found it quite difficult to get all of the Python dependencies to work properly. In particular, mathlibtools would only work with a version of Python that isn't available via apt!

I want to repeat once more that it is extremely unlikely that Python will be replaced for the backend work. Turning LaTeX into HTML is very non-trivial. And PlasTeX is an awesome solution for that.

Johan Commelin (Dec 07 2023 at 13:40):

Mario Carneiro said:

I would like to know exactly how crazy the latex actually is in practice though

Our current blueprints have probably been pretty mild. But mathematicians typically load lots of custom macros. And when you add diagrams into the mix, it gets nasty quickly.

Utensil Song (Dec 07 2023 at 13:50):

Mario Carneiro said:

I'm not sure exactly how you derived that from the PFR project, everyone involved was quite adept at lean and probably significantly less so wrt python

To be clear, because that's probably the upper limit of the count of Lean experts can work on a formalization project concurrently on earth. And yes, they are adept at proving with Lean, how many are they good at writing Lean tactics? How many are they good at writing Lean for general programming? Particularly handling issues that involve Lean internals and Lake internals? Now divide this number by the formalization project counts to meet the goal "formalizing math research in real time" (with tooling support requests coming from them). That's why the model doesn't scale, yet.

This is why I reach the conclusion that the less Lean in the tooling for the blueprint, the better, at least this is true for the status quo of Lean at general programming. I know a dozen of programming languages, and Lean (as a general purpose programming) is a few levels more difficult to master (or get things done due to the lack of packages in this regard) than a dead simple and ecosystem-rich programming language like Python. (Not that I'm upvoting for Python, just comparing the complexity level of learning, at least in the midterm. In the long run, Lean choice is going to be better, but does it need to happen now?)

Terence Tao (Dec 07 2023 at 16:58):

Patrick Massot said:

It is very important that math papers can be gradually turned into blueprints.

I just want to say that when I started working on the blueprint, I first wrote it from scratch (or more precisely, from an extremely bare bones template that @Yaël Dillies provided with like three dummy lemmas), consulting the original paper but writing things out again by hand, but about halfway through I realized that I could cut and paste (and then edit) large portions of the actual paper to create the blueprint, which was much faster. One could imagine eventually having some AI tool to take a random paper in LaTeX and make a first pass at chopping it up into a blueprint; with enough tools to edit dependency graphs in ways that don't require manually moving around dozens of \uses, this could potentially be faster still than the cut and paste method.

Terence Tao (Dec 07 2023 at 18:02):

One other thought: for high profile projects especially, I imagine there will be a large number of casual participants who may not be motivated enough to download lean, github, etc. and make a formal Lean contribution, but they could still help with for instance pointing out a typo in the dependency graph or blueprint. Having a mechanism in the blueprint for casual users to make these suggestions, and have administrators be able to easily act on them, would be helpful and also be a way to broaden participation and exploit the "long tail" of such projects. This may well be better than trying to develop overengineered automated tools to do this sort of minor blueprint maintainance. It's possible that the proposal of having a github issues page for each node already goes a long way towards doing this.

Terence Tao (Dec 07 2023 at 18:31):

Another thought in the same vein: it might be nice to also have support for MWE Lean requests that are self-contained enough to place in the Lean playground rather than in the full Github repository, again so that casual participants may be able to contribute. A recent organically emerging example of this was in https://leanprover.zulipchat.com/#narrow/stream/412902-Polynomial-Freiman-Ruzsa-conjecture/topic/Balog-Szemer.C3.A9di-Gowers/near/405989888 when I needed one simple self-contained lemma as part of filling in the final sorry to PFR, and @Johan Commelin was able to supply it in a few minutes. (See also https://leanprover.zulipchat.com/#narrow/stream/412902-Polynomial-Freiman-Ruzsa-conjecture/topic/Balog-Szemer.C3.A9di-Gowers/near/405690547 for a significantly larger, but still self contained, lemma that required quite a bit more effort to fill, but could still be done completely independently of the rest of the PFR project.) But perhaps again the issues page already handles this, if it is easy to support MWEs within the comment system of an issue page.

Mario Carneiro (Dec 08 2023 at 08:52):

Agreed that problems that don't require the context of the repo are much easier to jump in and do on short notice

Mario Carneiro (Dec 08 2023 at 08:56):

I think there is a lake request in here too @Mac Malone : imagine if someone could have a MWE example which just contains import PFR.Foo and I paste this into my project (which is not PFR related) and it just gets the project data and makes it work (possibly with a prompt but not more than that)? That would be a true portable MWE

Mario Carneiro (Dec 08 2023 at 08:57):

(I'm aware there are many blockers to such a thing working, but it sounds like a good flow to me)

Terence Tao (Dec 08 2023 at 16:04):

Mario Carneiro said:

I think there is a lake request in here too Mac Malone : imagine if someone could have a MWE example which just contains import PFR.Foo and I paste this into my project (which is not PFR related) and it just gets the project data and makes it work (possibly with a prompt but not more than that)? That would be a true portable MWE

I don't know if this is way too resource hungry to be realistic, but another option is to somehow attach a custom Lean playground to each formalization project that has access to the master build and can import from it just as easily as from Mathlib. (Such a playground could then also have additional features to integrate it with the blueprint / Github issues page, so that if a casual contributor manages to complete a MWE in the local playground, they could automatically send it back for review to the page issuing that MWE with a click of a button.)

Terence Tao (Dec 08 2023 at 16:13):

With such a framework, one could also automatically convert an entire node formalization request (i.e., proving the full proposition(s) related to a Blueprint node) to a MWE: by following a link from that node, the contributor would be sent to the custom playground in which the proposition to be proven is stated as an example with a sorry (with all relevant prior import, variable, def, abbrev, open, etc. context added automatically; with a more intelligent system, one might also include (or maybe just #check) explicit versions of all the propositions that this node \uses), and the contributor can then try to prove it using whatever imports from the project and from Mathlib as they deem fit. (One trivially handle this task by importing the file containing the original sorried proposition and just exact that proposition, or by importing more advanced results that rely on the proposition, so some human review (or some fancy linter) may be needed to avoid frivolous circular logic submissions.)

Mario Carneiro (Dec 08 2023 at 22:58):

I think this is essentially what you already get with gitpod, but maybe someone knows what additional work would be needed to have the right cache available

Kevin Buzzard (Dec 08 2023 at 23:08):

One issue with gitpod is that it takes several minutes to be up and running.

Terence Tao (Dec 08 2023 at 23:32):

Kevin Buzzard said:

One issue with gitpod is that it takes several minutes to be up and running.

The PFR project front page https://teorth.github.io/pfr/ ostensibly has a gitpod button in it, but I have not yet been able to get it to work.

Utensil Song (Dec 09 2023 at 04:06):

It seems that proposals so far fall into 4 categories:

1. Authoring and organizing the informal part of the blueprint better

For example, with the help of LLMs to chop a math paper into nodes in a blueprint, possibly annotate the nodes with dependencies hinted be \ref or the text, the resulting dep graph could be making too much connections, so an algorithm can eliminate some of these edges, particularly the transitive ones (like \uses{A} % ,B, C } so mathematicians can do less copy-pasting but focusing on elaborate on necessary details, adding intermediate lemmas, and organize the dep graph etc. Preferably this can be done incrementally as there might be more than one math paper involved, it would be ideal to add a new paper without messing up existing blueprint. This is where natural language comprehension and generation technologies like LLMs shine, together with some deterministic algorithms.

This part is preferably based on the original vision by Patrick, and reuse most of the current implantation in Python/LaTeX. In the long run (but ROI-wise not now), they can be all rewritten in Lean, and it's very feasible, e.g. Lean can have an FFI binding to tectonic written in Rust which has great compatibility with existing LaTeX paper no matter how crazy the LaTeX might be, and it aims to have a Web output, but it hasn't got to a point that can mirror Web and PDF so well like plastex (the base of leanblueprint) did so far. The AI part of the implementation can also be done in Python or C++ that also binds to Lean via FFI, like LeanInfer did.

Utensil Song (Dec 09 2023 at 04:07):

2. General project management

There could be an issue of each of the blueprint nodes, to track the claims, milestones, tags etc. And it will also allow casual participants to make suggestions.

This part can be implemented by Github issues, Github actions, bots, Github projects, and has tight integration of the existing workflow around the Github repo. The bot makes use of the metadata leanblueprint deduced from the LaTeX source, raise these issues, track their status, and reflect them in to the dep graph, possibly with extra metadata persistence that the bot is allowed to commit into the repo automatically or via PRs if we wish to keep it simple as the statistically generated Web pages that can be easily hosted on Github Pages or test locally.

Utensil Song (Dec 09 2023 at 04:08):

3. Automatically check and sync Lean status back to blueprint

This will need to check whether Lean has the statement declaration, has a proof without sorry, passes strict checking by lean4check/lean4lean or external checkers etc. and sync the state back to blueprint.

This part could be mostly written in Lean via meta-programming, and reuse the syncing mechanisms in 2. I still prefer this part to be organized as PRs as humans should decide whether a node actually turns green (maybe there are issues of the proof, or there should be more \uses etc.), but this can also be done automatically like described in 2.

I wish to emphasize the importance of that this information is stored in the repo as human readable and editable files, instead of binary data or technical format stored on some servers or databases, this will make the blueprint reproducible by anyone who has the code, without the need to access anything hosted on Github or any other servers. This is how a digitalized and formalized piece of Math can be kept in a long period of time.

Utensil Song (Dec 09 2023 at 04:08):

4. How to break things down on the Lean side into MWEs and make casual participants can jump in to help

This is actually quite essential. Gitpod or Codespaces still takes too long to fire up. A MWE and a link to working on it on live.lean-lang.org is instant. This makes contributions and discussion more efficient to a new level. The idea of making live.lean-lang.org to be able to import from ongoing formalization projects is feasible, at least this works for a limited amount of formalization projects, like the long running FLT project.

Extracting a not necessarily too minimal MWE with the help of Lean meta-programming is also feasible. We already have things like extract_goal , imagine a help tactic, which collects everything needed from the environment, delaborate it into a MWE with a wide import like import Mathlib + import PFR but minimal code for that specific goal.

One can further imagine that extracting a need of tactics, like giving a few example proofs, then Lean can automatically make an MWE, so a tactic expert can start developing one with the MWE as test cases. It's quite important to develop tactics to a specific math area , just like ring, field_simp etc. but they are just for undergraduate level abstract algebra, imagine a tactic called sheaf or cat or something less general, they can be as easy as just aesop plus some rulesets but greatly helpful to proof authoring and reproduce the efficient informal reasoning logic flow (although it might be less friendly to, say, a graduate student to drill down to the machinery of the proof). Developing tactics with the help from LLMs is one more thing to imagine, it would be nice to have more training towards this direction.

Utensil Song (Dec 09 2023 at 04:37):

(disclaimer: unfortunately I can only volunteer to do a tiny fraction of all these due to the limitation of my Lean time, just discussing the possible big picture, feasibility, and some personal take on them here).

Siddhartha Gadgil (Dec 09 2023 at 15:04):

(deleted)

Scott Morrison (Dec 09 2023 at 17:29):

Terence Tao said:

Kevin Buzzard said:

One issue with gitpod is that it takes several minutes to be up and running.

The PFR project front page https://teorth.github.io/pfr/ ostensibly has a gitpod button in it, but I have not yet been able to get it to work.

It works for me in gitpod, although the link for the gitpod button at https://teorth.github.io/pfr/ needs to be https://gitpod.io/#https://github.com/teorth/pfr rather than the current link to the github repo itself!

Utensil Song (Dec 09 2023 at 17:56):

The link in the project page is incorrect, but the one in README is correct, and works for me (for Lean, not blueprint):

image.png

Terence Tao (Dec 09 2023 at 21:45):

Utensil Song said:

The link in the project page is incorrect, but the one in README is correct, and works for me (for Lean, not blueprint):

image.png

OK, hopefully it is fixed now.

As a separate (and rather easy) feature request - I wonder if there is a way to add an additional coloring to the dependency graph (maybe some sort of "really green") that indicates not only that a node has been locally formalized, but that all of the nodes it depends on transitively are also formalized? Right now for instance we are adding a few extensions to the PFR project, which has temporarily uglified our dependency graph, but it would be nice to tell at a glance that the original result PFR is still completely proven (or at least, that blueprint thinks that it is).

Utensil Song (Dec 10 2023 at 06:33):

Another issue is that all nodes with ' in their name are not responding to clicks (i.e. showing the statement in an overlay at the top).

Alok Singh (S1'17) (Dec 10 2023 at 07:17):

Terence Tao said:

Utensil Song said:

The link in the project page is incorrect, but the one in README is correct, and works for me (for Lean, not blueprint):

image.png

OK, hopefully it is fixed now.

As a separate (and rather easy) feature request - I wonder if there is a way to add an additional coloring to the dependency graph (maybe some sort of "really green") that indicates not only that a node has been locally formalized, but that all of the nodes it depends on transitively are also formalized? Right now for instance we are adding a few extensions to the PFR project, which has temporarily uglified our dependency graph, but it would be nice to tell at a glance that the original result PFR is still completely proven (or at least, that blueprint thinks that it is).

as a slightly fancier thing on top, use opacity to show what fraction of the transitive deps are formalized?

Utensil Song (Dec 10 2023 at 07:37):

Terence Tao said:

Utensil Song said:

The link in the project page is incorrect, but the one in README is correct, and works for me (for Lean, not blueprint):

image.png

OK, hopefully it is fixed now.

As a separate (and rather easy) feature request - I wonder if there is a way to add an additional coloring to the dependency graph (maybe some sort of "really green") that indicates not only that a node has been locally formalized, but that all of the nodes it depends on transitively are also formalized? Right now for instance we are adding a few extensions to the PFR project, which has temporarily uglified our dependency graph, but it would be nice to tell at a glance that the original result PFR is still completely proven (or at least, that blueprint thinks that it is).

Filed as part of PatrickMassot/leanblueprint#9 .

Utensil Song (Dec 10 2023 at 07:48):

Utensil Song said:

Filed as part of PatrickMassot/leanblueprint#9 .

I'm a little confused by "temporarily uglified". pfr is very green, both boarder and background:

image.png

Only pfr' is having a green border, and a blue background (because at least entropy-pfr' is not ready), which seems to be working as intended:

image.png

How exactly is a more green node defined? (You may refer to other definitions in PatrickMassot/leanblueprint#9 or the Legend of dep graph (click on top-left corner).

Shreyas Srinivas (Dec 10 2023 at 11:40):

  1. I hope blueprint remains fully open source. This also means that, like now, the possibility of self hosting remains.
  2. The dependency graph is currently a static pdf. It would be nice if it were web based. That's to say one could customise the UI of the graph to focus on various nodes or types of nodes by clicking on them or ticking off checkboxes. This might also allow a 3D layout and allow nodes to be pushed into the background if the user doesn't wish to see them.

Ruben Van de Velde (Dec 10 2023 at 11:43):

What do you mean by static pdf? https://teorth.github.io/pfr/blueprint/dep_graph_document.html

Utensil Song (Dec 10 2023 at 11:51):

Yes, it's web-based and interactive. Actually, I was planning to make it available to the PDF too (but it seems to be a much less needed use case).

Focusing on specific types of nodes or some sort of filtering is feasible. Just need to see what are the exact frequent use cases because there are so many ways to do it.

One feature that's already available is that it supports displaying the dep graph by sections, so it would have less nodes on each graph. One can always write a summary section to get the big picture and navigate to the graphs of separate sections to check the local dependencies.

Utensil Song (Dec 10 2023 at 12:00):

Shreyas Srinivas said:

  1. I hope blueprint remains fully open source. This also means that, like now, the possibility of self hosting remains.

Same hope here, especially if Morph Labs have enhanced it significantly (like, most of that have been discussed here become a reality) .

Shreyas Srinivas (Dec 10 2023 at 12:18):

Utensil Song said:

Focusing on specific types of nodes or some sort of filtering is feasible. Just need to see what are the exact frequent use cases because there are so many ways to do it.

Suppose I click on a node, it would be nice to see only the dependencies of that node in focus and foregrounded (to avoid the convoluted arcs of the graph) while other nodes get pushed to the background and get de-emphasised

Shreyas Srinivas (Dec 10 2023 at 12:23):

Another improvement would be the way the initial rendering of the graph works. Currently one sees the entire graph with all the nodes in it with teeny tiny bits of text. It is really hard to tell where to even begin navigating through it. Ideally the top level theorem should be front and center and in reasonable font size, with dependencies less focussed and placed behind. If required, a 2D layout like the current one could be shown as well.

Utensil Song (Dec 10 2023 at 12:48):

Shreyas Srinivas said:

Utensil Song said:

Focusing on specific types of nodes or some sort of filtering is feasible. Just need to see what are the exact frequent use cases because there are so many ways to do it.

Suppose I click on a node, it would be nice to see only the dependencies of that node in focus and foregrounded (to avoid the convoluted arcs of the graph) while other nodes get pushed to the background and get de-emphasised

This reminds me of @Eric Wieser 's mathlib-import-graph which has dynamic layout and supports highlight/de-highlight etc. which used sigmajs.

Patrick Massot (Dec 10 2023 at 16:10):

Shreyas Srinivas said:

Another improvement would be the way the initial rendering of the graph works. Currently one sees the entire graph with all the nodes in it with teeny tiny bits of text. It is really hard to tell where to even begin navigating through it. Ideally the top level theorem should be front and center and in reasonable font size, with dependencies less focussed and placed behind. If required, a 2D layout like the current one could be shown as well.

There is already a solution to that: having several graphs, just like in LTE. In the PFR case the extension should have gotten a new chapter in the blueprint and their own graph. There is simply nothing you can do to get a huge graph where you can read labels, unless you force all your user to have 50in screen.

Yaël Dillies (Dec 10 2023 at 16:12):

The problem with the multiple graphs solution is that you lose the dependencies that go across several graphs. I would much prefer to be able to filter out nodes on a single graph.

Patrick Massot (Dec 10 2023 at 16:13):

What filtering? In the case at hand, all extension would still depend on PFR, right?

Yaël Dillies (Dec 10 2023 at 16:15):

Well typically you could have one toggle for each chapter. If you turn one of the chapters off, all relevant nodes disappear.

Yaël Dillies (Dec 10 2023 at 16:16):

A naïve implementation of that would create 2n2^n dependency graphs, where nn is the number of chapters, but hopefully we don't need to see all possible filterings ever.

Utensil Song (Dec 10 2023 at 16:19):

or rather, collapse into a node representing that chapter

Yaël Dillies (Dec 10 2023 at 16:19):

That could also work.

Utensil Song (Dec 10 2023 at 16:24):

It's very feasible, and I've seen many ones like that, but they always feel less stable (re-layout when some nodes are gone), and sometimes dizzy to navigate. Personal preference is that the author could organize things into sections for the readers, so drilling down become natural.

Yaël Dillies (Dec 10 2023 at 16:27):

Wait, what blueprints have you seen that work like that?

Shreyas Srinivas (Dec 10 2023 at 16:40):

Is there a way to get the raw dependency data of the pfr project in json format?

Utensil Song (Dec 10 2023 at 16:40):

Many such dependency graphs. Once auto generated without human intervention, it's usually messy for more than dozens of nodes. It only gets better if some higher level hierarchy organize them. There is one (messy) example up there, the sigmajs one.

Utensil Song (Dec 10 2023 at 16:44):

Shreyas Srinivas said:

Is there a way to get the raw dependency data of the pfr project in json format?

It can be done within a blink of the eye from the command line: https://stackoverflow.com/a/51443514/200764 if you know where to look for the dot source code for the dep graph (hint: the bottom).

The blueprint can in principle provide this JSON by default, so people can render it to their liking.


Last updated: Dec 20 2023 at 11:08 UTC