Zulip Chat Archive

Stream: general

Topic: Discussion: NodeGraph

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Dec 13 2024 at 20:50):

Discussion thread for #announce > NodeGraph .

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Dec 13 2024 at 20:51):

This is cool! I am wondering about the extra docs in @[node doc] or node doc in ... Does this run the risk of having documentation which only some tools can display, so wouldn't show up in doc-gen for example? Why not just stick to ordinary docstrings on declarations?

Adam Topaz (Dec 13 2024 at 22:38):

I would be more than happy to get rid of the node doc functionality and rely just on docstrings if we had some way to evaluating strings at elaboration time within the docstring. For example:

def world := "world!"

/-- Hello {world} -/
def helloWorld := "Hello World!"

and have the docstring actually become "Hello world!" then that would be more than sufficient for what I am envisioning here.

Edward van de Meent (Dec 13 2024 at 23:08):

Could you give an example of a situation which benefits from this functionality?

Adam Topaz (Dec 13 2024 at 23:11):

Edward van de Meent said:

Could you give an example of a situation which benefits from this functionality?

@[node include_str "some_complicated_latex_file.tex"]
def foo := bar

Adam Topaz (Dec 13 2024 at 23:11):

If I could write

/-- {include_str "thing.tex"} -/
def foo := bar

that would be perfectly fine as far as I am concerned!

Adam Topaz (Dec 13 2024 at 23:14):

(well, the widget, infoview, docgen webpages, etc., all display markdown, so maybe replace .tex with .md in the pseudoexample above)

Kim Morrison (Dec 14 2024 at 10:31):

Further on Edward's question, could you give us a more concrete example where you'd need to evaluate in a doc-string?

Edward van de Meent (Dec 14 2024 at 11:19):

I guess that's a way to report the lean version in the docs...

Edward van de Meent (Dec 14 2024 at 11:20):

But that seems more like a gimmic than anything

Adam Topaz (Dec 14 2024 at 11:51):

I don’t have a concrete example I can share yet. The purpose of this approach of evaluating strings was because I wanted to make it possible for documentation to be written separately from the lean code itself.

One of the goals of this project is to enable people to write “formal blueprints” by liberally using sorries and axioms in the lean code itself, while also providing extensive markdown/latex documentation for each node (e.g. describing a proof or definition as one would do on paper). If the project that involves people who are not comfortable with lean syntax, then the idea is to enable such people to still make meaningful contributions by writing documentation separately.

Adam Topaz (Dec 14 2024 at 11:54):

This vision of “formal blueprints” is also why it may make sense to have some place for node documentation that’s separate from the declaration documentation. In other words, the usual docstring has its usual purpose while the additional doc can contain additional information, like a natural language proof, which doesn’t necessarily belong in a docstring

Damiano Testa (Dec 14 2024 at 21:40):

One situation where I would find it convenient to "evaluate doc-strings" is when you want to keep in sync text that is written in the module doc and text that goes as a doc-string: in the module doc you may want to describe something with the text that you would also then want to be the doc-string.

Damiano Testa (Dec 14 2024 at 21:41):

I am not sure if you could achieve this with NodeGraph as it currently is, but it may be in scope nonetheless.

Alex J. Best (Dec 15 2024 at 05:06):

I've also wanted such docstring interpolation in the past when working with lean-slides. Especially if it supports evaluating meta code it could be a very nice way to write docs that stay up to date automatically by including some type information and usage exampes in the docstring. Maybe this is all obviated by verso but it's a sort of attempt at a light weight version of literate doc writing.

Kim Morrison (Dec 15 2024 at 08:59):

Maybe it would make sense to have an RFC about interpolation in doc-comments. Or perhaps just have @David Thrane Christiansen say something how this would overlap with / relate to things happening in the verso direction.

Kim Morrison (Dec 15 2024 at 09:01):

Adam Topaz said:

like a natural language proof, which doesn’t necessarily belong in a docstring

Personally, I'd be happy if we were more enthusiastic about long doc-strings, e.g. including natural language proofs. Hard to maintain a consistent style across a huge library, but it seems like it's hard to go wrong by shifting the norms in favour of more and longer doc-strings!

Eric Wieser (Dec 15 2024 at 09:17):

I don't think docstrings are the place for proofs; if I'm autocompleting a lemma name or searching in the docs, I don't care at all how it was proved, only about what it says. If we want to store informal proofs in a machine-accesible way, then I claim they belong as some annotation inside the proof term along the lines of what NodeGraph is doing.

Julian Berman (Dec 15 2024 at 10:33):

In case it's helpful at all for comparison, Python does not allow interpolation in docstrings (f-strings cannot be a docstring) but does allow assigning docstrings dynamically (and has essentially forever, including way before it had interpolating directly into strings). I assume Lean has some equivalent of the latter already (presumably more in the form of evolving a declaration to get a new one rather than mutating the old one.)? Personally I've wanted the former a time or two in 15 years programming but overall I think it's better it doesn't exist as it's somewhat easy to get around via the latter functionality.

(I also find the long vs. not long bit amusing, people have argued for 10 years about "hey Sphinx is nice because it lets you move parts of your super long docstring to a separate file and render it into the docs later, that's great your source code is shorter!" vs. "please never use that put your long docstring right where it is, the whole point of a docstring is that it's local to the code it describes so that someone sees it and updates it right when they touch what it has to do with." I'm on team B these days, I have very long docstrings in code in many places.)

Eric Wieser (Dec 15 2024 at 18:22):

Is there anything stopping mathlib defining a /!-- .* -/ command syntax that just means "do interpolation in the docstring, then execute the underlying command"?

Eric Wieser (Dec 15 2024 at 18:22):

(obviously this isn't great for consistency of Lean, but for prototyping this seems like it would work)

Adam Topaz (Dec 15 2024 at 20:15):

I spent a few minutes trying to figure out how to make such a /!— .* -/ command while writing NodeGraph but I couldn’t figure out how to get the syntax parser to work.

Adam Topaz (Dec 15 2024 at 20:16):

For now, the doc in @[node doc] is completely optional so I’m inclined to leave things as they are in that regard

David Thrane Christiansen (Dec 16 2024 at 15:01):

The medium term plan is to transition to using Verso in docstrings, which makes them less and less like strings! This will be a slow transition, discussed in advance with the community to make sure you don't get stuck.

This would make docstring interpolation be a quick library. Here's the pseudocode I imagine:

@[docstring_role]
def interpolate : RoleExpander
  | args, inlines => do
    noArgs args -- Don't accept any arguments!
    if h : inlines.size != 1 then throwError ...
    else match inlines[0] with
      | `(inline| code{ $code }) =>
        let stx <- parse code
        let e <- elabTerm code none
        `(... $(result of evaluation of e) )
      | _ => throwErrorAt inlines[0] "Expected inline code"

Then you could write

/--
2 + 2 = {interpolate}`2 + 2`
-/
theorem two_plus_two_eq_four : 2 + 2 = 4 := rfl

Would that meet your needs, assuming the documentation for writing macros like this was good?

Adam Topaz (Dec 16 2024 at 17:11):

That sounds really great! It would certainly be more than sufficient for things like NodeGraph (although given the discussion above, it may make sense to keep the option for doc in node doc). Looking forward to seeing this in action :)

Eric Wieser (Dec 16 2024 at 17:21):

Does this mean in the long run that all the docstrings with $\LaTeX$. have to be replaced with docstrings with $`\LaTeX`.?

David Thrane Christiansen (Dec 16 2024 at 19:33):

I'd say the answer is "probably, but only gradually and with tool support, and if it's a big problem, let's talk about it". I don't have a complete design worked out in detail yet, but I certainly wouldn't want anything where the migration path for the syntax of docs wasn't "set the option, click the code action" at worst

David Thrane Christiansen (Dec 16 2024 at 19:36):

The advantage of piggybacking on code element syntax here is that we don't end up trying to parse LaTeX, which is a huge development time tarpit - we have a parser for code elements, and it works the same here, and we don't have to worry about precise escaping rules for dollar signs in math.

David Thrane Christiansen (Dec 16 2024 at 19:47):

Don't worry - I won't inflict anything on you without first finding out the real costs and benefits and doing my best to minimize the former

Yury G. Kudryashov (Dec 19 2024 at 22:23):

Docutils use :math:`x^2`

James Gallicchio (Jan 30 2025 at 19:06):

I'm necroposting, but this made me think of Rust doc strings, which can include example usage code that actually gets executed as a test during compilation :big_smile: if there's an RFC for interpolation in docstrings, maybe we can get inspiration from Rust's implementation.

Last updated: May 02 2025 at 03:31 UTC