Zulip Chat Archive

Stream: mathlib4

Topic: add `import Leanbin.Init.Default` in synported files


Scott Morrison (Oct 20 2021 at 03:15):

I think when running synport we need to add import Leanbin.Init.Default add the beginning of any files mathlib3 files that don't have other imports.

e.g. in Data/Options/Defs.lean, the synported file has errors on HasMem, Option.eq_none_of_is_none, and many others, which are resolved by adding this import.

Scott Morrison (Oct 20 2021 at 03:18):

@Mario Carneiro, @Daniel Selsam, I'm happy to take your advice on the best balance between me attempting things like this vs one of you. Obviously I will be inefficient at first and need some hand-holding, but I'm also happy to invest in hacking on binport/synport. (Daniel has already given me a quick walkthrough.)

Mario Carneiro (Oct 20 2021 at 04:34):

That was actually deliberate. The idea was to have the files look as similar to the original as possible, so ensuring the appropriate imports should be handled by the user in the case where there are no imports.

Mario Carneiro (Oct 20 2021 at 04:37):

That is, in a case like Data.Options.Defs we probably want to continue having no imports and instead rethink what goes in the file

Mario Carneiro (Oct 20 2021 at 04:38):

or import Mathlib.Init or something

Scott Morrison (Oct 20 2021 at 04:51):

Returning to this thread, I'm unclear how to move forward, however. Taking the synported Data/Options/Defs.lean, for example, what are we going to do about the missing HasMem, if not add import Leanbin.Init.Default?

Mario Carneiro (Oct 20 2021 at 04:52):

I think we want to import a file in mathlib4 that defines HasMem, or else move the code that relates option and HasMem to another file

Scott Morrison (Oct 20 2021 at 04:53):

Okay. Have you guys already decided the best way to add such imports? That is, suppose HasMem is in some file in mathlib4. What will be the actual mechanism to add an import line to the synported Data/Options/Defs.lean?

Scott Morrison (Oct 20 2021 at 05:06):

That is, will there be some instruction (where?) that tells synport what to add? Or will there be a "postproduction" step where we patch synported files with human-written changes?

Mario Carneiro (Oct 20 2021 at 05:07):

There will be a post production step, that's the human porting part

Mario Carneiro (Oct 20 2021 at 05:08):

The machine ported version exists already, that's the generated binport file

Scott Morrison (Oct 20 2021 at 05:10):

Okay, but the "human porting part" is going to have to exist for a long time concurrently with re-running synport, while we fix all the problems in synport. So will we e.g. generate a bunch of patches to apply after synport has run, so the pipeline becomes "run synport" then "apply all patches"?

Scott Morrison (Oct 20 2021 at 05:11):

e.g. if I wanted to add an import to Data/Options/Defs/leanpointing to some mathlib4 file, how would I do that "persistently", i.e. so after synport gets updated in some way that import is still in place?

Mario Carneiro (Oct 20 2021 at 05:11):

Any patching mechanism that is fully automatic and uniform can be done by synport itself. Anything that comes afterwards is presumably something that needs 1-1 interaction

Scott Morrison (Oct 20 2021 at 05:12):

Great.

Mario Carneiro (Oct 20 2021 at 05:13):

Scott Morrison said:

e.g. if I wanted to add an import to Data/Options/Defs/leanpointing to some mathlib4 file, how would I do that "persistently", i.e. so after synport gets updated in some way that import is still in place?

Once you have synported Data.Options.Defs and fixed it, the plan is not to synport it ever again; instead future synported files will depend on your new Data.Options.Defs, possibly including #align directives to help it stay relatively close to the binported version

Scott Morrison (Oct 20 2021 at 05:13):

So while my original question was mistaken, because we don't want to be adding import Leanbin.Init.Default to Data/Options/Defs.lean, how to we go about asking synport to import XXX? (I'm guess the answer here is "there isn't yet a way to ask synport to do this"?)

Mario Carneiro (Oct 20 2021 at 05:14):

Synport always assumes you want to import whatever you did in lean 3

Mario Carneiro (Oct 20 2021 at 05:15):

I don't think we are planning to change the import graph during the port, so that sounds right to me

Mario Carneiro (Oct 20 2021 at 05:16):

The issue in the case of data.options.defs is that there was a change to init itself, i.e. what is available from a blank starting point

Mario Carneiro (Oct 20 2021 at 05:16):

I think we want to handle these things on a case by case basis

Scott Morrison (Oct 20 2021 at 05:16):

This doesn't seem right. Are you saying that there is no way to address the fact that Data/Options/Defs.lean doesn't contain an import that defines HasMem, until we declare that particular file will never be synported again?

Scott Morrison (Oct 20 2021 at 05:18):

i.e. if synport is not going to introduce such an import statment, that implies that a human is going to do it. But then from what you said above, as soon as a human touches the file we will never run synport again on that file.

Mario Carneiro (Oct 20 2021 at 05:18):

We're making the assumption that Init is basically the same as lean 3 init, because making that strictly true would cause a lot of mess in the files that would just have to be deleted by the user

Mario Carneiro (Oct 20 2021 at 05:19):

Yes that's right. Once Data.Option.Defs works out however it will fix this, downstream files will just work

Scott Morrison (Oct 20 2021 at 05:20):

Hmm.

Mario Carneiro (Oct 20 2021 at 05:21):

One way to generically address this would be to always import Mathlib.Init in zero import files, and then put any missing stuff from lean 3 core in that file

Mario Carneiro (Oct 20 2021 at 05:22):

We kind of use logic.basic as a pseudo-Mathlib.Init in lean 3 but maybe it's better to be explicit about this

Scott Morrison (Oct 20 2021 at 05:23):

Okay, consider the following scenario. Pretend that Data/Options/Defs.lean only had two problems:

  1. there's a missing import which will provide HasMem, because of changes in Init
  2. whitespace formatting is horrible.

Now, I know 2. is going to get magically better at some point in the future, when e.g. Sebastian feels like working on the Lean4 pretty printer. So if I don't want to fix whitespace by hand (because later it would be fixed automatically), then I can't add an import, because that would "freeze" this file (i.e. mark it as not-to-be-regenerated by synport).

Mario Carneiro (Oct 20 2021 at 05:23):

Or we can just make sure that data.option.defs imports mathlib.init in lean 3 mathlib

Mario Carneiro (Oct 20 2021 at 05:24):

My heart was definitely breaking about (2) when I was working on it

Mario Carneiro (Oct 20 2021 at 05:24):

but the formatter is in core

Mario Carneiro (Oct 20 2021 at 05:26):

I think it is possible to get a re-formatter without synport intervention

Mario Carneiro (Oct 20 2021 at 05:26):

that is, read a lean 4 file and spit it back out again

Scott Morrison (Oct 20 2021 at 05:26):

okay, but that is escaping the point of my question :-)

Mario Carneiro (Oct 20 2021 at 05:27):

that would solve your problem, because when the future time comes around that the formatter is magically better, we can just reformat already synported files

Scott Morrison (Oct 20 2021 at 05:28):

Okay. But I think I understand that your intention is definitely to have no human editing of a file until after the last time synport has ever run on that file. That's fine and good --- indeed simpler than what I was fearing, although possibly less flexible.

Mario Carneiro (Oct 20 2021 at 05:29):

So the workflow looks like:

  1. Run synport
  2. Add the missing import
  3. Move on
  4. Sebastian fixes the formatter
  5. Reformat the file

Scott Morrison (Oct 20 2021 at 05:30):

So I guess the next "pipeline" question is how we will actually declare that a particular file is no longer "owned by synport". Just move it to the mathlib4 repo?

Mario Carneiro (Oct 20 2021 at 05:31):

I envision a workflow where a person who wants to port file X runs synport on X (using already ported dependency files), fixes it up and pushes it to mathlib4

Mario Carneiro (Oct 20 2021 at 05:31):

so we would just not pick values of X that are already ported

Mario Carneiro (Oct 20 2021 at 05:32):

although it's also possible to pick some X that is already ported, and then diff it with the stored version and cherry pick changes

Mario Carneiro (Oct 20 2021 at 05:33):

like I intend to do basically that for all the files in mathlib4 that have been human ported like Data.Nat.Basic

Scott Morrison (Oct 20 2021 at 05:34):

Great!

Mario Carneiro (Oct 20 2021 at 05:35):

for diffing it will be much better if the formatter is good though

Scott Morrison (Oct 20 2021 at 05:37):

Okay, so returning to the initial question of this thread, I think we decided that I was wrong to suggest adding import Leanbin.Init.Default to synported files with no imports, but instead we should add import Mathlib.Init.

Scott Morrison (Oct 20 2021 at 05:37):

Do we achieve this by modifying synport?

Scott Morrison (Oct 20 2021 at 05:38):

Or do we go back to mathlib3, and create an empty init file, and add it as an import to all the no import files, so that synport will reproduce this?

Mario Carneiro (Oct 20 2021 at 05:39):

A combination of both sounds good

Mario Carneiro (Oct 20 2021 at 05:39):

like, mathlib should probably have an init file

Mario Carneiro (Oct 20 2021 at 05:39):

and it's not empty

Scott Morrison (Oct 20 2021 at 05:39):

But then what would need to change in synport?

Mario Carneiro (Oct 20 2021 at 05:40):

but also synport can have a default import if we want to do that generically

Mario Carneiro (Oct 20 2021 at 05:40):

although it gets weird if you are porting a prelude file

Mario Carneiro (Oct 20 2021 at 05:49):

Here's how to modify mathport to add Mathlib.Init on zero include files:

diff --git a/Mathport/Syntax/Translate/Basic.lean b/Mathport/Syntax/Translate/Basic.lean
index 69a1dd2..1006c89 100644
--- a/Mathport/Syntax/Translate/Basic.lean
+++ b/Mathport/Syntax/Translate/Basic.lean
@@ -152,9 +152,10 @@ def AST3toData4 : AST3 → M Data4
   | ⟨prel, imp, commands, _, _⟩ => do
     let prel := prel.map fun _ => mkNode ``Parser.Module.prelude #[mkAtom "prelude"]
     let imp ← imp.foldlM (init := #[]) fun imp ns =>
-      ns.foldlM (init := imp) fun imp n => do
-        imp.push $ mkNode ``Parser.Module.import #[mkAtom "import",
-          mkNullNode, mkIdent (← renameModule n.kind)]
+      ns.foldlM (init := imp) fun imp n => imp.push <$> renameModule n.kind
+    let imp := if prel.isNone && imp.isEmpty then #[`Mathlib.Init] else imp
+    let imp := imp.map fun imp =>
+      mkNode ``Parser.Module.import #[mkAtom "import", mkNullNode, mkIdent imp]
     let fmt ← liftCoreM $ PrettyPrinter.format Parser.Module.header.formatter $
       mkNode ``Parser.Module.header #[mkOptionalNode prel, mkNullNode imp]
     modify fun s => { s with output := fmt }

Scott Morrison (Oct 20 2021 at 05:51):

I can test locally and PR sometime.

Mario Carneiro (Oct 20 2021 at 05:52):

(I'm holding off on PR'ing now because Mathlib.Init should probably exist first)

Scott Morrison (Oct 20 2021 at 05:56):

Now (slightly redundantly with this), you think it's reasonable for me to add an init to mathlib3, even if it is empty at first? (And make otherwise zero-import files import it?)

Mario Carneiro (Oct 20 2021 at 05:59):

It is basically a backport issue: should we take has_mem out of core? What are the other missing things in lean 4 core? I think a lot of zero import files can still be zero import, and those that can't can import init

Mario Carneiro (Oct 20 2021 at 05:59):

although the actual name init is a no-go because it would collide with the one in lean 3 core

Scott Morrison (Oct 20 2021 at 07:06):

Moving has_mem out of lean3 core looks to be a lot of work. The API for rbtree for example uses it extensively. I've personally never worried too much about the "what about non-mathlib users of lean3" argument, but it does seem somewhat unkind to them (or at least, I would anticipate this objection!).

Scott Morrison (Oct 20 2021 at 07:07):

But is seems like there is no big obstacle to add import our_init as a placeholder to mathlib3 files that need something that goes missing between Lean3 and Lean4, even if the mathlib3 version of our_init is largely empty, and it just becomes a handwritten file in mathlib4 containing HasMem and all the other things we discover.

Scott Morrison (Oct 20 2021 at 07:08):

There could be a comment explaining that this is to assist the mathport effort, without polluting mathlib3 itself too badly.

Gabriel Ebner (Oct 20 2021 at 07:28):

Scott Morrison said:

Moving has_mem out of lean3 core looks to be a lot of work. The API for rbtree for example uses it extensively. I've personally never worried too much about the "what about non-mathlib users of lean3" argument, but it does seem somewhat unkind to them (or at least, I would anticipate this objection!).

Another blocker for removing has_mem is the set-builder notation.

For the issue at hand, I would assume the easiest way is to just hardcode a list of extra imports in mathport?

def extraImports : HashMap Name (Array Name) := ....

Mario Carneiro (Oct 20 2021 at 07:30):

If the list is per-file, then it's not much different from fixing the import list once when you are in the post-synport fixup stage

Mario Carneiro (Oct 20 2021 at 07:32):

Scott Morrison said:

There could be a comment explaining that this is to assist the mathport effort, without polluting mathlib3 itself too badly.

The reason I am resistant to adding Mathport.Init on every synport file is because I am trying to avoid "pollution" of any kind. That is, it should not add anything that is always unwanted in the t -> infty version of mathlib 4

Scott Morrison (Oct 20 2021 at 07:32):

Yes, but I feel like in the early stages, working on a synported file is actually going to consist of many iterations of "lets improve synport in the following way". During these iterations, it's nice if the imports can be usable, and for this, we either need a catchall as in your patchfile above, or Gabriel's suggestion.

Scott Morrison (Oct 20 2021 at 07:33):

Mario Carneiro said:

The reason I am resistant to adding Mathport.Init on every synport file is because I am trying to avoid "pollution" of any kind. That is, it should not add anything that is always unwanted in the t -> infty version of mathlib 4

extraImports satisfies this (admirable! :-) constraint, doesn't it?

Mario Carneiro (Oct 20 2021 at 07:33):

It's a basic version of synport overrides, which is going to be hard to spec

Mario Carneiro (Oct 20 2021 at 07:34):

an alternative way to interact with synport is to just specify more and more precise overrides on the output until it is generating exactly the file you want. I am guessing that this will be a lot more difficult to do than just manipulating a lean 4 file

Mario Carneiro (Oct 20 2021 at 07:35):

but it has the advantage that a single change can be applied in many places, and you can rerun the tool as many times as you want

Mario Carneiro (Oct 20 2021 at 07:38):

I think that synport overrides make a lot of sense for upstream objects that want to affect downstream output; that's the idea behind #align. But for the downstream file itself, you are already working on that file and can presumably fix it on the spot

Scott Morrison (Oct 20 2021 at 07:40):

My example involving formatting was not so good above (because as you said formatting can be improved later).

A better example of a pair of problems is:

  1. we want to insert whatever the appropriate import statements are to every file that is currently missing HasMem
  2. there are still many capitalisation problems

If we decided for example that inserting these import will only be done in the "by hand" phase, then in some sense no one can start working on these until the capitalisation problems are solved. That's not a deal-breaker, but it makes tasks sequential.

Mario Carneiro (Oct 20 2021 at 07:40):

You can also imagine working at less-than-file granularity. You can do this fairly well by just taking segments of the synport file between different versions

Scott Morrison (Oct 20 2021 at 07:41):

(I mean, they can start working on solving such missing imports, but only "offline" in some sense. i.e. preparing some list of future patches to be made. But tracking these would be a nightmare compared to something like extraImports or the global import over_init patch.)

Mario Carneiro (Oct 20 2021 at 07:41):

Adding the imports is a fairly trivial thing to do though

Mario Carneiro (Oct 20 2021 at 07:42):

and as mentioned, it can also be backported

Mario Carneiro (Oct 20 2021 at 07:42):

if you add a dummy import in mathlib3 then you get it for free

Scott Morrison (Oct 20 2021 at 07:42):

Which is trivial? Adding them in synport? Or adding them by hand afterwards?

Mario Carneiro (Oct 20 2021 at 07:42):

adding them by hand afterwards

Mario Carneiro (Oct 20 2021 at 07:43):

I'm not opposed to per-file customization of synport but it doesn't seem necessary here. It might help some folks' workflows but doesn't look like it needs to be checked in

Gabriel Ebner (Oct 20 2021 at 07:44):

Hmm, another lowtech option might just be to have a shell script with lots of sed calls for post-processing.

Mario Carneiro (Oct 20 2021 at 07:45):

That may be necessary for some formatting related things, but I think we can make synport do anything in that category

Scott Morrison (Oct 20 2021 at 07:45):

The sed option, or a directory full of patchfiles, or whatever is what I'd imagined you guys had in mind already, so I've been surprised today to discover its quite the opposite. :-)

Gabriel Ebner (Oct 20 2021 at 07:45):

I kind of agree with Scott that we still want to iterate before moving files to mathlib4. Once we start doing that, we've effectively passed flag date. Further development on mathport is essentially pointless (because it only applies to future files), and mathlib changes on already ported files become forbidden.

Mario Carneiro (Oct 20 2021 at 07:46):

I'm not sure what the best interface for stored customization of synport is

Mario Carneiro (Oct 20 2021 at 07:47):

maybe we can read a lean 4 file representing the previous iteration and use info from it somehow

Mario Carneiro (Oct 20 2021 at 07:49):

Gabriel Ebner said:

Further development on mathport is essentially pointless (because it only applies to future files), and mathlib changes on already ported files become forbidden.

I'm not sure I would say it is pointless. It only applies to the 98% of mathlib that hasn't yet been ported

Mario Carneiro (Oct 20 2021 at 07:50):

My guess is that most of the iteration will happen early, once we have started actually attempting to use it on full files but before we have really scaled up

Scott Morrison (Oct 20 2021 at 07:51):

e.g. with my imports + capitalisation example, I would probably want to pick some particular concrete file to iterate improvements to the capitalisation scheme. But while doing that, I really don't want the mental burden of adding the import (or whatever the orthogonal problem is) during every iteration of working on capitalisation.

Scott Morrison (Oct 20 2021 at 07:52):

(Perhaps the problem here is that I am too much of a voodoo programmer, and need more iterations on a given input than most people before stuff works. :-)

Mario Carneiro (Oct 20 2021 at 07:55):

it sounds better if you call it "test driven development" :)

Mac (Oct 20 2021 at 08:11):

Mario Carneiro said:

but it has the advantage that a single change can be applied in many places, and you can rerun the tool as many times as you want

It would also have the advantage of potentially allowing synport to get to the point where it can intelligently port virtually any Lean 3 file thus making the transition more painless for users.

Mario Carneiro (Oct 20 2021 at 08:49):

We're talking about file-specific changes here, though, not general fixes. General fixes, i.e. changes which would be useful in any similar code, should just be done by modifying synport itself so that everyone gets the benefit

Mac (Oct 20 2021 at 08:53):

@Mario Carneiro but that then goes back to @Scott Morrison's original point. If synport keeps getting improved in that manner,' freezing' a file introduces a large commitment as it will no longer be easy to automatically apply the new improvements to the frozen file.

Mario Carneiro (Oct 20 2021 at 08:54):

I think that it will be very difficult to get a good synport-input that can produce a "live updating" lean 4 file

Mario Carneiro (Oct 20 2021 at 08:55):

like, another few person-months

Mario Carneiro (Oct 20 2021 at 09:01):

and presumably, once a file has been ported, any fixes to synport would at best fix something that had to be done by hand on the earlier file. The lost time spent doing that can't be recovered, and the ported file no longer benefits from the synport fix because it's already working

Mac (Oct 20 2021 at 09:15):

@Mario Carneiro this assumes frozen files will be quickly moved from 'ported' to 'fixed' with no long stretches of WIP. Is that the plan?

Mario Carneiro (Oct 20 2021 at 09:17):

Like I mentioned, you can also work at sub-file granularity by just translating half the file and leaving the other half in the original state; you can then re-run synport and replace the bottom half of the file without any loss of fixup work

Mac (Oct 20 2021 at 09:18):

From my vague understand of this whole affair. It sounded like the goal was to get files minimally working (with sorrys strewn here and there) and slowly improve synport to remove those as time passes. In such cases, there can be reasonable cases where there may be small necessary edits made to the synported file to get downstream stuff simi-functional while still have many gaps that one would like synport to fill later on.

Mario Carneiro (Oct 20 2021 at 09:18):

but yes, the fixes should not be significantly more complicated than fixing capitalization or broken theorem references and formatting (and I hope synport to get better at both things)

Mario Carneiro (Oct 20 2021 at 09:18):

Synport produces broken files, not files with sorry

Mario Carneiro (Oct 20 2021 at 09:18):

because you would just have to delete the sorry, put the broken proof in and fix it anyway

Mario Carneiro (Oct 20 2021 at 09:19):

I think binport prefers to use sorry since it has to keep on trucking even in the presence of mismatches

Mac (Oct 20 2021 at 09:20):

ah, then I guess my mental model was just wrong


Last updated: Dec 20 2023 at 11:08 UTC