Zulip Chat Archive
Stream: mathlib4
Topic: add `import Leanbin.Init.Default` in synported files
Scott Morrison (Oct 20 2021 at 03:15):
I think when running synport we need to add import Leanbin.Init.Default
add the beginning of any files mathlib3 files that don't have other imports.
e.g. in Data/Options/Defs.lean
, the synported file has errors on HasMem
, Option.eq_none_of_is_none
, and many others, which are resolved by adding this import.
Scott Morrison (Oct 20 2021 at 03:18):
@Mario Carneiro, @Daniel Selsam, I'm happy to take your advice on the best balance between me attempting things like this vs one of you. Obviously I will be inefficient at first and need some hand-holding, but I'm also happy to invest in hacking on binport/synport. (Daniel has already given me a quick walkthrough.)
Mario Carneiro (Oct 20 2021 at 04:34):
That was actually deliberate. The idea was to have the files look as similar to the original as possible, so ensuring the appropriate imports should be handled by the user in the case where there are no imports.
Mario Carneiro (Oct 20 2021 at 04:37):
That is, in a case like Data.Options.Defs
we probably want to continue having no imports and instead rethink what goes in the file
Mario Carneiro (Oct 20 2021 at 04:38):
or import Mathlib.Init
or something
Scott Morrison (Oct 20 2021 at 04:51):
Returning to this thread, I'm unclear how to move forward, however. Taking the synported Data/Options/Defs.lean
, for example, what are we going to do about the missing HasMem
, if not add import Leanbin.Init.Default
?
Mario Carneiro (Oct 20 2021 at 04:52):
I think we want to import a file in mathlib4 that defines HasMem
, or else move the code that relates option and HasMem
to another file
Scott Morrison (Oct 20 2021 at 04:53):
Okay. Have you guys already decided the best way to add such imports? That is, suppose HasMem
is in some file in mathlib4. What will be the actual mechanism to add an import
line to the synported Data/Options/Defs.lean
?
Scott Morrison (Oct 20 2021 at 05:06):
That is, will there be some instruction (where?) that tells synport what to add? Or will there be a "postproduction" step where we patch synported files with human-written changes?
Mario Carneiro (Oct 20 2021 at 05:07):
There will be a post production step, that's the human porting part
Mario Carneiro (Oct 20 2021 at 05:08):
The machine ported version exists already, that's the generated binport file
Scott Morrison (Oct 20 2021 at 05:10):
Okay, but the "human porting part" is going to have to exist for a long time concurrently with re-running synport, while we fix all the problems in synport. So will we e.g. generate a bunch of patches to apply after synport has run, so the pipeline becomes "run synport" then "apply all patches"?
Scott Morrison (Oct 20 2021 at 05:11):
e.g. if I wanted to add an import to Data/Options/Defs/lean
pointing to some mathlib4 file, how would I do that "persistently", i.e. so after synport gets updated in some way that import is still in place?
Mario Carneiro (Oct 20 2021 at 05:11):
Any patching mechanism that is fully automatic and uniform can be done by synport itself. Anything that comes afterwards is presumably something that needs 1-1 interaction
Scott Morrison (Oct 20 2021 at 05:12):
Great.
Mario Carneiro (Oct 20 2021 at 05:13):
Scott Morrison said:
e.g. if I wanted to add an import to
Data/Options/Defs/lean
pointing to some mathlib4 file, how would I do that "persistently", i.e. so after synport gets updated in some way that import is still in place?
Once you have synported Data.Options.Defs
and fixed it, the plan is not to synport it ever again; instead future synported files will depend on your new Data.Options.Defs
, possibly including #align
directives to help it stay relatively close to the binported version
Scott Morrison (Oct 20 2021 at 05:13):
So while my original question was mistaken, because we don't want to be adding import Leanbin.Init.Default
to Data/Options/Defs.lean
, how to we go about asking synport to import XXX
? (I'm guess the answer here is "there isn't yet a way to ask synport to do this"?)
Mario Carneiro (Oct 20 2021 at 05:14):
Synport always assumes you want to import whatever you did in lean 3
Mario Carneiro (Oct 20 2021 at 05:15):
I don't think we are planning to change the import graph during the port, so that sounds right to me
Mario Carneiro (Oct 20 2021 at 05:16):
The issue in the case of data.options.defs
is that there was a change to init
itself, i.e. what is available from a blank starting point
Mario Carneiro (Oct 20 2021 at 05:16):
I think we want to handle these things on a case by case basis
Scott Morrison (Oct 20 2021 at 05:16):
This doesn't seem right. Are you saying that there is no way to address the fact that Data/Options/Defs.lean
doesn't contain an import that defines HasMem
, until we declare that particular file will never be synported again?
Scott Morrison (Oct 20 2021 at 05:18):
i.e. if synport is not going to introduce such an import statment, that implies that a human is going to do it. But then from what you said above, as soon as a human touches the file we will never run synport again on that file.
Mario Carneiro (Oct 20 2021 at 05:18):
We're making the assumption that Init
is basically the same as lean 3 init
, because making that strictly true would cause a lot of mess in the files that would just have to be deleted by the user
Mario Carneiro (Oct 20 2021 at 05:19):
Yes that's right. Once Data.Option.Defs
works out however it will fix this, downstream files will just work
Scott Morrison (Oct 20 2021 at 05:20):
Hmm.
Mario Carneiro (Oct 20 2021 at 05:21):
One way to generically address this would be to always import Mathlib.Init
in zero import files, and then put any missing stuff from lean 3 core in that file
Mario Carneiro (Oct 20 2021 at 05:22):
We kind of use logic.basic
as a pseudo-Mathlib.Init
in lean 3 but maybe it's better to be explicit about this
Scott Morrison (Oct 20 2021 at 05:23):
Okay, consider the following scenario. Pretend that Data/Options/Defs.lean
only had two problems:
- there's a missing import which will provide
HasMem
, because of changes inInit
- whitespace formatting is horrible.
Now, I know 2. is going to get magically better at some point in the future, when e.g. Sebastian feels like working on the Lean4 pretty printer. So if I don't want to fix whitespace by hand (because later it would be fixed automatically), then I can't add an import, because that would "freeze" this file (i.e. mark it as not-to-be-regenerated by synport).
Mario Carneiro (Oct 20 2021 at 05:23):
Or we can just make sure that data.option.defs
imports mathlib.init
in lean 3 mathlib
Mario Carneiro (Oct 20 2021 at 05:24):
My heart was definitely breaking about (2) when I was working on it
Mario Carneiro (Oct 20 2021 at 05:24):
but the formatter is in core
Mario Carneiro (Oct 20 2021 at 05:26):
I think it is possible to get a re-formatter without synport intervention
Mario Carneiro (Oct 20 2021 at 05:26):
that is, read a lean 4 file and spit it back out again
Scott Morrison (Oct 20 2021 at 05:26):
okay, but that is escaping the point of my question :-)
Mario Carneiro (Oct 20 2021 at 05:27):
that would solve your problem, because when the future time comes around that the formatter is magically better, we can just reformat already synported files
Scott Morrison (Oct 20 2021 at 05:28):
Okay. But I think I understand that your intention is definitely to have no human editing of a file until after the last time synport has ever run on that file. That's fine and good --- indeed simpler than what I was fearing, although possibly less flexible.
Mario Carneiro (Oct 20 2021 at 05:29):
So the workflow looks like:
- Run synport
- Add the missing import
- Move on
- Sebastian fixes the formatter
- Reformat the file
Scott Morrison (Oct 20 2021 at 05:30):
So I guess the next "pipeline" question is how we will actually declare that a particular file is no longer "owned by synport". Just move it to the mathlib4 repo?
Mario Carneiro (Oct 20 2021 at 05:31):
I envision a workflow where a person who wants to port file X runs synport on X (using already ported dependency files), fixes it up and pushes it to mathlib4
Mario Carneiro (Oct 20 2021 at 05:31):
so we would just not pick values of X that are already ported
Mario Carneiro (Oct 20 2021 at 05:32):
although it's also possible to pick some X that is already ported, and then diff it with the stored version and cherry pick changes
Mario Carneiro (Oct 20 2021 at 05:33):
like I intend to do basically that for all the files in mathlib4 that have been human ported like Data.Nat.Basic
Scott Morrison (Oct 20 2021 at 05:34):
Great!
Mario Carneiro (Oct 20 2021 at 05:35):
for diffing it will be much better if the formatter is good though
Scott Morrison (Oct 20 2021 at 05:37):
Okay, so returning to the initial question of this thread, I think we decided that I was wrong to suggest adding import Leanbin.Init.Default
to synported files with no imports, but instead we should add import Mathlib.Init
.
Scott Morrison (Oct 20 2021 at 05:37):
Do we achieve this by modifying synport?
Scott Morrison (Oct 20 2021 at 05:38):
Or do we go back to mathlib3, and create an empty init
file, and add it as an import to all the no import files, so that synport will reproduce this?
Mario Carneiro (Oct 20 2021 at 05:39):
A combination of both sounds good
Mario Carneiro (Oct 20 2021 at 05:39):
like, mathlib should probably have an init file
Mario Carneiro (Oct 20 2021 at 05:39):
and it's not empty
Scott Morrison (Oct 20 2021 at 05:39):
But then what would need to change in synport?
Mario Carneiro (Oct 20 2021 at 05:40):
but also synport can have a default import if we want to do that generically
Mario Carneiro (Oct 20 2021 at 05:40):
although it gets weird if you are porting a prelude file
Mario Carneiro (Oct 20 2021 at 05:49):
Here's how to modify mathport to add Mathlib.Init
on zero include files:
diff --git a/Mathport/Syntax/Translate/Basic.lean b/Mathport/Syntax/Translate/Basic.lean
index 69a1dd2..1006c89 100644
--- a/Mathport/Syntax/Translate/Basic.lean
+++ b/Mathport/Syntax/Translate/Basic.lean
@@ -152,9 +152,10 @@ def AST3toData4 : AST3 → M Data4
| ⟨prel, imp, commands, _, _⟩ => do
let prel := prel.map fun _ => mkNode ``Parser.Module.prelude #[mkAtom "prelude"]
let imp ← imp.foldlM (init := #[]) fun imp ns =>
- ns.foldlM (init := imp) fun imp n => do
- imp.push $ mkNode ``Parser.Module.import #[mkAtom "import",
- mkNullNode, mkIdent (← renameModule n.kind)]
+ ns.foldlM (init := imp) fun imp n => imp.push <$> renameModule n.kind
+ let imp := if prel.isNone && imp.isEmpty then #[`Mathlib.Init] else imp
+ let imp := imp.map fun imp =>
+ mkNode ``Parser.Module.import #[mkAtom "import", mkNullNode, mkIdent imp]
let fmt ← liftCoreM $ PrettyPrinter.format Parser.Module.header.formatter $
mkNode ``Parser.Module.header #[mkOptionalNode prel, mkNullNode imp]
modify fun s => { s with output := fmt }
Scott Morrison (Oct 20 2021 at 05:51):
I can test locally and PR sometime.
Mario Carneiro (Oct 20 2021 at 05:52):
(I'm holding off on PR'ing now because Mathlib.Init
should probably exist first)
Scott Morrison (Oct 20 2021 at 05:56):
Now (slightly redundantly with this), you think it's reasonable for me to add an init
to mathlib3, even if it is empty at first? (And make otherwise zero-import files import it?)
Mario Carneiro (Oct 20 2021 at 05:59):
It is basically a backport issue: should we take has_mem out of core? What are the other missing things in lean 4 core? I think a lot of zero import files can still be zero import, and those that can't can import init
Mario Carneiro (Oct 20 2021 at 05:59):
although the actual name init
is a no-go because it would collide with the one in lean 3 core
Scott Morrison (Oct 20 2021 at 07:06):
Moving has_mem
out of lean3 core looks to be a lot of work. The API for rbtree
for example uses it extensively. I've personally never worried too much about the "what about non-mathlib users of lean3" argument, but it does seem somewhat unkind to them (or at least, I would anticipate this objection!).
Scott Morrison (Oct 20 2021 at 07:07):
But is seems like there is no big obstacle to add import our_init
as a placeholder to mathlib3 files that need something that goes missing between Lean3 and Lean4, even if the mathlib3 version of our_init
is largely empty, and it just becomes a handwritten file in mathlib4 containing HasMem
and all the other things we discover.
Scott Morrison (Oct 20 2021 at 07:08):
There could be a comment explaining that this is to assist the mathport effort, without polluting mathlib3 itself too badly.
Gabriel Ebner (Oct 20 2021 at 07:28):
Scott Morrison said:
Moving
has_mem
out of lean3 core looks to be a lot of work. The API forrbtree
for example uses it extensively. I've personally never worried too much about the "what about non-mathlib users of lean3" argument, but it does seem somewhat unkind to them (or at least, I would anticipate this objection!).
Another blocker for removing has_mem is the set-builder notation.
For the issue at hand, I would assume the easiest way is to just hardcode a list of extra imports in mathport?
def extraImports : HashMap Name (Array Name) := ....
Mario Carneiro (Oct 20 2021 at 07:30):
If the list is per-file, then it's not much different from fixing the import list once when you are in the post-synport fixup stage
Mario Carneiro (Oct 20 2021 at 07:32):
Scott Morrison said:
There could be a comment explaining that this is to assist the mathport effort, without polluting mathlib3 itself too badly.
The reason I am resistant to adding Mathport.Init
on every synport file is because I am trying to avoid "pollution" of any kind. That is, it should not add anything that is always unwanted in the t -> infty version of mathlib 4
Scott Morrison (Oct 20 2021 at 07:32):
Yes, but I feel like in the early stages, working on a synported file is actually going to consist of many iterations of "lets improve synport in the following way". During these iterations, it's nice if the imports can be usable, and for this, we either need a catchall as in your patchfile above, or Gabriel's suggestion.
Scott Morrison (Oct 20 2021 at 07:33):
Mario Carneiro said:
The reason I am resistant to adding
Mathport.Init
on every synport file is because I am trying to avoid "pollution" of any kind. That is, it should not add anything that is always unwanted in the t -> infty version of mathlib 4
extraImports
satisfies this (admirable! :-) constraint, doesn't it?
Mario Carneiro (Oct 20 2021 at 07:33):
It's a basic version of synport overrides, which is going to be hard to spec
Mario Carneiro (Oct 20 2021 at 07:34):
an alternative way to interact with synport is to just specify more and more precise overrides on the output until it is generating exactly the file you want. I am guessing that this will be a lot more difficult to do than just manipulating a lean 4 file
Mario Carneiro (Oct 20 2021 at 07:35):
but it has the advantage that a single change can be applied in many places, and you can rerun the tool as many times as you want
Mario Carneiro (Oct 20 2021 at 07:38):
I think that synport overrides make a lot of sense for upstream objects that want to affect downstream output; that's the idea behind #align
. But for the downstream file itself, you are already working on that file and can presumably fix it on the spot
Scott Morrison (Oct 20 2021 at 07:40):
My example involving formatting was not so good above (because as you said formatting can be improved later).
A better example of a pair of problems is:
- we want to insert whatever the appropriate
import
statements are to every file that is currently missingHasMem
- there are still many capitalisation problems
If we decided for example that inserting these import
will only be done in the "by hand" phase, then in some sense no one can start working on these until the capitalisation problems are solved. That's not a deal-breaker, but it makes tasks sequential.
Mario Carneiro (Oct 20 2021 at 07:40):
You can also imagine working at less-than-file granularity. You can do this fairly well by just taking segments of the synport file between different versions
Scott Morrison (Oct 20 2021 at 07:41):
(I mean, they can start working on solving such missing imports, but only "offline" in some sense. i.e. preparing some list of future patches to be made. But tracking these would be a nightmare compared to something like extraImports
or the global import over_init
patch.)
Mario Carneiro (Oct 20 2021 at 07:41):
Adding the imports is a fairly trivial thing to do though
Mario Carneiro (Oct 20 2021 at 07:42):
and as mentioned, it can also be backported
Mario Carneiro (Oct 20 2021 at 07:42):
if you add a dummy import in mathlib3 then you get it for free
Scott Morrison (Oct 20 2021 at 07:42):
Which is trivial? Adding them in synport
? Or adding them by hand afterwards?
Mario Carneiro (Oct 20 2021 at 07:42):
adding them by hand afterwards
Mario Carneiro (Oct 20 2021 at 07:43):
I'm not opposed to per-file customization of synport but it doesn't seem necessary here. It might help some folks' workflows but doesn't look like it needs to be checked in
Gabriel Ebner (Oct 20 2021 at 07:44):
Hmm, another lowtech option might just be to have a shell script with lots of sed
calls for post-processing.
Mario Carneiro (Oct 20 2021 at 07:45):
That may be necessary for some formatting related things, but I think we can make synport do anything in that category
Scott Morrison (Oct 20 2021 at 07:45):
The sed
option, or a directory full of patchfiles, or whatever is what I'd imagined you guys had in mind already, so I've been surprised today to discover its quite the opposite. :-)
Gabriel Ebner (Oct 20 2021 at 07:45):
I kind of agree with Scott that we still want to iterate before moving files to mathlib4. Once we start doing that, we've effectively passed flag date. Further development on mathport is essentially pointless (because it only applies to future files), and mathlib changes on already ported files become forbidden.
Mario Carneiro (Oct 20 2021 at 07:46):
I'm not sure what the best interface for stored customization of synport is
Mario Carneiro (Oct 20 2021 at 07:47):
maybe we can read a lean 4 file representing the previous iteration and use info from it somehow
Mario Carneiro (Oct 20 2021 at 07:49):
Gabriel Ebner said:
Further development on mathport is essentially pointless (because it only applies to future files), and mathlib changes on already ported files become forbidden.
I'm not sure I would say it is pointless. It only applies to the 98% of mathlib that hasn't yet been ported
Mario Carneiro (Oct 20 2021 at 07:50):
My guess is that most of the iteration will happen early, once we have started actually attempting to use it on full files but before we have really scaled up
Scott Morrison (Oct 20 2021 at 07:51):
e.g. with my imports + capitalisation example, I would probably want to pick some particular concrete file to iterate improvements to the capitalisation scheme. But while doing that, I really don't want the mental burden of adding the import (or whatever the orthogonal problem is) during every iteration of working on capitalisation.
Scott Morrison (Oct 20 2021 at 07:52):
(Perhaps the problem here is that I am too much of a voodoo programmer, and need more iterations on a given input than most people before stuff works. :-)
Mario Carneiro (Oct 20 2021 at 07:55):
it sounds better if you call it "test driven development" :)
Mac (Oct 20 2021 at 08:11):
Mario Carneiro said:
but it has the advantage that a single change can be applied in many places, and you can rerun the tool as many times as you want
It would also have the advantage of potentially allowing synport to get to the point where it can intelligently port virtually any Lean 3 file thus making the transition more painless for users.
Mario Carneiro (Oct 20 2021 at 08:49):
We're talking about file-specific changes here, though, not general fixes. General fixes, i.e. changes which would be useful in any similar code, should just be done by modifying synport itself so that everyone gets the benefit
Mac (Oct 20 2021 at 08:53):
@Mario Carneiro but that then goes back to @Scott Morrison's original point. If synport
keeps getting improved in that manner,' freezing' a file introduces a large commitment as it will no longer be easy to automatically apply the new improvements to the frozen file.
Mario Carneiro (Oct 20 2021 at 08:54):
I think that it will be very difficult to get a good synport-input that can produce a "live updating" lean 4 file
Mario Carneiro (Oct 20 2021 at 08:55):
like, another few person-months
Mario Carneiro (Oct 20 2021 at 09:01):
and presumably, once a file has been ported, any fixes to synport would at best fix something that had to be done by hand on the earlier file. The lost time spent doing that can't be recovered, and the ported file no longer benefits from the synport fix because it's already working
Mac (Oct 20 2021 at 09:15):
@Mario Carneiro this assumes frozen files will be quickly moved from 'ported' to 'fixed' with no long stretches of WIP. Is that the plan?
Mario Carneiro (Oct 20 2021 at 09:17):
Like I mentioned, you can also work at sub-file granularity by just translating half the file and leaving the other half in the original state; you can then re-run synport and replace the bottom half of the file without any loss of fixup work
Mac (Oct 20 2021 at 09:18):
From my vague understand of this whole affair. It sounded like the goal was to get files minimally working (with sorrys strewn here and there) and slowly improve synport to remove those as time passes. In such cases, there can be reasonable cases where there may be small necessary edits made to the synported file to get downstream stuff simi-functional while still have many gaps that one would like synport to fill later on.
Mario Carneiro (Oct 20 2021 at 09:18):
but yes, the fixes should not be significantly more complicated than fixing capitalization or broken theorem references and formatting (and I hope synport to get better at both things)
Mario Carneiro (Oct 20 2021 at 09:18):
Synport produces broken files, not files with sorry
Mario Carneiro (Oct 20 2021 at 09:18):
because you would just have to delete the sorry, put the broken proof in and fix it anyway
Mario Carneiro (Oct 20 2021 at 09:19):
I think binport prefers to use sorry since it has to keep on trucking even in the presence of mismatches
Mac (Oct 20 2021 at 09:20):
ah, then I guess my mental model was just wrong
Last updated: Dec 20 2023 at 11:08 UTC