Zulip Chat Archive

Stream: general

Topic: advice on making PRs from experimental branches


Scott Morrison (Mar 11 2020 at 03:51):

I often find myself creating a branch of mathlib that explores some experimental idea, that perhaps goes well, and perhaps goes badly. Inevitably along the way I find myself making changes to existing parts of mathlib, either to fix something yucky, add a missing lemma, or refactor something. Sometimes I make multiple changes like this to different parts of mathlib.

At the end of the day, probably I don't want to turn the whole branch into a PR (perhaps the overall idea went badly...), but I want to extract out the useful groups of changes into separate PRs.

What do people do to achieve this?

Scott Morrison (Mar 11 2020 at 03:53):

So far the only good trick I know is to make a new branch, and run git merge --squash experimental_branch. This gives me all the changes that branch made, as unstaged edits to the current branch. I then revert most of the files, essentially all the ones that I hope have nothing to do with the subset of changes I'm trying to capture.

Scott Morrison (Mar 11 2020 at 03:53):

This works pretty well, but often I get things wrong and revert too much or too little and have to try again.

Scott Morrison (Mar 11 2020 at 03:53):

If anyone has suggested workflows that work better, I'd love to hear!

Mario Carneiro (Mar 11 2020 at 04:36):

This is probably domain specific, but usually when I am working on a formalization inside mathlib, I will put for_mathlib stuff directly in the files where they should go, which is probably a completely different file or part of the library than what I'm actually working on. So it's not hard to isolate these edits and push only them. Plus, they are usually standalone things so they can just drop in to mathlib.

For more complicated refactorings and such, it generally has to be done in bulk, and some care has to be taken with the PR. I don't think there is an easy way to extract your "branch" changes from the "refactoring" once it gets to this point.

Scott Morrison (Mar 11 2020 at 05:15):

I guess I tend to "work on mathlib directly", so I never have a to_mathlib directory.

Mario Carneiro (Mar 11 2020 at 05:42):

Sure, I do too, but I use the term to refer to library stuff that inevitably appears when working on a project. If you are in a separate project this should go at the top of your file, or in a separate file, but if you are working in mathlib then it should go where it is supposed to be (which is probably not wherever you are working).

Johan Commelin (Mar 11 2020 at 06:00):

I still put it at the top of the file that I'm working on, because otherwise I trigger massive recompiles. How do you avoid this?

Mario Carneiro (Mar 11 2020 at 06:30):

I don't. Options include: PR-ing without testing and trusting CI; don't import so many dependencies so that the chain from your change to your current project is not so bad; and putting things in the current file with an eye for where they will go, then doing bulk insertions with minimal checks during the PR

Mario Carneiro (Mar 11 2020 at 06:31):

Another option is to become vaguely frustrated at how slow lean is

Johan Commelin (Mar 11 2020 at 06:34):

Mario Carneiro said:

Another option is to become vaguely frustrated at how slow lean is

I'll go for this one.

Johan Commelin (Mar 11 2020 at 06:35):

When is your type theoretic editor plugin for MM0 being released?

Johan Commelin (Mar 11 2020 at 06:35):

(I guess in 202X? :stuck_out_tongue_wink: )

Mario Carneiro (Mar 11 2020 at 06:36):

Lean 0

Mario Carneiro (Mar 11 2020 at 06:39):

You know, there is a feature in MM1 that is kind of overkill for it but would make a massive difference for lean: you can turn off proof checking in all files other than the one you are working on. This is kind of obviously the correct setting for interactive use, and CI can do the full check

Mario Carneiro (Mar 11 2020 at 06:40):

In lean it would even be useful to be able to lean --make --no-proofs and get usable (but incomplete) .olean files

Johan Commelin (Mar 11 2020 at 06:44):

But in Lean you can tell it to check only the current file, can't you?

Johan Commelin (Mar 11 2020 at 06:44):

"check visible files"

Johan Commelin (Mar 11 2020 at 06:44):

That sounds close enough

Mario Carneiro (Mar 11 2020 at 06:44):

you still need to have the other files be processed, but only for theorem statements

Mario Carneiro (Mar 11 2020 at 06:45):

I don't think that "check visible files" does anything to stop the catastrophic cascade when you jump into tactic.basic to add or modify a tactic

Mario Carneiro (Mar 11 2020 at 06:46):

plus you are still left with a whole bunch of invalidated olean files that you need to regenerate non-interactively

Mario Carneiro (Mar 11 2020 at 06:47):

I would be curious to see how expensive it is to lean --make all of mathlib without proofs. My guess is ~1 minute

Mario Carneiro (Mar 11 2020 at 06:47):

hopefully less

Mario Carneiro (Mar 11 2020 at 06:48):

but I think there are still some hard typeclass problems in some theorem statements (I don't think we are doing anything heavy in defs)

Markus Himmel (Mar 11 2020 at 06:55):

Regarding the question about only keeping the changes you want: If the changes you want to keep and those you want to get rid of are in separate commits, then you can run git rebase -i abcde where abcde is the last commit from master that your branch has. It opens your editor with a list of all the commits in your experimental branch. You can then simply delete the commits you do not want by deleting the corresponding lines.

Mario Carneiro (Mar 11 2020 at 06:58):

In my case it's almost never in a separate commit, because it invariably comes up in the middle of the project. I guess with better git hygiene I could make a mini commit every time I have a library change, resulting in some spotted history of intermingled library and project commits, and then the library commits could be interactive-rebased into one group or one commit

Scott Morrison (Mar 11 2020 at 07:21):

Yes, my git hygiene is terrible. If I'm working on something experimental, commits tend to line up with when I go to sleep. :-)

Johan Commelin (Mar 11 2020 at 07:31):

@Mario Carneiro Should we turn the --no-proofs thing into a feature request for lean-3.1X.Y?

Johan Commelin (Mar 11 2020 at 07:31):

I have no idea how hard it would be to implement something like that...

Mario Carneiro (Mar 11 2020 at 07:34):

I think not hard. Keep in mind that lean is already elaborating proofs asynchronously, so there is already a part where that part of the work is placed in a closure and sent elsewhere. If we simply ignore that closure than bam, no proofs

Mario Carneiro (Mar 11 2020 at 07:36):

Some changes may have to be made to allow .olean file creation to proceed without the proof result, but the content can just be replaced with a sorry macro

Mario Carneiro (Mar 11 2020 at 07:39):

On a related note it might be nice to turn off that "imported file uses sorry" warning, which is way overkill in non-interactive mode (since it gets spammed on every dependency importing another dependency), and would become much worse with this compilation strategy

Yury G. Kudryashov (Mar 11 2020 at 07:48):

I often work on an experimental feature without making commits, then add selected changes to a git stash, create a new branch from master in another worktree, and do git stash pop. Emacs magit makes it easy.

Gabriel Ebner (Mar 11 2020 at 09:05):

Mario Carneiro said:

I would be curious to see how expensive it is to lean --make all of mathlib without proofs. My guess is ~1 minute

Unfortunately it's much longer than that. Definitions take a surprising amount of time to elaborate (just think of type class resolution). Then there are definitions that contains stuff like cast (by simp) .... (You can't skip the simp because you need to know what local constants the resulting proof depends on.) And then there are instance definitions: instance ... := by refine { add := ..., good := ... 20 line proof ... }; tidy.


Last updated: Dec 20 2023 at 11:08 UTC