Zulip Chat Archive

Stream: general

Topic: olean caching


Simon Hudon (Oct 07 2018 at 21:27):

I just created a PR to get git to preserve your olean files and restore them when you switch branches. I'd like to know if it works for people. Please let me know if you try it and if it helps.

Kenny Lau (Oct 07 2018 at 21:28):

:D

Simon Hudon (Oct 07 2018 at 21:32):

In particular, I'd like to know if it needs to be adapter for windows because I use shell scripts

Patrick Massot (Oct 07 2018 at 21:34):

If you manage to allow us to switch mathlib branches back and forth without recompiling I phone the inquisition.

Simon Hudon (Oct 07 2018 at 21:34):

Let me just delete a branch, one sec

Patrick Massot (Oct 07 2018 at 21:34):

That would be one sorcery prowess too many for your own good.

Simon Hudon (Oct 07 2018 at 21:38):

Can the quota be negotiated?

Simon Hudon (Oct 07 2018 at 21:39):

If you want to try it before the branch is merged, here are the files

cache_olean.zip

Simon Hudon (Oct 07 2018 at 21:39):

You unzip them in your mathlib directory and call ./install_hooks.sh

Patrick Massot (Oct 07 2018 at 21:40):

I'll negociate for you if you manage to convince Travis to put mathlib olean for linux somewhere git will find them each we git pull mathlib

Simon Hudon (Oct 07 2018 at 21:41):

What it does is:

* when you commit, it makes a copy of your olean files
* when you checkout a branch, it checks if a cache exists for that branch and restores it if one exists
* when you rebase a branch A on top of branch B, it caches the files of A and restores the files of B

Kenny Lau (Oct 07 2018 at 21:41):

just the olean files?

Simon Hudon (Oct 07 2018 at 21:41):

Yes, just the olean files

Simon Hudon (Oct 07 2018 at 21:42):

I'll negociate for you if you manage to convince Travis to put mathlib olean for linux somewhere git will find them each we git pull mathlib

That's a tall order. You're asking for a complete new feature, all I did was a small hack :P

Simon Hudon (Oct 07 2018 at 21:43):

Actually ... that's feasible

Simon Hudon (Oct 07 2018 at 21:44):

Actually, I think doing it half-way is more than half-way useful

Simon Hudon (Oct 07 2018 at 21:45):

Do you guys feel like giving it a spin?

Patrick Massot (Oct 07 2018 at 21:47):

I'll definitely try tomorrow, but I really need to go to bed now

Simon Hudon (Oct 07 2018 at 21:47):

I thought you enjoyed working with Lean

Simon Hudon (Oct 07 2018 at 21:51):

Sleep well :)

Bryan Gin-ge Chen (Oct 07 2018 at 22:32):

@Simon Hudon Does this work the same way for packages that include mathlib as a dependency?

Bryan Gin-ge Chen (Oct 07 2018 at 22:33):

Wait, maybe that question doesn't make sense.

Simon Hudon (Oct 07 2018 at 22:36):

You can make it work in that situation I believe. Once you configured the package, go in _target/deps/mathlib and call ./install_hooks.sh. If different versions of the same package use different branches of mathlib, switching between the versions of your package might get cheaper (we'd have to test it to make sure)

Bryan Gin-ge Chen (Oct 07 2018 at 22:36):

hint: The '.git/hooks/post-checkout' hook was ignored because it's not set as executable.
hint: You can disable this warning with `git config advice.ignoredHook false`.

Should I change the permissions on these files?

Simon Hudon (Oct 07 2018 at 22:37):

Yes good idea. And I'll make fixes to the scripts

Simon Hudon (Oct 07 2018 at 23:00):

Any luck?

Bryan Gin-ge Chen (Oct 07 2018 at 23:08):

I've been switching back and forth between master and tutorials and running lean --make and each time it does a full rebuild. If I do lean --make does that force a rebuild no matter what?

Bryan Gin-ge Chen (Oct 07 2018 at 23:08):

I'm on my macbook.

Scott Morrison (Oct 07 2018 at 23:09):

No, it doesn't/

Bryan Gin-ge Chen (Oct 07 2018 at 23:10):

Hmm, my master was out of date. Let me try again.

Scott Morrison (Oct 07 2018 at 23:11):

The problem is likely that master has introduced changes to stuff way down the dependency tree.

Scott Morrison (Oct 07 2018 at 23:11):

If you rebase tutorial onto master it should make life better.

Bryan Gin-ge Chen (Oct 07 2018 at 23:13):

I think I see what you're saying, but shouldn't caching all the olean files associated to a given commit also preserve the build status?

Simon Hudon (Oct 07 2018 at 23:14):

It should. Can you ls .bin/branches and tell me what you get?

Bryan Gin-ge Chen (Oct 07 2018 at 23:14):

Where's .bin?

Bryan Gin-ge Chen (Oct 07 2018 at 23:17):

Oh I see, I guess .bin somehow hasn't been created.

Simon Hudon (Oct 07 2018 at 23:34):

It is created when you commit or rebase. You can also manually call .git/hooks/cache_olean.sh

Bryan Gin-ge Chen (Oct 08 2018 at 00:02):

I ran cache_olean.sh after building on tutorials and master, and the olean files seem to be in .bin/branches/ but running lean --make still starts a rebuild.

Simon Hudon (Oct 08 2018 at 00:08):

What if you run restore_olean.sh before rebuilding?

Scott Morrison (Oct 08 2018 at 00:08):

./install_hooks.sh
cp: .git/hooks: Not a directory

Scott Morrison (Oct 08 2018 at 00:09):

Just needs a mkdir -p .git/hooks in install_hooks.sh.

Bryan Gin-ge Chen (Oct 08 2018 at 00:09):

I tried running restore_olean.sh and it's rebuilding again.

Scott Morrison (Oct 08 2018 at 00:11):

I pushed a commit that fixes this.

Simon Hudon (Oct 08 2018 at 00:11):

@Bryan Gin-ge Chen Can you check what the time stamp is on the most recently modified file in .bin/branches/tutorial? I'm assuming that's the one you're trying to cache.

Simon Hudon (Oct 08 2018 at 00:11):

@Scott Morrison You're too fast! Thanks!

Scott Morrison (Oct 08 2018 at 00:12):

Ah, but I fixed the wrong problem.

Scott Morrison (Oct 08 2018 at 00:12):

The actual problem is that I don't have a .git directory.

Scott Morrison (Oct 08 2018 at 00:12):

I have a .git file, because I'm using a submodule.

Scott Morrison (Oct 08 2018 at 00:12):

Crap...

Scott Morrison (Oct 08 2018 at 00:12):

This is not going to affect many people at all, but I guess I want a fix myself. :-)

Simon Hudon (Oct 08 2018 at 00:12):

Darn! How come you work without git?

Scott Morrison (Oct 08 2018 at 00:13):

The problem is that I work with git for _everything_. My entire home directory is a giant git repository, and everything else is a git submodule, so things get synchronised between my computers.

Simon Hudon (Oct 08 2018 at 00:14):

Actually, if you go in _target/deps/mathlib and call install_hooks.sh there, I'm hoping that will do the job

Scott Morrison (Oct 08 2018 at 00:14):

Well, but I mostly want this for mathlib itself.

Scott Morrison (Oct 08 2018 at 00:14):

Switching between branches that touch tactic/basic.lean is unpleasant...

Bryan Gin-ge Chen (Oct 08 2018 at 00:16):

The timestamps all seem to be from when I ran cache_olean.sh.

Scott Morrison (Oct 08 2018 at 00:17):

@Simon Hudon, why do you put cache_olean.sh and restore_olean.sh in the .git/hooks directory, rather than leaving them in the root directory?

Simon Hudon (Oct 08 2018 at 00:18):

@Bryan Gin-ge Chen That is useful information. I'll look into preserving time stamps properly

Simon Hudon (Oct 08 2018 at 00:19):

@Scott Morrison I want to be able to use the hooks even on branches where the files aren't in the repo

Scott Morrison (Oct 08 2018 at 00:19):

Ah, good point.

Bryan Gin-ge Chen (Oct 08 2018 at 00:19):

The timestamps of the .lean files are bumped when I checkout a new branch. So do we need the timestamps of the olean files to be more recent than that?

Simon Hudon (Oct 08 2018 at 00:24):

@Bryan Gin-ge Chen do you mean that the .lean timestamps are set to the time at which you checkout the files?

Bryan Gin-ge Chen (Oct 08 2018 at 00:25):

Yes.

Simon Hudon (Oct 08 2018 at 00:25):

@Scott Morrison Maybe there's a way to include a search step at the beginning of the script to climb up the directory structure until a .git is found.

Simon Hudon (Oct 08 2018 at 00:26):

@Bryan Gin-ge Chen What platform are you on?

Scott Morrison (Oct 08 2018 at 00:26):

There's a nice fix, almost pushed.

Scott Morrison (Oct 08 2018 at 00:26):

git rev-parse --git-dir tells you where your .git directory really is.

Scott Morrison (Oct 08 2018 at 00:27):

e.g. for me: /Users/scott/.git/modules/projects/lean/mathlib

Scott Morrison (Oct 08 2018 at 00:29):

Okay, pushed that change.

Scott Morrison (Oct 08 2018 at 00:29):

I think we need more hooks. I want my olean files cached even if we don't make a commit!

Simon Hudon (Oct 08 2018 at 00:29):

Thanks! Did you put that .git dir in a variable?

Scott Morrison (Oct 08 2018 at 00:29):

e.g. I never commit to master, but I want a cache.

Simon Hudon (Oct 08 2018 at 00:29):

Yeah, me too! I haven't found a good hook for that

Scott Morrison (Oct 08 2018 at 00:30):

I just have GITDIR=`git rev-parse --git-dir in most of the scripts.

Scott Morrison (Oct 08 2018 at 00:30):

What if when we checkout, we _first_ restore, _then_ cache.

Simon Hudon (Oct 08 2018 at 00:30):

I think we might need to get leanpkg or, emacs / VS code to do that after you build

Bryan Gin-ge Chen (Oct 08 2018 at 00:30):

macOS. Not all of the .lean files have their timestamps bumped, but the ones that don't exist in master certainly do.

Scott Morrison (Oct 08 2018 at 00:30):

If there was something to restore, then it just harmlessly puts it back again.

Scott Morrison (Oct 08 2018 at 00:30):

But if there was nothing to restore, then we build a cache?

Scott Morrison (Oct 08 2018 at 00:30):

hmm...

Scott Morrison (Oct 08 2018 at 00:30):

no

Scott Morrison (Oct 08 2018 at 00:31):

because that cache will be wrong. :-)

Bryan Gin-ge Chen (Oct 08 2018 at 00:33):

All the files that are changed between branches have their timestamps set to the time I do the checkout.

Scott Morrison (Oct 08 2018 at 00:37):

What if on every checkout we:
1) run cache before checkout (saving the state of the branch we're leaving)

Scott Morrison (Oct 08 2018 at 00:37):

2) run restore after checkout

Scott Morrison (Oct 08 2018 at 00:37):

3) run cache after that

Simon Hudon (Oct 08 2018 at 00:37):

What a bummer. I found this information here. https://confluence.atlassian.com/bbkb/preserving-file-timestamps-with-git-and-mercurial-781386524.html

It recommends a certain build system as a solution

Simon Hudon (Oct 08 2018 at 00:38):

@Scott Morrison run cache after the restore or after build?

Scott Morrison (Oct 08 2018 at 00:38):

I was thinking just immediately after the restore.

Scott Morrison (Oct 08 2018 at 00:38):

This has the effect of actually creating the cache merely by checking out.

Scott Morrison (Oct 08 2018 at 00:39):

Otherwise I don't see how a cache for master will ever be created

Bryan Gin-ge Chen (Oct 08 2018 at 00:39):

Could we just touch all the .olean files right before restoring?

Scott Morrison (Oct 08 2018 at 00:40):

I think the problem is that risk making things worse.

Simon Hudon (Oct 08 2018 at 00:40):

Yeah

Scott Morrison (Oct 08 2018 at 00:40):

(deleted)

Scott Morrison (Oct 08 2018 at 00:41):

(deleted)

Simon Hudon (Oct 08 2018 at 00:41):

What I'm thinking of is, after a checkout, look at each lean file and set its time stamp to that of the latest commit that changed it

Scott Morrison (Oct 08 2018 at 00:41):

oof...

Scott Morrison (Oct 08 2018 at 00:42):

(Presumably you mean to the time stamp of the corresponding olean file.)

Reid Barton (Oct 08 2018 at 00:42):

What if you just don't use cp -a?

Simon Hudon (Oct 08 2018 at 00:44):

The olean files will all be interpreted as up-to-date but some might not be

Simon Hudon (Oct 08 2018 at 00:46):

(Presumably you mean to the time stamp of the corresponding olean file.)

No. For any source file we need to determine if we need to rebuild it. One information we need is when we last built it which is why we shouldn't overwrite the time stamp of the olean file. The other one is when we last change it which is why we wouldn't want git to overwrite the time stamp of a file but it does.

Simon Hudon (Oct 08 2018 at 00:47):

I think we'd like to know if the file was built since the last commit that changed it

Scott Morrison (Oct 08 2018 at 07:58):

This olean caching stuff is really important, and I will be super happy if we can get it working smoothly. I'm not sure I can help at the moment with the timestamp issues, but let me know if there's anything I can do.

Scott Morrison (Oct 08 2018 at 07:59):

I hope one day we could even build a distributed cache; I could certainly contribute CPU cycles building olean files for central distribution.

Simon Hudon (Oct 08 2018 at 14:37):

I'll keep at it

Chris Hughes (Oct 08 2018 at 18:34):

I just have 17 copies of mathlib saved, so I never have to switch branches.

Johan Commelin (Oct 08 2018 at 18:37):

Diskspace is cheap...

Simon Hudon (Oct 08 2018 at 18:42):

I used to work like that. As the number of branches changes, I find it hard to manage

Tobias Grosser (Oct 09 2018 at 08:14):

Interesting. I am not really up-to-speed here, but i might be interesting to look at existing tools which address these issues.

Tobias Grosser (Oct 09 2018 at 08:14):

Specifically, there is 'ninja' a very fast alternative for make.

Tobias Grosser (Oct 09 2018 at 08:15):

And 'ccache' which caches the compilation of C/C++ files.

Tobias Grosser (Oct 09 2018 at 08:15):

If lean would have a way to export the "include" files, it might be possible to adapt these tools (or use at least their design).

Tobias Grosser (Oct 09 2018 at 08:16):

My feeling is that connecting .olean files to git branches is pretty fragile "hacky".

Tobias Grosser (Oct 09 2018 at 08:17):

Hashing the source files and having a proxy for returning the right source file in case a hash is already known is a strategy that works well in the OS projects I have been working on.

Mario Carneiro (Oct 09 2018 at 08:17):

lean already uses ninja for compilation

Mario Carneiro (Oct 09 2018 at 08:18):

You would have to hash the file plus the hashes of dependent files, git-style

Tobias Grosser (Oct 09 2018 at 08:19):

Right. The ccache docu explains well what is needed to do this kind of caching.

Tobias Grosser (Oct 09 2018 at 08:24):

@Mario Carneiro , are you saying lean is using 'ninja' for compiling the lean binary, or is it called as part of 'lean --make'?

Gabriel Ebner (Oct 09 2018 at 08:29):

ninja can optionally be used to compile lean itself. lean --make doesn't use ninja (the equivalent linja tool in Lean 2 did though). Ninja doesn't do anything more fancy than lean --make, I believe. (Does ninja use content hashes or modification times?) Something like ccache might be interesting when switching between branches.

Tobias Grosser (Oct 09 2018 at 08:29):

Also, to get the dependent files lean already has a command '--deps'.

Tobias Grosser (Oct 09 2018 at 08:30):

Ninja does not do anything more fancy than lean --make, I assume.

Tobias Grosser (Oct 09 2018 at 08:30):

However, it is _very_ fast.

Tobias Grosser (Oct 09 2018 at 08:31):

in checking if files have been modified.

Gabriel Ebner (Oct 09 2018 at 08:31):

That is pretty much the least expensive part of running lean --make though.

Tobias Grosser (Oct 09 2018 at 08:31):

This does not matter for the perf problems we have today, but I feel it will matter in the future.

Mario Carneiro (Oct 09 2018 at 08:32):

the problem is that the usual dependency analysis will say that too many files are affected and compile them all

Tobias Grosser (Oct 09 2018 at 08:33):

Right, it's more a difference between 5 seconds and 0.01 second on clean builds. It really improves productivity on large C++ code bases and might be useful for lean as well (especially if we interactively want to update proofs).

Tobias Grosser (Oct 09 2018 at 08:33):

Right. Most of likely requires changes in lean proper.

Mario Carneiro (Oct 09 2018 at 08:34):

0.01 second vs 1 second to tell me that nothing needs doing doesn't seem so useful

Mario Carneiro (Oct 09 2018 at 08:34):

but if you could reason that most of the library is unaffected by a new theorem in logic.basic that would be a HUGE gain

Mario Carneiro (Oct 09 2018 at 08:34):

like an hour compilation

Tobias Grosser (Oct 09 2018 at 08:35):

I mean these are orthogonal.

Mario Carneiro (Oct 09 2018 at 08:35):

is it? I'm talking about proper caching and dependency analysis

Tobias Grosser (Oct 09 2018 at 08:36):

You need all a) proper dependency analysis, b) good caching, c) worry about how to check if files have changed.

Tobias Grosser (Oct 09 2018 at 08:36):

ninja is really good in minimizing file IO and stat calls to very quickly check for c)

Mario Carneiro (Oct 09 2018 at 08:37):

If there were thousands of files then I can see this being a problem

Mario Carneiro (Oct 09 2018 at 08:37):

but at that point total compile times will dwarf all of this

Gabriel Ebner (Oct 09 2018 at 08:38):

In server mode we don't even do any file IO? We actually have a precomputed dependency graph.

Tobias Grosser (Oct 09 2018 at 08:39):

But it does not help with a) and b).

Tobias Grosser (Oct 09 2018 at 08:39):

I see.

Tobias Grosser (Oct 09 2018 at 08:40):

So my feeling is at some point we would just need proper caching in server mode.

Tobias Grosser (Oct 09 2018 at 08:40):

Which would be a lean specific thing, that just needs to be implemented, right?

Sebastian Ullrich (Oct 09 2018 at 15:49):

"just"

Sebastian Ullrich (Oct 09 2018 at 15:49):

:)

Simon Hudon (Oct 09 2018 at 21:25):

I have created a branch of ccache to try and add support for Lean. I saw an issue discussion that suggests that it might be a simple business so I thought I'd try it out: https://github.com/leanprover-community/ccache

Sebastian Ullrich (Oct 09 2018 at 21:27):

Wow, I've been thinking about doing this for a long time, but didn't think anyone would ever actually go ahead with it...

Sebastian Ullrich (Oct 09 2018 at 21:28):

Though are you sure you don't want to modify https://github.com/mozilla/sccache instead :) ?

Sebastian Ullrich (Oct 09 2018 at 21:28):

(No idea if it would actually be easier, just newer code base, cloud support, and my preference for Rust)

Simon Hudon (Oct 09 2018 at 21:30):

That sounds like an even more useful tool. I don't know what it would take to adapt though

Simon Hudon (Oct 09 2018 at 21:31):

Do I understand correctly that it allows sharing binaries between team members or community members?

Sebastian Ullrich (Oct 09 2018 at 21:33):

I assume it's intended for sharing with trusted team members. Securing public r/w access to some cloud storage doesn't sound like a simple issue.

Simon Hudon (Oct 09 2018 at 21:35):

I think I'll put that issue aside to handle a simpler problem :)

Patrick Massot (Oct 09 2018 at 21:36):

Sounds like an important issues, I've seen people discussing assuming nat = int, they probably shouldn't be trusted

Kenny Lau (Oct 09 2018 at 21:36):

how about we remove *.olean from gitignore?

Simon Hudon (Oct 09 2018 at 21:40):

It is a big no-no. We don't want to keep the history of those files, we just want appropriate binaries. Keeping those binaries will make git slower and will make merging more complicated.

Chris Hughes (Oct 09 2018 at 21:41):

Also olean is OS dependent

Simon Hudon (Oct 09 2018 at 21:43):

@Sebastian Ullrich it looks like sccache works with Lean out of the box. You just call sccache leanpkg build. Does that make sense to you?

Sebastian Ullrich (Oct 09 2018 at 21:44):

I have no idea. Does it look like it actually does anything?

Sebastian Ullrich (Oct 09 2018 at 21:45):

(@Chris Hughes It's not, AFAIK. Though it will be architecture-dependent in Lean 4)

Simon Hudon (Oct 09 2018 at 21:46):

It does look like it actually calls leanpkg but I'll have to experiment and see if a cache is created.

Sebastian Ullrich (Oct 09 2018 at 21:48):

Yes, that would be the important part :smiley:

Simon Hudon (Oct 09 2018 at 21:50):

If it does work out of the box, the next step would be integration: how do we make it invisible to the Lean user

Simon Hudon (Oct 09 2018 at 22:00):

I think part of it working with leanpkg and elan is to create a symbolic link so that when lean is called, it calls sccache lean instead. @Sebastian Ullrich Does that sound like that could break the build system?

Sebastian Ullrich (Oct 09 2018 at 22:09):

That could work

Simon Hudon (Oct 09 2018 at 22:14):

Right now, I put a shell script called ~/.sccache/bin/lean that just prints "calling lean" and I put it in my path (I checked with which lean) but leanpkg does not seem to call it

Sebastian Ullrich (Oct 09 2018 at 22:20):

oh https://github.com/leanprover/lean/blob/b13ac127fd83f3724d2f096b1fb85dc6b15e3746/bin/leanpkg#L24

Simon Hudon (Oct 09 2018 at 22:21):

So how does elan redirect to the right version of lean?

Sebastian Ullrich (Oct 09 2018 at 22:22):

It doesn't in that case, it just calls the right version of leanpkg :)

Simon Hudon (Oct 09 2018 at 22:23):

argh!

Simon Hudon (Oct 09 2018 at 22:24):

Any chance we might hard code ccache or sccache into elan?

Sebastian Ullrich (Oct 09 2018 at 22:25):

How would that help?

Simon Hudon (Oct 09 2018 at 22:27):

Maybe it could call a leanpkg-2 bash script instead of leanpkg and also produce the leanpkg-2 script so that the paths are set properly (for sccache). I have a feeling that that bash script did not change a lot between versions. Am I right?

Sebastian Ullrich (Oct 09 2018 at 22:30):

I see. But if you want to support both sccache leanpkg and sccache lean anyway, are you sure you save any code by making the former call the latter?

Simon Hudon (Oct 09 2018 at 22:32):

Actually, what I'd like to do is just replace every call to lean to sccache lean. sccache seems to index its cache by command line arguments.

Simon Hudon (Oct 09 2018 at 22:35):

Actually, it might be enough to have elan rename lean to lean-2 and create a bash script lean in its place to call sccache when elan downloads a version of lean

Simon Hudon (Oct 10 2018 at 21:56):

@Sebastian Ullrich It's starting to look to me like we may have to alter the way lean --make works for either of sccache or ccache to be useable

Sebastian Ullrich (Oct 10 2018 at 21:56):

I wouldn't be surprised :)

Simon Hudon (Oct 10 2018 at 21:57):

Bummer, I was hoping it might be a small(-ish) thing to do

Sebastian Ullrich (Oct 10 2018 at 21:59):

I was fully prepared that someone might at some point write an alternative leanpkg that fixes many issues with it (and would probably not be written in Lean)

Simon Hudon (Oct 10 2018 at 22:01):

Is there a way to use lean to export all the dependencies in a project? lean --deps seems to interact poorly with leanpkg.toml files

Simon Hudon (Oct 10 2018 at 22:05):

If I could use it, maybe I could generate a Makefile (or ninja config?) and that might be a better starting point than lean --make

Sebastian Ullrich (Oct 10 2018 at 22:16):

I don't think so, you basically have to parse the file header...

Tobias Grosser (Oct 11 2018 at 15:06):

+1 for 'ninja + make', at best if it's auto-generated from the lean files.

Simon Hudon (Oct 11 2018 at 15:07):

The other tool I've heard of is Bazel. Have you heard of it and would you endorse it?

Tobias Grosser (Oct 11 2018 at 15:10):

Never used it.

Tobias Grosser (Oct 11 2018 at 15:10):

I feel it might be too high level.

Tobias Grosser (Oct 11 2018 at 15:10):

In the end we know all dependences.

Tobias Grosser (Oct 11 2018 at 15:10):

And just want to declare them and execute.

Tobias Grosser (Oct 11 2018 at 15:13):

I mean, you might also try to use cmake:
https://cmake.org/pipermail/cmake/2011-April/043761.html

Tobias Grosser (Oct 11 2018 at 15:13):

Not sure if this makes things easier.

Tobias Grosser (Oct 11 2018 at 15:14):

I fell, getting dependences out of lean files is certainly a useful component in any of these tools.

Simon Hudon (Oct 11 2018 at 15:17):

In the end, we'll probably need to parse the Lean files ourselves though, no? Unless we can get lean to get them out for us.

Simon Hudon (Oct 11 2018 at 15:20):

In the meantime, I started a branch in sccache and I talked to the developers on how to support Lean. It sounds doable. I just need to learn a bit more of Rust. And they seem willing to take our patch

Tobias Grosser (Oct 11 2018 at 15:21):

Cool. This sounds very exciting, indeed.

Tobias Grosser (Oct 11 2018 at 15:21):

What is the problem with --deps?

Reid Barton (Oct 11 2018 at 15:22):

As far as I can tell, it just doesn't work properly

Simon Hudon (Oct 11 2018 at 15:23):

I can't get it to work at the scale of a project. It seems to work off of LEAN_PATH which is no longer required (and indeed harmful) to work with leanpkg

Tobias Grosser (Oct 11 2018 at 15:31):

I see. I have no idea of this code, but fixing it seems the most direct thing to do. In fact, does it do a lot more than parsing the imports?

Tobias Grosser (Oct 11 2018 at 15:32):

AFAIU it should just parse the imports and then print this to stdout in some way.

Reid Barton (Oct 11 2018 at 15:32):

Simon, have you gotten it to work under any circumstances?

Reid Barton (Oct 11 2018 at 15:48):

Oh, I figured out why it is broken. It thinks that the presence of both foo.lean and foo.olean means that the import of foo is ambiguous.

Reid Barton (Oct 11 2018 at 15:48):

So, this is something we could work around (in principle, at least)

Simon Hudon (Oct 11 2018 at 16:00):

I was actually thinking that temporarily setting LEAN_PATH might fix it but I'm not done experimenting with it

Reid Barton (Oct 11 2018 at 16:01):

I looked at strace -e trace=file output and it's clearly not failing to find the files it's looking for

Reid Barton (Oct 11 2018 at 16:02):

And it is looking in the correct paths as reported by lean -p (and examining the source seems to indicate that lean -p is using the same path information as lean --deps)

Simon Hudon (Oct 11 2018 at 16:13):

@Sebastian Ullrich Any chance we may get a fix for lean --deps?

Sebastian Ullrich (Oct 11 2018 at 16:16):

@Simon Hudon Yes, I feel that could be a reasonable change. It would be great if you or someone else could investigate fixing it

Simon Hudon (Oct 11 2018 at 16:19):

Sure I'll look into it. Do you have a suggestion of where I should start?

Sebastian Ullrich (Oct 11 2018 at 16:23):

The relevant call should be this one: https://github.com/leanprover/lean/blob/b13ac127fd83f3724d2f096b1fb85dc6b15e3746/src/shell/lean.cpp#L698

Sebastian Ullrich (Oct 11 2018 at 16:23):

Wow, display_deps is still testing for .hlean and .lua

Reid Barton (Oct 11 2018 at 16:27):

What I have inferred from looking at the source and the strace output: Say the module imports data.rbmap. display_deps calls

std::string find_file(search_path const & paths, std::string const & base, optional<unsigned> const & rel, name const & fname,
                      std::initializer_list<char const *> const & extensions) {

in src/util/lean_path.cpp. That calls file_path just above, because the import is not relative. That one tries each extension in the given list and pushes all the successes into a list of results.

Reid Barton (Oct 11 2018 at 16:28):

Because both data/rbmap/default.lean and data/rbmap/default.olean are found, that function raises an "ambiguous import" exception. Then display_deps catches the exception and reports an uninformative "file not found" error message.

Reid Barton (Oct 11 2018 at 16:29):

You may want to verify this by inserting tracing printfs and/or having display_deps show the original exception

Simon Hudon (Oct 11 2018 at 16:29):

Wow! you're fast!

Sebastian Ullrich (Oct 11 2018 at 16:30):

Nice work! This is the code the regular path uses for importing, you may want to adapt that https://github.com/leanprover/lean/blob/b13ac127fd83f3724d2f096b1fb85dc6b15e3746/src/library/module_mgr.cpp

Reid Barton (Oct 11 2018 at 16:30):

I was just looking into this earlier (see my messages of 42 minutes ago)

Sebastian Ullrich (Oct 11 2018 at 16:32):

Note that this code already looks completely different in Lean 4. So fixing this would only make sense if it was a big win for Lean 3 users (which a working sccache would probably be)

Simon Hudon (Oct 11 2018 at 16:34):

I think it would be. It seems like a small fix on the Lean side and I'm hopeful sccache should be easily ported from Lean 3 to Lean 4 (once it works for Lean 3)

Simon Hudon (Oct 11 2018 at 16:34):

@Reid Barton Can I leave that fix to you?

Reid Barton (Oct 11 2018 at 16:42):

I can take a shot at it. We should make sure we have some working setup for sccache before upstreaming the change to Lean though. For testing purposes, you should be able to just remove ".olean" from the list of extensions in display_deps.

Simon Hudon (Oct 11 2018 at 16:44):

we have some working setup for sccache before upstreaming the change to Lean though

I think the fix is usable on its own. Why do you think we should wait?

Reid Barton (Oct 11 2018 at 16:47):

Only because Lean 3 is currently in this state where it doesn't change without a particularly good reason

Simon Hudon (Oct 11 2018 at 16:50):

It makes it harder to progress if we wait for everything to be ready before we push anything. If we do the heavy lifting, I don't think they will begrudge us fixing a problem even if the sccache idea doesn't pan out.

Simon Hudon (Oct 11 2018 at 17:47):

I just tried removing .olean from the list in display_deps and it's working. Is that a good enough fix?

Reid Barton (Oct 11 2018 at 18:43):

It seems like it should be good enough for us, though there could be other bugs lurking underneath. The other question is whether it would break any other users of Lean and specifically of lean --deps. The only way I could imagine the change breaking a working system is if that system uses out-of-tree builds (output .olean files not located in the same place as the corresponding source files) and also relies on .olean files as inputs without corresponding source .lean files. That seems pretty unlikely to me, considering that lean's library itself is not built out-of-tree, so at a minimum the implicit import of init will fail.

Simon Hudon (Oct 11 2018 at 18:45):

That's also my impression. I wonder if they had anything else in mind when they put .olean in that list.

Simon Hudon (Oct 11 2018 at 18:49):

@Sebastian Ullrich Here's the fix that did it for me: https://github.com/cipher1024/lean/commit/d2d81a3539bdf952eaf6126d2be69f2fda0b2f1f

Simon Hudon (Oct 11 2018 at 19:02):

Also, is --make the only way to produce .olean files? For me, even if I give a file name, it seems to build the whole project. Is there any way around that?

Reid Barton (Oct 11 2018 at 19:05):

It should only build the file you specify (and its dependencies if necessary)

Simon Hudon (Oct 11 2018 at 19:08):

You're right, I misinterpreted what I saw

Simon Hudon (Oct 11 2018 at 19:08):

Now with the fix, I managed to build a make file that discovers dependencies and builds what it needs.

Tobias Grosser (Oct 11 2018 at 19:15):

Nice!

Simon Hudon (Oct 11 2018 at 19:21):

I decided to go for bare minimum and only use make. We can make it more sophisticated once we see it is working.

Tobias Grosser (Oct 11 2018 at 19:22):

Seems like a good idea. Looking forward to checkout the make file.

Simon Hudon (Oct 11 2018 at 19:45):

I'll push it in a branch in a moment. If you have comments, I'd love to hear them

Simon Hudon (Oct 11 2018 at 19:45):

It won't be functional because I hard coded a path to my Lean version

Sebastian Ullrich (Oct 11 2018 at 20:03):

You can use elan to manage your local Lean fork :)

Simon Hudon (Oct 11 2018 at 20:03):

Niiice! How?

Sebastian Ullrich (Oct 11 2018 at 20:04):

elan help toolchain link

Sebastian Ullrich (Oct 11 2018 at 20:05):

"crate", oops

Simon Hudon (Oct 11 2018 at 20:06):

Haha! I didn't even notice! I was thinking that you've been pretty thorough in replacing Rust with Lean :)

Simon Hudon (Oct 11 2018 at 20:08):

Btw, have you looked at my commit?

Sebastian Ullrich (Oct 11 2018 at 20:12):

Yeah, I think it should be fine. You can remove ".lua" too.

Simon Hudon (Oct 11 2018 at 20:13):

Will do!

Simon Hudon (Oct 11 2018 at 20:36):

Here we go: https://github.com/leanprover/lean/pull/1978

Simon Hudon (Oct 11 2018 at 21:24):

Is it possible to use elan to refer to a specific git commit on github?

Simon Hudon (Oct 11 2018 at 22:44):

@Tobias Grosser here is what I did for the Makefile. It generates dependency files for each .lean file before it gets started building any files and the first time you get a lot of error messages because they are missing.

https://github.com/leanprover-community/mathlib/commit/d090d711c0b438223479635b663e720dab2d07e7

Simon Hudon (Oct 11 2018 at 23:10):

@Sebastian Ullrich You use emacs right? What do you use for Rust?

Tobias Grosser (Oct 12 2018 at 05:45):

Nice.

@Tobias Grosser here is what I did for the Makefile. It generates dependency files for each .lean file before it gets started building any files and the first time you get a lot of error messages because they are missing.

https://github.com/leanprover-community/mathlib/commit/d090d711c0b438223479635b663e720dab2d07e7

Now just the sccache needs to work. I feel this might indeed be a great improvement.

Best Tobias

Johan Commelin (Oct 16 2018 at 18:33):

My laptop just crashed while compiling mathlib. This is the first time it crashed in a couple of years. We really need to make mathlib easier on prehistoric hardware :dragon:

Simon Hudon (Oct 16 2018 at 19:36):

I'm sciencing as fast as I can ... it's coming

Kenny Lau (Oct 16 2018 at 19:40):

or we can make mathlib compile faster

Simon Hudon (Oct 16 2018 at 19:42):

I don't think we should pick just one

Scott Morrison (Nov 19 2018 at 21:20):

Hi @Simon Hudon, would you have time to give a status update on sccache integration, in particular pointing out if there are places someone else could help write code/investigate issues/do some research?

Simon Hudon (Nov 19 2018 at 21:22):

Sure. I can push what I currently have

Simon Hudon (Nov 19 2018 at 21:23):

What I did is provide a way for sscache to determine if an olean file is outdated or not through hashing.

Simon Hudon (Nov 19 2018 at 21:24):

I'm still struggling to get the tests to pass. I think once that work, we should have a version sscache that can locally cache any lean build we like.

Simon Hudon (Nov 19 2018 at 21:24):

I wrote a Makefile to take advantage of it and I pushed it on leanprover-community. I also made a patch to lean itself to fix the way it computes dependencies

Simon Hudon (Nov 19 2018 at 21:26):

After I'm done with this part, one possible extension (if you want to spend the time) is figure out how to get some cloud storage to store a Lean cache. That would involve configuring sccache itself and setting up instructions for people who want to access the cache.

Simon Hudon (Nov 19 2018 at 21:26):

At this point I see that more as goody than as a must have though

Scott Morrison (Nov 20 2018 at 01:40):

I pushed two minor changes to the build-system branch.

Scott Morrison (Nov 21 2018 at 05:39):

Where is that "is an olean file outdated, via hashing" code?

Scott Morrison (Nov 21 2018 at 05:43):

Using timestamps to decide whether to recompile makes life really difficult. Do we know an obstacle (besides patching Lean), to instead have each olean file contain a hash of its source file and of the olean files for each of its imports? To decide if an olean file is stale or not you would read those hash from the olean file, and compare them against the current hashes of those files.

Scott Morrison (Nov 21 2018 at 05:43):

Once one .lean file changes, this would propagate staleness of all dependent .olean files.

Scott Morrison (Nov 21 2018 at 05:43):

And no timestamps would ever be considered.

Scott Morrison (Nov 21 2018 at 05:44):

Or, @Simon Hudon, is this what you've already done?

Simon Hudon (Nov 21 2018 at 05:47):

That's what I'm working on yes. More exactly, when calling sccache lean --make foo.lean, we get the list of modules imported (directly or indirectly) by foo.lean, we hash all of them (including foo.lean) and recompile only if that hash changes.

Scott Morrison (Nov 21 2018 at 05:49):

I see -- you hash the source files of the imports, or the olean files?

Scott Morrison (Nov 21 2018 at 05:50):

And this is all being done by the sccache wrapper, with no modification of Lean?

Simon Hudon (Nov 21 2018 at 05:50):

The sources

Simon Hudon (Nov 21 2018 at 05:50):

That's correct

Scott Morrison (Nov 21 2018 at 05:50):

But won't we then get burnt still by Lean being wrong about what is stale?

Simon Hudon (Nov 21 2018 at 05:51):

I'd like to hash the olean but all the ways I have thought of are flawed

Simon Hudon (Nov 21 2018 at 05:51):

What do you mean?

Scott Morrison (Nov 21 2018 at 05:51):

Say we have files A < B < C, and A and B are already built correctly, according to their hashes, while C needs to be rebuilt.

Scott Morrison (Nov 21 2018 at 05:51):

But suppose the timestamps on A.olean and B.olean are messed up.

Scott Morrison (Nov 21 2018 at 05:52):

When sccache invokes Lean on C.lean, what is to stop Lean from going and recompiling A and B?

Simon Hudon (Nov 21 2018 at 05:54):

I don't think this will be happening but I have to double check. Maybe @Sebastian Ullrich can tell us. If we call lean --make foo.lean, will Lean attempt to rebuild any of the dependencies? If so, is there a way to tell it not to?

Simon Hudon (Nov 21 2018 at 05:57):

One thing I have considered is to manually set the modification time of the olean files when they don't need to be rebuilt. That should solve the issue.

Sebastian Ullrich (Nov 21 2018 at 08:27):

Yeah, that's what I would have proposed, too

Johan Commelin (Dec 15 2018 at 19:24):

@Scott Morrison @Simon Hudon Any updates on the caching front? (Yes, indeed, I thought of it again because I'm currently recompiling... :lol:)

Simon Hudon (Dec 15 2018 at 19:25):

Haha! I actually managed to make progress on it today and yesterday. Now I'm testing it and I'm hoping I can roll out a version soon

Johan Commelin (Dec 15 2018 at 19:26):

Awesome news!

Simon Hudon (Dec 17 2018 at 17:16):

I managed to build a version that passes all their tests:

https://github.com/leanprover-community/sccache/tree/lean-support-2

Simon Hudon (Dec 17 2018 at 17:17):

When I try it with my Makefile though, it fails. There seems to be issue with the timing when sccache produces the binaries

Simon Hudon (Dec 17 2018 at 17:18):

If someone wants to try it for themselves, I can share the Makefile


Last updated: Dec 20 2023 at 11:08 UTC