Zulip Chat Archive

Stream: general

Topic: trylean bundle for lean4


Miguel Marco (Aug 03 2023 at 09:50):

Now that mathlib has been finally ported, the only thing that pushes me to stay with lean3 for my general topology course is the difficulty that my students will find to install lean4+mathlib4, compared to just unpacking the trylean bundle with lean3.

So I am considering creating a version of the trylean files with lean4 (which might be hard since i don't even have access to a windows or mac box). Anyways, i would need a hand. What are the steps i should follow to create it?

My first idea would be to get the lean3 version, and start subsituting the components, but there might be a better approach.

MohanadAhmed (Aug 03 2023 at 11:34):

Hello Miguel
Are you targeting a specific platform?

Kevin Buzzard (Aug 03 2023 at 11:44):

If you're using bundles like this in a course then probably you'll have to target all the platforms because the students will have all the OSs (a maths course is unlikely to be running in a computer lab).

However is it really true that it's not possible for the students to just to the git clone and lake exe cache get and lake build themselves?

I for one would love to see this functionality being taken up by the VS code plugin. Mathematicians are in general very uneasy with the command line, for example they typically have no concept of the idea that it is running in a specific directory (or how to see or change that directory). But that's really the main problem I've had over the years. I don't think the change from lean 3 to lean 4 has made things any easier or harder.

Miguel Marco (Aug 03 2023 at 11:44):

Ideally all of them, but the priority would be windows, since it is by far the most popular among my students.

MohanadAhmed (Aug 03 2023 at 11:53):

Kevin Buzzard said:

If you're using bundles like this in a course then probably you'll have to target all the platforms because the students will have all the OSs (a maths course is unlikely to be running in a computer lab).

However is it really true that it's not possible for the students to just to the git clone and lake exe cache get and lake build themselves?

I for one would love to see this functionality being taken up by the VS code plugin. Mathematicians are in general very uneasy with the command line, for example they typically have no concept of the idea that it is running in a specific directory (or how to see or change that directory). But that's really the main problem I've had over the years. I don't think the change from lean 3 to lean 4 has made things any easier or harder.

You are right of course. But I think it is easier to start by handling the issues on one platform, and then handle the minority cases as they come.

MohanadAhmed (Aug 03 2023 at 11:59):

I guess the issue is with the mental load:
You need :

  1. Git (It also better to have a github account)
  2. VS Code (VS code gets you elan and elan gets you lean)
  3. Then you need lake new or 'lake init`
  4. Then you need lake exe cache get
  5. Then lean takes some time to start up and you are not sure if you made a mistake or it is working.

Also, if any step has a problem, a newcomer is not sure where they made the mistake.

MohanadAhmed (Aug 03 2023 at 12:00):

What I am trying to say is I see the value of a "turn key" type solution where you just unpack stuff and get to lean

Miguel Marco (Aug 03 2023 at 12:06):

Kevin Buzzard said:

However is it really true that it's not possible for the students to just to the git clone and lake exe cache get and lake build themselves?

It deppends on what you mean by "possible". Given enough time spent both by me and them, eventually they would get it, but since this is not the main goal of the course (with already limited time available) , but just a small side experiment, we cannot afford to spend that time.

Also, the problem is not only the use of command line (which is a problem in itself, but could be worth getting them used to it), but the many steps that they have to follow. If I tell them to follow the following multi-step process:

  • Install vscode (which involves both downloading and running the installer)
  • Install git for windows (again, download, install, configure..)
  • Install the lean4 extension for vscode
  • open the terminal and run
    • git clone
    • lake exe cache get

they would probably feel overwhelmed, and maybe start with a negative feeling towards lean (or at least, a first impression consisting in "it is a complicated thing").

On the other hand, if i just tell them: "download this, unzip it, and you are done", they wouldn't experience such a friction, and would probably start using lean with a first impression of "it is basically like every other software we are used to".

MohanadAhmed (Aug 03 2023 at 12:13):

Both Git and VSCode support portable Mode with extensions inside the VSCode folder. The part that is missing for me is how to tell vscode-lean extension: "Hey, Elan, lake and lean are in this location don't try to download"

Kevin Buzzard (Aug 03 2023 at 12:14):

I guess the two other possibilities are gitpod and codespaces. In my IISc repo https://github.com/kbuzzard/IISc-experiments I made some notes to self in the README explaining how I set both of these up. In both cases you get a one click experience (assuming you have a github account I guess) which takes you to Lean running (fast) in the browser.

MohanadAhmed (Aug 03 2023 at 12:20):

How was the students experience with gitpod?

Miguel Marco (Aug 03 2023 at 12:21):

Kevin Buzzard said:

I guess the two other possibilities are gitpod and codespaces. In my IISc repo https://github.com/kbuzzard/IISc-experiments I made some notes to self in the README explaining how I set both of these up. In both cases you get a one click experience (assuming you have a github account I guess) which takes you to Lean running (fast) in the browser.

I considered that too, but I don't like the idea of "forcing" my students to register in an external service.

In fact, I managed to integrate vscode-server with lean in the jupyterlab server that we use for other courses (now I remember that I should probably docuemnt it), so that could be a great solution... if we had enough hardware resources: right now, we have a limit of 1GB of RAM per active user, which is totally insufficient to run lean. We could rise it, but then we risk running out of memory in the server during activity peaks.

Kevin Buzzard (Aug 03 2023 at 16:09):

@MohanadAhmed I tried gitpod and it worked fine for me; I don't know if any students actually used it

Eric Wieser (Aug 03 2023 at 16:49):

I had some gitpod trouble on a course unrelated to lean where they ran out of free compute budget; but the end result was that we had to show a few students how to install locally near the end of the course, rather than dealing with a whole classroom of setup problems at all once

Patrick Massot (Aug 03 2023 at 16:56):

Miguel, you are totally right that you can't hope students will follow all these steps. This is exactly why I created the Lean 3 bundles. There is no obstruction to doing it for Lean4 but they will be much larger.

Patrick Massot (Aug 03 2023 at 16:59):

This has been discussed many times on this Zulip. Thr script creating the bundles are on github, you simply need to adapt them to Lean 4. The painful part is to test them. You need access to (virtual) machines with every OS with a very minimal installation.

Miguel Marco (Aug 03 2023 at 17:40):

About the size, i just tried compressing my lake-packages directory (which i guess, would be the biggest part of the bundle). It is about 1.2GB. So I guess we could get the full bundle under 2GB. It is big, but nothing extraordinary nowadays.

MohanadAhmed (Aug 03 2023 at 17:45):

Lean toolchain it self is about 900MB

MohanadAhmed (Aug 03 2023 at 17:47):

added to your original 1.2GB i think it will be above 2GB

Miguel Marco (Aug 03 2023 at 19:35):

MohanadAhmed said:

Lean toolchain it self is about 900MB

About 200MB when compressed.

Kevin Buzzard (Aug 03 2023 at 20:15):

I would just stick some stuff up and let the students test them

Kevin Buzzard (Aug 03 2023 at 20:16):

Tell them you're going to do some bits of topology using a crazy computer system and you're looking for beta testers

Mario Carneiro (Aug 04 2023 at 04:06):

If you bundle a script that does lake exe cache get, there should be no need to bundle mathlib itself, which means it wouldn't be part of the download

Mario Carneiro (Aug 04 2023 at 04:08):

Miguel Marco said:

About the size, i just tried compressing my lake-packages directory (which i guess, would be the biggest part of the bundle). It is about 1.2GB. So I guess we could get the full bundle under 2GB. It is big, but nothing extraordinary nowadays.

In particular, lake exe cache uses a much more efficient compression than just zipping olean files. I think it would be better to run the cache download separately unless you can't rely on a network connection

Eric Wieser (Aug 04 2023 at 05:33):

Does it only use the compressed format over the network, or store it locally too?

Miguel Marco (Aug 04 2023 at 22:21):

you would need git installed on windows to run lake exe cache getright?

Niels Voss (Aug 05 2023 at 00:24):

The installation of some tools can be automated through winget or another package manager (though winget is often not available on old computers or might not be in $PATH).
For example, running winget install Git.Git would be much faster than downloading it manually. The same goes for winget install Microsoft.VisualStudioCode. Then these commands could be put into a script or listed all in one spot.

I think the main disadvantage of this is that it installs a whole bunch of software that the user might not like. For example, I think it adds Open in Git Bash options to the right click menu, which are hard to get rid of. I also don't like the feeling of installing random programs for which I don't know what they do.

Scott Morrison (Aug 05 2023 at 09:20):

Miguel Marco said:

you would need git installed on windows to run lake exe cache getright?

You shouldn't need git for cache, no.

Miguel Marco (Aug 05 2023 at 13:05):

Scott Morrison said:

Miguel Marco said:

you would need git installed on windows to run lake exe cache getright?

You shouldn't need git for cache, no.

Well... it complains if i don't have git installed

Scott Morrison (Aug 05 2023 at 23:31):

Oh --- presumably your lakefile points to git repositories for dependencies. The error (could you show us?) is presumably coming from just lake, rather than from cache.

Scott Morrison (Aug 05 2023 at 23:32):

If your lakefile instead pointed to local directories rather than git repositories for dependencies (is that even possible? perhaps not), then I'd hope that the cache step itself doesn't care about the presence of git.

Jon Eugster (Aug 06 2023 at 11:44):

Regarding the last question, that's definitely possible. I think the syntax is something like require mathlib from "path"/"to"/"local"/"dependency" and I believe I once had such a setup and it worked fine with lake exe cache

Miguel Marco (Aug 06 2023 at 11:51):

does lake exe cache get work too without git?

Miguel Marco (Aug 06 2023 at 21:40):

I finally opted for including the installers of git, vscode and elan in a zip file, together with a couple of scripts to automate their installation and the setup of a project with lake.

It can be downloaded from https://cloud.unizar.es/index.php/s/jkybmaLS3goSLLr if you want to test it

To install it, it should be enough to run the install_step_1.batfile, and then the usual "accept, ok..." stuff to install programs. When it is finished, you run install_step_2.bat to setup the lean extension and create the new project, called trylean.

It is still a bit more convoluted than the "unzip and done" approach, but I think it is less scary than following the steps by hand.

Scott Morrison (Aug 07 2023 at 03:38):

I worry that the last little bit to get it to "double-click and click ok" or "just unzip" is the essential thing missing here. Anything short of that and its yet another collection of hard to maintain scripts that some users are still going to be confused by.

Miguel Marco (Aug 07 2023 at 08:51):

I agree the "just unzip" approach is the desirable one. But I tried to make it work, and wasn't able to get it. So this is my fallback option.

Scott Morrison (Aug 07 2023 at 09:24):

What was the obstacle to just unzipping? I guess it means you can't provide git, but I'd hope that just a portable VSCodium, and a repository with require mathlib from "path"/"to"/"local"/"dependency" could work.

Miguel Marco (Aug 07 2023 at 09:27):

I tried that, but then, the first time you import something from mathlib, it tried to compile it. For some reason, it doesn't use the downloaded cache.

Scott Morrison (Aug 07 2023 at 09:31):

Ah --- that would be because the contents of the lakefile is hashed to determinate the validity of the cached oleans.

Scott Morrison (Aug 07 2023 at 09:33):

So you'd have to recompile everything after changing the lakefile to have local imports, and before distributing the zip file.

Scott Morrison (Aug 07 2023 at 09:33):

That sounds painful, but if we can get it to work we can just have CI prepare such a zip file at some interval.

Miguel Marco (Aug 07 2023 at 09:36):

what is that hash compared against?

Scott Morrison (Aug 07 2023 at 10:12):

build/lib/*.trace

Miguel Marco (Aug 07 2023 at 10:24):

I see, and that should match the hash (what kind of hash? it looks too short to be a md5 and too long to be a cksum) of lakefile.lean ?

Scott Morrison (Aug 07 2023 at 12:07):

No, it is a complicated (and unspecified) hash also of the contents of the file, and its transitive imports.

Scott Morrison (Aug 07 2023 at 12:07):

I'm just saying you need to do a complete rebuild before zipping everything up.

Scott Morrison (Aug 07 2023 at 12:08):

(That said, you can save a huge amount of space by compressing the oleans using Mario's leangz tool (in practice you would do this using the cache tool, I think) relative to just zipping everything.)

Scott Morrison (Aug 07 2023 at 12:09):

(But that is something you could think about after getting some zip file to work at all. :-)

Miguel Marco (Aug 07 2023 at 12:17):

I see. Would it be possible to compile it in a linux system and then copy (just the lake-packagesdirectory) to windows? or the platform is also part of the hash?

Miguel Marco (Aug 07 2023 at 17:23):

It seems it must be compiled on the target platform: compiling it in linux and then copying to windows still triggers the compiling.

I wonder if the specific path in the filesystem would cause recompilations too.

Jz Pan (Aug 07 2023 at 17:58):

Is it possible to ship a modified version of lean which bypasses all the hash check of olean files?

Jz Pan (Aug 07 2023 at 17:59):

Or add a command line switch to lean which bypasses hash check

Scott Morrison (Aug 07 2023 at 23:12):

Changing OS should not invalidate the oleans in any way. We only distribute one version of the oleans through lake exe cache get. Are you sure nothing else is changing in the directory?

MohanadAhmed (Aug 07 2023 at 23:49):

Hello @Miguel Marco

Here is my attempt it seems to work. I tested it on Windows 11 in a Virtual machine. So basically copy the script into a .bat file in an empty folder and then execute the script. (After a while - get a cup of coffee you should see VSCodium with a new lean project open).

:: PrepareLean.bat
@ECHO OFF

::::::::::::::::::: Individual Components URLS:
:: (1) 7z to extract git, (2) Git (3) VC Redistributable for Lean Tar (4) VSCodium the editor
:: (5) Current Mathlibs Version of Lean (6) Elan Installer Script (7) Lean VSCode extension
set Z7Z_URL="https://www.7-zip.org/a/7zr.exe"
set GIT_URL="https://github.com/git-for-windows/git/releases/download/v2.41.0.windows.3/PortableGit-2.41.0.3-64-bit.7z.exe"
set VC_REDIST_URL="https://aka.ms/vs/17/release/vc_redist.x64.exe"
set VSCODIUM_URL="https://github.com/VSCodium/vscodium/releases/download/1.81.0.23216/VSCodium-win32-x64-1.81.0.23216.zip"
set MATHLIB_LEAN_TOOLCHAIN_URL="https://raw.githubusercontent.com/leanprover-community/mathlib4/master/lean-toolchain"
set ELAN_INSTALLER_URL="https://raw.githubusercontent.com/leanprover/elan/master/elan-init.sh"
set VSCODE_LEAN4_EXT_URL="https://github.com/leanprover/vscode-lean4/releases/download/v0.0.108/lean4-0.0.108.vsix"
:: set VSCODE_URL="https://az764295.vo.msecnd.net/stable/6445d93c81ebe42c4cbd7a60712e0b17d9463e97/VSCode-win32-x64-1.81.0.zip"


::::::::::::::::::: Download the Components
curl -L -C - --output "z7z.exe" %Z7Z_URL%
curl -L -C - --output "git-install.exe" %GIT_URL%
curl -L -C - --output "vc_redist.x64.exe" %VC_REDIST_URL%
curl -L -C - --output "lean-toolchain" %MATHLIB_LEAN_TOOLCHAIN_URL%
curl -L -C - --output "elan-init.sh" %ELAN_INSTALLER_URL%
curl -L -C - --output "vscodium.zip" %VSCODIUM_URL%
curl -L -C - --output "lean4ext.zip" %VSCODE_LEAN4_EXT_URL%

::::::::::::::::::: Extracting Components

:: Extract Git Portable using 7zip
z7z.exe x "git-install.exe" -o".\PortableGit"

:: Extract VSCodium and  Install vscode-lean4 extension
if not exist "VSCodium" (
    mkdir VSCodium
    tar -x -f vscodium.zip -C ".\VSCodium"
)

if not exist "VSCodium\leanext" (
    mkdir VSCodium\leanext
    tar -x -f lean4ext.zip -C ".\VSCodium\leanext"
    xcopy /E /I ".\VSCodium\leanext\extension" ".\VSCodium\data\extensions\leanprover"
    rmdir /S /Q ".\VSCodium\leanext"
)

:: Use only if leantar in mathlib is 0.1.4 or less
:: vc_redist.x64.exe /passive /quiet /norestart

:: Lets get to Work!!
:: Control Elan's location by ELAN_HOME and Cache Location by XDG_CACHE_HOME
::::::::::::::::::: Prepare Environment Variables and Clean Path
set Path=C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;%USERPROFILE%\AppData\Local\Microsoft\WindowsApps;%CD%;%CD%\PortableGit\bin\;%CD%\Elan\bin\
set ELAN_HOME=%CD%\Elan
set XDG_CACHE_HOME=%CD%\Cache
set ELECTRON_EXTRA_LAUNCH_ARGS=--disable-gpu-sandbox
set DEMOPROJ=DemoProj
set /p LEAN_TOOLCHAIN_VERSION=<lean-toolchain

::::::::::::::::::: Installation of ELAN in Current Folder with Mathlibs Toolchain version
PortableGit\bin\bash.exe -c "./elan-init.sh -y --no-modify-path --default-toolchain %LEAN_TOOLCHAIN_VERSION%"

::::::::::::::::::: Create demo Project
lake new %DEMOPROJ% math
PortableGit\bin\bash.exe -c "cd %DEMOPROJ% && lake update && lake exe cache get"

::::::::::::::::::: Star New Project in VSCodium
VSCodium\VSCodium.exe %DEMOPROJ%

MohanadAhmed (Aug 07 2023 at 23:58):

Basically the script:

  1. Downloads:

    - 7z.exe: used to expand the Portable Git archive.
    - Portable Git
    - VSCodium
    - The current mathlib4 master/lean-toolchain file.
    - Elan installer script elan-init.sh
    - VSCode Lean4 Extension
    - VC Redistributable : to be used only if LeanTar version is less than 0.1.5 (manually uncomment to run) see PR #6430

All the downloads use -L -C options so that they can resume in case of interruption.

  1. It then extracts Portable Git, VSCodium and extracts the lean extension in to VSCodium.

  2. It then sets two variables ELAN_HOME and XDG_CACHE_HOME. These control the locations of Elan and Lean Toolchains (ELAN_HOME) and the location of the olean cache files (XDG_CACHE_HOME). We also add Portable git and Elan Home to the path so that we can later use lean and lake. We then run the elan installer and pass in the version of the toolchain we want (the one used by Mathlib4's master branch)

  3. We create a DemoProj using the typical lake sequence (lake new DemoProj math, lake update, lake exe cache get)

  4. Finally we start the VSCodium and pass in the newly created project.

MohanadAhmed (Aug 08 2023 at 00:00):

The directory structure at the end of a successful script invocation should look like this:
image.png

MohanadAhmed (Aug 08 2023 at 10:59):

You can also use this script to start VSCodium at later times:

:: RunLean.bat Script
@ECHO OFF
:: Lets get to Work!!
:: Control Elan's location by ELAN_HOME and Cache Location by XDG_CACHE_HOME
::::::::::::::::::: Prepare Environment Variables and Clean Path
set Path=C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;%USERPROFILE%\AppData\Local\Microsoft\WindowsApps;%CD%;%CD%\PortableGit\bin\;%CD%\Elan\bin\
set ELAN_HOME=%CD%\Elan
set XDG_CACHE_HOME=%CD%\Cache
set ELECTRON_EXTRA_LAUNCH_ARGS=--disable-gpu-sandbox
set DEMOPROJ=DemoProj
set /p LEAN_TOOLCHAIN_VERSION=<lean-toolchain

::::::::::::::::::: Star New Project in VSCodium
VSCodium\VSCodium.exe %DEMOPROJ%

Scott Morrison (Aug 08 2023 at 11:02):

If you just zip everything up at the end, how big a file is it?

MohanadAhmed (Aug 08 2023 at 11:03):

Trying it now

Scott Morrison (Aug 08 2023 at 11:03):

If this script can be tested by others, it would be good to upload it to the community website, and include it as an alternative installation method.

MohanadAhmed (Aug 08 2023 at 11:03):

Zipping has been runnig for about 5 mins :sweat_smile:

Scott Morrison (Aug 08 2023 at 11:04):

If the zip file is manageable, we should arrange for a regular CI job to build the zip file and host it somewhere.

MohanadAhmed (Aug 08 2023 at 11:04):

For comparsion it took 3 mins form start of script to VScodium showing up with lean running

MohanadAhmed (Aug 08 2023 at 11:07):

Scott Morrison said:

If the zip file is manageable, we should arrange for a regular CI job to build the zip file and host it somewhere.

Yeah I think that is a great idea. !
How big is manageable?
Also the script will need a bit of polishing to make the hardcoded version URLs be less hardcoded

MohanadAhmed (Aug 08 2023 at 11:13):

(deleted)

MohanadAhmed (Aug 08 2023 at 11:13):

Scott Morrison said:

If you just zip everything up at the end, how big a file is it?

1.82GB

MohanadAhmed (Aug 08 2023 at 11:16):

This the built-in Windows Zip program

MohanadAhmed (Aug 08 2023 at 11:16):

I will try now the 7z compressor

Scott Morrison (Aug 08 2023 at 11:23):

That's not hopeless. :-) Some people at least would find it useful.

Scott Morrison (Aug 08 2023 at 11:25):

To make a smaller zip file, make sure that you don't have any .olean files, but only the .ltar files in the cache. These are much smaller. The script that launches VSCode will need to run lake exe cache get itself, but it will only be unpacking the .ltar files, which is quite fast.

MohanadAhmed (Aug 08 2023 at 11:27):

I packed everything including the .olean files. I can try that next.

7z Compressor produced 1.3 GB file (with oleans included)

MohanadAhmed (Aug 08 2023 at 11:28):

How do you tell lake exe cache get to only download but not unpack?

Scott Morrison (Aug 08 2023 at 11:29):

lake exe cache get-

MohanadAhmed (Aug 08 2023 at 11:34):

I think we are of to a good start the folder size is down from 5.6 GB to 2.2 GB before compression. Waiting for compression to finish now

MohanadAhmed (Aug 08 2023 at 11:36):

OK down to 798MB for the final 7z archive

Scott Morrison (Aug 08 2023 at 11:39):

Great. If you send it to me I can have a look over it for obvious redundancies.

MohanadAhmed (Aug 08 2023 at 11:41):

Oh I made a small mistake I packed the sources with it. I will try without the sources and see if it gets down further.

Scott Morrison said:

Great. If you send it to me I can have a look over it for obvious redundancies.

Once I get the last archive I will send it. Does a google drive link work for you?

MohanadAhmed (Aug 08 2023 at 11:41):

Down to 604MB

Scott Morrison (Aug 08 2023 at 11:43):

I think the right way to deploy this is:

  • a script that lives in the mathlib4 repository, say as portable_windows.bat or something, that anyone can download and run themselves
  • a CI job on the mathlib4 repository that runs e.g. every 24 hours, and puts the zip file ... somewhere? I think we can provide azure hosting still? Others know that aspect of things better.

Scott Morrison (Aug 08 2023 at 11:43):

I don't think it makes any sense to keep old copies around: you'll only be able to get the current nightly. If someone wants to use a consistent version for teaching then they can download and distribute a particular copy themselves.

Scott Morrison (Aug 08 2023 at 11:45):

We might consider that this zipfile should give you three folders: Mathlib, MIL, and MyProject, so people have the choice of working in any of those three places. I think this would be very minimal overhead?

Scott Morrison (Aug 08 2023 at 11:46):

Oh -- was your previous message saying you were leaving out the Mathlib sources? That's not a good idea. Jump to definition is essential.

MohanadAhmed (Aug 08 2023 at 11:47):

Not the mathlib sources. Just the Portable Git, VSCodium, 7z and elan installer ... setup executables etc

MohanadAhmed (Aug 08 2023 at 11:49):

Here is the 7z archive (604MB)
https://drive.google.com/file/d/1Ih3268qtXWZENti7IkJ7qSu2zshluMbB/view?usp=sharing

MohanadAhmed (Aug 08 2023 at 11:50):

Here is a zip archive (778 MB)
https://drive.google.com/file/d/1aLzHU3QWuOF6O2SUkdULlAoLOJ4CooJN/view?usp=sharing

MohanadAhmed (Aug 08 2023 at 11:54):

There are two scripts inside PrepareLean.bat and RunLean.bat.

The RunLean.bat starts VSCodium with the Empty DemoProj.

Scott Morrison (Aug 08 2023 at 11:59):

My inclination is that the additional savings for the 7z version are not worth the slight extra difficulty. The point of this bundle is to be the absolute lowest bar installation method on windows: let's not assume that anyone has 7z.

Scott Morrison (Aug 08 2023 at 12:05):

If you really wanted to shrink further, you could traverse the Elan/toolchains/... directories, and compress every .olean using leangz, and then have the first step of RunLean.bat undo this. This would probably result in a further big saving, but maybe this is premature optimization. :-)

Scott Morrison (Aug 08 2023 at 12:05):

It's clear that the 778mb zip file is easily good enough --- the main priority should be to get this reviewed and into the repository + a CI workflow.

MohanadAhmed (Aug 08 2023 at 12:30):

Scott Morrison said:

It's clear that the 778mb zip file is easily good enough --- the main priority should be to get this reviewed and into the repository + a CI workflow.

I think a good first step is to get some confirmation that it works on other machines. Then we can smooth out the rough edges for CI.

Scott Morrison (Aug 08 2023 at 12:43):

I'm about to be off for the day, but I can try this on a virtual machine tomorrow if no one has got there first.

Sebastian Ullrich (Aug 08 2023 at 13:36):

I may have missed this: if the Git issue is solved now, what's the reason for including the cache in the archive instead of doing a regular cache get?

MohanadAhmed (Aug 08 2023 at 14:08):

I guess one point is the archive now becomes usable without an internet connection.

Max Nowak 🐉 (Aug 08 2023 at 14:50):

Instead of typing lake exe cache get, could we annotate the dependency in the lakefile with something like preferCached, so it will download the cached version automatically? That would be one less command to run.

Max Nowak 🐉 (Aug 08 2023 at 14:59):

I find hand-crafted “bundles” or scripts to actually be more confusing than having a solid vscode extension which handles all the tool chain setup and downloading of dependencies automatically. Downloading vscode, installing the extension, opening a project, which then automatically sets up the tool chain and downloads the mathlib cache doesn’t seem too much to ask. But then again I’m not a mathematician :sweat_smile: .

I like having one root manifest (lakefiles in our case) which already has all the setup info, and then solid tooling (vscode extension) to make that manifest “just work”.

Miguel Marco (Aug 08 2023 at 15:08):

MohanadAhmed said:

There are two scripts inside PrepareLean.bat and RunLean.bat.

The RunLean.bat starts VSCodium with the Empty DemoProj.

What does the PrepareLean.bat file do exactly? I get it should be run just once after decompressing, right?

Kevin Buzzard (Aug 08 2023 at 15:19):

Max Nowak 🐺 said:

I find hand-crafted “bundles” or scripts to actually be more confusing than having a solid vscode extension which handles all the tool chain setup and downloading of dependencies automatically.

If it were possible to get the VS Code Lean 4 extension to download and correctly install projects (e.g. if it offered the user a sample list of projects ranging from a basic Hello World project to the full Mathematics In Lean download together with mathlib + oleans, which were just a one click install) then that would be absolutely great. But I am used to dealing with mathematicians with slow Windows machines who have no git, no python, and have literally never seen a command line and have no idea how to navigate to a given directory within it. Right now my solution for these people is just to direct them to gitpod (and to ensure that all my projects are gitpod-ready); the only real objection I've heard to that solution is that you have to make an account on github (and then make ssh keys work, and soon you'll also have to set up 2FA, if you want to actually do anything like saving your work easily).

MohanadAhmed (Aug 08 2023 at 15:56):

Miguel Marco said:

MohanadAhmed said:

There are two scripts inside PrepareLean.bat and RunLean.bat.

The RunLean.bat starts VSCodium with the Empty DemoProj.

What does the PrepareLean.bat file do exactly? I get it should be run just once after decompressing, right?

It is the script that creates/prepares the bundle folder.
So if you want to change the version of VSCodium or git and so on.

But if you already have the bundle and just want to run Lean and get to proving then RunLean.bat is the script you want

MohanadAhmed (Aug 08 2023 at 15:59):

If you think it is confusing to users I can remove it

Miguel Marco (Aug 08 2023 at 16:05):

So, the users that download the 778MB file should just unzip it and run RunLean.bat?

MohanadAhmed (Aug 08 2023 at 16:06):

Yes

MohanadAhmed (Aug 08 2023 at 16:07):

Did it work?

Miguel Marco (Aug 08 2023 at 17:15):

It shows this message in the command prompt :

error: no such file or directory (error code: 2)
  file: .\lakefile.lean

and then this error message in the VSdode output:

[{
    "resource": "/C:/Users/User/Desktop/LeanPackNoOlean/LeanPackNoOlean/DemoProj/DemoProj.lean",
    "owner": "_generated_diagnostic_collection_name_#2",
    "severity": 8,
    "message": "`c:\\Users\\User\\Desktop\\LeanPackNoOlean\\LeanPackNoOlean\\Elan\\toolchains\\leanprover--lean4---nightly-2023-08-05\\bin\\lake.exe print-paths Init` failed:\n\nstderr:\ninfo: mathlib: URL has changed; you might need to delete .\\lake-packages\\mathlib manually\nerror: > git fetch origin    # in directory .\\lake-packages\\mathlib\nerror: stderr:\nfatal: 'origin' does not appear to be a git repository\nfatal: Could not read from remote repository.\n\nPlease make sure you have the correct access rights\nand the repository exists.\nerror: external command `git` exited with code 128\nInvalid Lake configuration.  Please restart the server after fixing the Lake configuration file.\n",
    "source": "Lean 4",
    "startLineNumber": 1,
    "startColumn": 1,
    "endLineNumber": 1,
    "endColumn": 1
}]

MohanadAhmed (Aug 08 2023 at 19:21):

Sorry that was my mistake.
Can you please replace the contents of the RunLean.bat file with the following and rerun:

@ECHO OFF
:: Lets get to Work!!
:: Control Elan's location by ELAN_HOME and Cache Location by XDG_CACHE_HOME
::::::::::::::::::: Prepare Environment Variables and Clean Path
set Path=C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;%USERPROFILE%\AppData\Local\Microsoft\WindowsApps;%CD%;%CD%\PortableGit\bin\;%CD%\Elan\bin\
set ELAN_HOME=%CD%\Elan
set XDG_CACHE_HOME=%CD%\Cache
set ELECTRON_EXTRA_LAUNCH_ARGS=--disable-gpu-sandbox
set DEMOPROJ=DemoProj
set /p LEAN_TOOLCHAIN_VERSION=<lean-toolchain

::::::::::::::::::: Star New Project in VSCodium
cd %DEMOPROJ%
lake exe cache unpack
cd ..
VSCodium\VSCodium.exe %DEMOPROJ%

MohanadAhmed (Aug 08 2023 at 19:23):

I updated the 7z version with Scott Morrison's suggestion to unpack the olean files later but did not properly update RunLean.bat in the zip file

Miguel Marco (Aug 08 2023 at 21:44):

Now I get this error:

`c:\Users\User\Desktop\LeanPackNoOlean\LeanPackNoOlean\Elan\toolchains\leanprover--lean4---nightly-2023-08-05\bin\lake.exe print-paths Init` failed:

stderr:
info: mathlib: URL has changed; you might need to delete .\lake-packages\mathlib manually
error: > git fetch origin    # in directory .\lake-packages\mathlib
error: stderr:
fatal: 'origin' does not appear to be a git repository
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
error: external command `git` exited with code 128
Invalid Lake configuration.  Please restart the server after fixing the Lake configuration file.

MohanadAhmed (Aug 09 2023 at 15:21):

I was able to reproduce it on windows 10. I am not sure what exactly the issue is but the Self Extracting archives work without issues on Windows 11, Windows 10 and even Windows 8.1 (if you install VC_redist).

image.png

MohanadAhmed (Aug 09 2023 at 15:21):

Here is a link to the Self extracting archive https://drive.google.com/file/d/1Ws7-t185QXL262-S7MPGu3wm_ZzAwwha/view?usp=sharing

MohanadAhmed (Aug 09 2023 at 15:22):

If anyone has WIndows and can help by trying it that would be great.

Miguel Marco (Aug 13 2023 at 14:34):

I tried it in a fresh VM, and it seems to work ok (it takes some time to unpack, but otherwise it works).

As a note, i would remove the PrepareLean script from the file that would be shipped to the end user, to prevent confussions. I would also maybe add a small message at the beginning of the RunLean scirpt stating that Lean is opening, and that it might take some time the first time it is run.

Miguel Marco (Aug 16 2023 at 22:45):

Does someone have access to a Mac to try to create something similar?

Jireh Loreaux (Aug 22 2023 at 15:50):

Yes, it would be great to get this for Macs, but the only one I have access to is stuck on High Sierra :sweat_smile:, so it probably shouldn't be me. I guess we'll probably need both Intel and M versions? @Scott Morrison, did we start hosting the above archive for Windows anywhere yet?

Miguel Marco (Aug 25 2023 at 16:00):

Shouldn't such a bundle for MacOS be pretty much independent of the version?

Different processor architecture is another issue.

MohanadAhmed (Aug 29 2023 at 11:02):

Scott Morrison said:

I think the right way to deploy this is:

  • a script that lives in the mathlib4 repository, say as portable_windows.bat or something, that anyone can download and run themselves
  • a CI job on the mathlib4 repository that runs e.g. every 24 hours, and puts the zip file ... somewhere? I think we can provide azure hosting still? Others know that aspect of things better.

If you still think this is a good idea I think I can make a PR for it.
I took a look at the CI files on mathlib. If I understand correctly the CI always runs on ubuntu-latest right?
Is it possible to run a particular job on windows?

Or is it better to adapt the scripts above to run on bash shell on ubuntu.

Scott Morrison (Aug 29 2023 at 11:04):

Yes. Just use runs-on: windows-2022.

Scott Morrison (Aug 29 2023 at 11:04):

(e.g. Lean4 CI runs jobs on Windows)

MohanadAhmed (Aug 29 2023 at 11:05):

Oh Ok I guess you answered my question before I even asked it :joy:

MohanadAhmed (Aug 29 2023 at 11:16):

Any place I can store the final output bundle so that it is available after CI finishes?

Scott Morrison (Aug 29 2023 at 11:20):

Two choices:

  1. store it as a release on github (probably not on mathlib4 itself, but maybe a mathlib4-bundles repo?)
  2. push it to azure, e.g. following the CI code that mathlib3 used:
      - name: push release to azure
        if: always() && github.repository == 'leanprover-community/mathlib' && steps.build.outputs.started == 'true'
        run: |
          archive_name="$(git rev-parse HEAD).tar.gz"
          find src/ -name "*.olean" -o -name ".noisy_files" | tar czf "$archive_name" -T -
          azcopy copy "$archive_name" "${{ secrets.AZURE_SAS_TOKEN }}" --block-size-mb 99 --overwrite false
          archive_name="$(git rev-parse HEAD).tar.xz"
          find src/ -name "*.olean" -o -name ".noisy_files" | tar cJf "$archive_name" -T -
          azcopy copy "$archive_name" "${{ secrets.AZURE_SAS_TOKEN }}" --block-size-mb 99 --overwrite false

Scott Morrison (Aug 29 2023 at 11:22):

Unfortunately the AZURE_SAS_TOKEN has not been migrated from mathlib3 to mathlib4, so we'd have to look into that.

Scott Morrison (Aug 29 2023 at 11:23):

I don't think this bundle is something that we need to keep old versions of.

Scott Morrison (Aug 29 2023 at 11:23):

So possibly just a single "release" on the mathlib4 repo is okay, that we keep overwriting.

Sebastien Gouezel (Aug 29 2023 at 11:47):

If you want all your students to use the same version of mathlib when giving some lectures, it would be good to have archived versions of the bundle.

Scott Morrison (Aug 29 2023 at 11:52):

Excellent point. It is going to be a big bundle, so we can't do this nightly. Let's initially set this up for weekly, and we can tweak as necessary. (Ideally synchronizing with Mathlib moving to each new stable release of Lean!)

Sebastien Gouezel (Aug 29 2023 at 11:53):

Yes, one version for each stable release of Lean would be perfect.

Scott Morrison (Aug 29 2023 at 11:54):

I think my preference is to avoid using Azure, just to minimize the necesary infrastructure, and instead just push a release to the mathlib4 repo itself.

Scott Morrison (Aug 29 2023 at 11:55):

@MohanadAhmed, do you think you could make a PR that adds a workflow that runs on a cron job weekly, and creates a tag (perhaps name the tag whatever is in the lean-toolchain?) and pushes a release?

Scott Morrison (Aug 29 2023 at 11:55):

I have been working on workflows a lot the last few days, so I can definitely point to examples, help, take over parts, etc.

MohanadAhmed (Aug 29 2023 at 12:07):

I started on it

MohanadAhmed (Aug 30 2023 at 00:02):

PR #6859

MohanadAhmed (Aug 30 2023 at 00:07):

So I worked on in an external repo and the script and the yml CI file in this repo (https://github.com/MohanadAhmed/TryLean4Bundle). The created bundle is on the releases page (https://github.com/MohanadAhmed/TryLean4Bundle/releases)

I copied these two into PR #6859. @Scott Morrison I think I will need your help in figuring out how to get the added yaml file to show up in mathlib4 Workflows. Does the yml file have to be in master or can we run it from a branch?

Scott Morrison (Aug 30 2023 at 00:16):

The cron job won't run from a branch.

Scott Morrison (Aug 30 2023 at 00:16):

For testing purposes we may want to run this on push, until we're ready to merge.

Scott Morrison (Aug 30 2023 at 00:18):

Further, I think the CI should not only assemble the bundle, but also run it and do some amount of verification -- we can't check everything but we're headless so can't actually launch VSCode, but verifying that git and lean (and possibly a command line interface to VSCode) work would be great.

Scott Morrison (Aug 30 2023 at 00:19):

We really don't want to be in the situation of only knowing that bundles are broken because users are reporting problems.

MohanadAhmed (Aug 30 2023 at 00:19):

Would the push work from a branch?
The yml file currently has two triggers: schedule for timed and workflow_dispatch for manual testing.

Scott Morrison (Aug 30 2023 at 00:19):

If the branch is on the main repository, thenpush should trigger.

Scott Morrison (Aug 30 2023 at 00:21):

I don't know the rules for workflow_dispatch, unfortunately. But that would be ideal!

MohanadAhmed (Aug 30 2023 at 00:22):

Scott Morrison said:

I don't know the rules for workflow_dispatch, unfortunately. But that would be ideal!

As far as I understand, it just means someone has to click "Run Workflow".

MohanadAhmed (Aug 30 2023 at 00:24):

Scott Morrison said:

Further, I think the CI should not only assemble the bundle, but also run it and do some amount of verification -- we can't check everything but we're headless so can't actually launch VSCode, but verifying that git and lean (and possibly a command line interface to VSCode) work would be great.

Would unpacking the bundle and then issuing lake --version lean --version be enough or you are thinking of something more involved?

Scott Morrison (Aug 30 2023 at 01:59):

That's a pretty good start. Ideally they would be one test per component we install. So maybe git --version and code --version (if that even exists)?

Scott Morrison (Aug 30 2023 at 01:59):

Can we ask code for the list of installed extensions from the command line?

Patrick Massot (Aug 30 2023 at 02:00):

code --list-extensions

MohanadAhmed (Aug 30 2023 at 12:27):

Patrick Massot said:

code --list-extensions

Thanks @Patrick Massot

I am currently using VSCodium (like in your trylean3 bundle). But I assume the command is probably very similar and I will find it.

By the way I am currently using a blank demo project. But it occurs to me that for a typical user interested in doing maths proofs, having the book "Mathematics in Lean" in the bundle might be a better starter? What do you think?

Patrick Massot (Aug 30 2023 at 13:05):

In the Lean 3 days we shipped the tutorials project with those bundles. Indeed we could use MIL now. One could argue this won't be interesting to people who want to use Lean for non-mathematics but:

  1. those people probably want to do computer science or programming so they are probably not afraid of the full install procedure
  2. MIL alone isn't large, the bundle size comes from Mathlib
  3. We don't have anything else to include.

Scott Morrison (Aug 31 2023 at 01:28):

Very much in favour of including MIL in the bundle.

Miguel Marco (Aug 31 2023 at 15:01):

Can we add also bundles for linux and Mac?

Scott Morrison (Aug 31 2023 at 22:17):

There's slightly less need, and it is slightly weirder to do so, for linux and mac, on account of there being standard package managers for both. Nevertheless, yes, I hope someone will do them!

Miguel Marco (Oct 13 2023 at 13:48):

Scott Morrison said:

There's slightly less need, and it is slightly weirder to do so, for linux and mac, on account of there being standard package managers for both. Nevertheless, yes, I hope someone will do them!

What exactly do you have in mind about the package managers? Do you mean that you can install git with the package manager and then use the usual vscode/extension install and lake new lake exe cache get ..?

Jon Eugster (Oct 13 2023 at 14:55):

I think what Scott hints at is that the installation instruction on Linux is a one-liner shell script (thanks to the existing package manager) while the 2-3 students in a class that still struggle with the installation after an hour of trying are always on Windows.

Kevin Buzzard (Oct 13 2023 at 14:59):

One advantage of Linux is that it tends to come with a command line ;-) Another is that a Linux user tends to know how to fire it up. All the time I am dealing with Windows users who have no idea what a command line is. If anything this problem is getting worse not better.

Miguel Marco (Oct 13 2023 at 20:29):

Yes, i am not worried about linux users (at the very least, they don't get intimidated by the idea of using the command line). Mac users is another story though.


Last updated: Dec 20 2023 at 11:08 UTC