Zulip Chat Archive

Stream: general

Topic: 4000 file(s)


Kevin Buzzard (Dec 08 2023 at 19:34):

info: [9/9] Linking cache
installing leantar 0.1.9
Attempting to download 4000 file(s)
Downloaded: 4000 file(s) [attempted 4000/4000 = 100%] (100% success)
Decompressing 4000 file(s)
unpacked in 9027 ms

Yongyi Chen (Dec 09 2023 at 01:23):

9 seconds to unpack, damn my hard drive (Samsung MZVLQ512HALU SSD) is slow. Or my CPU (Intel i7-10875H) is slow. Or WSL on Windows 11 is slow... not sure which.

mathlib: running post-update hooks
Attempting to download 4000 file(s)
Downloaded: 4000 file(s) [attempted 4000/4000 = 100%] (100% success)
Decompressing 4000 file(s)
unpacked in 60133 ms

Anyone want to chip in data points?

Scott Morrison (Dec 09 2023 at 01:48):

Decompressing 4000 file(s)
unpacked in 3264 ms

:-)

Adam Topaz (Dec 09 2023 at 01:52):

not quite as fast for me, but still reasonable :-)

Decompressing 4000 file(s)
unpacked in 5586 ms

Adam Topaz (Dec 09 2023 at 01:53):

I do have a fairly new NVME dive.

Adam Topaz (Dec 09 2023 at 01:56):

I can't imagine the CPU being the bottle neck (for Yongyi)

Yongyi Chen (Dec 09 2023 at 02:12):

Alright it's time to run CrystalDiskMark and potentially order a new SSD!

Adam Topaz (Dec 09 2023 at 02:19):

I have zero experience with windows/WSL with lean, but there are quite a few windows users around that could give some info from their end.

Yongyi Chen (Dec 09 2023 at 03:34):

Do you guys still get sub 10 seconds with a lake clean before doing lake exe cache get?

Kyle Miller (Dec 09 2023 at 04:32):

Here's mine:

% lake clean; lake exe cache get
[... omitting info lines ...]
No files to download
Decompressing 4004 file(s)
unpacked in 3259 ms

Kyle Miller (Dec 09 2023 at 04:33):

That's on a Mac Studio. I don't know if the hard drive is fast, if Mac OS is good about caching writes, or both.

Yongyi Chen (Dec 09 2023 at 04:40):

OK, nice. CrystalDiskMark returned 200-300 MB/s sequential write speed which is about 10x slower than what my SSD should be getting. That must be the culprit.

Mario Carneiro (Dec 09 2023 at 05:29):

I believe windows is to blame for slow unpacking, it likes to run hooks on filesystem activity like the antivirus

Yongyi Chen (Dec 09 2023 at 05:51):

Yeah, Windows does do that, because I've noticed LaTeX compilation on Windows is an order of magnitude slower than on Linux.

Yongyi Chen (Dec 09 2023 at 06:35):

Although I should note that the CrystalDiskMark benchmark I compared to was also a Windows one

Sebastien Gouezel (Dec 09 2023 at 09:00):

Yes, it's certainly a windows thing. I have a pretty fast SSD, and still I often get unpacking times of the order of a minute.

Sebastien Gouezel (Dec 09 2023 at 09:04):

Fresh test:

PS C:\Users\Sebastien\Desktop\mathlib4> lake clean
PS C:\Users\Sebastien\Desktop\mathlib4> lake exe cache get
[...]
Downloaded: 4004 file(s) [attempted 4004/4004 = 100%] (100% success)
Decompressing 4004 file(s)
unpacked in 29159 ms

No active antivirus except for the inbuilt microsoft one (windows defender)

Sebastien Gouezel (Dec 09 2023 at 09:05):

So the download step is considerably faster than the unpacking step...

Sebastien Gouezel (Dec 09 2023 at 09:09):

Goes down to 8s when windows defender is desactivated.

Sebastien Gouezel (Dec 09 2023 at 09:14):

It's possible to desactivate permanently windows defender on specific directories, so I've just done that for my Lean stuff.

Yongyi Chen (Dec 09 2023 at 14:53):

I can confirm Windows Defender does slow things down (I learned that as a middle or high school student) and I disabled it on my Lean folder before doing any of these tests. For reference, my SSD model is advertised at 1800 MB/s sequential write speed, and I used to get that speed 2 years ago, but achieves about 1/10 of that today.

Scott Morrison (Dec 09 2023 at 16:40):

I wonder if we should put a

if os = windows && unpackingTime > 20s then
  print "You may want to disable Windows Defender in this directory"

in cache.

Sebastien Gouezel (Dec 09 2023 at 17:11):

Maybe clarify the message with "For faster unpacking, you may want to disable Windows Defender in this directory". Sounds like a very good idea!

Eric Rodriguez (Dec 09 2023 at 17:12):

Is there any way to detect if it is on? Also, I know it's completely benign but there's always a red flag when software asks you to disable antivirus on a specific folder

Patrick Massot (Dec 09 2023 at 17:16):

Here it is not asking you anything, this is purely informative.

Yongyi Chen (Dec 09 2023 at 17:19):

I install Linux dual-boot this morning to my machine and I got an 11 second unpacking time from a fresh install of Lean and mathlib. This is despite the fact that, against my expectations, my SSD's sequential write speed was still measured at about 200 MB/s on Linux! This proves the rest of the difference is Windows shenanigans (or perhaps my particular Windows install)

Eric Rodriguez (Dec 09 2023 at 17:23):

Patrick Massot said:

Here it is not asking you anything, this is purely informative.

Totally agree, but it'd be nice to e.g. have a link to the reasoning if we choose to go down this path.

Yongyi Chen (Dec 09 2023 at 17:23):

Here's another performance statistic that might be useful. I measured warm loading times for a file that imports Mathlib.

Windows 11 (non-WSL, same specs as above): 6-8 seconds
Windows WSL (same spcs): 4-5 seconds
Linux (also same specs as above): 2 seconds (!!)
Macbook M1 2020 : 10 seconds ( :( )

For comparison, live.lean-lang.org takes 3 seconds.

Eric Wieser (Dec 09 2023 at 17:29):

Shouldn't it be "You may want to disable Windows Defender in the {cwd}/.lake directory"?

Scott Morrison (Dec 09 2023 at 17:32):

Eric Wieser said:

Shouldn't it be "You may want to disable Windows Defender in the {cwd}/.lake directory"?

Why is that? It just seems unnecessarily confusing --- a user shouldn't even need to be aware the .lake directory exists.

Jon Eugster (Dec 09 2023 at 17:47):

Yongyi Chen said:

For comparison, live.lean-lang.org takes 3 seconds.

I believe it runs on nixOS, i.e. linux as well

Eric Wieser (Dec 09 2023 at 18:26):

Mainly just on the principle "don't disable your antivirus for more than the things you need to"

Mario Carneiro (Dec 09 2023 at 20:26):

I agree, there are good reasons to want to keep an antivirus on for the git repo containing sources copied from the internet which may not apply (or apply differently) to the build directory


Last updated: Dec 20 2023 at 11:08 UTC