Zulip Chat Archive
Stream: general
Topic: olean cache
Yury G. Kudryashov (May 10 2020 at 17:17):
I have two questions about olean cache:
- Is there any reason to use
tar.gz
and not, e.g.,tar.xz
? - Is there a command to remove old files from
~/.mathlib
? I know aboutrm ~/.mathlib/*.tar*
but I'd like to preserve, e.g., caches corresponding to live branch heads.
Yury G. Kudryashov (May 15 2020 at 10:13):
Ping here. Can we switch to tar.xz
?
Patrick Massot (May 15 2020 at 10:14):
I think this is a question for @Rob Lewis
Yury G. Kudryashov (May 15 2020 at 10:14):
This would save bandwidth and upload/download times.
Patrick Massot (May 15 2020 at 10:14):
Well, I would need to adjust leanproject too, but not much (maybe not at all)
Rob Lewis (May 15 2020 at 10:16):
I think 2 is for @Patrick Massot , this seems like a leanproject command. For 1, I don't know if there was a reason. We could change it but it will break urls for leanproject, so there would need to be a change there. And we'd need to keep checking gz for backward compatibility.
Patrick Massot (May 15 2020 at 10:20):
I checked, the unpacking function is generic enough but of course the url contains .tar.gz
Patrick Massot (May 15 2020 at 10:20):
The lazy transition solution would be to first try tar.xz and fall back to tar.gz
Patrick Massot (May 15 2020 at 10:21):
I don't see any issue it could cause
Rob Lewis (May 15 2020 at 10:21):
@Gabriel Ebner was there a reason to use gz? I don't remember which of us set that up. If I did it, gz was probably just inertia.
Rob Lewis (May 15 2020 at 10:22):
The only transition issue is people with outdated leanproject will start getting errors.
Patrick Massot (May 15 2020 at 10:22):
About deleting old archives, I can implement it if you tell me precisely what you want. Or I can merge a PR, this is probably just as fast to do it rather than telling me
Patrick Massot (May 15 2020 at 10:22):
The only transition issue is people with outdated leanproject will start getting errors.
True, this is already pretty bad
Patrick Massot (May 15 2020 at 10:23):
Are .xz so much smaller than .gz?
Yury G. Kudryashov (May 15 2020 at 10:24):
Repacking
Patrick Massot (May 15 2020 at 10:25):
Also need to add download time and unpacking time
Yury G. Kudryashov (May 15 2020 at 10:26):
I wonder why some of today's files are tar.bz2
Yury G. Kudryashov (May 15 2020 at 10:26):
local cache?
Patrick Massot (May 15 2020 at 10:27):
yes
Rob Lewis (May 15 2020 at 10:27):
It saves ~7mb, not insignificant.
Yury G. Kudryashov (May 15 2020 at 10:31):
xz
: 19M, bz2
: 23M, gz
: 26M
Patrick Massot (May 15 2020 at 10:32):
What about unpacking times?
Yury G. Kudryashov (May 15 2020 at 10:32):
xz
: 2.2s
Yury G. Kudryashov (May 15 2020 at 10:33):
gz
: 0.8s
Yury G. Kudryashov (May 15 2020 at 10:33):
bz2
: 3.8s
Patrick Massot (May 15 2020 at 10:34):
Ok, this is a huge difference. Probably negligible compared to download time, but maybe not compared to local copy
Yury G. Kudryashov (May 15 2020 at 10:35):
So probably local cache should be stored in tar.gz
Yury G. Kudryashov (May 15 2020 at 10:37):
For me download time is 40-80s depending on I-don't-know-what
Yury G. Kudryashov (May 15 2020 at 10:37):
(yes, I probably should change the ISP)
Rob Lewis (May 15 2020 at 10:40):
That's your normal download speed for a 27mb file? We pay "a lot" for the fancy Azure CDN for fast download speeds, so I just want to make sure it's not a limitation there.
Yury G. Kudryashov (May 15 2020 at 10:40):
Unfortunately, "forced to use home Internet all the time" and "hard to switch the ISP" came together.
Yury G. Kudryashov (May 15 2020 at 10:41):
The bottleneck is definitely on my side.
Patrick Massot (May 15 2020 at 10:41):
Even Zulip must be slow at this rate
Yury G. Kudryashov (May 15 2020 at 10:42):
Even Amazon video / Netflix work at this rate (never tried HD).
Rob Lewis (May 15 2020 at 10:45):
The optimal compression method is gonna depend on both bandwidth and processing time. My guess is I'd see a small gain from xz but nothing that significant. Slow laptop on a university connection will prefer gz.
Rob Lewis (May 15 2020 at 10:45):
But there are clear gains from xz on the Azure side, it basically cuts 25% off the bill.
Sebastien Gouezel (May 15 2020 at 10:45):
Yury G. Kudryashov said:
For me download time is 40-80s depending on I-don't-know-what
For me, it's more like 90 seconds, and this is the normal speed due to my internet connection...
Rob Lewis (May 15 2020 at 10:46):
Not that saving $3/mo matters so much.
Sebastien Gouezel (May 15 2020 at 10:46):
And decompressing time will be completely negligible, so I vote strongly for .xz
.
Rob Lewis (May 15 2020 at 10:47):
I had no idea it was still possible to get 300kbps lines.
Sebastien Gouezel (May 15 2020 at 10:47):
All the houses in the neighborhood have optical fiber, except mine and my direct neighbor's because of bad luck.
Johan Commelin (May 15 2020 at 10:47):
I lived in Papua New Guinea for half a year... glad I didn't know about leanproject
back then (-;
We had a satellite connection to Asia, and could receive email / browse a 2 kbps, or something like that.
Patrick Massot (May 15 2020 at 10:49):
Now I picture Yury living on a small island in the middle of a huge Canadian lake, surrounded by 100km of forest with no other human being (but lots of animals).
Patrick Massot (May 15 2020 at 10:50):
although Toronto didn't look exactly like that last time I checked
Rob Lewis (May 15 2020 at 10:50):
And Sebastien the same, except it's a French lake and the animals all have optical fiber.
Patrick Massot (May 15 2020 at 10:50):
We don't have the right kind of lakes in France
Yury G. Kudryashov (May 15 2020 at 10:51):
I have a 6Mbps Internet via a phone line.
Sebastien Gouezel (May 15 2020 at 10:51):
Same here: when I am really in trouble, I use the phone :)
Patrick Massot (May 15 2020 at 10:53):
Something like https://www.privateislandsonline.com/uploads/resize/_1548_image_cbb0871e3b.jpg-2148-1600.jpg
Johan Commelin (May 15 2020 at 10:55):
Ooh well... it's only 2 months ago that I had 0 kbps for 3 days because they cut an internet cable at a construction site in town.
Reid Barton (May 15 2020 at 10:55):
I've been on a small island in the middle of a lake in Canada surrounded by forests that looks remarkably like that one. It took me a while to convince myself that that was actually not a photo of the island I've been to.
Gabriel Ebner (May 15 2020 at 11:06):
Rob Lewis said:
Gabriel Ebner was there a reason to use gz? I don't remember which of us set that up. If I did it, gz was probably just inertia.
Yes, it was inertia. I don't see any reason why we couldn't switch to xz
. We could also deploy both xz
and gz
for a while while leanproject support is being rolled out.
Gabriel Ebner (May 15 2020 at 11:07):
Another idea: we could drop the .lean
files and just distribute the .olean
files.
Yury G. Kudryashov (May 15 2020 at 11:12):
$ ls -l *.tar.xz
-rw-r--r-- 1 urkud users 17979212 May 15 07:04 without-leans.tar.xz
-rw-r--r-- 1 urkud users 19744024 May 15 07:10 with-leans.tar.xz
Sebastian Ullrich (May 15 2020 at 11:47):
You might also be interested in zstd
.
Arch Linux added support for zstd as a package compression method in October 2019 with the release of the pacman 5.2 package manager,[25] and in January 2020 switched from xz to zstd for the packages in the official repository. Arch uses zstd -c -T0 --ultra -20 -, the size of all compressed packages combined increased by 0.8% (compared to xz), the decompression speed is 1300% faster, decompression memory increased by 50 MiB when using multiple threads, compression memory increases but scales with the number of threads used.
Johan Commelin (May 15 2020 at 12:10):
Sounds good!
Last updated: Dec 20 2023 at 11:08 UTC