Zulip Chat Archive

Stream: general

Topic: olean cache


view this post on Zulip Yury G. Kudryashov (May 10 2020 at 17:17):

I have two questions about olean cache:

  1. Is there any reason to use tar.gz and not, e.g., tar.xz?
  2. Is there a command to remove old files from ~/.mathlib? I know about rm ~/.mathlib/*.tar* but I'd like to preserve, e.g., caches corresponding to live branch heads.

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 10:13):

Ping here. Can we switch to tar.xz?

view this post on Zulip Patrick Massot (May 15 2020 at 10:14):

I think this is a question for @Rob Lewis

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 10:14):

This would save bandwidth and upload/download times.

view this post on Zulip Patrick Massot (May 15 2020 at 10:14):

Well, I would need to adjust leanproject too, but not much (maybe not at all)

view this post on Zulip Rob Lewis (May 15 2020 at 10:16):

I think 2 is for @Patrick Massot , this seems like a leanproject command. For 1, I don't know if there was a reason. We could change it but it will break urls for leanproject, so there would need to be a change there. And we'd need to keep checking gz for backward compatibility.

view this post on Zulip Patrick Massot (May 15 2020 at 10:20):

I checked, the unpacking function is generic enough but of course the url contains .tar.gz

view this post on Zulip Patrick Massot (May 15 2020 at 10:20):

The lazy transition solution would be to first try tar.xz and fall back to tar.gz

view this post on Zulip Patrick Massot (May 15 2020 at 10:21):

I don't see any issue it could cause

view this post on Zulip Rob Lewis (May 15 2020 at 10:21):

@Gabriel Ebner was there a reason to use gz? I don't remember which of us set that up. If I did it, gz was probably just inertia.

view this post on Zulip Rob Lewis (May 15 2020 at 10:22):

The only transition issue is people with outdated leanproject will start getting errors.

view this post on Zulip Patrick Massot (May 15 2020 at 10:22):

About deleting old archives, I can implement it if you tell me precisely what you want. Or I can merge a PR, this is probably just as fast to do it rather than telling me

view this post on Zulip Patrick Massot (May 15 2020 at 10:22):

The only transition issue is people with outdated leanproject will start getting errors.

True, this is already pretty bad

view this post on Zulip Patrick Massot (May 15 2020 at 10:23):

Are .xz so much smaller than .gz?

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 10:24):

Repacking

view this post on Zulip Patrick Massot (May 15 2020 at 10:25):

Also need to add download time and unpacking time

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 10:26):

I wonder why some of today's files are tar.bz2

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 10:26):

local cache?

view this post on Zulip Patrick Massot (May 15 2020 at 10:27):

yes

view this post on Zulip Rob Lewis (May 15 2020 at 10:27):

It saves ~7mb, not insignificant.

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 10:31):

xz: 19M, bz2: 23M, gz: 26M

view this post on Zulip Patrick Massot (May 15 2020 at 10:32):

What about unpacking times?

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 10:32):

xz: 2.2s

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 10:33):

gz: 0.8s

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 10:33):

bz2: 3.8s

view this post on Zulip Patrick Massot (May 15 2020 at 10:34):

Ok, this is a huge difference. Probably negligible compared to download time, but maybe not compared to local copy

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 10:35):

So probably local cache should be stored in tar.gz

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 10:37):

For me download time is 40-80s depending on I-don't-know-what

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 10:37):

(yes, I probably should change the ISP)

view this post on Zulip Rob Lewis (May 15 2020 at 10:40):

That's your normal download speed for a 27mb file? We pay "a lot" for the fancy Azure CDN for fast download speeds, so I just want to make sure it's not a limitation there.

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 10:40):

Unfortunately, "forced to use home Internet all the time" and "hard to switch the ISP" came together.

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 10:41):

The bottleneck is definitely on my side.

view this post on Zulip Patrick Massot (May 15 2020 at 10:41):

Even Zulip must be slow at this rate

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 10:42):

Even Amazon video / Netflix work at this rate (never tried HD).

view this post on Zulip Rob Lewis (May 15 2020 at 10:45):

The optimal compression method is gonna depend on both bandwidth and processing time. My guess is I'd see a small gain from xz but nothing that significant. Slow laptop on a university connection will prefer gz.

view this post on Zulip Rob Lewis (May 15 2020 at 10:45):

But there are clear gains from xz on the Azure side, it basically cuts 25% off the bill.

view this post on Zulip Sebastien Gouezel (May 15 2020 at 10:45):

Yury G. Kudryashov said:

For me download time is 40-80s depending on I-don't-know-what

For me, it's more like 90 seconds, and this is the normal speed due to my internet connection...

view this post on Zulip Rob Lewis (May 15 2020 at 10:46):

Not that saving $3/mo matters so much.

view this post on Zulip Sebastien Gouezel (May 15 2020 at 10:46):

And decompressing time will be completely negligible, so I vote strongly for .xz.

view this post on Zulip Rob Lewis (May 15 2020 at 10:47):

I had no idea it was still possible to get 300kbps lines.

view this post on Zulip Sebastien Gouezel (May 15 2020 at 10:47):

All the houses in the neighborhood have optical fiber, except mine and my direct neighbor's because of bad luck.

view this post on Zulip Johan Commelin (May 15 2020 at 10:47):

I lived in Papua New Guinea for half a year... glad I didn't know about leanproject back then (-;
We had a satellite connection to Asia, and could receive email / browse a 2 kbps, or something like that.

view this post on Zulip Patrick Massot (May 15 2020 at 10:49):

Now I picture Yury living on a small island in the middle of a huge Canadian lake, surrounded by 100km of forest with no other human being (but lots of animals).

view this post on Zulip Patrick Massot (May 15 2020 at 10:50):

although Toronto didn't look exactly like that last time I checked

view this post on Zulip Rob Lewis (May 15 2020 at 10:50):

And Sebastien the same, except it's a French lake and the animals all have optical fiber.

view this post on Zulip Patrick Massot (May 15 2020 at 10:50):

We don't have the right kind of lakes in France

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 10:51):

I have a 6Mbps Internet via a phone line.

view this post on Zulip Sebastien Gouezel (May 15 2020 at 10:51):

Same here: when I am really in trouble, I use the phone :)

view this post on Zulip Patrick Massot (May 15 2020 at 10:53):

Something like https://www.privateislandsonline.com/uploads/resize/_1548_image_cbb0871e3b.jpg-2148-1600.jpg

view this post on Zulip Johan Commelin (May 15 2020 at 10:55):

Ooh well... it's only 2 months ago that I had 0 kbps for 3 days because they cut an internet cable at a construction site in town.

view this post on Zulip Reid Barton (May 15 2020 at 10:55):

I've been on a small island in the middle of a lake in Canada surrounded by forests that looks remarkably like that one. It took me a while to convince myself that that was actually not a photo of the island I've been to.

view this post on Zulip Gabriel Ebner (May 15 2020 at 11:06):

Rob Lewis said:

Gabriel Ebner was there a reason to use gz? I don't remember which of us set that up. If I did it, gz was probably just inertia.

Yes, it was inertia. I don't see any reason why we couldn't switch to xz. We could also deploy both xz and gz for a while while leanproject support is being rolled out.

view this post on Zulip Gabriel Ebner (May 15 2020 at 11:07):

Another idea: we could drop the .lean files and just distribute the .olean files.

view this post on Zulip Yury G. Kudryashov (May 15 2020 at 11:12):

$ ls -l *.tar.xz
-rw-r--r-- 1 urkud users 17979212 May 15 07:04 without-leans.tar.xz
-rw-r--r-- 1 urkud users 19744024 May 15 07:10 with-leans.tar.xz

view this post on Zulip Sebastian Ullrich (May 15 2020 at 11:47):

You might also be interested in zstd.

Arch Linux added support for zstd as a package compression method in October 2019 with the release of the pacman 5.2 package manager,[25] and in January 2020 switched from xz to zstd for the packages in the official repository. Arch uses zstd -c -T0 --ultra -20 -, the size of all compressed packages combined increased by 0.8% (compared to xz), the decompression speed is 1300% faster, decompression memory increased by 50 MiB when using multiple threads, compression memory increases but scales with the number of threads used.

view this post on Zulip Johan Commelin (May 15 2020 at 12:10):

Sounds good!


Last updated: May 11 2021 at 22:14 UTC