Zulip Chat Archive

Stream: general

Topic: Docs memory usage


Winston Yin (尹維晨) (Dec 08 2023 at 09:30):

Is it just me (Chrome 119.0.6045.199 on MacBook Pro 13" 2018 running macOS 13.6.1) or the Mathlib docs pages are using half a GB of memory each!?
Screenshot-2023-12-08-at-01.25.59.png

Henrik Böving (Dec 08 2023 at 10:07):

On my machine in chromium it stabilizes around ~85 MB. Which I have no idea if that's big for a modern webpage? But the majority of that size is used by the DeclarationDataCenter so it is most likely possible to improve on.

Mauricio Collares (Dec 08 2023 at 10:09):

Wasn't there an improvement a long time ago which reduced local storage requirements? Could this lead to high memory usage if someone hasn't cleared the local storage cache since then?

Henrik Böving (Dec 08 2023 at 10:10):

I have no idea but I guess its worth a try.

Winston Yin (尹維晨) (Dec 08 2023 at 10:30):

Cleared cache and still consistently >350 MB on several doc pages. To compare, a GiHub PR page uses <100 MB, and a Google Doc uses <200 MB.

Mauricio Collares (Dec 08 2023 at 10:57):

Can you reproduce this in a fresh profile? Do you have any extensions that could be causing the problem?

Mauricio Collares (Dec 08 2023 at 10:58):

(I can't reproduce it, it stays under 150MB)

Utensil Song (Dec 08 2023 at 11:00):

Mine is around 452MB, on MacBook Pro 2021 (M1)

Winston Yin (尹維晨) (Dec 08 2023 at 11:01):

In Chrome Incognito, all extensions disabled, went up to 750 MB lol

Mauricio Collares (Dec 08 2023 at 11:03):

Is this as soon as you load a page, or do you have to click around a lot?

Winston Yin (尹維晨) (Dec 08 2023 at 11:03):

Basically as soon as it finishes loading

Utensil Song (Dec 08 2023 at 11:03):

Winston Yin (尹維晨) said:

In Chrome Incognito, all extensions disabled, went up to 750 MB lol

191MB for a fresh Chrome Incognito tab

Mauricio Collares (Dec 08 2023 at 11:04):

Winston Yin (尹維晨) said:

Basically as soon as it finishes loading

Can you use https://developer.chrome.com/docs/devtools/memory-problems/heap-snapshots to see if something stands out?

Utensil Song (Dec 08 2023 at 11:05):

Goes up to 361MB after 2 searches

Henrik Böving (Dec 08 2023 at 11:06):

image.png

If I visit https://leanprover-community.github.io/mathlib4_docs/Mathlib/Geometry/Manifold/MFDeriv.html#HasMFDerivAt in a Chromium (that I never really use) my heap profiler sits at 90 MB

Henrik Böving (Dec 08 2023 at 11:08):

But it does indeed go up after visiting another page via searches, 160 after two. That's most likely a mem leak in the JS. I conjecture in the DeclarationDataCenter

Winston Yin (尹維晨) (Dec 08 2023 at 11:09):

Screenshot-2023-12-08-at-03.08.32.png

Henrik Böving (Dec 08 2023 at 11:10):

The large amount of strings are from the DeclarationDataCenter because it has to store all of the meta information for the in browser search.

Utensil Song (Dec 08 2023 at 11:10):

Everytime I search, the profile has a new number under 100MB, then they accumulated, now I'm 500MB after a few searches

Henrik Böving (Dec 08 2023 at 11:22):

I'll try to investigate this after my lectures, in the afternoon.

Henrik Böving (Dec 08 2023 at 14:58):

Okay interesting (accidental) observation: Just leaving my window alone for 15~ minutes drops the consumed memory drastically back down to 50 MB. Can anyone reproduce this?

Utensil Song (Dec 08 2023 at 15:07):

It only dropped to ~ 270MB (a few hours ago it was 500MB+, and I left it open...)

Shreyas Srinivas (Dec 08 2023 at 15:07):

Yeah that happens to me.

Shreyas Srinivas (Dec 08 2023 at 15:08):

Could this be an issue with the search feature?

Utensil Song (Dec 08 2023 at 15:10):

I used the following to record, and search for 3 times:

image.png

Here is what can be seen from the snapshot:

image.png

My guess is that they are trapped in the closure of a Promise.

Henrik Böving (Dec 08 2023 at 15:17):

I mean these strings are not supposed to be free'd really. The issue is that they appear multiple times...

Henrik Böving (Dec 08 2023 at 15:53):

No I can't figure out where the memory gets duplicated so much. From my understanding of whats going on in the debugger the singleton in the declaration data center works and the large object that it manages does not end up getting duplicated. If someone else has an idea feel free to take a look.

Henrik Böving (Dec 09 2023 at 22:29):

Okay I have some more news. So the majority of data that fills the RAM is strings from the declaration data centers as conjectured above. Now lets say we are on page A and go to B. Interestingly the browser does not instantly free the declaration data center from page A once arriving on B. However it will end up doing this after the mentioned 10 minutes (for whatever reason). And now if you visit a series of pages in quick succession lots of data centers remain in memory but only ever get freed approximately at the same time. At least this is how I am reading the traces of the memory profiler.

Namely when I visit 7 pages in quick succession, take a snapshot, wait for 10 min until it frees and take another snapshot then diff them I see the corresponding amount of datacenters being freed. If i do just one side swap its just one datacenter.

While this does explain the phenomena we observe I have no idea how one might convince JS VM to...well not behave like this?

Utensil Song (Dec 11 2023 at 03:24):

I had a few local attempts to fix around the Promise idea, but none seems to work.

One thing that puzzles me is JS memory seems to be significantly less than total memory. JS memory is usually lower, ~200MB, while the total memory footprint is ~500MB+, and burst more on each search.

image.png

In the time I type the above, total memory down to 300M, JS memory down to 60MB, something just doesn't add up.

Utensil Song (Dec 11 2023 at 03:32):

To make it more obvious, searching more times (I start with a, get to k) results in larger difference: 1G vs 370MB

image.png

Winston Yin (尹維晨) (Dec 13 2023 at 03:26):

I checked again throughout the day, and the memory usage is ~ 220MB, which is not low but still much more reasonable than before. Thanks for looking into this. I look forward to future optimisations!

Henrik Böving (Dec 13 2023 at 07:30):

(I didn't do anything to fix this because idk how, its still the same code and an ever so slightly changed JSON format)


Last updated: Dec 20 2023 at 11:08 UTC