Zulip Chat Archive

Stream: general

Topic: Adding logging for cases where oleans are not picked up


Ryan Lahfa (Nov 01 2021 at 12:24):

This is a question to Lean 3 maintainers mostly, but I am having a lot of difficulties to understand why Lean 3 makes some decisions in respect to olean reuse vs lean compilation from source.

From what I gather:

  • A module can be marked as: it has to be compiled from source, this is the case when I do: lean --make src and src must be compiled from source ;
  • Some flag can_use_oleans is passed through the machinery to allow or not oleans, sometimes, it gets set, sometimes it does not.
  • If an olean can be found and its hash is compatible, then it gets reused.

Initially, I was making the changes to support emscripten v2 and file_exists relied on stat syscall to find out if a file existed or not, though emscripten didn't export $ERRNO_CODES properly, so in JavaScript, whenever there was an error with an FS syscall: the table this.ERRNO_CODES[t.code] returned undefined, then an exception with undefined errno was thrown, and it bubbled incorrectly as the file existed (error code: 0, which means everything is fine :>), making believe Lean that the .lean existed when it didn't because next it tried to read the file, and it failed. (hinting on the stat issue, thanks the modern DevTools for being able to debug WASM!)

It helped a lot during the debugging to have debug traces of the module_mgr class, now I'm doing something similar except that I'm not in the browser anymore, but I have a family of symlinks: https://github.com/RaitoBezarius/nixexprs/blob/master/pkgs/science/lean/make-lean-game.nix#L57-L66

Where I layer the source of a Lean repo containing a src/ and all of its dependencies (a family of symlinks of the form _target/deps/${dep}, with a special behavior for mathlib where I go get the Azure cache in addition).

What happens specifically for mathlib is the following:

/nix/store/5ip4qjgmxxn7zfr9m2iy3vk6mi7lqbfh-Natural-Number-Game-library-src-bundle/_target/deps
└──mathlib -> /nix/store/di71hzh91zkv6jj7bvnfmg3c3fq1f02q-mathlib-with-cached-olean

This derivation mathlib-with-cached-olean shows for example:

> ls -lah /nix/store/5ip4qjgmxxn7zfr9m2iy3vk6mi7lqbfh-Natural-Number-Game-library-src-bundle/_target/deps/mathlib/src/logic
Permissions Size User Date Modified  Name
lrwxrwxrwx    67 root  1 janv.  1970 basic.lean -> /nix/store/3wrkkns24gsbpir93nrw0jx9aflif1g9-source/logic/basic.lean
lrwxrwxrwx    68 root  1 janv.  1970 basic.olean -> /nix/store/3wrkkns24gsbpir93nrw0jx9aflif1g9-source/logic/basic.olean
lrwxrwxrwx    71 root  1 janv.  1970 embedding.lean -> /nix/store/3wrkkns24gsbpir93nrw0jx9aflif1g9-source/logic/embedding.lean
lrwxrwxrwx    72 root  1 janv.  1970 embedding.olean -> /nix/store/3wrkkns24gsbpir93nrw0jx9aflif1g9-source/logic/embedding.olean
lrwxrwxrwx    70 root  1 janv.  1970 function.lean -> /nix/store/3wrkkns24gsbpir93nrw0jx9aflif1g9-source/logic/function.lean
lrwxrwxrwx    71 root  1 janv.  1970 function.olean -> /nix/store/3wrkkns24gsbpir93nrw0jx9aflif1g9-source/logic/function.olean
lrwxrwxrwx    70 root  1 janv.  1970 relation.lean -> /nix/store/3wrkkns24gsbpir93nrw0jx9aflif1g9-source/logic/relation.lean
lrwxrwxrwx    71 root  1 janv.  1970 relation.olean -> /nix/store/3wrkkns24gsbpir93nrw0jx9aflif1g9-source/logic/relation.olean
lrwxrwxrwx    69 root  1 janv.  1970 relator.lean -> /nix/store/3wrkkns24gsbpir93nrw0jx9aflif1g9-source/logic/relator.lean
lrwxrwxrwx    70 root  1 janv.  1970 relator.olean -> /nix/store/3wrkkns24gsbpir93nrw0jx9aflif1g9-source/logic/relator.olean
lrwxrwxrwx    68 root  1 janv.  1970 unique.lean -> /nix/store/3wrkkns24gsbpir93nrw0jx9aflif1g9-source/logic/unique.lean
lrwxrwxrwx    69 root  1 janv.  1970 unique.olean -> /nix/store/3wrkkns24gsbpir93nrw0jx9aflif1g9-source/logic/unique.olean

Which is sane, but for some reason, it looks like it do not get picked by lean --make src and mathlib gets recompiled. Plus, as the compilation gets done in a Nix sandbox, it seems like isatty(STDOUT_FILENO) is false (?), therefore, no progress can be shown (or I'm wrong on what progress means).

Leading me to this topic:

  • Could there be a way to design a debugging machinery for oleans non-reuse in Lean?
  • Could there be a way to show progress with: which file is getting compiled atm?

Thanks!

Ryan Lahfa (Nov 01 2021 at 12:26):

Bonus question: what to do of very old Lean versions? It seems like NNG was made for 3.4.2, I guess 3.5.0 could work, but it seems like the olean machinery relies on mtime, so I guess Nix timestamps are not an issue here.

Gabriel Ebner (Nov 01 2021 at 12:27):

NNG was made for 3.4.2

Can we upgrade it? That seems like the easiest solution here. 3.4.2 is ancient

Gabriel Ebner (Nov 01 2021 at 12:27):

olean machinery relies on mtime

No, it only relies on file hashes.

Ryan Lahfa (Nov 01 2021 at 12:28):

Gabriel Ebner said:

olean machinery relies on mtime

No, it only relies on file hashes.

Even for old versions like 3.5.0 ?

        auto olean_fn = olean_of_lean(lean_fn);
        shared_file_lock olean_lock(olean_fn);
        auto olean_mtime = get_mtime(olean_fn);
        if (olean_mtime != -1 && olean_mtime >= lean_mtime &&
            can_use_olean &&
            !m_modules_to_load_from_source.count(id) &&
            is_candidate_olean_file(olean_fn)) {
            return std::make_shared<module_info>(id, read_file(olean_fn, std::ios_base::binary), module_src::OLEAN, olean_mtime);
        }

seems to be hinting that?

Gabriel Ebner (Nov 01 2021 at 12:28):

Could there be a way to show progress with: which file is getting compiled atm?

This is indeed shown, but only if stdout is a tty as you've discovered.

Gabriel Ebner (Nov 01 2021 at 12:28):

No, it only relies on file hashes.

Even for old versions like 3.5.0 ?

No, 3.5.0 is ancient.

Ryan Lahfa (Nov 01 2021 at 12:29):

Gabriel Ebner said:

NNG was made for 3.4.2

Can we upgrade it? That seems like the easiest solution here. 3.4.2 is ancient

I think that's definitely the best solution, my NUR compiles and caches all versions starting 3.24.0 (https://github.com/RaitoBezarius/nixexprs/blob/master/pkgs/science/lean/default.nix#L16-L30) and 3.5.0 too (but disabling tests as some libleanshared stuff is not picked up correctly by Nix I think).

Ryan Lahfa (Nov 01 2021 at 12:29):

Gabriel Ebner said:

No, it only relies on file hashes.

Even for old versions like 3.5.0 ?

No, 3.5.0 is ancient.

What would you say is the "starting version" for modern Lean 3 :) — I used 3.24.0 as a boundary but unsure.

Gabriel Ebner (Nov 01 2021 at 12:30):

When it comes to oleans and recompilation I don't think much has changed since 3.7.2

Ryan Lahfa (Nov 01 2021 at 12:32):

Gabriel Ebner said:

Could there be a way to show progress with: which file is getting compiled atm?

This is indeed shown, but only if stdout is a tty as you've discovered.

I didn't get that far: is that a necessary condition for stdout to be a tty?

Gabriel Ebner (Nov 01 2021 at 12:32):

necessary and sufficient iirc

Ryan Lahfa (Nov 01 2021 at 12:34):

Gabriel Ebner said:

necessary and sufficient iirc

Then, do you see a way to have an escape hatch for Nix builds or that might be too complicated?

Gabriel Ebner (Nov 01 2021 at 12:36):

sed -i 's/isatty(STDOUT_FILENO)/true/' src/shell/lean.cpp

:smile:

Ryan Lahfa (Nov 01 2021 at 12:36):

:D

Ryan Lahfa (Nov 01 2021 at 12:50):

what about something like this @Gabriel Ebner :

From 0f07210d63cc668d9e130cc8163064dc932dcfd8 Mon Sep 17 00:00:00 2001
From: Raito Bezarius <masterancpp@gmail.com>
Date: Mon, 1 Nov 2021 13:47:56 +0100
Subject: [PATCH] shell(progress): add a flag to force progress even if stdout
 is not a tty

---
 src/shell/lean.cpp | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/src/shell/lean.cpp b/src/shell/lean.cpp
index 2a20bec60..37251e9b3 100644
--- a/src/shell/lean.cpp
+++ b/src/shell/lean.cpp
@@ -243,6 +243,7 @@ static struct option g_long_options[] = {
     {"deps",         no_argument,       0, 'd'},
     {"test-suite",   no_argument,       0, 'e'},
     {"timeout",      optional_argument, 0, 'T'},
+    {"force-show-progress", no_argument, 0, 'F'},
 #if defined(LEAN_JSON)
     {"json",         no_argument,       0, 'J'},
     {"path",         no_argument,       0, 'p'},
@@ -260,7 +261,7 @@ static struct option g_long_options[] = {
 };

 static char const * g_opt_str =
-    "PdD:qpgvhet:012E:A:B:j:012rM:012T:012"
+    "FPdD:qpgvhet:012E:A:B:j:012rM:012T:012"
 #if defined(LEAN_MULTI_THREAD)
     "s:012"
 #endif
@@ -431,15 +432,16 @@ int main(int argc, char ** argv) {
     LEAN_EMSCRIPTEN_FS
 #endif
     ::initializer init;
-    bool make_mode          = false;
-    bool export_tlean       = false;
-    bool export_ast         = false;
-    bool use_old_oleans     = false;
-    bool report_widgets     = true;
-    bool recursive          = false;
-    unsigned trust_lvl      = LEAN_BELIEVER_TRUST_LEVEL+1;
-    bool only_deps          = false;
-    bool test_suite         = false;
+    bool make_mode           = false;
+    bool export_tlean        = false;
+    bool export_ast          = false;
+    bool use_old_oleans      = false;
+    bool report_widgets      = true;
+    bool recursive           = false;
+    unsigned trust_lvl       = LEAN_BELIEVER_TRUST_LEVEL+1;
+    bool only_deps           = false;
+    bool test_suite          = false;
+    bool force_show_progress = false;
     unsigned num_threads    = 0;
 #if defined(LEAN_MULTI_THREAD)
     num_threads = hardware_concurrency();
@@ -511,6 +513,9 @@ int main(int argc, char ** argv) {
         case 'q':
             opts = opts.update(lean::get_verbose_opt_name(), false);
             break;
+   case 'F':
+      force_show_progress = true;
+      break;
         case 'd':
             only_deps = true;
             break;
@@ -613,7 +618,7 @@ int main(int argc, char ** argv) {

     log_tree lt;

-    bool show_progress = (make_mode || export_tlean) && isatty(STDOUT_FILENO);
+    bool show_progress = (make_mode || export_tlean) && (isatty(STDOUT_FILENO) || force_show_progress);
     progress_message_stream msg_stream(std::cout, json_output, show_progress, lt.get_root());
     if (json_output) ios.set_regular_channel(ios.get_diagnostic_channel_ptr());

--
2.33.0

(modulo formatting)?

Ryan Lahfa (Nov 01 2021 at 12:50):

(except it might not apply very well across all versions hm)

Gabriel Ebner (Nov 01 2021 at 12:51):

Looks reasonable as well. If it's just for the one-off emscripten build, then I'd just do the sed version.

Ryan Lahfa (Nov 01 2021 at 12:54):

It would get used when I make library.zip bundles, so it's for the Lean x86_64 builds and if I do the sed version, I'd need to compile twice Lean I believe.

Now, I have a way to know which file is getting compiled during library making, all that's left is to understand why an olean is not picked up, would you be interested in a patch towards this with debugging traces for emscripten and x86_64? @Gabriel Ebner

Gabriel Ebner (Nov 01 2021 at 12:56):

Maybe you're not using the same lean version to compile the olean as you use for the javascript version (there's a check that you can disable with -DCHECK_OLEAN_VERSION=OFF).

Ryan Lahfa (Nov 01 2021 at 12:59):

Hm, is there a way to find out which lean version was compiled for a given olean? I'm using the oleans from the Azure cache for mathlib, so it could be this.

Gabriel Ebner (Nov 01 2021 at 13:03):

You can look in the leanpkg.toml file inside mathlib to figure out the correct Lean version to use.

Gabriel Ebner (Nov 01 2021 at 13:05):

also see https://github.com/NixOS/nixpkgs/pull/124507

Ryan Lahfa (Nov 01 2021 at 13:11):

Ha, nice!

Ryan Lahfa (Nov 01 2021 at 13:16):

Many thanks for all the insights @Gabriel Ebner !

Kevin Buzzard (Nov 01 2021 at 13:17):

It's on my job list for this month to deal with natural number game issues, I can update lean as part of it. My guess is that it will be extremely easy to bump mathlib although perhaps I might need help if things like the tactic hack breaks

Ryan Lahfa (Nov 01 2021 at 13:18):

Kevin Buzzard said:

It's on my job list for this month to deal with natural number game issues, I can update lean as part of it. My guess is that it will be extremely easy to bump mathlib although perhaps I might need help if things like the tactic hack breaks

To be honest, I quickly tried:
image.png

For now, I'll start another Lean Game project from scratch to perform tests and come back to NNG to help porting it (and learn more about those tactic hacks!)

Ryan Lahfa (Nov 01 2021 at 14:17):

Getting back to disabling olean version check, I added some logging, here's an example of a trace:
olean /nix/store/r1cwxp6krlghnb30gn3g5nm68djhjqb7-mathlib-azure-olean-with-src/src/tactic/simp_result.olean was rejected (can use olean: no, does it exist: yes, is it a module to load from source: no).

Ryan Lahfa (Nov 01 2021 at 14:17):

So it does exist, it's not a module to load from source, but for some reason, it cannot use olean, what are the causes of this?

Ryan Lahfa (Nov 01 2021 at 14:18):

            // If anything in the reflexive-transitive closure of this module under the import relation
            // has changed, rebuild the module.
            if (mod->m_trans_hash != actual_trans_hash && !m_use_old_oleans)
                return build_module(id, false, orig_module_stack);

sounds like this is due to this

Ryan Lahfa (Nov 01 2021 at 14:18):

so if I pass --old-oleans maybe I have a chance :D

Ryan Lahfa (Nov 01 2021 at 14:25):

joy: Added 1887 olean files from r1cwxp6krlghnb30gn3g5nm68djhjqb7-mathlib-azure-olean-with-src

Ryan Lahfa (Nov 01 2021 at 14:54):

Okay, I quite got there through disabling olean version check and forcing old oleans, just unsure if the transitive hash thing might become a problem.

Ryan Lahfa (Nov 01 2021 at 17:44):

In order to verify how these transitive hash works, I'd like to verify them externally using Python, so I tried to parse oleans, according to the C++ code, there should be:

  • header (7s)
  • version (27s in the case of 3.35.0, commit a68d251bfc57)
  • source hash (unsignedunsigned int?)
  • transitive hash (unsigned)
  • claimed blob hash (unsigned)
  • uses sorry or not (bool)
  • number of imports (unsigned)
  • for each import, a module_name, that is: a name structure and a optional<unsigned>

Assuming those are unsigned short in fact.
Just parsing up to the sorry field seems strange

>>> struct.unpack_from('=9sx27sxHHH?H', data)
(b'oleanfile', b'3.35.0, commit a68d251bfc57', 39679, 23212, 65475, True, 46697)

And 46697 imports (dlist/basic.lean) seems too much. Is there any doc on the technical details of these formats (or existing external code of parsing) ?

Eric Wieser (Nov 01 2021 at 17:49):

Can you link to the relevant c++ code?

Eric Wieser (Nov 01 2021 at 17:50):

H is short not int though

Ryan Lahfa (Nov 01 2021 at 17:53):

Eric Wieser said:

H is short not int though

Indeed, I wrote I'm assuming that those are unsigned short in fact instead of unsigned int (with I, it shows way bigger numbers).

Ryan Lahfa (Nov 01 2021 at 17:53):

Eric Wieser said:

Can you link to the relevant c++ code?

https://github.com/leanprover-community/lean/blob/master/src/library/module.cpp#L632-L649

Eric Wieser (Nov 01 2021 at 17:54):

https://github.com/leanprover-community/lean/blob/65ad4ffdb3abac75be748554e3cbe990fb1c6500/src/util/serializer.h#L65-L72

https://github.com/leanprover-community/lean/blob/65ad4ffdb3abac75be748554e3cbe990fb1c6500/src/util/serializer.cpp#L26-L37

Eric Wieser (Nov 01 2021 at 17:55):

It's a variable width format

Ryan Lahfa (Nov 01 2021 at 17:56):

Indeed

Ryan Lahfa (Nov 01 2021 at 17:57):

that will be ugly to parse

Ryan Lahfa (Nov 01 2021 at 17:58):

hm isn't just number in another endianness?

Ryan Lahfa (Nov 01 2021 at 17:58):

(and in 64 bits no matter what)

Eric Wieser (Nov 01 2021 at 17:58):

The struct module is unsuitable for parsing this

Eric Wieser (Nov 01 2021 at 17:58):

No, it's 1 byte if the number is <255, and 5 bytes if its greater

Ryan Lahfa (Nov 01 2021 at 17:59):

I meant in the case where it's >=255

Eric Wieser (Nov 01 2021 at 17:59):

You don't care about efficiency I assume for checking the hashes, just write a simple tokenizer

Ryan Lahfa (Nov 01 2021 at 17:59):

excluding the -1

Ryan Lahfa (Nov 01 2021 at 17:59):

I indeed don't care

Ryan Lahfa (Nov 01 2021 at 17:59):

but rather than write something new, maybe I will just create a new Lean executable lean_check_olean which will reuse the existing deserializing machinery

Eric Wieser (Nov 01 2021 at 18:04):

We're talking around 15 lines, probably:

def read_string(s):
    sofar = io.BytesIO()
    while (c := next(s)) != 0:
        sofar.write(bytes([c]))
    return sofar.getvalue()

# repeat for `read_unsigned`

with open('some_olean', 'rb') as f:
    header = read_string(f)
    version = read_string(f)
    # etc

Ryan Lahfa (Nov 01 2021 at 18:05):

The name deserializer won't be simple at all though

Ryan Lahfa (Nov 01 2021 at 18:05):

And if I want to reproduce the transitive hash computations, I'll need it I believe

Ryan Lahfa (Nov 01 2021 at 18:06):

Excluding shenanigans with transitive hashes of oleans (that is, use_old_oleans=True on emscripten bundle and Lean x86_64):
image.png
it works fine :)

Ryan Lahfa (Nov 01 2021 at 18:07):

(@ name deserializer: I mean this: https://github.com/leanprover-community/lean/blob/master/src/util/name.cpp#L515-L538)

Ryan Lahfa (Nov 01 2021 at 18:57):

well, doing it quickly in C++ yields to:

reading /nix/store/r3iqrhwik0fx49b75vxhfkpfjdg50h4p-mathlib-with-cached-olean/src/algebra/abs.lean
module loaded, is olean: yes
olean parsed.
transitive hash computed.
mod trans hash: 1918900608
actual trans hash: 639958348

so far so good :)

Ryan Lahfa (Nov 01 2021 at 21:10):

For fun, here's a graph of valid/invalid transitive dependencies: graph.svg

I guess this is due to the fact that src_hash = hash(remove_cr(lean_file)) changes due to the missing githash, which induces invalid transitive hash at some parts.

Ryan Lahfa (Nov 01 2021 at 23:48):

Okay, implementing the same thing as upstream nixpkgs for githash made everything work just fine, :-).
I pushed most of the stuff there: https://github.com/RaitoBezarius/nixexprs/tree/master/pkgs/science/lean if people are actually interested. I will now be able to focus more on frontend. Some patches are there to help for olean diagnostics over emscripten and x86_64 (hidden behind debugOleans attribute which enable very verbose logging.)

Ryan Lahfa (Nov 01 2021 at 23:48):

Thanks everyone for your insights :)


Last updated: Dec 20 2023 at 11:08 UTC