Zulip Chat Archive

Stream: lean4

Topic: Dealing with large computation (memory)


Andrés Goens (Jul 05 2022 at 14:42):

What would be the idiomatic/ideal way of dealing with memory issues (where I'm running out of memory) in Lean4? I.e. if I know where I can serialize and store to disk some of my computation, is there a nice way to do this? Or perhaps phrase differently, how much does it involve me weaving IO into everything and defining a bunch of toByteArray functions manually?

James Gallicchio (Jul 07 2022 at 13:29):

It would definitely involve weaving IO around everywhere, at least without substantial hacks :thinking:

RE: serialization, we could definitely derive serializations for most types. that would actually be a quite useful library, if you don't want to build it I might

RE: weaving IO around everything, what I would personally do is set up your computation to be parameterized by the disk-saved values it needs. then you can do all the IO to save partial results to disk, open files, etc. in one top-level function, instead of having to pass IO around everywhere

James Gallicchio (Jul 07 2022 at 13:29):

(out of curiosity, what are you doing that's so memory intensive?)

Andrés Goens (Jul 07 2022 at 19:23):

I feared something like that :sweat_smile: that's a cool suggestion (parametrizing the computation). I've never looked into the code that does the derivation of instances, but that sounds like it would be a fun thing to try. I'd be happy to team up on building that library @James Gallicchio :)

Andrés Goens (Jul 07 2022 at 19:24):

James Gallicchio said:

(out of curiosity, what are you doing that's so memory intensive?)

It's basically an exhaustive search over a transition system (operational semantics) to check wether some behaviors are possible

James Gallicchio (Jul 07 2022 at 23:14):

Andrés Goens said:

I feared something like that :sweat_smile: that's a cool suggestion (parametrizing the computation). I've never looked into the code that does the derivation of instances, but that sounds like it would be a fun thing to try. I'd be happy to team up on building that library James Gallicchio :)

ooh! I would also be up for that :) I'll have time this weekend/next week. shall I make a repo?

Andrés Goens (Jul 08 2022 at 07:59):

sounds good! I could also give it a shot later today/this weekend

Sebastian Ullrich (Jul 08 2022 at 08:31):

Lean has internal serialization (and zero-cost mmap deserialization) routines that can handle any Lean value except for closures and external types. This is used for .olean files, but not currently exposed as a general Lean API.

Andrés Goens (Jul 08 2022 at 10:56):

@Sebastian Ullrich that sounds pretty useful! Do you think it would make sense for @James Gallicchio and me to try to derive Hashable instances using that, or does that mean we should wait until these routines are exposed as a general Lean API?

Sebastian Ullrich (Jul 08 2022 at 11:04):

Wait, where did Hashable come in from now?

James Gallicchio (Jul 08 2022 at 11:27):

(I presume Andres meant a Serializable typeclass)

James Gallicchio (Jul 08 2022 at 11:36):

@Sebastian Ullrich if I understand correctly, the internal serialization is just copying the Lean ABI representation of the value? (with some sort of special handling to ensure the reference counts are correct?)

The only issue with the ABI encoding is space efficiency. A serialization routine that optimizes better for space would be useful for network applications, but I am guessing the internal serialization would be fine for @Andrés Goens 's application

Sebastian Ullrich (Jul 08 2022 at 11:44):

Yeah, specialized encodings could be much more space-efficient. And the internal serialization might require non-trivial work before it could be exposed as a general API. For example, the serialized objects do not contain reference counts at all, so there is no safe way to free them.

James Gallicchio (Jul 08 2022 at 12:15):

Ah, I see... Okay, then it won't take us too long to put together something workable for now

Andrés Goens (Jul 08 2022 at 14:25):

Sebastian Ullrich said:

Wait, where did Hashable come in from now?

oh sorry, that was a lapsus, I meant something like Serializable or ToByteArray


Last updated: Dec 20 2023 at 11:08 UTC