Zulip Chat Archive

Stream: lean4

Topic: Lazy initialization

Eric Wieser (Feb 14 2025 at 12:08):

Does something like this already exist?

/-- Wrapper for a lazily-initializer reference. -/
structure LazilyInitialized (α : Type) : Type where
  private ref : IO.Ref (Option α)
  private mutex : Std.BaseMutex
  private init : IO α
deriving Nonempty

def LazilyInitialized.new {α} (init : IO α) : IO (LazilyInitialized α) := do
  return { ref := ← IO.mkRef none, mutex := ← Std.BaseMutex.new, init := init}

/-- Obtain the value, constructing it in a thread-safe way if necessary. -/
def LazilyInitialized.get {α} (l : LazilyInitialized α) : IO α := do
  if let .some a ← l.ref.get then
    return a
  try
    l.mutex.lock
    if let .some a ← l.ref.get then
      return a
    let a ← l.init
    l.ref.set (some a)
    return a
  finally
    l.mutex.unlock

initialize foo : LazilyInitialized Foo ← LazilyInitialized.new do
  someExpensiveIOCall

Eric Wieser (Feb 14 2025 at 12:31):

(context: nomeata/loogle#26)

Joachim Breitner (Feb 14 2025 at 13:11):

Looks like an IO version of docs#Thunk

Joachim Breitner (Feb 14 2025 at 13:13):

(I’ll watch this discussion in case something new comes up before merging your PR)

Eric Wieser (Feb 14 2025 at 14:25):

Ah indeed, though Thunk uses atomics behind the scenes rather than a mutex

Eric Wieser (Feb 14 2025 at 14:25):

Maybe this warrants a ThunkT monad transformer, with Thunk := ThunkT Id?

Eric Wieser (Feb 17 2025 at 12:11):

So something like

import Std
import Lean

/-- A monadic thunk, that memorizes its result after the first execution.

Can also be thought of as a monadic lazy initializer-/
structure ThunkM (m : Type → Type) [Monad m] [MonadLiftT BaseIO m] [MonadFinally m] (α : Type) : Type where
  private ctor ::
    private ref : IO.Ref (Option α)
    private mutex : Std.BaseMutex
    private init : m α

variable {m : Type → Type} {α} [Monad m] [MonadLiftT BaseIO m] [MonadFinally m]

instance [Nonempty (m α)] : Nonempty (ThunkM m α) :=
  let ⟨ref⟩ := inferInstanceAs <| Nonempty (IO.Ref (Option α))
  let ⟨mutex⟩ := inferInstanceAs <| Nonempty Std.BaseMutex
  let ⟨init⟩ := inferInstanceAs <| Nonempty (m α)
  ⟨{ref, mutex, init}⟩

/-- Create a monadic thunk from an initializer.

When `.get` is called and the initializer succeeds, its result is cached. -/
def ThunkM.mk (init : m α) : BaseIO (ThunkM m α) := do
  return { ref := ← IO.mkRef none, mutex := ← Std.BaseMutex.new, init := init}

/-- Obtain the value, constructing it in a thread-safe way if necessary. -/
def ThunkM.get (l : ThunkM m α) : m α := do
  if let .some a ← (l.ref.get : BaseIO (Option α)) then
    return a
  try
    l.mutex.lock
    if let .some a ← (l.ref.get : BaseIO (Option α)) then
      return a
    let a ← l.init
    (l.ref.set (some a) : BaseIO Unit)
    return a
  finally
    l.mutex.unlock

/-- info: ran initializer once -/
#guard_msgs in
run_meta do
  let lazy1 : ThunkM Lean.MetaM Nat ← ThunkM.mk <| do
    Lean.logInfo "ran initializer once"
    return 37
  lazy1.get
  lazy1.get

Joachim Breitner (Feb 17 2025 at 13:04):

Remind me why do you want to lift it out of IO? I guess now m can do “things”, but since you don't really know when the first get it will be invoked, it seems like it can be somewhat confusing and dangerous to have the effects of m appear there.

Eric Wieser (Feb 17 2025 at 13:08):

I don't have a good motivation for that besides trying to generalize. Indeed, it's hard to think of a monad where this lift wouldn't be confusing and dangerous

Joachim Breitner (Feb 17 2025 at 13:08):

Ok, that’s what I thought :-)

Eric Wieser (Feb 17 2025 at 13:09):

Maybe a version that cached the monad state and context at the time of initialization and used that in .get would be sane

Joachim Breitner (Feb 17 2025 at 13:10):

Yes, possibly, using docs#MonadControlT or so. But even then it may be confusing that any side-effects are dropped.

Eric Wieser (Feb 17 2025 at 13:10):

But it's more than I need, so maybe the original with name IO.Thunk is best?

Joachim Breitner (Feb 17 2025 at 13:11):

If in doubt, don’t generalize. (It’s code, not math)

Eric Wieser (Feb 17 2025 at 13:12):

I guess ExceptT BonusEx IO would be a place where the generality is useful

Joachim Breitner (Feb 17 2025 at 13:13):

Really? In that case the action would be thrown upon each .get and never cached (based on your code above)

Eric Wieser (Feb 17 2025 at 13:13):

That's already what it does with io errors

Joachim Breitner (Feb 17 2025 at 13:14):

Hmm, fair point. Maybe it should live in BaseIO then :-)

Joachim Breitner (Feb 17 2025 at 13:16):

Probably that’s why docs#BaseIO.asTask is also in BaseIO

Eric Wieser (Feb 17 2025 at 13:17):

Joachim Breitner said:

Hmm, fair point. Maybe it should live in BaseIO then :-)

Or there should be both BaseIO.Thunk and IO.Thunk?

Eric Wieser (Feb 17 2025 at 13:22):

For what it's worth, Python's @functools.cache has the behavior of not caching errors, like IO.Thunk would (https://github.com/python/cpython/issues/65736)

Joachim Breitner (Feb 17 2025 at 14:20):

Maybe BaseIO.Thunk is enough. IO.Thunk sounds dangerous, unless it does store the exception (i.e. is a thin wrapper around BaseIO.Thunk)

Eric Wieser (Feb 17 2025 at 15:36):

Would you like me to make such a change to the loogle PR, or is it blocked until this discussion is resolved anyway?

Joachim Breitner (Feb 17 2025 at 15:39):

I guess I can merge this change (with a BaseIO-typed-thunk) to unblock your experiments using Loogle locally.

Last updated: May 02 2025 at 03:31 UTC