Zulip Chat Archive

Stream: lean4

Topic: modifiable state inside a ReaderM struct

Chris Lovett (Sep 26 2022 at 21:44):

Wow, I just discovered you can do this:

structure Jobs where
   processed : IO.Ref (List String)

def process (name : String): ReaderT Jobs IO Unit := do
  let jobs ← read
  jobs.processed.modify (λ s => name :: s)

def test (jobs : List String) : ReaderT Jobs IO Unit := do
  for job in jobs do
    process job

def main (args : List String): IO Unit := do
  let processedRef ← IO.mkRef ([])
  let jobs : Jobs := {processed := processedRef}
  test args |>.run jobs
  let result ← jobs.processed.get
  IO.println result

#eval main ["apple", "banana", "orange"] --[orange, banana, apple]

Looking at the comments on lean_st_ref_get it appears this is also thread safe which is cool. I assume this should be documented in Monads somewhere? I tried to rewrite my sample above using the lower level ST.Ref primitives but got stuck. Does anyone know how to do that?

Mario Carneiro (Sep 26 2022 at 22:00):

A similar trick you may or may not be aware of is the StateRefT monad transformer, which has the appearance and semantics of StateT but is actually a ReaderT (ST.Ref _) under the hood

Mario Carneiro (Sep 26 2022 at 22:02):

It's generally preferred whenever you are doing mostly linear updates to a data structure, since the use of mutable references makes it easier to ensure that you are doing efficient destructive updates and you aren't persisting the old state (unless you want to)

Mario Carneiro (Sep 26 2022 at 22:05):

Here's a version of your program in ST:

structure Jobs (ω) where
   processed : ST.Ref ω (List String)

def process (name : String) : ReaderT (Jobs ω) (ST ω) Unit := do
  let jobs ← read
  jobs.processed.modify (λ s => name :: s)

def test (jobs : List String) : ReaderT (Jobs ω) (ST ω) Unit := do
  for job in jobs do
    process job

def foo (args : List String) : List String :=
  runST fun ω => do
    let processedRef ← ST.mkRef ([])
    let jobs : Jobs ω := {processed := processedRef}
    test args |>.run jobs
    let result ← jobs.processed.get
    pure result

#eval foo ["apple", "banana", "orange"] --[orange, banana, apple]

Chris Lovett (Sep 26 2022 at 22:11):

Cool, thanks, also unfortunately "goto definition" doesn't work on StateRefT, and find all returns a very long list, do you happen to know where it is defined?

Mario Carneiro (Sep 26 2022 at 22:18):

StateRefT is actually a macro made to look like a regular definition, in order to hide the infectious ω type argument. The actual definition is StateRefT'

Mario Carneiro (Sep 26 2022 at 22:24):

Here's a version using StateRefT:

def process (name : String) : StateRefT (List String) (ST ω) Unit := do
  modify fun s => name :: s

def test (jobs : List String) : StateRefT (List String) (ST ω) Unit := do
  for job in jobs do
    process job

def foo (args : List String) : List String :=
  runST fun _ => do
    let (_, result) ← test args |>.run ([])
    pure result

#eval foo ["apple", "banana", "orange"] --[orange, banana, apple]

Chris Lovett (Sep 26 2022 at 22:27):

ah, but in that StateRefT version you have a direct "modify" method so it's looking more like a StateM monad at that point, the previous one is more interesting to me because the functions are all ReaderT, but one of them reaches in and modifies an ST.Ref structure field... which almost seems to break the promise that ReaderT is readonly...

Mario Carneiro (Sep 26 2022 at 22:28):

The way you should think about it is that an IO.Ref is like a pointer, and the pointer doesn't change even though the data behind it does. So the Ref can be a regular immutable value stored in a function parameter or in the reader context, and all the mutation goes via the impure IO.Ref.get and IO.Ref.set functions. ST ω is just a way of doing the same thing as IO but with a "local heap" identified by ω, and with a safe runST function for running an ST computation in pure code.

Chris Lovett (Sep 26 2022 at 23:13):

last question, what is ST short for in namespace ST ? Can you write paragraph that motivates ST ? Specifically, how is it different from breaking your context into 2 structs so you can use ReaderT ReadOnlyContext (StateT ReadWriteContext) ... ? Is this ST.Ref in a readonly context achieving something you can't do that way? For example, the Lean LSP server uses one of these for the RBMap that stores FileWorker objects, in the ServerContext where the ServerM monad is ReaderT ServerContext IO. Could this have been done using ReaderT ServerContext (StateT FileWorkerMap) ... ? So is ST just a shortcut for lazy people? :-) Or does this somehow also make updating the RBMap more efficient somehow?

Mario Carneiro (Sep 26 2022 at 23:23):

By the way, in case you were missing the context, ST is an innovation from haskell. If you search for runST online you will probably find some discussions about it

Mario Carneiro (Sep 26 2022 at 23:24):

The whole design and most of the names are lifted straight from there

Mario Carneiro (Sep 26 2022 at 23:24):

ST is short for "state" I guess

Mario Carneiro (Sep 26 2022 at 23:30):

I'm not even sure it's all that great a fit for lean. The trick with using type variables for "branding" local heaps is foreign to lean (and not even relevant since the compiler doesn't take advantage of it), and the things ST does are pretty close to provably impossible within the lean logic

Chris Lovett (Sep 26 2022 at 23:37):

Oh, https://wiki.haskell.org/Monad/ST - it is "strict Sate Threads", so there is a real connection to multithreading... which leads to https://www.microsoft.com/en-us/research/publication/lazy-functional-state-threads/ yummy.

Mario Carneiro (Sep 26 2022 at 23:40):

I guess the idea is that ST values can be mutated thread-unsafely because the data is constrained to live on one thread?

Mario Carneiro (Sep 26 2022 at 23:40):

I can't imagine that really working out in lean

Mario Carneiro (Sep 26 2022 at 23:45):

oh, maybe it's short for Single Threaded

Chris Lovett (Sep 26 2022 at 23:45):

It looks like lean_st_ref_get is trying to do some thread safe stuff...

Mario Carneiro (Sep 26 2022 at 23:46):

yeah, in lean they are definitely just the same thing as IO refs

Mario Carneiro (Sep 26 2022 at 23:48):

in fact, IO.Ref is just a wrapper around ST.Ref, ST.Ref.get and IO.Ref.get are the same function

Chris Lovett (Sep 27 2022 at 00:44):

Trying to run each "process" job in a separate Task, but I get back an empty processed list, what am I doing wrong? (I wish we had tasks.waitAll)

structure Jobs (ω) where
   processed : ST.Ref ω (List String)

def process (name : String) : ReaderT (Jobs ω) (ST ω) Unit := do
  let jobs ← read
  jobs.processed.modify (λ s => name :: s)

def processTasks (args: List String) : ReaderT (Jobs ω) (ST ω) Unit := do
  let jobs ← read
  let tasks := args.map (λ s => Task.spawn fun _ => dbgSleep 1000 fun _ => process s |>.run jobs)
  let result := tasks.map (λ s => s.get)
  return

def foo (args : List String) : List String :=
  runST fun ω => do
    let processedRef ← ST.mkRef ([])
    let jobs : Jobs ω := {processed := processedRef}
    processTasks args |>.run jobs
    let result ← jobs.processed.get
    return result

#eval foo ["apple", "banana", "orange"] --[] ???

Chris Lovett (Sep 27 2022 at 00:46):

and is there an easier way? Does lean have a parallel.foreach ?

Mario Carneiro (Sep 27 2022 at 01:01):

You never ran the ST actions

Mario Carneiro (Sep 27 2022 at 01:01):

Are you getting an unused variable result on the let result := line? It should have been a red flag

Chris Lovett (Sep 27 2022 at 01:03):

So process s |>.run jobs doesn't run it?

Mario Carneiro (Sep 27 2022 at 01:04):

What you did in processTasks is to create a List (Task (ST ω Unit)) consisting of tasks that will resolve to an ST action after 1 second, then mapping Task.get on them to get a List (ST ω Unit) (note that at this point none of these ST actions have run yet). The list is then dropped, and lean being a functional language sees that all the work is pointless and (correctly) optimizes the whole function to pure ()

Mario Carneiro (Sep 27 2022 at 01:05):

It runs the ReaderT wrapper, which means that it accepts the Jobs reference and resolves to an action in the base monad, here ST

Mario Carneiro (Sep 27 2022 at 01:05):

you still have to run the ST action

Mario Carneiro (Sep 27 2022 at 01:06):

And you can't, this is the whole point of the "single threaded" business for the ST monad

Mario Carneiro (Sep 27 2022 at 01:06):

You would have to be in the original runST invocation (that is, the main thread) in order to run it

Mario Carneiro (Sep 27 2022 at 01:07):

inside your task you are computing a pure function so you don't have access to any state to mutate

Mario Carneiro (Sep 27 2022 at 01:08):

What you actually want is to create an impure task, using the confusingly-located BaseIO.asTask function

Mario Carneiro (Sep 27 2022 at 01:08):

and you need to use IO refs instead of ST refs and do everything in IO

Mario Carneiro (Sep 27 2022 at 01:10):

this is obviously not a pure computation since you are deliberately trying to race threads to get the values out in a random order, which is not allowed for pure functions

Mario Carneiro (Sep 27 2022 at 01:15):

Here's the corrected version:

structure Jobs where
   processed : IO.Ref (List String)

def process (name : String) : ReaderT Jobs IO Unit := do
  let jobs ← read
  jobs.processed.modify (λ s => name :: s)

def processTasks (args: List String) : ReaderT Jobs IO Unit := do
  let jobs ← read
  let tasks ← args.mapM fun s => IO.asTask do
    IO.sleep 1000
    process s |>.run jobs
  for task in tasks do
    ofExcept task.get

def foo (args : List String) : IO (List String) := do
    let processedRef ← IO.mkRef ([])
    let jobs : Jobs := {processed := processedRef}
    processTasks args |>.run jobs
    jobs.processed.get

#eval foo ["apple", "banana", "orange"] --["banana", "orange", "apple"]
                                        --["banana", "apple", "orange"]
                                        --["orange", "banana", "apple"]

Mario Carneiro (Sep 27 2022 at 01:18):

one thing to note about this example: it is important that we "use" the result of the tasks, here by matching on all of them with ofExcept, because if we don't the compiler is still permitted to drop the threads early or never start them

Chris Lovett (Sep 27 2022 at 01:19):

Thanks, I guess this means the processed.modify is actually thread safe, and ST is not about "thread local storage"...

Mario Carneiro (Sep 27 2022 at 01:19):

well note that it's not ST any more

Mario Carneiro (Sep 27 2022 at 01:20):

There isn't an ST version of this example

Mario Carneiro (Sep 27 2022 at 01:21):

because there is no ST.asTask

Mario Carneiro (Sep 27 2022 at 01:22):

but yes, IO refs are thread safe

Mario Carneiro (Sep 27 2022 at 01:22):

(and ST refs as well, since they are currently not distinguished from IO refs, although haskell makes a distinction and maybe lean will in the future)

Mario Carneiro (Sep 27 2022 at 01:24):

Just about everything in lean is thread safe. It's pretty hard to guarantee that some data is not concurrently accessible except the refcount = 1 trick lean uses for destructive update

Chris Lovett (Sep 27 2022 at 01:25):

Very instructional, thanks! we should also write a chapter in the reference manual about Tasks. This chapter is just a bit too cryptic :-). Is there a "parallel.foreach" in lean that will do task creation for each item in the enumeration and gathering of results for me?

Mario Carneiro (Sep 27 2022 at 01:26):

that's the .map Task.get trick from before

Mario Carneiro (Sep 27 2022 at 01:27):

It wouldn't be too hard to make a macro that does it, I think

Chris Lovett (Sep 27 2022 at 01:36):

Yea, I'd want processTasks to collapse down to a single line like Parallel.foreach args (λ _ => process s |>.run jobs) and this would return the gathered results (or throw). I realize the results in this case are not interesting, but assuming we were not playing with IO.ref state in this example and process returned a String, then parallel.foreach would return List String. See the F# version, they call it Async.Parallel...

Do we have a naming convention for "async" methods that return a Task? Like in this version?

structure Jobs where
   processed : IO.Ref (List String)

def asyncProcess (name : String) : ReaderT Jobs IO (Task (Except IO.Error Unit)) := do
  let jobs ← read
  IO.asTask do
    IO.sleep 500
    jobs.processed.modify (λ s => name :: s)

def processTasks (args: List String) : ReaderT Jobs IO Unit := do
  let jobs ← read
  let tasks ← args.mapM fun s => asyncProcess s |>.run jobs
  for task in tasks do
    ofExcept task.get

def foo (args : List String) : IO (List String) := do
    let processedRef ← IO.mkRef []
    let jobs : Jobs := {processed := processedRef}
    processTasks args |>.run jobs
    jobs.processed.get

Mario Carneiro (Sep 27 2022 at 01:41):

Actually, I started looking at this but I don't think it makes sense. What monad would it live in? What would you even be able to do in the body of the loop?

Mario Carneiro (Sep 27 2022 at 01:42):

For loops structurally don't return a value, but if you don't return a value from a task it's completely useless

Mario Carneiro (Sep 27 2022 at 01:43):

Maybe you should show something a little less like a toy example

Mario Carneiro (Sep 27 2022 at 01:57):

def parForIn [ForIn IO σ α] (xs : σ) (f : α → IO PUnit) : IO PUnit := do
  let mut tasks := #[]
  for x in xs do
    tasks := tasks.push (← IO.asTask (f x))
  tasks.forM (ofExcept ·.get)

syntax "parallel " "for " ident " in " termBeforeDo " do " doSeq : doElem
macro_rules
  | `(doElem| parallel for $x in $xs do $seq) => `(doElem| parForIn $xs fun $x => do $seq)

structure Jobs where
   processed : IO.Ref (List String)

def asyncProcess (name : String) : ReaderT Jobs IO (Task (Except IO.Error Unit)) := do
  let jobs ← read
  IO.asTask do
    IO.sleep 500
    jobs.processed.modify (λ s => name :: s)

def processTasks (args: List String) : ReaderT Jobs IO Unit := do
  let jobs ← read
  let tasks ← args.mapM fun s => asyncProcess s |>.run jobs
  for task in tasks do
    ofExcept task.get

def foo (args : List String) : IO (List String) := do
    let processedRef ← IO.mkRef []
    let jobs : Jobs := {processed := processedRef}
    parallel for name in args do
      IO.sleep 500
      jobs.processed.modify (λ s => name :: s)
    jobs.processed.get

#eval foo ["apple", "banana", "orange"] --["banana", "orange", "apple"]

Chris Lovett (Sep 27 2022 at 01:57):

Perhaps I don't understand your comment, but I found the parallel task library in .NET to be extremely useful, and powerful, since you can customize every aspect of task scheduling, prioritization and so on. And it works really well with LINQ so you each task can pull async data from SQL, or HTTP or files, and transform the data, which you can then gather up and send along to a client. Even the Lean LSP could use this and process all dependent files in a project when we add proper LSP project support. But if lean Monads make it hard to type the result of a parallel operation over monadic functions then I guess this is a big strike against Monads, so it would be good to figure out how to solve that...

Mario Carneiro (Sep 27 2022 at 01:59):

You need to learn to think more functionally about things. Most of these things are approached from a completely different angle. It's not that you can't do them, but rather putting them in imperative clothes isn't always a useful thing to do

Mario Carneiro (Sep 27 2022 at 01:59):

Lake is highly parallel and also highly functional, and the server is too

Chris Lovett (Sep 27 2022 at 02:00):

Ok, but I need to utilize all 24 cores on my CPU, how does Lean help me do that? Are you saying the right answer is IO.Process.spawn (like lake), and not Task.spawn?

Mario Carneiro (Sep 27 2022 at 02:01):

Does IO.asTask not work?

Mario Carneiro (Sep 27 2022 at 02:01):

Or Task.spawn for pure functions

Mario Carneiro (Sep 27 2022 at 02:02):

It's true that lean's support for threading primitives is pretty bare (see lean4#1280), but for this particular task Task seems to be the right tool for the job

Chris Lovett (Sep 27 2022 at 02:03):

It is the minimum viable product yes. Just doesn't have all the bells and whistles I'm used to having.

Mario Carneiro (Sep 27 2022 at 02:04):

sure, but you haven't even asked for anything that needs the bells and whistles

Chris Lovett (Sep 27 2022 at 02:04):

When I saw AsyncList in the Lean LSP Server I'm like why isn't this easier to do and generally available outside the Server implementation?

Mario Carneiro (Sep 27 2022 at 02:05):

I'm a believer in doing something 3 times before making a premature abstraction for it

Mario Carneiro (Sep 27 2022 at 02:05):

every application has slightly different requirements, and you shouldn't make the library version until you have a reasonable grasp on what those requirements are

Chris Lovett (Sep 27 2022 at 02:06):

I like where https://github.com/leanprover/lean4/issues/1280 is going, I am also a huge fan of channels as a way to coordinate across tasks - I think a channel would make the FileWorker management MUCH easier.

Mario Carneiro (Sep 27 2022 at 02:07):

For instance, AsyncList corresponds to a lazy list, in which the tail of the list can be delayed and in a thunk. That's different from your application above, where you want all the results to be run in parallel

Mario Carneiro (Sep 27 2022 at 02:08):

if AsyncList was the library abstraction, it wouldn't be appropriate for your use case

Mario Carneiro (Sep 27 2022 at 02:08):

just because it's got Async and List in the name doesn't mean that tells the whole story

Chris Lovett (Sep 27 2022 at 02:08):

I agree it is very hard to design a really great framework. Often times frameworks just add more layers of "inefficiency" that add up to big performance problems. But Lean has some new things to offer in that regard with metaprogramming, one could think about a self assembling framework that provides only what a given user needs, and optimized for that usage scenario. Could be very interesting.

Mario Carneiro (Sep 27 2022 at 02:09):

Okay, but again you have to start with the task at hand

Mario Carneiro (Sep 27 2022 at 02:09):

and work backward to the best way to support that

Mario Carneiro (Sep 27 2022 at 02:10):

I don't think we have a very clear idea at all of what a library abstraction in this space would actually be, besides what Task already is

Mario Carneiro (Sep 27 2022 at 02:10):

(Since you mentioned it: you can set thread priorities in Task primitives BTW.)

Chris Lovett (Sep 27 2022 at 02:11):

There's a many very well designed parallel libraries to learn from including C++, .NET, Java, Rust, so I don't think you need to start from first principles and ask each user "what are you trying to do". If you approach it that way you won't have something great for another 10 years.

Mario Carneiro (Sep 27 2022 at 02:11):

also LINQ is just functional style programming for C#

Mario Carneiro (Sep 27 2022 at 02:13):

No you really do have to design it afresh. Most imperative designs don't work at all in functional languages, or need to be rephrased such that they would hardly be recognizable by the end

Mario Carneiro (Sep 27 2022 at 02:13):

You just named a bunch of imperative languages

Chris Lovett (Sep 27 2022 at 02:14):

Yes which is why I love it so much :-) I was one the incubation team that developed X# which lead to c-omega which was the precursor to LINQ. We showed the C# team how to do it :-)

Chris Lovett (Sep 27 2022 at 02:14):

Well F# seems to have bolted onto .NET parallel library somehow... but I've never used it so I don't know how good it is.

Mario Carneiro (Sep 27 2022 at 02:14):

anyway this doesn't seem to be an actually actionable suggestion. Do you want to suggest something more concrete?

Mario Carneiro (Sep 27 2022 at 02:15):

Or better yet, #xy the problem since the tools to do what you want are already available, once you figure them out

Chris Lovett (Sep 27 2022 at 02:27):

Nope just a question for now, and I like your "parallel for" macro. Thanks for all your help. Getting back to the question about ST I'll assume it means single threaded, and that when we used it in the parallel test, the modify method worked reliably because it was protecting the update to the ST.Ref List String ensuring that each update was serialized...

Chris Lovett (Sep 27 2022 at 02:45):

Here's a better test and it passes as expected:

structure Jobs where
   processed : IO.Ref (List Nat)

def asyncProcess (name : Nat) : ReaderT Jobs IO (Task (Except IO.Error Unit)) := do
  let jobs ← read
  IO.asTask do
    IO.sleep 1
    jobs.processed.modify (λ s => name :: s)

def processTasks (args: List Nat) : ReaderT Jobs IO Unit := do
  let jobs ← read
  let tasks ← args.mapM fun s => asyncProcess s |>.run jobs
  for task in tasks do
    ofExcept task.get

def foo (args : List Nat) : IO (List Nat) := do
  let processedRef ← IO.mkRef []
  let jobs : Jobs := {processed := processedRef}
  processTasks args |>.run jobs
  jobs.processed.get

def test : IO Unit := do
  let max := 10000
  let x ← foo (List.range max)

  let a := x.toArray.qsort (·<·)

  for expected in List.range max do
    let actual := a[expected]!
    if actual ≠ expected then
      IO.println s!"unexpected {actual} instead of {expected}"

  IO.println "ok"

#eval test

Chris Lovett (Sep 27 2022 at 03:01):

But getting back to my original question, how is this IO.Ref any better than using StateT like this? I assume it is simply that we don't have to create a new structure Jobs container each time we update the state? So is that all it is really for then, this minor efficiency improvement? If the StateT was on List Nat then even that difference goes away? But perhaps if we have a sparse collection of randomly updatable objects then an ST.Ref for each one of those would be a win...? Am I missing something?

structure Jobs where
   processed : List Nat

def asyncProcess (i : Nat) : StateT Jobs IO (Task (Except IO.Error Unit)) := do
  modify (λ s => { processed := i :: s.processed })
  IO.asTask do
    IO.sleep 1

def processTasks (args: List Nat) : StateT Jobs IO Unit := do
  let tasks ← args.mapM fun s => asyncProcess s
  for task in tasks do
    ofExcept task.get

def foo (args : List Nat) : IO (List Nat) := do
  let jobs : Jobs := {processed := []}
  let (_, s) ← processTasks args |>.run jobs
  return s.processed

def test : IO Unit := do
  let max := 10000
  let x ← foo (List.range max)

  let a := x.toArray.qsort (·<·)

  if a.size ≠ max then
    IO.println s!"Returned wrong length {a.size} instead of {max}"
  else
    for expected in List.range max do
      let actual := a[expected]!
      if actual ≠ expected then
        IO.println s!"unexpected {actual} instead of {expected}"

    IO.println "ok"

#eval timeit "running: " test  -- ok, running:  503ms

StateT does this in 504ms, and the previous ST.Ref version did it in 509ms, so essentially the same... probably a huge thread start/join overhead in this case...

Chris Lovett (Sep 27 2022 at 03:15):

Ah, but I can't move the modify (λ s => { processed := i :: s.processed }) inside the IO.asTask so perhaps THAT is the reason for ST.Ref...

Mario Carneiro (Sep 27 2022 at 03:24):

Regarding your test, here's a simplification which does a lot of contentious writes to a shared variable, where the modification function is slow.

#eval show IO _ from do
  let var ← IO.mkRef 0
  let tasks ← (List.range 50).mapM fun _ => IO.asTask do
    for _ in [0:50] do
      let f i := dbgSleep 1 (fun _ => i+1)
      var.modify f
  tasks.forM (ofExcept ·.get)
  var.get

#eval show IO _ from do
  let var ← IO.mkRef 0
  let tasks ← (List.range 50).mapM fun _ => IO.asTask do
    for _ in [0:50] do
      let f i := dbgSleep 1 (fun _ => i+1)
      var.set (f (← var.get))
  tasks.forM (ofExcept ·.get)
  var.get

The first test is fairly slow and also gets the right answer (50*50 = 2500), while the second test is fast and also wrong (I get almost exactly 400 = 50*8 which again reveals the concurrency). This is demonstrating something that probably should have been in the docs: Ref.modify atomically performs the modification by taking a lock on the ref, so all the writes are serialized. In the second example we use get and set separately and lots of increments are lost due to concurrent modification.

Chris Lovett (Sep 27 2022 at 03:29):

yes separate get and set is rife with problems. One needs an atomic update, sounds like modify is exactly that. Note that in Simon's paper they say that "ST" stands for "State Transformer".

Mario Carneiro (Sep 27 2022 at 03:31):

Chris Lovett said:

But getting back to my original question, how is this IO.Ref any better than using StateT like this? I assume it is simply that we don't have to create a new structure Jobs container each time we update the state? So is that all it is really for then, this minor efficiency improvement? If the StateT was on List Nat then even that difference goes away? But perhaps if we have a sparse collection of randomly updatable objects then an ST.Ref for each one of those would be a win...? Am I missing something?

The differences are washed out even more than you might expect because lean performs destructive updates when you use a variable linearly (that is, you don't keep multiple references to the state, which could require persisting old values). So in many tests like this you will see almost zero difference between using mutation and using StateT which does functional-but-in-place mutation of a value.

However, the FBIP optimization is brittle and also comes with some overheads of its own, depending on how you write it. For example StateM s a is a wrapper for a function s -> a × s, which means that every call to a StateM function has to allocate a pair and then destruct it. The compiler can often optimize these away but it's not really zero cost. With "true mutation" you don't have to put the s in an extra box (although there is a very similar overhead to this in using IO because of the error checking, although there is work being done on the compiler to make IO zero cost in more cases).

It's not a clear win in either direction. Choose the form that fits your design the best.

Mario Carneiro (Sep 27 2022 at 03:32):

Chris Lovett said:

yes separate get and set is rife with problems. One needs an atomic update, sounds like modify is exactly that. Note that in Simon's paper they say that "ST" stands for "State Transformer".

yikes, that makes it a bit of a nightmare to disambiguate ST and StateT

Chris Lovett (Sep 27 2022 at 03:35):

Yeah, a bit of a collision in terminology with "StateT" being a "Monad Transfomer with updatatable State" and "ST" being an "State Transformer Monad"...

Chris Lovett (Sep 27 2022 at 04:13):

And yes, I did notice that ExceptM adds an Except wrapper even in the Except.ok code path, which seems not so great to me. Most other runtimes work very hard to ensure you only pay for exception handling except in the exceptional cases where someone actually calls throw. So yes, it would be very cool if we could optimize that way somehow.

Mario Carneiro (Sep 27 2022 at 04:49):

Lean's Except is more similar to Rust's Result type or C error return values than traditional exception handling in the style of C++ / C# / Java which use stack unwinding and runtime type information (RTTI). That strategy has very low cost on the happy path but is super expensive if you ever use it, and many C/C++ codebases avoid it entirely.

Mario Carneiro (Sep 27 2022 at 04:53):

That kind of side channel is also a really poor fit for functional languages generally. The nearest equivalent is the CPS monad, where you pass an error continuation to be called in the event of an exception. It works pretty well but it also incurs the cost of a closure allocation in the common case.

Chris Lovett (Sep 27 2022 at 05:33):

Totally agree stack unwinding is a horrible thing, even in languages that have implemented it really well. So yeah, no free lunch without it. I guess so long as Except is on the stack instead of on the heap that would help a lot.

Chris Lovett (Sep 27 2022 at 05:33):

So I put a doc together, https://github.com/leanprover/lean4/pull/1653, waddaya think?

Mario Carneiro (Sep 27 2022 at 17:11):

@Chris Lovett Added some review comments. There is another thing about ST that I did not mention, but is a pretty important difference from StateT: The ST monad allows you to create references of arbitrary types, you don't have to determine all the state you are going to carry in advance at the call to StateT.run. You can see this in the type of ST: although it has this mysterious ω parameter, what it doesn't have is the state type like StateT would.

Mario Carneiro (Sep 27 2022 at 17:11):

def foo (ref : ST.Ref ω Nat) (bar : ST ω Unit) : ST ω (Nat × Nat) := do
  let x ← ref.get
  bar
  return (x, ← ref.get)

#eval runST fun ω => do
  let x : ST.Ref ω Nat ← ST.mkRef 0
  let y : ST.Ref ω (List Nat) ← ST.mkRef []
  let z ← foo x do
    x.modify (·+1)
    y.modify (3::·)
  return (← x.get, ← y.get, z) -- (1, [3], 0, 1)

Mario Carneiro (Sep 27 2022 at 17:14):

Here's an example that highlights some of this. The foo function reads a Nat reference, then runs an arbitrary function bar, then reads the reference again. It has no idea how much state is actually in the monad, whether there are any other references and what types they are. In the main function, we call foo where there are two references, one of type Nat and one of type List Nat, and modify both of them in bar, so foo can see the state change to x but the change to y also happened. You can't do this sort of thing using StateT without a lot of contortion.

Chris Lovett (Sep 27 2022 at 20:55):

Interesting, looking at your doc feedback, where you want to simplify everything and remove all mention of ReaderM and so on seems like it no longer fits in the monads chapter. So I'm wondering if we need a separate page someplace that talks about "references" independently from monads (I think your simpler example that has nothing to do with monads really, except that it is running in an IO monad). Then perhaps my mentioning of ST.Ref in the Monads chapter could happen in ReaderM just as a side note, and we could put another mention in the documentation on let mut because an IO.Ref also gets around the limitations of let mut in a multithreading app. But the above information then makes me think we do still need an ST monad specific page in the monads chapter that contains this kind of info... more on "ST" properties so it is not just about "ST.Ref"?

Mario Carneiro (Sep 27 2022 at 20:58):

I would start by presenting the type signatures of the ST.Ref functions and runST, then explain runST a bit because it's weird, then give a motivating example and contrast with StateT. It's certainly most relevant to the monads chapter because it's a specific tool in monadic programming

Mario Carneiro (Sep 27 2022 at 21:00):

It should be put in a place where we can have some reasonable expectation that the reader can understand how monad type signatures work and what do notation does at least at a basic level, meaning it has to come in or after the monads chapter

Mario Carneiro (Sep 27 2022 at 21:00):

it's a more advanced topic to be sure though

Chris Lovett (Sep 27 2022 at 23:11):

Ok, I think I will move the "ST" page so it comes after "Transformers" then I can also show StateRefT, and call it more of an advanced topic and I will start by presenting ST, ST.Ref and runST in a simpler example and I will rename it "ST References" since ST and ST.Ref are inseparable.

But I have a quick question, when I look at https://github.com/leanprover/lean4/blob/master/src/Lean/Meta/Tactic/Cleanup.lean it uses StateRefT in a bunch of local functions and if I change all those to StateT it all compiles and works? So why is StateRefT used here instead of StateT ?

Mario Carneiro (Sep 27 2022 at 23:18):

StateRefT is basically a wrapper around ST made to give it the interface of StateT

Mario Carneiro (Sep 27 2022 at 23:19):

That is, it is deliberately designed to be a drop in replacement for StateT

Chris Lovett (Sep 27 2022 at 23:22):

Right, which explains why I could drop in StateT and everything worked, but my question is why did the author choose StateRefT in this case instead of StateT ? Is there some functional difference that is important in this case?

Mario Carneiro (Sep 27 2022 at 23:22):

Well you should check the benchmark numbers for that

Mario Carneiro (Sep 27 2022 at 23:23):

There is no functional difference, but there could be a performance difference

Chris Lovett (Sep 27 2022 at 23:23):

Got it, is there a benchmark test for Lean.Meta already? Where do I find it and learn how to run it?

Mario Carneiro (Sep 27 2022 at 23:27):

http://speedcenter.informatik.kit.edu/velcom/home

Chris Lovett (Sep 27 2022 at 23:41):

Thanks, I just found the connection in ~/tests/bench/...

Chris Lovett (Sep 27 2022 at 23:59):

Any ideas on this error related to Lean.Data.HashMap ?
image.png

Chris Lovett (Sep 27 2022 at 23:59):

my branch is up to date with lean4 master...

Mario Carneiro (Sep 28 2022 at 00:01):

it looks like you are using nightly lean in the error message

Mario Carneiro (Sep 28 2022 at 00:06):

try elan override set lean4 in the directory, assuming you previously set elan toolchain link lean4 <path/to/stage1>

Chris Lovett (Sep 28 2022 at 00:11):

I see, yes elan override set ... solved the problem. That's not mentioned in the readme.

Chris Lovett (Sep 28 2022 at 00:12):

looks like another missing dependency also not mentioned in the readme (scipy is also missing):

Benchmark 1 out of 10 to 10 [------------------------------------] 0%[17:11:44] Program block no. 0 failed: The perf tool needed for the perf stat runner isn't installed. You can install it via the linux-tools (or so) package of your distribution. If it's installed, you might by only allowed to use it with super user rights. Test a simple command like perf stat /bin/echo to see what you have to do if you want to use with your current rights.

Doyou know the magic install command line for this?

Chris Lovett (Sep 28 2022 at 00:14):

Ubuntu says to try sudo apt install linux-tools-common so I'll try it...

Mario Carneiro (Sep 28 2022 at 00:15):

I have never used the benchmarking tool

Chris Lovett (Sep 28 2022 at 00:36):

I see, well to conclude the question then I just discovered that the linux perf tool is not supported in WSL. So that's my problem.

Last updated: May 02 2025 at 03:31 UTC