Zulip Chat Archive

Stream: new members

Topic: external iteration and the State monad


JJ (Aug 23 2024 at 19:36):

i'm looking to implement a parser in a recursive-descent, external-iteration style: by calling .next? on a List Byte (ideally a ByteArray, but i need to learn how to implement Stream for it) until i hit an error or the stream ends. this is the style of iteration i'm used to in rust - but it is complicated a little bit by lean being purely functional, as .next? is no longer of type &mut self → Option<T>, but is instead of type Stream T → Option (T × Stream T) (or List T). as such, .next? does not mutate the iterator in-place, and i have to manually plumb about the state in the types.

i was told that the State monad provides for something nicer. how do i go about using this? specifically, how would i go about converting the following #mwe to use the State monad, and can i get rid of the bind-then-update-data pattern when using .next??

import Batteries

abbrev UInt1 := BitVec 1
abbrev UInt2 := BitVec 2
abbrev UInt4 := BitVec 4
abbrev UInt6 := BitVec 6
abbrev Byte := UInt8
abbrev Short := UInt16
abbrev Long := UInt32

inductive Chunk where
  | rgb (r g b : Byte)
  | rgba (r g b a : Byte)
  | index (index : UInt6)
  | diff (dr dg db : UInt2)
  | luma (dg : UInt6) (drdg dbdg : UInt4)
  | run (run : UInt6)

def decode_chunk (data : List Byte) (byte : Byte) : Option (Chunk × List Byte) := do
  match byte with
    | 0b11111110 =>
      let (r, data)  data.next?
      let (g, data)  data.next?
      let (b, data)  data.next?
      (Chunk.rgb r g b, data)
    | 0b11111111 =>
      let (r, data)  data.next?
      let (g, data)  data.next?
      let (b, data)  data.next?
      let (a, data)  data.next?
      (Chunk.rgba r g b a, data)
    | _ => match (byte >>> 6) with
      | 0b00 =>
        (Chunk.index <| BitVec.ofNat 6 <| UInt8.toNat (byte &&& 0b00111111), data)
      | 0b01 =>
        let dr := BitVec.ofNat 2 <| UInt8.toNat (byte &&& 0b00110000) >>> 4
        let dg := BitVec.ofNat 2 <| UInt8.toNat (byte &&& 0b00001100) >>> 2
        let db := BitVec.ofNat 2 <| UInt8.toNat (byte &&& 0b00000011)
        (Chunk.diff dr dg db, data)
      | 0b10 =>
        let dg := BitVec.ofNat 6 <| UInt8.toNat (byte &&& 0b00111111)
        let (byte, data)  data.next?
        let drdg := BitVec.ofNat 4 <| UInt8.toNat (byte &&& 0b11110000) >>> 4
        let dbdg := BitVec.ofNat 4 <| UInt8.toNat (byte &&& 0b00001111)
        (Chunk.luma dg drdg dbdg, data)
      | 0b11 =>
        (Chunk.run $ BitVec.ofNat 6 <| UInt8.toNat (byte &&& 0b00111111), data)
      | _ => failure

def decode (data : List Byte) : Option (List Chunk) := do
  let mut data := data
  let mut chunks := []
  repeat
    let (byte, data')  data.next?
    data  data'
    let (chunk, data')  decode_chunk data byte
    data  data'
    chunks  chunk :: chunks
  chunks

i'd greatly appreciate either direct advice or links to documentation... i have read through the monads examples in #fpil, but they did not click for me.

i'd also appreciate general advice about the code quality. some of the BitVec conversions i am doing are terribly ugly and i suspect there's a nicer way to go about them. not to mention the weird bind-then-update-data pattern

JJ (Aug 23 2024 at 19:36):

tangentially - i have the import Batteries line only because it provides for using List.next? as a dot method. this is probably overkill - all i want is to have .next? in my namespace. is there an easier way to do that? i have been encountering this kind of frequently and it would be cool if i could just do open Stream.next? (for example) or something.


Last updated: May 02 2025 at 03:31 UTC