Zulip Chat Archive

Stream: new members

Topic: how to parse Floats or even Sci notation with Parsec?


Christian Tzurcanu (Aug 12 2024 at 12:30):

I have tried several ways that Claude and ChatGPT recommend and none worked.

Probably the solution is to look for a '.' character and then go back and parse everything that starts from the last whitespace. But I was unable to attain that...

my current MVE:

import Lean
import Lean.Parser


inductive Types : Type u
  | strVal (v : String)
  | intVal (v : Int)
  | floatVal (v : Float)
  deriving Repr


open Lean Lean.Parsec

def whitespace : Parsec Unit :=
  discard <| many (satisfy Char.isWhitespace)

def digits1 : Parsec (Array Char) :=
  many1 digit

def parseInt : Parsec Int :=
  digits1 >>= fun ds => pure (String.toInt! (String.mk ds.toList))

def parseIntVal : Parsec Types :=
  whitespace *>
  let sign := (pchar '-' *> pure (-1)) <|> (pchar '+' *> pure 1) <|> pure 1
  -- whitespace *>
  sign >>= fun s =>
    parseInt >>= fun n =>
      pure (Types.intVal (s * n))




-- problem with Float starts here

-- Utility function to parse digits
def digits : Parsec String :=
  many1Chars digit

def stringToFloat (s : String) : Option Float :=
  let parts := s.split (· == '.')
  match parts with
  | [intPart, fracPart] =>
    let intValue := intPart.toInt!
    let fracValue := fracPart.toNat!
    let fracDigits := fracPart.length -- (Int.ofNat fracPart.length)
    some (Float.ofScientific intValue.toNat (fracValue > 0) fracDigits)
  | _ => none

def parseFloatVal : Parsec Types := do
  let intPart  many1Chars digit
  let _  pchar '.'
  let fracPart  many1Chars digit
  let floatStr := intPart ++ "." ++ fracPart
  match stringToFloat floatStr with
  | some f => pure (Types.floatVal f)
  | none => fail s!"Invalid float: {floatStr}"



-- problem with Float ends here




-- Parser for strVal
def parseStrVal : Parsec Types := do
  -- whitespace
  let _  pchar '"'
  let str ← many1Chars (satisfy (λ c => c ≠ '"'))
  let _  pchar '"'
  return Types.strVal str


partial def parseTypes : Parsec Types :=
    parseStrVal <|> parseFloatVal <|> parseIntVal

def parse (input : String) : Except String Types :=
  match parseTypes input.iter with
  | Lean.Parsec.ParseResult.success _ res => Except.ok res
  | Lean.Parsec.ParseResult.error _ err => Except.error err

def exceptToTypes (e : Except String Types) (default : Types) : Types :=
  match e with
  | Except.ok t => t
  | Except.error _ => default


def testParser (input : String) : IO Unit := do
  IO.println s!"Input: \"{input}\""
  match parse input with
  | Except.ok result =>
    IO.println s!"Parsed successfully. {repr result}"
  | Except.error err =>
    IO.println s!"Parsing failed: {err}"


#eval testParser "-123"

def res := testParser "-1.23"
#eval res

the second eval should have produced a Float not an Int.

Christian Tzurcanu (Aug 12 2024 at 12:37):

I also wonder why there is no Parsec combinator widely known for Floats. It seems something of wide interest.. Not that I would know how to find the available combinators, but...

Edward van de Meent (Aug 12 2024 at 13:09):

there is a parser for scientific notation called Lean.Parser.scientificLit...

Edward van de Meent (Aug 12 2024 at 13:13):

although i should point out that that doesn't parse the - part

Christian Tzurcanu (Aug 12 2024 at 13:14):

Edward van de Meent said:

although i should point out that that doesn't parse the - part

even so. how do you use it in this context?

Edward van de Meent (Aug 12 2024 at 13:19):

good question... i guess there's a distinguished difference between parsing lean syntax versus parsing String...

Christian Tzurcanu (Aug 12 2024 at 13:21):

Edward van de Meent said:

there is a parser for scientific notation called Lean.Parser.scientificLit...

I have tried several questions to ChatGPT around scientificLit and they have not yielded a solution for this context.. yet

Christian Tzurcanu (Aug 15 2024 at 05:51):

something like this was a solution for my case:

inductive Types : Type u
  | strVal (v : String)
  | intVal (v : Int)
  | floatVal (v : Float)
...

open Lean Lean.Parsec

def intToFloat (n : Int) : Float :=
  if n >= 0 then
    Float.ofNat n.toNat
  else
    -Float.ofNat (-n).toNat

def float_parsing (intPart fracPart expPart : String) (sign : Option Char) : Option Float :=
  let fullnum := (intPart ++ fracPart).toNat!
  let plc := fracPart.length
  let exponent := expPart.toInt!
  let floatVal := Float.ofScientific fullnum true plc
  let adjustedFloat := floatVal * Float.pow 10.0 (intToFloat exponent)
  match sign with
  | some '-' => some (-adjustedFloat)
  | _        => some adjustedFloat

def parseFloatOrInt : Parsec Types := do
  let sign  optional (pchar '+' <|> pchar '-')
  let intPart  many1Chars digit
  optional (pchar '.') >>= fun
  | some _ => do
    let fracPart  many1Chars digit
    optional (pchar 'e' <|> pchar 'E') >>= fun
    | some _ => do
      let expPart  manyChars (pchar '+' <|> pchar '-' <|> digit)
      let floatStr := intPart ++ "." ++ fracPart ++ "e" ++ expPart
      match float_parsing intPart fracPart expPart sign with
      | some f => pure (Types.floatVal f)
      | none   => fail s!"Invalid float: {floatStr}"
    | none => do
      let floatStr := intPart ++ "." ++ fracPart
      match float_parsing intPart fracPart "0" sign with
      | some f => pure (Types.floatVal f)
      | none   => fail s!"Invalid float: {floatStr}"
  | none => do
    let intVal := intPart.toInt!
    pure (Types.intVal (match sign with | some '-' => -intVal | _ => intVal))

def parseTypes : Parsec Types :=
    parseStrVal <|> parseFloatOrInt ...

Damiano Testa (Aug 15 2024 at 11:43):

Is this helpful?

import Lean.Elab.Command

open Lean in
run_cmd
  let float  `(Parser.scientificLit| 1.4e-24)
  logInfo float

Edward van de Meent (Aug 15 2024 at 11:44):

i don't think so; as mentioned earlier there is a difference between parsing a string and parsing embedded syntax

Edward van de Meent (Aug 15 2024 at 11:44):

i.e. between Parser and Parsec

Christian Tzurcanu (Aug 15 2024 at 11:45):

Damiano Testa said:

Is this helpful?

import Lean.Elab.Command

open Lean in
run_cmd
  let float  `(Parser.scientificLit| 1.4e-24)
  logInfo float

it would probably be helpful if I wanted to have the type explicitly stated. but I want both possibilities. and the harder problem is when it is not..

Damiano Testa (Aug 15 2024 at 11:48):

Is this better?

open Lean in
run_cmd
  let str : String := "1.4e-24"
  let stx := mkNode `scientific #[mkAtom str]
  let float  `(Parser.scientificLit| $stx)
  logInfo float

Christian Tzurcanu (Aug 15 2024 at 11:49):

Damiano Testa said:

Is this better?

open Lean in
run_cmd
  let str : String := "1.4e-24"
  let stx := mkNode `scientific #[mkAtom str]
  let float  `(Parser.scientificLit| $stx)
  logInfo float

as stated: I need to parse String. and for now the problem is fixed. i would rather dance carefully around it than try other solutions. but thank you for the effort: it may be useful for somebody else

Damiano Testa (Aug 15 2024 at 11:50):

I think that I am not understanding something: in the final code, the input is a string.

Christian Tzurcanu (Aug 15 2024 at 11:53):

Damiano Testa said:

I think that I am not understanding something: in the final code, the input is a string.

I meant also that now i depend on Parsec for all the other parts. And i am new with Lean: I would not know how to integrate what you provided

Damiano Testa (Aug 15 2024 at 11:54):

Ah, I see! The constrain that it was supposed to depend on other code was what I was missing!

Christian Tzurcanu (Aug 15 2024 at 11:54):

Damiano Testa said:

Ah, I see! The constrain that it was supposed to depend on other code was what I was missing!

a lot of code: implementing Lisp in Lean


Last updated: May 02 2025 at 03:31 UTC