Zulip Chat Archive
Stream: new members
Topic: how to parse Floats or even Sci notation with Parsec?
Christian Tzurcanu (Aug 12 2024 at 12:30):
I have tried several ways that Claude and ChatGPT recommend and none worked.
Probably the solution is to look for a '.' character and then go back and parse everything that starts from the last whitespace. But I was unable to attain that...
my current MVE:
import Lean
import Lean.Parser
inductive Types : Type u
| strVal (v : String)
| intVal (v : Int)
| floatVal (v : Float)
deriving Repr
open Lean Lean.Parsec
def whitespace : Parsec Unit :=
discard <| many (satisfy Char.isWhitespace)
def digits1 : Parsec (Array Char) :=
many1 digit
def parseInt : Parsec Int :=
digits1 >>= fun ds => pure (String.toInt! (String.mk ds.toList))
def parseIntVal : Parsec Types :=
whitespace *>
let sign := (pchar '-' *> pure (-1)) <|> (pchar '+' *> pure 1) <|> pure 1
-- whitespace *>
sign >>= fun s =>
parseInt >>= fun n =>
pure (Types.intVal (s * n))
-- problem with Float starts here
-- Utility function to parse digits
def digits : Parsec String :=
many1Chars digit
def stringToFloat (s : String) : Option Float :=
let parts := s.split (· == '.')
match parts with
| [intPart, fracPart] =>
let intValue := intPart.toInt!
let fracValue := fracPart.toNat!
let fracDigits := fracPart.length -- (Int.ofNat fracPart.length)
some (Float.ofScientific intValue.toNat (fracValue > 0) fracDigits)
| _ => none
def parseFloatVal : Parsec Types := do
let intPart ← many1Chars digit
let _ ← pchar '.'
let fracPart ← many1Chars digit
let floatStr := intPart ++ "." ++ fracPart
match stringToFloat floatStr with
| some f => pure (Types.floatVal f)
| none => fail s!"Invalid float: {floatStr}"
-- problem with Float ends here
-- Parser for strVal
def parseStrVal : Parsec Types := do
-- whitespace
let _ ← pchar '"'
let str ← many1Chars (satisfy (λ c => c ≠ '"'))
let _ ← pchar '"'
return Types.strVal str
partial def parseTypes : Parsec Types :=
parseStrVal <|> parseFloatVal <|> parseIntVal
def parse (input : String) : Except String Types :=
match parseTypes input.iter with
| Lean.Parsec.ParseResult.success _ res => Except.ok res
| Lean.Parsec.ParseResult.error _ err => Except.error err
def exceptToTypes (e : Except String Types) (default : Types) : Types :=
match e with
| Except.ok t => t
| Except.error _ => default
def testParser (input : String) : IO Unit := do
IO.println s!"Input: \"{input}\""
match parse input with
| Except.ok result =>
IO.println s!"Parsed successfully. {repr result}"
| Except.error err =>
IO.println s!"Parsing failed: {err}"
#eval testParser "-123"
def res := testParser "-1.23"
#eval res
the second eval should have produced a Float not an Int.
Christian Tzurcanu (Aug 12 2024 at 12:37):
I also wonder why there is no Parsec combinator widely known for Floats. It seems something of wide interest.. Not that I would know how to find the available combinators, but...
Edward van de Meent (Aug 12 2024 at 13:09):
there is a parser for scientific notation called Lean.Parser.scientificLit
...
Edward van de Meent (Aug 12 2024 at 13:13):
although i should point out that that doesn't parse the -
part
Christian Tzurcanu (Aug 12 2024 at 13:14):
Edward van de Meent said:
although i should point out that that doesn't parse the
-
part
even so. how do you use it in this context?
Edward van de Meent (Aug 12 2024 at 13:19):
good question... i guess there's a distinguished difference between parsing lean syntax
versus parsing String
...
Christian Tzurcanu (Aug 12 2024 at 13:21):
Edward van de Meent said:
there is a parser for scientific notation called
Lean.Parser.scientificLit
...
I have tried several questions to ChatGPT around scientificLit and they have not yielded a solution for this context.. yet
Christian Tzurcanu (Aug 15 2024 at 05:51):
something like this was a solution for my case:
inductive Types : Type u
| strVal (v : String)
| intVal (v : Int)
| floatVal (v : Float)
...
open Lean Lean.Parsec
def intToFloat (n : Int) : Float :=
if n >= 0 then
Float.ofNat n.toNat
else
-Float.ofNat (-n).toNat
def float_parsing (intPart fracPart expPart : String) (sign : Option Char) : Option Float :=
let fullnum := (intPart ++ fracPart).toNat!
let plc := fracPart.length
let exponent := expPart.toInt!
let floatVal := Float.ofScientific fullnum true plc
let adjustedFloat := floatVal * Float.pow 10.0 (intToFloat exponent)
match sign with
| some '-' => some (-adjustedFloat)
| _ => some adjustedFloat
def parseFloatOrInt : Parsec Types := do
let sign ← optional (pchar '+' <|> pchar '-')
let intPart ← many1Chars digit
optional (pchar '.') >>= fun
| some _ => do
let fracPart ← many1Chars digit
optional (pchar 'e' <|> pchar 'E') >>= fun
| some _ => do
let expPart ← manyChars (pchar '+' <|> pchar '-' <|> digit)
let floatStr := intPart ++ "." ++ fracPart ++ "e" ++ expPart
match float_parsing intPart fracPart expPart sign with
| some f => pure (Types.floatVal f)
| none => fail s!"Invalid float: {floatStr}"
| none => do
let floatStr := intPart ++ "." ++ fracPart
match float_parsing intPart fracPart "0" sign with
| some f => pure (Types.floatVal f)
| none => fail s!"Invalid float: {floatStr}"
| none => do
let intVal := intPart.toInt!
pure (Types.intVal (match sign with | some '-' => -intVal | _ => intVal))
def parseTypes : Parsec Types :=
parseStrVal <|> parseFloatOrInt ...
Damiano Testa (Aug 15 2024 at 11:43):
Is this helpful?
import Lean.Elab.Command
open Lean in
run_cmd
let float ← `(Parser.scientificLit| 1.4e-24)
logInfo float
Edward van de Meent (Aug 15 2024 at 11:44):
i don't think so; as mentioned earlier there is a difference between parsing a string and parsing embedded syntax
Edward van de Meent (Aug 15 2024 at 11:44):
i.e. between Parser
and Parsec
Christian Tzurcanu (Aug 15 2024 at 11:45):
Damiano Testa said:
Is this helpful?
import Lean.Elab.Command open Lean in run_cmd let float ← `(Parser.scientificLit| 1.4e-24) logInfo float
it would probably be helpful if I wanted to have the type explicitly stated. but I want both possibilities. and the harder problem is when it is not..
Damiano Testa (Aug 15 2024 at 11:48):
Is this better?
open Lean in
run_cmd
let str : String := "1.4e-24"
let stx := mkNode `scientific #[mkAtom str]
let float ← `(Parser.scientificLit| $stx)
logInfo float
Christian Tzurcanu (Aug 15 2024 at 11:49):
Damiano Testa said:
Is this better?
open Lean in run_cmd let str : String := "1.4e-24" let stx := mkNode `scientific #[mkAtom str] let float ← `(Parser.scientificLit| $stx) logInfo float
as stated: I need to parse String. and for now the problem is fixed. i would rather dance carefully around it than try other solutions. but thank you for the effort: it may be useful for somebody else
Damiano Testa (Aug 15 2024 at 11:50):
I think that I am not understanding something: in the final code, the input is a string.
Christian Tzurcanu (Aug 15 2024 at 11:53):
Damiano Testa said:
I think that I am not understanding something: in the final code, the input is a string.
I meant also that now i depend on Parsec for all the other parts. And i am new with Lean: I would not know how to integrate what you provided
Damiano Testa (Aug 15 2024 at 11:54):
Ah, I see! The constrain that it was supposed to depend on other code was what I was missing!
Christian Tzurcanu (Aug 15 2024 at 11:54):
Damiano Testa said:
Ah, I see! The constrain that it was supposed to depend on other code was what I was missing!
a lot of code: implementing Lisp in Lean
Last updated: May 02 2025 at 03:31 UTC