UTF-8 decoding #
This file contains the low-level proof that UTF-8 decoding and encoding are inverses.
An important corollary is that attempting to decode a byte array that is valid UTF-8 will
succeed, which is required for the definition of String.toList.
utf8EncodeChar #
utf8EncodeChar low-level API #
Returns the sequence of bytes in a character's UTF-8 encoding.
Equations
- One or more equations did not get rendered due to their size.
Instances For
utf8EncodeChar BitVec API #
Size one #
Size two #
Size three #
Size four #
parseFirstByte definition #
Equations
- One or more equations did not get rendered due to their size.
Instances For
parseFirstByte low-level API #
parseFirstByte BitVec API #
Size one #
Size two #
Size three #
Size four #
isInvalidContinuationByte definition & API #
Equations
- ByteArray.utf8DecodeChar?.isInvalidContinuationByte b = (b &&& 192 != 128)
Instances For
parseFirstByte, isInvalidContinuationByte and utf8EncodeChar #
Instances For
Equations
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Decodes and returns the Char whose UTF-8 encoding begins at i in bytes.
Returns none if i is not the start of a valid UTF-8 encoding of a character.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
utf8DecodeChar? low-level API #
Main theorems #
utf8DecodeChar?_utf8EncodeChar_append and toByteArray_of_utf8DecodeChar?_eq_some are the two main results that together
imply that UTF-8 encoding and decoding are inverse.
Corollaries #
Decodes and returns the Char whose UTF-8 encoding begins at i in bytes.
This function requires a proof that there is, in fact, a valid Char at i. utf8DecodeChar? is
an alternative function that returns Option Char instead of requiring a proof ahead of time.
Equations
- bytes.utf8DecodeChar i h = (bytes.utf8DecodeChar? i).get h