Zulip Chat Archive

Stream: general

Topic: IO changes in 3.4.2


Simon Winwood (Feb 26 2019 at 05:43):

In lean 3.4.2 the mode for subprocess pipes was changed --- in 3.4.1 pipes are binary, but in 3.4.2 they are utf-8. Is there some way to change the mode of these pipes (either before or after they are created?) I looked and couldn't see anything obvious.

Andrew Ashworth (Feb 26 2019 at 05:51):

are you talking about this commit: https://github.com/leanprover/lean/commit/4e16bc7192f9f32b03222142e659fa3dae4b8025#diff-6915aa5b8efa0a177a7155df746d3a21?

Andrew Ashworth (Feb 26 2019 at 05:52):

Probably want to ask @Keeley Hoek

Simon Winwood (Feb 26 2019 at 07:23):

Possibly? I noticed it when my lean program started behaving really oddly, and I tracked it down to a change in the release notes.

Keeley Hoek (Feb 26 2019 at 07:24):

(Looks like the zulip app tricked me I to posting twice, sorry)

Keeley Hoek (Feb 26 2019 at 07:25):

Hey!

By default pipes weren't binary, they were text mode. The commit referenced instead changes the behaviour of text mode pipes to properly decode utf-8 characters when writing them (and encode when reading), instead of just butchering the internal representation of each character by truncating it to the lowest 8 bits.

This behaviour is only enabled in text mode, and the function in the io namespace which opens a file descriptor handle takes an optional boolean true or false argument which specifies whether to open the handle in text (ff) or binary (tt) mode.

Simon Winwood (Feb 26 2019 at 07:26):

This isn't enabled for process pipes though --- I can't change the mode of a pipe as far as I can tell.
\

Simon Winwood (Feb 26 2019 at 07:27):

This might be more due to me abusing lean's char type, but there doesn't seem to be any byte buffer support

Keeley Hoek (Feb 26 2019 at 07:32):

Ooooh, yep that sound bad. A problem will arise when lean reads a byte from a process handle with the highest bit set (i.e. non ASCII). I'm less certain about what would happen if an invalid unicode character is written...

Keeley Hoek (Feb 26 2019 at 07:33):

Is that consistent with what you're seeing?

Simon Winwood (Feb 26 2019 at 07:55):

I am reading from a file and writing to a pipe, and getting unexpected results on the other end of the pipe. The text coming back is all ascii, so I am fine there

Simon Winwood (Feb 26 2019 at 08:00):

I need to head off. This isn't blocking anything, but it would be nice to figure out a workaround, esp. for future releases. Maybe something like byte-buffers (instead of just char buffers)? I don't know what the plan for lean 4 looks like either

Simon Winwood (Feb 26 2019 at 08:00):

thanks for replying so quickly btw

Joe Hendrix (Mar 05 2019 at 20:00):

I work with Simon. It sounds like we're definitely not able to upgrade to Lean 3.4.2 until we find a way to send binary data from another process into Lean. We could do tricks like base64 encode the data, but for now I think we'll just stay with Lean 3.4.1 until Lean 4 is out with a C FFI.

Mario Carneiro (Mar 05 2019 at 20:08):

What's the issue? I have written a lean program that reads binary data, so I think it's at least possible to get the behavior you want

Mario Carneiro (Mar 05 2019 at 20:26):

here's a small demo:

$ printf "%b" '\xca\xfe\xba\xbe' > test.bin
import system.io

#eval do
  h  io.mk_file_handle "test.bin" io.mode.read tt,
  4, c  io.fs.read h 4,
  trace (to_string $ char.val <$> array.to_list c) (return ())
  -- [202, 254, 186, 190]

Joe Hendrix (Mar 05 2019 at 21:02):

We want to read binary data from stdout on a process created by io.proc.spawn.


Last updated: Dec 20 2023 at 11:08 UTC