Zulip Chat Archive

Stream: mathlib4

Topic: error writing standard output: Broken pipe


Sam van G (May 04 2024 at 12:28):

Hello! This CI job on my branch just failed with the message:
tail: error writing 'standard output': Broken pipe
Does anyone know what went wrong / what I should do?

Ruben Van de Velde (May 04 2024 at 12:48):

That's a new one for me. I triggered another attempt

Sam van G (May 04 2024 at 14:30):

Interesting, this time it worked! Thanks Ruben.

Michael Stoll (May 05 2024 at 15:47):

I'm seeing the message

tail: error writing 'standard output': Broken pipe

also from time to time in CI (and also when bors is trying to merge something) now.

Mario Carneiro (May 05 2024 at 15:50):

I'm wondering if this has something to do with the fact that lake closes stdout now @Mac Malone never mind, I must have hallucinated that

Ruben Van de Velde (May 05 2024 at 16:47):

It seems like something that's either caused by some unprincipled pipe handling or by the operating system not being 100% reliable

Michael Stoll (May 05 2024 at 19:31):

Just happening again:

 Downloaded: 4454 file(s) [attempted 4454/4454 = 100%] (100% success)
Decompressing 4454 file(s)
unpacked in 5118 ms
[2/4455] Building Mathlib.Mathport.Rename
tail: error writing 'standard output': Broken pipe
Error: Process completed with exit code 1.

while checking the cache in CI.

Michael Stoll (May 05 2024 at 19:49):

... and again after re-running the build job.

Michael Stoll (May 05 2024 at 19:50):

One slightly annoying thing is that I have to wait until the other parts of the job have finished (in particular, linting) before I can restart it.

Michael Stoll (May 05 2024 at 20:01):

This time, it's

Downloaded: 4454 file(s) [attempted 4454/4454 = 100%] (100% success)
Decompressing 4454 file(s)
unpacked in 6167 ms
[2/4455] Building Mathlib.Mathport.Rename
/home/lean/actions-runner/_work/_temp/cf950ba2-6abb-4d03-bbf2-4d2ce2af80b8.sh: line 9: kill: (2272700) - No such process
tail: error writing 'standard output': Broken pipe

but it was apparently not registered as a failure.

Richard Copley (May 05 2024 at 20:26):

This comes from the "check the cache" step in one of bors.yml, build.yml, build.yml.in, build_fork.yml, which all have this test:

            # We pipe the output of `lake build` to a file,
            # and if we find " Building Mathlib" in that file we kill `lake build`, and error.
            lake build > tmp & tail --pid=$! -n +1 -F tmp | (! (grep -m 1 " Building Mathlib" && kill $! ))

Finding, for example, "Building Mathlib.Mathport.Rename" in the output of lake build used to mean that the file needed to be rebuilt, indicating that the cache didn't work. That's no longer a valid assumption, so this test should be rewritten.
The intention seems to be for tail to fail gracefully on detecting that lake was killed, but apparently that didn't work.

Richard Copley (May 05 2024 at 20:39):

This was always quite racy. grep sends the kill signal to whichever process has the pid that lake was given when it started (hopefully either lake or nobody), then grep exits and tail falls over because it can no longer talk to grep. Later, lake handles the signal and dies (which was supposed to be detected by tail).

Ruben Van de Velde (May 05 2024 at 20:40):

Eek. Thanks for figuring that out

Richard Copley (May 05 2024 at 20:45):

Originally written by @Kim Morrison (#7099) (but that doesn't mean they have to fix it!)

Kim Morrison (May 06 2024 at 04:04):

@Mac Malone, another instance of Hyrum's law in action here with Lake.

Can you tell us how we should detect whether lake build wants to do any work, without doing any work?

Kim Morrison (May 06 2024 at 04:05):

Ah, this is just lake build --no-build, I think.

Kim Morrison (May 06 2024 at 04:09):

Hopefully #12688.


Last updated: May 02 2025 at 03:31 UTC