Zulip Chat Archive

Stream: lean4

Topic: lake upload for Cloud Release


Utensil Song (Oct 18 2023 at 11:28):

I was preparing for a Cloud Release, which is lake upload <tag> in CI, but this simple command doesn't work for Ubuntu CI due to

  lake upload v0.0.1-alpha.2
  shell: /usr/bin/bash -e {0}
  env:
    pythonLocation: /opt/hostedtoolcache/Python/3.11.5/x64
    PKG_CONFIG_PATH: /opt/hostedtoolcache/Python/3.11.5/x64/lib/pkgconfig
    Python_ROOT_DIR: /opt/hostedtoolcache/Python/3.11.5/x64
    Python2_ROOT_DIR: /opt/hostedtoolcache/Python/3.11.5/x64
    Python3_ROOT_DIR: /opt/hostedtoolcache/Python/3.11.5/x64
    LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.11.5/x64/lib
    GH_TOKEN: ***
error: > tar -c -z --exclude=*.tar.gz --exclude=*.tar.gz.trace -f ./build/linux-64.tar.gz -C ./build .
error: stderr:
tar: .: file changed as we read it
error: external command `tar` exited with code 1

It works fine for my Mac CI.

I understand that lake will modified *.trace, and it's already excluded, tar should not observe any modification.

@Mac Malone Do you have some idea why this is happening? I could not find another projects running CI for Cloud Release for reference. Thanks!

Utensil Song (Oct 18 2023 at 11:56):

It's confirmed locally that a raw tar doesn't fail, but lake upload fails at tar.

Running variants of strace -ff -e file lake upload v0.0.1-alpha.2and strace -ff lake upload v0.0.1-alpha.2 2>&1 | grep -E 'open|write'|grep -v 'write resumed'|grep -v 'write(4'|grep -v 'write(1'|grep -v 'write(2' gives me no clue on what lake might be writing to.

Mac Malone (Oct 18 2023 at 17:22):

Utensil Song said:

It's confirmed locally that a raw tar doesn't fail, but lake upload fails at tar.

So, you are saying running the exact same tar commands works when run directly on the CI, but fails when Lake runs it?

Mac Malone (Oct 18 2023 at 17:25):

That is very weird. Did you try running lake with -v to get the full verbose output of Lake's tar to see where it is failing?

Utensil Song (Oct 19 2023 at 01:13):

Yes, that's what I mean, and I can reproduce it very time as CI fails consistently multiple times.

Utensil Song (Oct 19 2023 at 01:14):

lake upload v0.0.1-alpha.2 -v just gives files names, still no clue. This is the end of it:

./cln-1.3.6/src/base/digit/.dirstamp
./cln-1.3.6/src/base/cl_d0_exception.lo
./ginac-1.8.7.tar.bz2
./cln-1.3.6.tar.bz2
./share/
./share/man/
./share/man/man1/
./share/man/man1/pi.1
./share/man/man1/ginsh.1
./share/man/man1/viewgar.1
./share/info/
./share/info/cln.info
./share/info/ginac.info
./share/info/ginac-examples.info
error: stderr:
tar: .: file changed as we read it
error: external command `tar` exited with code 1

Utensil Song (Oct 19 2023 at 01:16):

.dirstamp looks suspicious but it belongs to make, not lake.

Utensil Song (Oct 19 2023 at 01:31):

Also, there's nothing suspicious in the lake code, all the call chain seems to be free of anything modifying other files or asynchronous. I'll bypass this by raw tar calls for now and revisit when there's more ideas or clues.

Utensil Song (Oct 19 2023 at 06:10):

OK, I have identified the source and will file an issue.

The issue lies in the way lake calls tar, the archive it generates is inside build which changes the modified time of directory build, I previously ruled out this because I thought the check also excludes the exclude patterns, but it turns out to be not the case for GNU tar after reading its source code, it fails at checking the mtime of the dir, without taking the exclude patterns into account.

Now I can bypass this 100% by generating the archive elsewhere, but the issue doesn't happen if tar takes too short to finish, and I have 100M+ to tar.

Mac Malone (Oct 19 2023 at 06:25):

Utensil Song said:

The issue lies in the way lake calls tar, the archive it generates is inside build which changes the modified time of directory build, I previously ruled out this because I thought the check also excludes the exclude patterns, but it turns out to be not the case for GNU tar after reading its source code

That is interesting. When I original wrote this code I stumble upon a StackOverflow answer like this one (I am not sure if it was the same question or a similar one) which suggested that --exclude should work. I am surprised the behavior you are seeing here never was raised there. Do you think creating an empty file in the directory with the proper name would be enough to avoid this error (since the directory should not be modified then since no files were added or removed)?

Utensil Song (Oct 19 2023 at 06:41):

Maybe solution 1 in the SO answer would suffice, but now the command changes into build so there is no extra level of directory. Solution 2 might work, as my experience is indeed that if the archive was created and errored, the second time tar would succeed. But both solution sounds too subtle to me, need to check against the source code.

Utensil Song (Oct 19 2023 at 07:21):

Both solution confirms to work well after repeated local tests. Nothing in the code explicitly excludes these 2 cases, but I assume they prevent the modified time of the dir to change, as creating a file is an event that changes the modified time of the direct parent directory of that file.

Utensil Song (Oct 19 2023 at 09:32):

lean4#2713

Wojciech Nawrocki (Oct 21 2023 at 00:20):

Thanks for researching this! I have also observed it.


Last updated: Dec 20 2023 at 11:08 UTC