[Originally reported by Seth Vidal] Description of problem: guestfish <<EOF > add f12-minimal.img > run > upload /var/tmp/guestfish-1.0.85-1.el5.7.x86_64.rpm /home/vmbuild/guestfish-1.0.85-1.el5.7.x86_64.rpm > EOF libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x64 from daemon, expected 0xffffeeee libguestfs: error: message length (536933877) > maximum possible size (4194304) libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x1 from daemon, expected 0xffffeeee With another version of libguestfs: $ cat test #!/bin/sh - guestfish -x <<EOF add f12.img run upload a_big_file / EOF $ sh test libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x64 from daemon, expected 0xffffeeee libguestfs: error: message length (536933877) > maximum possible size (4194304) libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x1 from daemon, expected 0xffffeeee Version-Release number of selected component (if applicable): libguestfs 1.0.87 How reproducible: Always. The problem here is we're uploading without first mounting any disk. Upload is failing and probably reporting an error, but then the protocol loses synchronization and it's game over.
Fix posted: http://git.annexia.org/?p=libguestfs.git;a=commitdiff;h=5922d7084d6b43f0a1a15b664c7082dfeaf584d0
Setting back to ASSIGNED, since we're still not quite there with this patch. ><fs> sparse /tmp/test.img 10M ><fs> run ><fs> tar-in /tmp/foobar /blah libguestfs: error: open: /tmp/foobar: No such file or directory ><fs> list-devices libguestfs: error: unexpected procedure number (69/7) ><fs> list-devices /dev/vda The first error from list-devices shouldn't happen.
Another example: ><fs> tar-in /tmp/foobar /blah libguestfs: error: open: /tmp/foobar: No such file or directory ><fs> ping-daemon libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x64 from daemon, expected 0xffffeeee ><fs> ping-daemon libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x2000f5f5 from daemon, expected 0xffffeeee
Here's a one line reproducer for the latest libguestfs: $ ./fish/guestfish -N disk -- -tar-in /dev/nofile /blah : ping-daemon libguestfs: error: open: /dev/nofile: No such file or directory libguestfs: error: unexpected procedure number (69/92) libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x18 from daemon, expected 0xffffeeee libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x2000f5f5 from daemon, expected 0xffffeeee
OK I understand what's going on here. Both ends simultaneously send cancel messages: library daemon | V sends RPC message -------+ | | | receives RPC message | | V V opens file, filesystem not mounted! error: not found! | | | V V sends cancel sends cancel +-------> <--------+ !!!!
Patch posted upstream to fix the issue described in comment 5: https://www.redhat.com/archives/libguestfs/2010-May/msg00061.html
Setting back to ASSIGNED, since the regression test for this test has started to hang. See also: http://git.annexia.org/?p=libguestfs.git;a=commitdiff;h=fb998000e60b32219c2bf839044cff59f499dff1
[Copy of a note sent to the mailing list] I just pushed a commit which reenables two tests for [this bug]: http://git.annexia.org/?p=libguestfs.git;a=commitdiff;h=dc8e4b057ecd3984d7c27c8e ce54048b6a06d662 This is a really long-standing bug which we thought we'd fixed, but then turned up again. It currently is *not* failing on my machine. I added some clearer debug messages to the code paths involved. It seems to be highly timing related and I doubt that it is fully squashed, so it is quite probably that these tests will fail for somebody somewhere. If you can get it to fail with LIBGUESTFS_DEBUG=1 then please post the full log into the bug report.
I think I've nailed this one finally. Posted a patch here, still testing it: https://www.redhat.com/archives/libguestfs/2011-March/msg00090.html
Also the two follow up messages: https://www.redhat.com/archives/libguestfs/2011-March/msg00092.html https://www.redhat.com/archives/libguestfs/2011-March/msg00093.html
Fixes included in 1.9.12.
*** Bug 624035 has been marked as a duplicate of this bug. ***
Haven't seen this for quite a while. FIXED!