Bug 576879
Summary: | libguestfs protocol loses synchronization if you 'upload' before mounting disks | |||
---|---|---|---|---|
Product: | [Community] Virtualization Tools | Reporter: | Richard W.M. Jones <rjones> | |
Component: | libguestfs | Assignee: | Richard W.M. Jones <rjones> | |
Status: | CLOSED UPSTREAM | QA Contact: | ||
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | unspecified | CC: | mbooth, virt-maint | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | All | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 627835 (view as bug list) | Environment: | ||
Last Closed: | 2011-07-14 19:04:46 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 584228, 591155, 591250, 627835 |
Fix posted: http://git.annexia.org/?p=libguestfs.git;a=commitdiff;h=5922d7084d6b43f0a1a15b664c7082dfeaf584d0 Setting back to ASSIGNED, since we're still not quite there with this patch. ><fs> sparse /tmp/test.img 10M ><fs> run ><fs> tar-in /tmp/foobar /blah libguestfs: error: open: /tmp/foobar: No such file or directory ><fs> list-devices libguestfs: error: unexpected procedure number (69/7) ><fs> list-devices /dev/vda The first error from list-devices shouldn't happen. Another example: ><fs> tar-in /tmp/foobar /blah libguestfs: error: open: /tmp/foobar: No such file or directory ><fs> ping-daemon libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x64 from daemon, expected 0xffffeeee ><fs> ping-daemon libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x2000f5f5 from daemon, expected 0xffffeeee Here's a one line reproducer for the latest libguestfs: $ ./fish/guestfish -N disk -- -tar-in /dev/nofile /blah : ping-daemon libguestfs: error: open: /dev/nofile: No such file or directory libguestfs: error: unexpected procedure number (69/92) libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x18 from daemon, expected 0xffffeeee libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x2000f5f5 from daemon, expected 0xffffeeee OK I understand what's going on here. Both ends simultaneously send cancel messages: library daemon | V sends RPC message -------+ | | | receives RPC message | | V V opens file, filesystem not mounted! error: not found! | | | V V sends cancel sends cancel +-------> <--------+ !!!! Patch posted upstream to fix the issue described in comment 5: https://www.redhat.com/archives/libguestfs/2010-May/msg00061.html Setting back to ASSIGNED, since the regression test for this test has started to hang. See also: http://git.annexia.org/?p=libguestfs.git;a=commitdiff;h=fb998000e60b32219c2bf839044cff59f499dff1 [Copy of a note sent to the mailing list] I just pushed a commit which reenables two tests for [this bug]: http://git.annexia.org/?p=libguestfs.git;a=commitdiff;h=dc8e4b057ecd3984d7c27c8e ce54048b6a06d662 This is a really long-standing bug which we thought we'd fixed, but then turned up again. It currently is *not* failing on my machine. I added some clearer debug messages to the code paths involved. It seems to be highly timing related and I doubt that it is fully squashed, so it is quite probably that these tests will fail for somebody somewhere. If you can get it to fail with LIBGUESTFS_DEBUG=1 then please post the full log into the bug report. I think I've nailed this one finally. Posted a patch here, still testing it: https://www.redhat.com/archives/libguestfs/2011-March/msg00090.html Also the two follow up messages: https://www.redhat.com/archives/libguestfs/2011-March/msg00092.html https://www.redhat.com/archives/libguestfs/2011-March/msg00093.html Fixes included in 1.9.12. *** Bug 624035 has been marked as a duplicate of this bug. *** Haven't seen this for quite a while. FIXED! |
[Originally reported by Seth Vidal] Description of problem: guestfish <<EOF > add f12-minimal.img > run > upload /var/tmp/guestfish-1.0.85-1.el5.7.x86_64.rpm /home/vmbuild/guestfish-1.0.85-1.el5.7.x86_64.rpm > EOF libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x64 from daemon, expected 0xffffeeee libguestfs: error: message length (536933877) > maximum possible size (4194304) libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x1 from daemon, expected 0xffffeeee With another version of libguestfs: $ cat test #!/bin/sh - guestfish -x <<EOF add f12.img run upload a_big_file / EOF $ sh test libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x64 from daemon, expected 0xffffeeee libguestfs: error: message length (536933877) > maximum possible size (4194304) libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x1 from daemon, expected 0xffffeeee Version-Release number of selected component (if applicable): libguestfs 1.0.87 How reproducible: Always. The problem here is we're uploading without first mounting any disk. Upload is failing and probably reporting an error, but then the protocol loses synchronization and it's game over.