Bug 627835 - libguestfs protocol loses synchronization if you 'upload' before mounting disks
Summary: libguestfs protocol loses synchronization if you 'upload' before mounting disks
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libguestfs
Version: 6.1
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Richard W.M. Jones
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 576879 613593 libguestfs_rebase6.3
Blocks: 584228 591155 591250
TreeView+ depends on / blocked
 
Reported: 2010-08-27 05:18 UTC by Jinxin Zheng
Modified: 2011-12-06 10:42 UTC (History)
8 users (show)

Fixed In Version: libguestfs-1.7.17-24.el6
Doc Type: Bug Fix
Doc Text:
Clone Of: 576879
Environment:
Last Closed: 2011-12-06 10:42:02 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1512 normal SHIPPED_LIVE libguestfs bug fix update 2011-12-06 00:39:11 UTC

Description Jinxin Zheng 2010-08-27 05:18:34 UTC
Clone to RHEL 6 to ensure it will get fixed in 6.1.

+++ This bug was initially created as a clone of Bug #576879 +++

[Originally reported by Seth Vidal]

Description of problem:

guestfish <<EOF
> add f12-minimal.img
> run
> upload /var/tmp/guestfish-1.0.85-1.el5.7.x86_64.rpm /home/vmbuild/guestfish-1.0.85-1.el5.7.x86_64.rpm
> EOF
libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x64 from daemon, expected 0xffffeeee
 
libguestfs: error: message length (536933877) > maximum possible size (4194304)
libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x1 from daemon, expected 0xffffeeee

With another version of libguestfs:

$ cat test
#!/bin/sh -

guestfish -x <<EOF
add f12.img
run
upload a_big_file /
EOF
$ sh test
libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x64 from daemon, expected 0xffffeeee

libguestfs: error: message length (536933877) > maximum possible size (4194304)
libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x1 from daemon, expected 0xffffeeee

Version-Release number of selected component (if applicable):

libguestfs 1.0.87

How reproducible:

Always.

The problem here is we're uploading without first mounting
any disk.  Upload is failing and probably reporting an error,
but then the protocol loses synchronization and it's game over.

--- Additional comment from rjones@redhat.com on 2010-04-17 17:32:59 EDT ---

Fix posted:
http://git.annexia.org/?p=libguestfs.git;a=commitdiff;h=5922d7084d6b43f0a1a15b664c7082dfeaf584d0

--- Additional comment from rjones@redhat.com on 2010-04-18 05:29:19 EDT ---

Setting back to ASSIGNED, since we're still not quite there
with this patch.

><fs> sparse /tmp/test.img 10M
><fs> run
><fs> tar-in /tmp/foobar /blah
libguestfs: error: open: /tmp/foobar: No such file or directory
><fs> list-devices 
libguestfs: error: unexpected procedure number (69/7)
><fs> list-devices 
/dev/vda

The first error from list-devices shouldn't happen.

--- Additional comment from rjones@redhat.com on 2010-04-18 05:30:09 EDT ---

Another example:

><fs> tar-in /tmp/foobar /blah
libguestfs: error: open: /tmp/foobar: No such file or directory
><fs> ping-daemon
libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x64 from daemon, expected 0xffffeeee
><fs> ping-daemon
libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x2000f5f5 from daemon, expected 0xffffeeee

--- Additional comment from rjones@redhat.com on 2010-05-12 14:06:12 EDT ---

Here's a one line reproducer for the latest libguestfs:

$ ./fish/guestfish -N disk -- -tar-in /dev/nofile /blah : ping-daemon
libguestfs: error: open: /dev/nofile: No such file or directory
libguestfs: error: unexpected procedure number (69/92)
libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x18 from daemon, expected 0xffffeeee

libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x2000f5f5 from daemon, expected 0xffffeeee

--- Additional comment from rjones@redhat.com on 2010-05-12 14:22:26 EDT ---

OK I understand what's going on here.  Both ends simultaneously send
cancel messages:

library                 daemon
  |
  V
 sends RPC message -------+
  |                       |
  |                 receives RPC message
  |                       |
  V                       V
 opens file,        filesystem not mounted!
 error: not found!        |
  |                       |
  V                       V
 sends cancel       sends cancel
  +------->      <--------+
            !!!!

--- Additional comment from rjones@redhat.com on 2010-05-12 15:00:36 EDT ---

Patch posted upstream to fix the issue described in comment 5:
https://www.redhat.com/archives/libguestfs/2010-May/msg00061.html

Comment 1 Richard W.M. Jones 2010-11-24 09:09:09 UTC
Will probably be fixed by the rebase.  Needs QA to
verify that.

Comment 2 Richard W.M. Jones 2011-01-04 14:15:25 UTC
Going to claim that this is fixed by the
rebase.  QA please check this one carefully
since there are lots of corner cases in the
code, and we're not really sure that we have
fixed all of them properly.

Comment 4 Jinxin Zheng 2011-01-31 07:18:00 UTC
This is found not completely fixed. Actually it looks even worse:

guestfish <<EOF
add test.img
run
upload test.txt /test.txt
EOF

it is getting hang running the above.

so I would change this back to ASSIGNED.

Comment 5 Richard W.M. Jones 2011-01-31 09:48:32 UTC
Fair enough, this isn't fixed.  In fact we suspected this
when the regression test started failing:
https://bugzilla.redhat.com/show_bug.cgi?id=576879#c7

I think I'm going to leave this one and not fix it for 6.1.
There's an easy workaround for users, and we can fix it
for 6.2 instead.

Comment 6 RHEL Product and Program Management 2011-01-31 10:05:14 UTC
Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request.

Comment 8 Richard W.M. Jones 2011-07-12 10:48:31 UTC
Safest to fix this by rebasing (bug 719879).

Comment 9 Richard W.M. Jones 2011-08-10 17:24:39 UTC
Actually we have included all the relevant commits
so this should be fixed in 6.2.

Comment 10 Richard W.M. Jones 2011-08-10 17:26:43 UTC
This is what the correct output should be (verified
for me on RHEL 6.2 with libguestfs 1.7.17-24.el6):

$ guestfish -N disk -- -tar-in /dev/nofile /blah : ping-daemon : echo OK
libguestfs: error: open: /dev/nofile: No such file or directory
OK

Comment 13 Jinxin Zheng 2011-08-26 09:10:17 UTC
Verified this using the reproducer in comment 10.

$ guestfish -N disk -- -tar-in /dev/nofile /blah : ping-daemon : echo OK

libguestfs-1.7.17-19:
The script hangs after printing
libguestfs: error: open: /dev/nofile: No such file or directory

libguestfs-1.7.17-26:
The scripts prints the following then exit.
libguestfs: error: open: /dev/nofile: No such file or directory
OK

Comment 14 errata-xmlrpc 2011-12-06 10:42:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1512.html


Note You need to log in before you can comment on or make changes to this bug.