Bug 627835 - libguestfs protocol loses synchronization if you 'upload' before mounting disks
libguestfs protocol loses synchronization if you 'upload' before mounting disks
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libguestfs (Show other bugs)
6.1
All Linux
high Severity high
: rc
: ---
Assigned To: Richard W.M. Jones
Virtualization Bugs
: Reopened
Depends On: 576879 613593 libguestfs_rebase6.3
Blocks: 584228 591155 591250
  Show dependency treegraph
 
Reported: 2010-08-27 01:18 EDT by Jinxin Zheng
Modified: 2011-12-06 05:42 EST (History)
8 users (show)

See Also:
Fixed In Version: libguestfs-1.7.17-24.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 576879
Environment:
Last Closed: 2011-12-06 05:42:02 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jinxin Zheng 2010-08-27 01:18:34 EDT
Clone to RHEL 6 to ensure it will get fixed in 6.1.

+++ This bug was initially created as a clone of Bug #576879 +++

[Originally reported by Seth Vidal]

Description of problem:

guestfish <<EOF
> add f12-minimal.img
> run
> upload /var/tmp/guestfish-1.0.85-1.el5.7.x86_64.rpm /home/vmbuild/guestfish-1.0.85-1.el5.7.x86_64.rpm
> EOF
libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x64 from daemon, expected 0xffffeeee
 
libguestfs: error: message length (536933877) > maximum possible size (4194304)
libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x1 from daemon, expected 0xffffeeee

With another version of libguestfs:

$ cat test
#!/bin/sh -

guestfish -x <<EOF
add f12.img
run
upload a_big_file /
EOF
$ sh test
libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x64 from daemon, expected 0xffffeeee

libguestfs: error: message length (536933877) > maximum possible size (4194304)
libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x1 from daemon, expected 0xffffeeee

Version-Release number of selected component (if applicable):

libguestfs 1.0.87

How reproducible:

Always.

The problem here is we're uploading without first mounting
any disk.  Upload is failing and probably reporting an error,
but then the protocol loses synchronization and it's game over.

--- Additional comment from rjones@redhat.com on 2010-04-17 17:32:59 EDT ---

Fix posted:
http://git.annexia.org/?p=libguestfs.git;a=commitdiff;h=5922d7084d6b43f0a1a15b664c7082dfeaf584d0

--- Additional comment from rjones@redhat.com on 2010-04-18 05:29:19 EDT ---

Setting back to ASSIGNED, since we're still not quite there
with this patch.

><fs> sparse /tmp/test.img 10M
><fs> run
><fs> tar-in /tmp/foobar /blah
libguestfs: error: open: /tmp/foobar: No such file or directory
><fs> list-devices 
libguestfs: error: unexpected procedure number (69/7)
><fs> list-devices 
/dev/vda

The first error from list-devices shouldn't happen.

--- Additional comment from rjones@redhat.com on 2010-04-18 05:30:09 EDT ---

Another example:

><fs> tar-in /tmp/foobar /blah
libguestfs: error: open: /tmp/foobar: No such file or directory
><fs> ping-daemon
libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x64 from daemon, expected 0xffffeeee
><fs> ping-daemon
libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x2000f5f5 from daemon, expected 0xffffeeee

--- Additional comment from rjones@redhat.com on 2010-05-12 14:06:12 EDT ---

Here's a one line reproducer for the latest libguestfs:

$ ./fish/guestfish -N disk -- -tar-in /dev/nofile /blah : ping-daemon
libguestfs: error: open: /dev/nofile: No such file or directory
libguestfs: error: unexpected procedure number (69/92)
libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x18 from daemon, expected 0xffffeeee

libguestfs: error: check_for_daemon_cancellation_or_eof: read 0x2000f5f5 from daemon, expected 0xffffeeee

--- Additional comment from rjones@redhat.com on 2010-05-12 14:22:26 EDT ---

OK I understand what's going on here.  Both ends simultaneously send
cancel messages:

library                 daemon
  |
  V
 sends RPC message -------+
  |                       |
  |                 receives RPC message
  |                       |
  V                       V
 opens file,        filesystem not mounted!
 error: not found!        |
  |                       |
  V                       V
 sends cancel       sends cancel
  +------->      <--------+
            !!!!

--- Additional comment from rjones@redhat.com on 2010-05-12 15:00:36 EDT ---

Patch posted upstream to fix the issue described in comment 5:
https://www.redhat.com/archives/libguestfs/2010-May/msg00061.html
Comment 1 Richard W.M. Jones 2010-11-24 04:09:09 EST
Will probably be fixed by the rebase.  Needs QA to
verify that.
Comment 2 Richard W.M. Jones 2011-01-04 09:15:25 EST
Going to claim that this is fixed by the
rebase.  QA please check this one carefully
since there are lots of corner cases in the
code, and we're not really sure that we have
fixed all of them properly.
Comment 4 Jinxin Zheng 2011-01-31 02:18:00 EST
This is found not completely fixed. Actually it looks even worse:

guestfish <<EOF
add test.img
run
upload test.txt /test.txt
EOF

it is getting hang running the above.

so I would change this back to ASSIGNED.
Comment 5 Richard W.M. Jones 2011-01-31 04:48:32 EST
Fair enough, this isn't fixed.  In fact we suspected this
when the regression test started failing:
https://bugzilla.redhat.com/show_bug.cgi?id=576879#c7

I think I'm going to leave this one and not fix it for 6.1.
There's an easy workaround for users, and we can fix it
for 6.2 instead.
Comment 6 RHEL Product and Program Management 2011-01-31 05:05:14 EST
Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request.
Comment 8 Richard W.M. Jones 2011-07-12 06:48:31 EDT
Safest to fix this by rebasing (bug 719879).
Comment 9 Richard W.M. Jones 2011-08-10 13:24:39 EDT
Actually we have included all the relevant commits
so this should be fixed in 6.2.
Comment 10 Richard W.M. Jones 2011-08-10 13:26:43 EDT
This is what the correct output should be (verified
for me on RHEL 6.2 with libguestfs 1.7.17-24.el6):

$ guestfish -N disk -- -tar-in /dev/nofile /blah : ping-daemon : echo OK
libguestfs: error: open: /dev/nofile: No such file or directory
OK
Comment 13 Jinxin Zheng 2011-08-26 05:10:17 EDT
Verified this using the reproducer in comment 10.

$ guestfish -N disk -- -tar-in /dev/nofile /blah : ping-daemon : echo OK

libguestfs-1.7.17-19:
The script hangs after printing
libguestfs: error: open: /dev/nofile: No such file or directory

libguestfs-1.7.17-26:
The scripts prints the following then exit.
libguestfs: error: open: /dev/nofile: No such file or directory
OK
Comment 14 errata-xmlrpc 2011-12-06 05:42:02 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1512.html

Note You need to log in before you can comment on or make changes to this bug.