Bug 1269874 - HMP/QMP blocked on failed NBD device (during migration and quit)
HMP/QMP blocked on failed NBD device (during migration and quit)
Status: NEW
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev (Show other bugs)
7.2
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Markus Armbruster
xianwang
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-10-08 07:57 EDT by Dr. David Alan Gilbert
Modified: 2017-06-08 07:59 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Dr. David Alan Gilbert 2015-10-08 07:57:06 EDT
Description of problem:
It's possible to hang the HMP in various cases if there's a non-responding block device; in this case an NBD server that's fallen off the net.


At a high level we have:
  a) Start a drive_mirror to a remote NBD server
  b) Fail the network connection to the NBD server
  c) Cancel the block-job (apparently works)
  d) Start a migration
  e) Hangs at the end of migration

or
  d2) Try and quit (i.e. q) - hangs

the problem is that the TCP connection to the NBD device blocks and things hang in bdrv_flush_all - although I'm not sure why that causes an HMP hang for the migrate

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.3.0-29.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. You need two hosts;  source & destination - you should have two network connections to them, so that you can take one down and still work with it.  Lets assume that the 'source' can talk to 'destination' using the name 'destinationname' and on the destionation that's ether device em2
2. You need a VM image of your favourite OS
3. On both hosts run:
   /usr/libexec/qemu-kvm -nographic -machine pc,accel=kvm -m 2048 -drive file=/home/localvms/local-f20b.qcow2,if=none,id=foo -device virtio-blk-pci,drive=foo,id=food -S
4. On the destination run:
  nbd_server_start :8889
  nbd_server_add -w foo
5. On the source run:
  drive_add 0 id=remote,file=nbd:destinationname:8889:exportname=foo,if=none
  drive_mirror foo remote

6. On the destination take down the interface that has destinationname
  ifdown em2

 You now see on the source that info block-jobs shows the job not advancing
7. On the source
 migrate_set_speed 100G
 migrate -d "exec:cat > /dev/null"

Actual results:
after a second or two the HMP blocks

Expected results:
we should never block the HMP

Additional info:
Cases where we could hit this include:
   a) Just a dead NBD server (probably also NFS etc)
   b) A migrate to a host that dies during the migrate, so you want to cancel that and then migrate somewhere else.
Comment 2 Dr. David Alan Gilbert 2015-10-09 13:10:58 EDT
the path that causes the HMP lock is that in migration_completion we do a lock_iothread, 
vm_stop_force_state(RUN_STATE_FINISH_MIGRATE)
do the device state save
unlock the iothread.

The vm_stop_force_state does a bdrv_flush_all(), which I think is where it hangs;
one possibility is for us to do an earlier flush (and I think there's a patch to do that for performance reasons from Intel somewhere); that would at least cause the migration thread to block before it locks the iothread unless you got really unlucky and the destination failed after that point.

However, even then we need a way of forcibly stopping the nbd client to be able to free the migration thread.
Comment 3 Stefan Hajnoczi 2015-10-14 10:50:14 EDT
This is a design limitation in QEMU today.  There are synchronous "wait for all I/Os" points in the code.

We need to move to a model with timeouts so these operations can fail.  That may mean that migration fails and requires the user to issue some kind of "force remove" command to take down the broken NBD device.

None of this exists yet and I've seen other BZs related to the same issue.  I suggest we keep this around and keep moving this BZ to the next release until we have time to tackle this issue.
Comment 4 Dr. David Alan Gilbert 2015-10-14 13:10:59 EDT
(In reply to Stefan Hajnoczi from comment #3)
> This is a design limitation in QEMU today.  There are synchronous "wait for
> all I/Os" points in the code.
> 
> We need to move to a model with timeouts so these operations can fail.  That
> may mean that migration fails and requires the user to issue some kind of
> "force remove" command to take down the broken NBD device.

Thinking about the 'force remove' was what initially got me thinking about this, and would solve the worst migration case, i.e. a migration to a machine that dies and then you try a new migration, because libvirt could do that 'force remove' when it kills the block job doing the disk copy as part of the failed migration.

Dave

> None of this exists yet and I've seen other BZs related to the same issue. 
> I suggest we keep this around and keep moving this BZ to the next release
> until we have time to tackle this issue.
Comment 5 Ademar Reis 2015-12-28 09:46:27 EST
I'm assuming this affects QMP as well, otherwise we would close this BZ as WONTFIX, because HMP is not supported.

I'm adding QMP to the summary.
Comment 6 Ademar Reis 2015-12-28 09:47:27 EST
See also: Bug 1285453
Comment 8 juzhang 2017-06-08 07:59:03 EDT
Hi Qunfang,

Free to update the QE contact.

Note You need to log in before you can comment on or make changes to this bug.