Red Hat Bugzilla – Bug 1269874
HMP/QMP blocked on failed NBD device (during migration and quit)
Last modified: 2017-06-08 07:59:03 EDT
Description of problem:
It's possible to hang the HMP in various cases if there's a non-responding block device; in this case an NBD server that's fallen off the net.
At a high level we have:
a) Start a drive_mirror to a remote NBD server
b) Fail the network connection to the NBD server
c) Cancel the block-job (apparently works)
d) Start a migration
e) Hangs at the end of migration
d2) Try and quit (i.e. q) - hangs
the problem is that the TCP connection to the NBD device blocks and things hang in bdrv_flush_all - although I'm not sure why that causes an HMP hang for the migrate
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. You need two hosts; source & destination - you should have two network connections to them, so that you can take one down and still work with it. Lets assume that the 'source' can talk to 'destination' using the name 'destinationname' and on the destionation that's ether device em2
2. You need a VM image of your favourite OS
3. On both hosts run:
/usr/libexec/qemu-kvm -nographic -machine pc,accel=kvm -m 2048 -drive file=/home/localvms/local-f20b.qcow2,if=none,id=foo -device virtio-blk-pci,drive=foo,id=food -S
4. On the destination run:
nbd_server_add -w foo
5. On the source run:
drive_add 0 id=remote,file=nbd:destinationname:8889:exportname=foo,if=none
drive_mirror foo remote
6. On the destination take down the interface that has destinationname
You now see on the source that info block-jobs shows the job not advancing
7. On the source
migrate -d "exec:cat > /dev/null"
after a second or two the HMP blocks
we should never block the HMP
Cases where we could hit this include:
a) Just a dead NBD server (probably also NFS etc)
b) A migrate to a host that dies during the migrate, so you want to cancel that and then migrate somewhere else.
the path that causes the HMP lock is that in migration_completion we do a lock_iothread,
do the device state save
unlock the iothread.
The vm_stop_force_state does a bdrv_flush_all(), which I think is where it hangs;
one possibility is for us to do an earlier flush (and I think there's a patch to do that for performance reasons from Intel somewhere); that would at least cause the migration thread to block before it locks the iothread unless you got really unlucky and the destination failed after that point.
However, even then we need a way of forcibly stopping the nbd client to be able to free the migration thread.
This is a design limitation in QEMU today. There are synchronous "wait for all I/Os" points in the code.
We need to move to a model with timeouts so these operations can fail. That may mean that migration fails and requires the user to issue some kind of "force remove" command to take down the broken NBD device.
None of this exists yet and I've seen other BZs related to the same issue. I suggest we keep this around and keep moving this BZ to the next release until we have time to tackle this issue.
(In reply to Stefan Hajnoczi from comment #3)
> This is a design limitation in QEMU today. There are synchronous "wait for
> all I/Os" points in the code.
> We need to move to a model with timeouts so these operations can fail. That
> may mean that migration fails and requires the user to issue some kind of
> "force remove" command to take down the broken NBD device.
Thinking about the 'force remove' was what initially got me thinking about this, and would solve the worst migration case, i.e. a migration to a machine that dies and then you try a new migration, because libvirt could do that 'force remove' when it kills the block job doing the disk copy as part of the failed migration.
> None of this exists yet and I've seen other BZs related to the same issue.
> I suggest we keep this around and keep moving this BZ to the next release
> until we have time to tackle this issue.
I'm assuming this affects QMP as well, otherwise we would close this BZ as WONTFIX, because HMP is not supported.
I'm adding QMP to the summary.
See also: Bug 1285453
Free to update the QE contact.