Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
On a setup of 2 rhevh hosts, migrate a VM from host1 to host2.
While migration processed (reached 98% by vdsm logs),
block storage by iptables reject, on dst host.
In UI VM is stuck in state "migrating from". On both src/dst hosts virsh -r, list show this VM in pause mode.
On src host, vdsm.log, there was a message on migration stuck
on src host, vdsClient -s 0 list show these VM details: Status = Migration Source, pauseCode = NOERR
on both hosts qemu process of this VM exist.
It seems like that the blocked storage is keeping qemu/libvirt stuck.
Version-Release number of selected component (if applicable):
libvirt-0.10.2-29.el6_5.2.x86_64
Tested on is31 rhevm
Expected results:
Migration should fail, by using abortOnError flag
Additional info:
This bug might be related to bug 972675/ bug 1045833
According to libvirtd logs from destination, qemu-kvm process is stuck and not responding to "cont" monitor command. Thus libvirtd is waiting for it to resume guest CPUs so that it can let the source know migration succeeded. There's not much we can do about it and if qemu-kvm is in d-state, I doubt they can solve it easily. To me the best solution seems to be for management (vdsm/rhev) to just abort the migration by killing the destination domain after detecting migration is stuck... Anyway, I'm passing this bug to qemu-kvm in case they have more ideas on how to avoid qemu being stuck.
More info:
=========
- Command used to block storage on dst host:
iptables -A OUTPUT -d <storage server fqdn> -j REJECT
- After unblock storage on dst host (by "iptables -D $CHAIN_NAME $LINE_NUMBER"),
Migration has completed.
This is the classic problem of having QEMU (QMP) being stuck in d-state if the underlying storage is blocked. For reference, see:
https://bugzilla.redhat.com/show_bug.cgi?id=665820#c10https://bugzilla.redhat.com/show_bug.cgi?id=665820#c18
I'm marking this bug as a dupe of Bug 665820, which requests a QMP info command to be always responsive and/or an event to be delivered on such a situation, but keep in mind that we're not working on such a new feature anytime soon.
So this should be dealt with in the management layer.
*** This bug has been marked as a duplicate of bug 665820 ***