Bug 1902631
Summary: | [RHOSP 13 to 16.1 Upgrades][OvS-DPDK] DPDK vms fail to live-migrate between 13->16.1 upgrade | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Yadnesh Kulkarni <ykulkarn> | |
Component: | documentation | Assignee: | Maxime Coquelin <maxime.coquelin> | |
Status: | CLOSED DUPLICATE | QA Contact: | nlevinki <nlevinki> | |
Severity: | urgent | Docs Contact: | ||
Priority: | urgent | |||
Version: | 16.1 (Train) | CC: | cfields, dgilbert, dvd, fbaudin, fhallal, fleitner, hakhande, i.maximets, kchamart, kmehta, kthakre, maxime.coquelin, mburns, morazi, msufiyan, smooney, yrachman | |
Target Milestone: | --- | Keywords: | Reopened | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1917817 2244628 (view as bug list) | Environment: | ||
Last Closed: | 2021-01-27 13:51:47 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1916832, 1917817 | |||
Bug Blocks: | 2244628 |
Description
Yadnesh Kulkarni
2020-11-30 09:03:51 UTC
Hello, I've encountered the similar problem during an attempt to live migrate a dpdk instance from an older compute node (on rhosp13) to an upgraded compute node (rhosp16.1). Here is the error snippet: ~~~ 2020-12-04 10:00:21.134 8 ERROR nova.virt.libvirt.driver [-] [instance: b129442b-e162-490f-b30c-7e1d99dce35b] Live Migration failure: internal error: qemu une xpectedly closed the monitor: 2020-12-04T10:00:09.889529Z qemu-kvm: -chardev socket,id=charnet0,path=/var/lib/vhost_sockets/vhu25fdafef-e9,server: info: QEMU waiting for connection on: disconnected:unix:/var/lib/vhost_sockets/vhu25fdafef-e9,server 2020-12-04T10:00:12.080859Z qemu-kvm: -device cirrus-vga,id=video0,bus=pci.0,addr=0x2: warning: 'cirrus-vga' is deprecated, please use a different VGA card in stead 2020-12-04T10:00:20.705320Z qemu-kvm: Features 0x130afe7a2 unsupported. Allowed features: 0x178bfa7e6 2020-12-04T10:00:20.705355Z qemu-kvm: Failed to load virtio-net:virtio 2020-12-04T10:00:20.705364Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:03.0/virtio-net' 2020-12-04T10:00:20.705722Z qemu-kvm: load of migration failed: Operation not permitted: libvirt.libvirtError: internal error: qemu unexpectedly closed the mo nitor: 2020-12-04T10:00:09.889529Z qemu-kvm: -chardev socket,id=charnet0,path=/var/lib/vhost_sockets/vhu25fdafef-e9,server: info: QEMU waiting for connection on: disconnected:unix:/var/lib/vhost_sockets/vhu25fdafef-e9,server ~~~ And the instance was rolled back to the source compute node, I've halted the upgrade process for now to complete the migration process first. So, if there are any specific logs needed I can help with that information. Qemu-kvm version: On destination node: (RHEL 8.2, within the nova_libvirt container) qemu-kvm-common-4.2.0-29.module+el8.2.1+7990+27f1e480.4.x86_64 qemu-kvm-block-curl-4.2.0-29.module+el8.2.1+7990+27f1e480.4.x86_64 qemu-kvm-core-4.2.0-29.module+el8.2.1+7990+27f1e480.4.x86_64 On source node: qemu-kvm-common-rhev-2.12.0-48.el7_9.1.x86_64 qemu-kvm-rhev-2.12.0-48.el7_9.1.x86_64 (In reply to Ketan Mehta from comment #6) > Hello, > > I've encountered the similar problem during an attempt to live migrate a > dpdk instance from an older compute node (on rhosp13) to an upgraded compute > node (rhosp16.1). > > Here is the error snippet: > > ~~~ > 2020-12-04 10:00:21.134 8 ERROR nova.virt.libvirt.driver [-] [instance: > b129442b-e162-490f-b30c-7e1d99dce35b] Live Migration failure: internal > error: qemu une > xpectedly closed the monitor: 2020-12-04T10:00:09.889529Z qemu-kvm: -chardev > socket,id=charnet0,path=/var/lib/vhost_sockets/vhu25fdafef-e9,server: info: > QEMU > waiting for connection on: > disconnected:unix:/var/lib/vhost_sockets/vhu25fdafef-e9,server > > 2020-12-04T10:00:12.080859Z qemu-kvm: -device > cirrus-vga,id=video0,bus=pci.0,addr=0x2: warning: 'cirrus-vga' is > deprecated, please use a different VGA card in > stead > 2020-12-04T10:00:20.705320Z qemu-kvm: Features 0x130afe7a2 unsupported. > Allowed features: 0x178bfa7e6 > > 2020-12-04T10:00:20.705355Z qemu-kvm: Failed to load virtio-net:virtio > 2020-12-04T10:00:20.705364Z qemu-kvm: error while loading state for instance > 0x0 of device '0000:00:03.0/virtio-net' > > 2020-12-04T10:00:20.705722Z qemu-kvm: load of migration failed: Operation > not permitted: libvirt.libvirtError: internal error: qemu unexpectedly > closed the mo > nitor: 2020-12-04T10:00:09.889529Z qemu-kvm: -chardev > socket,id=charnet0,path=/var/lib/vhost_sockets/vhu25fdafef-e9,server: info: > QEMU waiting for connection > on: disconnected:unix:/var/lib/vhost_sockets/vhu25fdafef-e9,server > ~~~ > > And the instance was rolled back to the source compute node, I've halted the > upgrade process for now to complete the migration process first. > > So, if there are any specific logs needed I can help with that information. > > Qemu-kvm version: > > On destination node: (RHEL 8.2, within the nova_libvirt container) > > qemu-kvm-common-4.2.0-29.module+el8.2.1+7990+27f1e480.4.x86_64 > qemu-kvm-block-curl-4.2.0-29.module+el8.2.1+7990+27f1e480.4.x86_64 > qemu-kvm-core-4.2.0-29.module+el8.2.1+7990+27f1e480.4.x86_64 > > On source node: > > qemu-kvm-common-rhev-2.12.0-48.el7_9.1.x86_64 > qemu-kvm-rhev-2.12.0-48.el7_9.1.x86_64 Please refer this note: https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html-single/director_installation_and_usage/index#overcloud-storage You have two options, use ceph/ceph/external for LVM, use case upgrade will pass without workloads only, it is tested but not supported used case. For ovs+dpdk we have BZ with w/a https://bugzilla.redhat.com/show_bug.cgi?id=1895887 sorry ment to update this on friday. to avoid the proliferation of bug im going to close this as a duplicate of the existing DDF bug https://bugzilla.redhat.com/show_bug.cgi?id=1916869 in our internal call we confirmed the assertion that the work required to make this work is excessinve and would be an unresonable amount of technical debt to maintainer given the vm would evenutally have to be hard rebooted anyway. for that reason this will be address by a documentation update to not that livemigration with ovs-dpdk is not suport during FFU due to cahnges required to ensure vms only negotiate valid offloads when using ovs-dpdk. all aprochs we explored either require a vm reboot or a regression of the offload negociation fix followed by an eventual vm reboot so it is our view that it is better to take thet reboot upfront in the form of a cold migration. *** This bug has been marked as a duplicate of bug 1916869 *** The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |