Bug 1527532
Summary: | Unable to live migrate vm in DPDK environment | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Eyal Dannon <edannon> | ||||||
Component: | openvswitch | Assignee: | Sahid Ferdjaoui <sferdjao> | ||||||
Status: | CLOSED NOTABUG | QA Contact: | Ofer Blaut <oblaut> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 10.0 (Newton) | CC: | apevec, atelang, berrange, chrisw, dasmith, edannon, eglynn, fbaudin, kchamart, ksundara, lyarwood, mbooth, rhos-maint, samccann, sbandyop, sbauza, sferdjao, sgordon, skramaja, srevivo, stephenfin, vchundur, vromanso, yrachman | ||||||
Target Milestone: | async | Keywords: | Reopened, Triaged, ZStream | ||||||
Target Release: | 10.0 (Newton) | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2018-04-05 07:16:46 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1542107, 1543165, 1543166 | ||||||||
Bug Blocks: | |||||||||
Attachments: |
|
Description
Eyal Dannon
2017-12-19 12:37:40 UTC
Created attachment 1370008 [details]
Compute-1 sosreport
I can't find the bugzilla but I think there is a know issue with command 'openstack server migrate' when using '--block-migration' option. Can you try with command "nova live-migrate --block-migration" ? Also can you provide sosreport of source host since it's where we will have the error message returned by libvirt/QEMU. Oh I see compute-1 is source host, but there is not a lot of information regarding an error related to a DPDK context. Please try with nova command. Hi, [stack@undercloud-0 ~]$ nova live-migration --block-migrate test Still getting into error state: | OS-EXT-STS:vm_state | error 2017-12-25 07:58:33.407 2698 ERROR oslo_messaging.rpc.server InstanceNotFound: Instance instance-0000000b could not be found. 2017-12-25 07:58:33.407 2698 ERROR oslo_messaging.rpc.server 2017-12-25 07:58:45.182 2698 INFO nova.compute.manager [-] [instance: 6e269ae4-bfc7-4dd9-903a-8cd94e480271] VM Stopped (Lifecycle Event) [root@compute-0 ~]# ll /var/lib/nova/instances/6e269ae4-bfc7-4dd9-903a-8cd94e480271/ total 56644 -rw-------. 1 root root 0 Dec 25 07:58 console.log -rw-r--r--. 1 root root 57999360 Dec 25 07:58 disk -rw-r--r--. 1 nova nova 78 Dec 25 07:58 disk.info Any additional info I could provide? would you like to take a look at the setup? Thanks, It seems that some wrong happened during the post live migration step but the logs you have reported do not included DEBUG so we can't investigate more of that what was the root cause. Can you configure nova.conf in debug and reproduce the case? Hum so the instance is crashing in destination host. ... 2018-01-10T13:36:07.894624Z qemu-kvm: -chardev pty,id=charserial1: char device redirected to /dev/pts/1 (label charserial1) 2018-01-10T13:36:11.366484Z qemu-kvm: Not a migration stream 2018-01-10T13:36:11.366722Z qemu-kvm: load of migration failed: Invalid argument 2018-01-10 13:36:11.596+0000: shutting down, reason=crashed I'm exchanging of that with dgilbert and continuing investigation... [root@compute-1 ~]# ovs-vsctl show e333b920-a3df-4a7f-9256-0fb90824e9c8 Manager "ptcp:6640:127.0.0.1" is_connected: true Bridge br-int Controller "tcp:127.0.0.1:6633" is_connected: true fail_mode: secure Port br-int Interface br-int type: internal Port int-br-link Interface int-br-link type: patch options: {peer=phy-br-link} Port "vhue13713ea-58" tag: 8 Interface "vhue13713ea-58" type: dpdkvhostuser Bridge br-link Controller "tcp:127.0.0.1:6633" is_connected: true fail_mode: secure Port "dpdk0" Interface "dpdk0" type: dpdk Port br-link Interface br-link type: internal Port phy-br-link Interface phy-br-link type: patch options: {peer=int-br-link} ovs_version: "2.6.1" [root@compute-1 ~]# ovs-vsctl list interface vhue13713ea-58 _uuid : 58955a74-51c3-4c7a-b92c-ce4e81aba761 admin_state : up bfd : {} bfd_status : {} cfm_fault : [] cfm_fault_status : [] cfm_flap_count : [] cfm_health : [] cfm_mpid : [] cfm_remote_mpids : [] cfm_remote_opstate : [] duplex : [] error : [] external_ids : {attached-mac="fa:16:3e:09:79:34", iface-id="e13713ea-58e2-4f2e-9e21-8923accdd0c4", iface-status=active, vm-uuid="1719f566-7903-4cff-8f75-3d2801f78f66"} ifindex : 0 ingress_policing_burst: 0 ingress_policing_rate: 0 lacp_current : [] link_resets : 0 link_speed : [] link_state : up lldp : {} mac : [] mac_in_use : "00:00:00:00:00:00" mtu : 1496 mtu_request : 1496 name : "vhue13713ea-58" ofport : 9 ofport_request : [] options : {} other_config : {} statistics : {"rx_1024_to_1518_packets"=1, "rx_128_to_255_packets"=26, "rx_1523_to_max_packets"=0, "rx_1_to_64_packets"=16, "rx_256_to_511_packets"=4, "rx_512_to_1023_packets"=0, "rx_65_to_127_packets"=363, rx_bytes=38586, rx_dropped=0, rx_errors=0, rx_packets=409, tx_bytes=45541, tx_packets=468} status : {} type : dpdkvhostuser [root@compute-1 ~]# cat /var/log/libvirt/qemu/instance-00000008.log 2018-01-10 15:38:19.809+0000: starting up libvirt version: 3.2.0, package: 14.el7_4.7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2017-12-19-04:58:04, x86-041.build.eng.bos.redhat.com), qemu version: 2.9.0(qemu-kvm-rhev-2.9.0-16.el7_4.13), hostname: compute-1.localdomain LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=instance-00000008,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-8-instance-00000008/master-key.aes -machine pc-i440fx-rhel7.4.0,accel=kvm,usb=off,dump-guest-core=off -cpu Skylake-Client-IBRS,ss=on,hypervisor=on,tsc_adjust=on,pdpe1gb=on,mpx=off,xsavec=off,xgetbv1=off -m 4096 -realtime mlock=off -smp 6,sockets=3,cores=1,threads=2 -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu/8-instance-00000008,share=yes,size=4294967296,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0-5,memdev=ram-node0 -uuid 1719f566-7903-4cff-8f75-3d2801f78f66 -smbios 'type=1,manufacturer=Red Hat,product=OpenStack Compute,version=14.0.8-5.el7ost,serial=e7b3bfa8-30e2-42ce-95c6-58637aa201a5,uuid=1719f566-7903-4cff-8f75-3d2801f78f66,family=Virtual Machine' -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-8-instance-00000008/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/nova/instances/1719f566-7903-4cff-8f75-3d2801f78f66/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev socket,id=charnet0,path=/var/run/openvswitch/vhue13713ea-58 -netdev vhost-user,chardev=charnet0,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:09:79:34,bus=pci.0,addr=0x3 -add-fd set=0,fd=27 -chardev file,id=charserial0,path=/dev/fdset/0,append=on -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 10.100.120.112:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on 2018-01-10T15:38:19.954121Z qemu-kvm: -chardev pty,id=charserial1: char device redirected to /dev/pts/1 (label charserial1) 2018-01-10 16:01:17.447+0000: initiating migration 2018-01-10T16:01:17.453777Z qemu-kvm: Failed to read msg header. Read -1 instead of 12. Original request 6. 2018-01-10T16:01:17.453998Z qemu-kvm: vhost_set_log_base failed: Input/output error (5) 2018-01-10T16:01:17.454060Z qemu-kvm: Failed to set msg fds. 2018-01-10T16:01:17.454076Z qemu-kvm: vhost_set_vring_addr failed: Invalid argument (22) 2018-01-10T16:01:17.454090Z qemu-kvm: Failed to set msg fds. 2018-01-10T16:01:17.454111Z qemu-kvm: vhost_set_vring_addr failed: Invalid argument (22) 2018-01-10T16:01:17.454125Z qemu-kvm: Failed to set msg fds. 2018-01-10T16:01:17.454138Z qemu-kvm: vhost_set_features failed: Invalid argument (22) 2018-01-10 16:01:17.697+0000: shutting down, reason=crashed Based on errors it seems that we are hitting the "Issue2" of bug 1450680. I'm marking it as duplicate even if we do not have configured the interface to use 2 queues and do not have traffic in the guest. *** This bug has been marked as a duplicate of bug 1450680 *** I was unable to reproduce the issue with latest puddle I suspect an issue in OVS/DPDK configuration. Please re-open if necessary. |