Bug 1527532
| Summary: | Unable to live migrate vm in DPDK environment | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Eyal Dannon <edannon> | ||||||
| Component: | openvswitch | Assignee: | Sahid Ferdjaoui <sferdjao> | ||||||
| Status: | CLOSED NOTABUG | QA Contact: | Ofer Blaut <oblaut> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 10.0 (Newton) | CC: | apevec, atelang, berrange, chrisw, dasmith, edannon, eglynn, fbaudin, kchamart, ksundara, lyarwood, mbooth, rhos-maint, samccann, sbandyop, sbauza, sferdjao, sgordon, skramaja, srevivo, stephenfin, vchundur, vromanso, yrachman | ||||||
| Target Milestone: | async | Keywords: | Reopened, Triaged, ZStream | ||||||
| Target Release: | 10.0 (Newton) | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2018-04-05 07:16:46 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | 1542107, 1543165, 1543166 | ||||||||
| Bug Blocks: | |||||||||
| Attachments: |
|
||||||||
|
Description
Eyal Dannon
2017-12-19 12:37:40 UTC
Created attachment 1370008 [details]
Compute-1 sosreport
I can't find the bugzilla but I think there is a know issue with command 'openstack server migrate' when using '--block-migration' option. Can you try with command "nova live-migrate --block-migration" ? Also can you provide sosreport of source host since it's where we will have the error message returned by libvirt/QEMU. Oh I see compute-1 is source host, but there is not a lot of information regarding an error related to a DPDK context. Please try with nova command. Hi, [stack@undercloud-0 ~]$ nova live-migration --block-migrate test Still getting into error state: | OS-EXT-STS:vm_state | error 2017-12-25 07:58:33.407 2698 ERROR oslo_messaging.rpc.server InstanceNotFound: Instance instance-0000000b could not be found. 2017-12-25 07:58:33.407 2698 ERROR oslo_messaging.rpc.server 2017-12-25 07:58:45.182 2698 INFO nova.compute.manager [-] [instance: 6e269ae4-bfc7-4dd9-903a-8cd94e480271] VM Stopped (Lifecycle Event) [root@compute-0 ~]# ll /var/lib/nova/instances/6e269ae4-bfc7-4dd9-903a-8cd94e480271/ total 56644 -rw-------. 1 root root 0 Dec 25 07:58 console.log -rw-r--r--. 1 root root 57999360 Dec 25 07:58 disk -rw-r--r--. 1 nova nova 78 Dec 25 07:58 disk.info Any additional info I could provide? would you like to take a look at the setup? Thanks, It seems that some wrong happened during the post live migration step but the logs you have reported do not included DEBUG so we can't investigate more of that what was the root cause. Can you configure nova.conf in debug and reproduce the case? Hum so the instance is crashing in destination host. ... 2018-01-10T13:36:07.894624Z qemu-kvm: -chardev pty,id=charserial1: char device redirected to /dev/pts/1 (label charserial1) 2018-01-10T13:36:11.366484Z qemu-kvm: Not a migration stream 2018-01-10T13:36:11.366722Z qemu-kvm: load of migration failed: Invalid argument 2018-01-10 13:36:11.596+0000: shutting down, reason=crashed I'm exchanging of that with dgilbert and continuing investigation... [root@compute-1 ~]# ovs-vsctl show
e333b920-a3df-4a7f-9256-0fb90824e9c8
Manager "ptcp:6640:127.0.0.1"
is_connected: true
Bridge br-int
Controller "tcp:127.0.0.1:6633"
is_connected: true
fail_mode: secure
Port br-int
Interface br-int
type: internal
Port int-br-link
Interface int-br-link
type: patch
options: {peer=phy-br-link}
Port "vhue13713ea-58"
tag: 8
Interface "vhue13713ea-58"
type: dpdkvhostuser
Bridge br-link
Controller "tcp:127.0.0.1:6633"
is_connected: true
fail_mode: secure
Port "dpdk0"
Interface "dpdk0"
type: dpdk
Port br-link
Interface br-link
type: internal
Port phy-br-link
Interface phy-br-link
type: patch
options: {peer=int-br-link}
ovs_version: "2.6.1"
[root@compute-1 ~]# ovs-vsctl list interface vhue13713ea-58
_uuid : 58955a74-51c3-4c7a-b92c-ce4e81aba761
admin_state : up
bfd : {}
bfd_status : {}
cfm_fault : []
cfm_fault_status : []
cfm_flap_count : []
cfm_health : []
cfm_mpid : []
cfm_remote_mpids : []
cfm_remote_opstate : []
duplex : []
error : []
external_ids : {attached-mac="fa:16:3e:09:79:34", iface-id="e13713ea-58e2-4f2e-9e21-8923accdd0c4", iface-status=active, vm-uuid="1719f566-7903-4cff-8f75-3d2801f78f66"}
ifindex : 0
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current : []
link_resets : 0
link_speed : []
link_state : up
lldp : {}
mac : []
mac_in_use : "00:00:00:00:00:00"
mtu : 1496
mtu_request : 1496
name : "vhue13713ea-58"
ofport : 9
ofport_request : []
options : {}
other_config : {}
statistics : {"rx_1024_to_1518_packets"=1, "rx_128_to_255_packets"=26, "rx_1523_to_max_packets"=0, "rx_1_to_64_packets"=16, "rx_256_to_511_packets"=4, "rx_512_to_1023_packets"=0, "rx_65_to_127_packets"=363, rx_bytes=38586, rx_dropped=0, rx_errors=0, rx_packets=409, tx_bytes=45541, tx_packets=468}
status : {}
type : dpdkvhostuser
[root@compute-1 ~]# cat /var/log/libvirt/qemu/instance-00000008.log
2018-01-10 15:38:19.809+0000: starting up libvirt version: 3.2.0, package: 14.el7_4.7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2017-12-19-04:58:04, x86-041.build.eng.bos.redhat.com), qemu version: 2.9.0(qemu-kvm-rhev-2.9.0-16.el7_4.13), hostname: compute-1.localdomain
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=instance-00000008,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-8-instance-00000008/master-key.aes -machine pc-i440fx-rhel7.4.0,accel=kvm,usb=off,dump-guest-core=off -cpu Skylake-Client-IBRS,ss=on,hypervisor=on,tsc_adjust=on,pdpe1gb=on,mpx=off,xsavec=off,xgetbv1=off -m 4096 -realtime mlock=off -smp 6,sockets=3,cores=1,threads=2 -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu/8-instance-00000008,share=yes,size=4294967296,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0-5,memdev=ram-node0 -uuid 1719f566-7903-4cff-8f75-3d2801f78f66 -smbios 'type=1,manufacturer=Red Hat,product=OpenStack Compute,version=14.0.8-5.el7ost,serial=e7b3bfa8-30e2-42ce-95c6-58637aa201a5,uuid=1719f566-7903-4cff-8f75-3d2801f78f66,family=Virtual Machine' -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-8-instance-00000008/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/nova/instances/1719f566-7903-4cff-8f75-3d2801f78f66/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev socket,id=charnet0,path=/var/run/openvswitch/vhue13713ea-58 -netdev vhost-user,chardev=charnet0,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:09:79:34,bus=pci.0,addr=0x3 -add-fd set=0,fd=27 -chardev file,id=charserial0,path=/dev/fdset/0,append=on -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 10.100.120.112:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on
2018-01-10T15:38:19.954121Z qemu-kvm: -chardev pty,id=charserial1: char device redirected to /dev/pts/1 (label charserial1)
2018-01-10 16:01:17.447+0000: initiating migration
2018-01-10T16:01:17.453777Z qemu-kvm: Failed to read msg header. Read -1 instead of 12. Original request 6.
2018-01-10T16:01:17.453998Z qemu-kvm: vhost_set_log_base failed: Input/output error (5)
2018-01-10T16:01:17.454060Z qemu-kvm: Failed to set msg fds.
2018-01-10T16:01:17.454076Z qemu-kvm: vhost_set_vring_addr failed: Invalid argument (22)
2018-01-10T16:01:17.454090Z qemu-kvm: Failed to set msg fds.
2018-01-10T16:01:17.454111Z qemu-kvm: vhost_set_vring_addr failed: Invalid argument (22)
2018-01-10T16:01:17.454125Z qemu-kvm: Failed to set msg fds.
2018-01-10T16:01:17.454138Z qemu-kvm: vhost_set_features failed: Invalid argument (22)
2018-01-10 16:01:17.697+0000: shutting down, reason=crashed
Based on errors it seems that we are hitting the "Issue2" of bug 1450680. I'm marking it as duplicate even if we do not have configured the interface to use 2 queues and do not have traffic in the guest.
*** This bug has been marked as a duplicate of bug 1450680 ***
I was unable to reproduce the issue with latest puddle I suspect an issue in OVS/DPDK configuration. Please re-open if necessary. |