Bug 1748253
Summary: | QEMU crashes (core dump) when using the integrated NDB server with data-plane | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | aihua liang <aliang> |
Component: | qemu-kvm | Assignee: | Sergio Lopez <slopezpa> |
Status: | CLOSED ERRATA | QA Contact: | aihua liang <aliang> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 8.1 | CC: | coli, eblake, jferlan, jinzhao, jsnow, juzhang, mtessun, qzhang, virt-maint |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-4.1.0-13.module+el8.1.0+4313+ef76ec61 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-11-06 07:19:21 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1741186 |
Description
aihua liang
2019-09-03 08:34:28 UTC
Hi, John I hit this bug when handle needinfo of https://bugzilla.redhat.com/show_bug.cgi?id=1652424, can you help to check it? Thanks, aliang Only virtio_scsi + data plane hit this issue. *** Bug 1717329 has been marked as a duplicate of this bug. *** Refer to 1717329, cancel the needinfo to jsnow. Patch posted upstream: https://lists.gnu.org/archive/html/qemu-block/2019-09/msg00481.html The actual issue is that enabling the integrated NDB server, if configured with a BlockBackend that uses an iothread (data-plane), crashes after negotiating with the client. The root cause is that we weren't setting the QIOChannel AioContext to the same one as the export's. Test on qemu-kvm-4.1.0-13.module+el8.1.0+4313+ef76ec61.x86_64, the issue has been resolved, set bug's status to "Verified". Test Steps: 1.Start src guest with qemu cmds: /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -machine pc \ -nodefaults \ -device VGA,bus=pci.0,addr=0x2 \ -chardev socket,id=qmp_id_qmp1,path=/var/tmp/monitor-qmp1-20190522-203214-pO8ikKhP,server,nowait \ -mon chardev=qmp_id_qmp1,mode=control \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20190522-203214-pO8ikKhP,server,nowait \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device pvpanic,ioport=0x505,id=idWcUIuL \ -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20190522-203214-pO8ikKhP,server,nowait \ -device isa-serial,chardev=serial_id_serial0 \ -chardev socket,id=seabioslog_id_20190522-203214-pO8ikKhP,path=/var/tmp/seabios-20190522-203214-pO8ikKhP,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20190522-203214-pO8ikKhP,iobase=0x402 \ -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 \ -object iothread,id=iothread0 \ -object iothread,id=iothread1 \ -blockdev node-name=file_node,driver=file,filename=/home/kvm_autotest_root/images/rhel810-64-virtio-scsi.qcow2 \ -blockdev node-name=drive_image1,driver=qcow2,file=file_node \ -device virtio-scsi-pci,iothread=iothread0,id=virtio_scsi_pci0,bus=pci.0,addr=0x4 \ -device scsi-hd,id=image1,drive=drive_image1 \ -device virtio-scsi-pci,id=virtio_scsi_pci1,bus=pci.0,addr=0x6,iothread=iothread1 \ -blockdev node-name=file_data,driver=file,filename=/home/data1.qcow2 \ -blockdev node-name=drive_data1,driver=qcow2,file=file_data \ -device scsi-hd,drive=drive_data1,id=data1,bus=virtio_scsi_pci1.0,scsi-id=0,lun=1,channel=0,werror=stop,rerror=stop \ -device virtio-net-pci,mac=9a:f4:f5:f6:f7:f8,id=idJqoo3m,vectors=4,netdev=idiujahB,bus=pci.0,addr=0x5 \ -netdev tap,id=idiujahB,vhost=on \ -m 7168 \ -smp 4,maxcpus=4,cores=2,threads=1,sockets=2 \ -cpu 'Skylake-Client',+kvm_pv_unhalt \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=off,strict=off,order=cdn,once=c \ -enable-kvm \ -monitor stdio \ 2. Create an empty data disk and start dst guest with cmds: #qemu-img create -f qcow2 data2.qcow2 2G /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -machine pc \ -nodefaults \ -device VGA,bus=pci.0,addr=0x2 \ -chardev socket,id=qmp_id_qmp1,path=/var/tmp/monitor-qmp1-20190522-203214-pO8ikKhQ,server,nowait \ -mon chardev=qmp_id_qmp1,mode=control \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20190522-203214-pO8ikKhP,server,nowait \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device pvpanic,ioport=0x505,id=idWcUIuL \ -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20190522-203214-pO8ikKhP,server,nowait \ -device isa-serial,chardev=serial_id_serial0 \ -chardev socket,id=seabioslog_id_20190522-203214-pO8ikKhP,path=/var/tmp/seabios-20190522-203214-pO8ikKhP,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20190522-203214-pO8ikKhP,iobase=0x402 \ -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 \ -object iothread,id=iothread0 \ -object iothread,id=iothread1 \ -blockdev node-name=file_node,driver=file,filename=/home/kvm_autotest_root/images/rhel810-64-virtio-scsi.qcow2 \ -blockdev node-name=drive_image1,driver=qcow2,file=file_node \ -device virtio-scsi-pci,iothread=iothread0,id=virtio_scsi_pci0,bus=pci.0,addr=0x4 \ -device scsi-hd,id=image1,drive=drive_image1 \ -device virtio-scsi-pci,id=virtio_scsi_pci1,bus=pci.0,addr=0x6,iothread=iothread1 \ -blockdev node-name=file_data,driver=file,filename=/home/data2.qcow2 \ -blockdev node-name=drive_data1,driver=qcow2,file=file_data \ -device scsi-hd,drive=drive_data1,id=data1,bus=virtio_scsi_pci1.0,scsi-id=0,lun=1,channel=0,werror=stop,rerror=stop \ -device virtio-net-pci,mac=9a:f4:f5:f6:f7:f8,id=idJqoo3m,vectors=4,netdev=idiujahB,bus=pci.0,addr=0x5 \ -netdev tap,id=idiujahB,vhost=on \ -m 7168 \ -smp 4,maxcpus=4,cores=2,threads=1,sockets=2 \ -cpu 'Skylake-Client',+kvm_pv_unhalt \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :1 \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=off,strict=off,order=cdn,once=c \ -enable-kvm \ -monitor stdio \ -incoming tcp:0:5000 \ 3. Set migration capabilities in both src and dst. {"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"events","state":true},{"capability":"dirty-bitmaps","state":true},{"capability":"pause-before-switchover","state":true}]}} 4. In dst, start ndb server, and expose data disk. { "execute": "nbd-server-start", "arguments": { "addr": { "type": "inet","data": { "host": "10.73.73.83", "port": "3333" } } } } {"return": {}} { "execute": "nbd-server-add", "arguments":{ "device": "drive_data1", "writable": true } } {"return": {}} 5. In src, Add bitmap to data disk and add a new file { "execute": "block-dirty-bitmap-add", "arguments": {"node": "drive_data1", "name": "bitmap0"}} (guest)#dd if=/dev/urandom of=test bs=1M coutn=1000 6. In src, add the target device and do block mirror. {"execute":"blockdev-add","arguments":{"driver":"nbd","node-name":"mirror","server":{"type":"inet","host":"10.73.73.83","port":"3333"},"export":"drive_data1"}} {"return": {}} {"execute": "blockdev-mirror", "arguments": { "device": "drive_data1","target": "mirror", "sync": "full", "job-id":"j1"}} {"timestamp": {"seconds": 1569727762, "microseconds": 131671}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "j1"}} {"timestamp": {"seconds": 1569727762, "microseconds": 131708}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "j1"}} {"return": {}} {"timestamp": {"seconds": 1569727780, "microseconds": 13812}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "j1"}} {"timestamp": {"seconds": 1569727780, "microseconds": 13867}, "event": "BLOCK_JOB_READY", "data": {"device": "j1", "len": 2147614720, "offset": 2147614720, "speed": 0, "type": "mirror"}} 7. Migrate from src to dst { "execute": "migrate", "arguments": { "uri": "tcp:10.73.224.68:5000"}} {"timestamp": {"seconds": 1569727797, "microseconds": 875204}, "event": "MIGRATION", "data": {"status": "setup"}} {"return": {}} {"timestamp": {"seconds": 1569727797, "microseconds": 895189}, "event": "MIGRATION_PASS", "data": {"pass": 1}} {"timestamp": {"seconds": 1569727797, "microseconds": 895245}, "event": "MIGRATION", "data": {"status": "active"}} {"timestamp": {"seconds": 1569727859, "microseconds": 370179}, "event": "MIGRATION_PASS", "data": {"pass": 2}} {"timestamp": {"seconds": 1569727862, "microseconds": 279190}, "event": "MIGRATION_PASS", "data": {"pass": 3}} {"timestamp": {"seconds": 1569727862, "microseconds": 579658}, "event": "MIGRATION_PASS", "data": {"pass": 4}} {"timestamp": {"seconds": 1569727862, "microseconds": 593908}, "event": "STOP"} {"timestamp": {"seconds": 1569727862, "microseconds": 595872}, "event": "MIGRATION", "data": {"status": "pre-switchover"}} 8. Cancel block job. {"execute":"block-job-cancel","arguments":{"device":"j1"}} {"return": {}} {"timestamp": {"seconds": 1569727876, "microseconds": 671310}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "j1"}} {"timestamp": {"seconds": 1569727876, "microseconds": 671348}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "j1"}} {"timestamp": {"seconds": 1569727876, "microseconds": 671421}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "j1", "len": 2147942400, "offset": 2147942400, "speed": 0, "type": "mirror"}} {"timestamp": {"seconds": 1569727876, "microseconds": 671446}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "j1"}} {"timestamp": {"seconds": 1569727876, "microseconds": 671477}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "j1"}} 9. Continue migrate {"execute":"migrate-continue","arguments":{"state":"pre-switchover"}} {"return": {}} {"timestamp": {"seconds": 1569727888, "microseconds": 547919}, "event": "MIGRATION", "data": {"status": "device"}} {"timestamp": {"seconds": 1569727888, "microseconds": 548552}, "event": "MIGRATION_PASS", "data": {"pass": 5}} {"timestamp": {"seconds": 1569727888, "microseconds": 553978}, "event": "MIGRATION", "data": {"status": "completed"}} 10. Info vm status in src and guest. (src qemu)info status VM status: paused (postmigrate) (dst qemu)info status VM status: running Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3723 |