Bug 1720126 - Qemu coredump when resume vm during block mirror process(data_plane enable)
Summary: Qemu coredump when resume vm during block mirror process(data_plane enable)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.1
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: rc
: ---
Assignee: Sergio Lopez
QA Contact: aihua liang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-13 08:14 UTC by aihua liang
Modified: 2019-11-06 07:17 UTC (History)
7 users (show)

Fixed In Version: qemu-kvm-4.1.0-10.module+el8.1.0+4234+33aa4f57
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-06 07:16:08 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:3723 0 None None None 2019-11-06 07:17:06 UTC

Description aihua liang 2019-06-13 08:14:22 UTC
Description of problem:
  Qemu coredump when resume vm during block mirror process(data_plane enable)

Version-Release number of selected component (if applicable):
  kernel version:4.18.0-94.el8.x86_64
  qemu-kvm version:qemu-kvm-4.0.0-3.module+el8.1.0+3265+26c4ed71.x86_64


How reproducible:
100%

Steps to Reproduce:
1.Create an image on NBD Server, then expose it
  #qemu-img create -f qcow2 system.qcow2 20G
  #qemu-nbd -t -p 9000 system.qcow2

2.Start guest with qemu cmds:
   /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1' \
    -machine q35  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x1  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20190602-221944-MrlxVzia,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20190602-221944-MrlxVzia,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idn20piu  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20190602-221944-MrlxVzia,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20190602-221944-MrlxVzia,path=/var/tmp/seabios-20190602-221944-MrlxVzia,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20190602-221944-MrlxVzia,iobase=0x402 \
    -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
    -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \
    -object iothread,id=iothread0 \
    -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
    -blockdev driver=file,filename=/home/kvm_autotest_root/images/rhel77-64-virtio.qcow2,node-name=file_node \
    -blockdev driver=qcow2,node-name=drive_image1,file=file_node \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bus=pcie.0-root-port-3,addr=0x0,iothread=iothread0 \
    -device pcie-root-port,id=pcie.0-root-port-5,slot=5,chassis=5,addr=0x5,bus=pcie.0 \
    -device virtio-scsi-pci,id=virtio_scsi_pci1,bus=pcie.0-root-port-5,addr=0x0 \
    -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \
    -device virtio-net-pci,mac=9a:33:34:35:36:37,id=idj01pFr,vectors=4,netdev=idMgbx8B,bus=pcie.0-root-port-4,addr=0x0  \
    -netdev tap,id=idMgbx8B,vhost=on \
    -m 7168  \
    -smp 4,maxcpus=4,cores=2,threads=1,sockets=2  \
    -cpu 'Skylake-Client',+kvm_pv_unhalt \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -enable-kvm \
    -monitor stdio \
    -qmp tcp:0:3000,server,nowait \

3.Add mirror target by -blockdev-add
   {"execute":"blockdev-add","arguments":{"driver":"nbd","node-name":"mirror","server":{"type":"inet","host":"10.73.73.83","port":"9000"}}}

4.Stop vm
   {"execute":"stop"}

5.Do block mirror to nbd target
   {"execute": "blockdev-mirror", "arguments": { "device": "drive_image1","target": "mirror", "sync": "full", "job-id":"j1"}}

6.Resume vm
   {"execute":"resume"}


Actual results:
After step6, qemu coredump with info:
  (qemu) qemu-kvm: io/channel.c:421: qio_channel_restart_write: Assertion `qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)' failed.
block_mirror.txt: line 37: 19353 Aborted                 (core dumped) /usr/libexec/qemu-kvm -name 'avocado-vt-vm1' -machine q35 -nodefaults -device VGA,bus=pcie.0,addr=0x1 -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20190602-221944-MrlxVzia,server,nowait -mon chardev=qmp_id_qmpmonitor1,mode=control ...

gdb info of coredump:
 #gdb -c core.qemu-kvm.0.70e60df7fb19498ca5e0105698bbd663.19353.1560412984000000
 Core was generated by `/usr/libexec/qemu-kvm -name avocado-vt-vm1 -machine q35 -nodefaults -device VGA'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	  return ret;
[Current thread is 1 (Thread 0x7f4925dd7700 (LWP 19355))]
Missing separate debuginfos, use: yum debuginfo-install spice-server-0.14.2-1.el8.x86_64
(gdb) bt
#0  0x00007f492cf8b93f in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f492cf75c95 in __GI_abort () at abort.c:79
#2  0x00007f492cf75b69 in __assert_fail_base
    (fmt=0x7f492d0dc940 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x56073d48a360 "qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)", file=0x56073d48a326 "io/channel.c", line=421, function=<optimized out>) at assert.c:92
#3  0x00007f492cf83df6 in __GI___assert_fail
    (assertion=assertion@entry=0x56073d48a360 "qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)", file=file@entry=0x56073d48a326 "io/channel.c", line=line@entry=421, function=function@entry=0x56073d48a490 <__PRETTY_FUNCTION__.18126> "qio_channel_restart_write") at assert.c:101
#4  0x000056073d29ad80 in qio_channel_restart_write (opaque=<optimized out>) at io/channel.c:420
#5  0x000056073d29ad80 in qio_channel_restart_write (opaque=<optimized out>) at io/channel.c:414
#6  0x000056073d2ee6a2 in aio_dispatch_handlers (ctx=ctx@entry=0x56073e17f810) at util/aio-posix.c:441
#7  0x000056073d2ef267 in aio_poll (ctx=0x56073e17f810, blocking=blocking@entry=true) at util/aio-posix.c:728
#8  0x000056073d0ddba4 in iothread_run (opaque=0x56073e167d60) at iothread.c:75
#9  0x000056073d2f1314 in qemu_thread_start (args=0x56073e180550) at util/qemu-thread-posix.c:502
#10 0x00007f492d31f2de in start_thread (arg=<optimized out>) at pthread_create.c:486
#11 0x00007f492d050653 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95


Expected results:
 VM can be resumed during block mirror process.

Additional info:
 When disable dataplane, don't hit this issue.

Comment 2 Sergio Lopez 2019-09-11 11:34:09 UTC
This issue is fixed by Kevin's AioContext management patch series included in qemu-4.1.0.

Comment 4 aihua liang 2019-09-18 03:59:44 UTC
Verified on qemu-kvm-4.1.0-10.module+el8.1.0+4234+33aa4f57.x86_64, the bug has been fixed. Set bug's status to "Verified".

Test steps:
 1.Create an image on NBD Server, then expose it
  #qemu-img create -f qcow2 system.qcow2 20G
  #qemu-nbd -t -p 9000 system.qcow2

2.Start guest with qemu cmds:
   /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1' \
    -machine q35  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x1  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20190602-221944-MrlxVzia,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20190602-221944-MrlxVzia,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idn20piu  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20190602-221944-MrlxVzia,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20190602-221944-MrlxVzia,path=/var/tmp/seabios-20190602-221944-MrlxVzia,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20190602-221944-MrlxVzia,iobase=0x402 \
    -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
    -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \
    -object iothread,id=iothread0 \
    -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
    -blockdev driver=file,filename=/home/kvm_autotest_root/images/rhel77-64-virtio.qcow2,node-name=file_node \
    -blockdev driver=qcow2,node-name=drive_image1,file=file_node \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bus=pcie.0-root-port-3,addr=0x0,iothread=iothread0 \
    -device pcie-root-port,id=pcie.0-root-port-5,slot=5,chassis=5,addr=0x5,bus=pcie.0 \
    -device virtio-scsi-pci,id=virtio_scsi_pci1,bus=pcie.0-root-port-5,addr=0x0 \
    -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \
    -device virtio-net-pci,mac=9a:33:34:35:36:37,id=idj01pFr,vectors=4,netdev=idMgbx8B,bus=pcie.0-root-port-4,addr=0x0  \
    -netdev tap,id=idMgbx8B,vhost=on \
    -m 7168  \
    -smp 4,maxcpus=4,cores=2,threads=1,sockets=2  \
    -cpu 'Skylake-Client',+kvm_pv_unhalt \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -enable-kvm \
    -monitor stdio \
    -qmp tcp:0:3000,server,nowait \

3.Add mirror target by -blockdev-add
   {"execute":"blockdev-add","arguments":{"driver":"nbd","node-name":"mirror","server":{"type":"inet","host":"10.66.144.83","port":"9000"}}}

4.Stop vm
   {"execute":"stop"}
   {"timestamp": {"seconds": 1568778986, "microseconds": 479039}, "event": "STOP"}
{"return": {}}

5.Do block mirror to nbd target
   {"execute": "blockdev-mirror", "arguments": { "device": "drive_image1","target": "mirror", "sync": "full", "job-id":"j1"}}
   {"timestamp": {"seconds": 1568778121, "microseconds": 645288}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "j1"}}
{"timestamp": {"seconds": 1568778121, "microseconds": 645331}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "j1"}}
{"return": {}}


6.Resume vm
   {"execute":"cont"}
   {"timestamp": {"seconds": 1568778993, "microseconds": 617882}, "event": "RESUME"}
   
   {"timestamp": {"seconds": 1568778169, "microseconds": 615809}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "j1"}}
{"timestamp": {"seconds": 1568778169, "microseconds": 616038}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "j1"}}
{"timestamp": {"seconds": 1568778454, "microseconds": 290246}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "j1"}}
{"timestamp": {"seconds": 1568778454, "microseconds": 290289}, "event": "BLOCK_JOB_READY", "data": {"device": "j1", "len": 21475557376, "offset": 21475557376, "speed": 0, "type": "mirror"}}

7.Complete block job.
   {"execute":"block-job-complete","arguments":{"device":"j1"}}
{"return": {}}
{"timestamp": {"seconds": 1568778841, "microseconds": 698299}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "j1"}}
{"timestamp": {"seconds": 1568778841, "microseconds": 698342}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "j1"}}
{"timestamp": {"seconds": 1568778841, "microseconds": 698414}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "j1", "len": 21480931328, "offset": 21480931328, "speed": 0, "type": "mirror"}}
{"timestamp": {"seconds": 1568778841, "microseconds": 698471}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "j1"}}
{"timestamp": {"seconds": 1568778841, "microseconds": 698505}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "j1"}}

8.Check block info:
  (qemu) info block
mirror: nbd://10.66.144.83:9000 (nbd)
    Attached to:      /machine/peripheral/image1/virtio-backend
    Cache mode:       writeback

9.Shutdown vm.
  {"execute":"system_powerdown"}
{"return": {}}
{"timestamp": {"seconds": 1568779161, "microseconds": 875201}, "event": "POWERDOWN"}
{"timestamp": {"seconds": 1568779164, "microseconds": 168019}, "event": "SHUTDOWN", "data": {"guest": true, "reason": "guest-shutdown"}}

Comment 6 errata-xmlrpc 2019-11-06 07:16:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3723


Note You need to log in before you can comment on or make changes to this bug.