Bug 1703916

Summary: Qemu core dump when quit vm after forbidden to do backup with a read-only bitmap
Product: Red Hat Enterprise Linux 7 Reporter: aihua liang <aliang>
Component: qemu-kvm-rhevAssignee: John Snow <jsnow>
Status: CLOSED ERRATA QA Contact: aihua liang <aliang>
Severity: high Docs Contact:
Priority: high    
Version: 7.7CC: coli, juzhang, mtessun, ngu, qzhang, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.12.0-32.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-08-22 09:20:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description aihua liang 2019-04-29 06:03:57 UTC
Description of problem:
 Qemu core dump when quit vm after forbidden to do backup with a read-only bitmap

Version-Release number of selected component (if applicable):
 kernel version:3.10.0-1037.el7.x86_64
 qemu-kvm-rhev version:qemu-kvm-rhev-2.12.0-27.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Start guest with qemu cmds:
   /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1' \
    -machine pc  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20190423-215834-BzwOjODj,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20190423-215834-BzwOjODj,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idapUGH0  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20190423-215834-BzwOjODj,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20190423-215834-BzwOjODj,path=/var/tmp/seabios-20190423-215834-BzwOjODj,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20190423-215834-BzwOjODj,iobase=0x402 \
    -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 \
    -object iothread,id=iothread0 \
    -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel77-64-virtio.qcow2 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=0x4,iothread=iothread0 \
    -drive id=drive_data1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/test.img \
    -device virtio-blk-pci,id=data1,drive=drive_data1,bus=pci.0,addr=0x6,iothread=iothread0 \
    -device virtio-net-pci,mac=9a:84:85:86:87:88,id=idc38p8G,vectors=4,netdev=idFM5N3v,bus=pci.0,addr=0x5  \
    -netdev tap,id=idFM5N3v,vhost=on \
    -m 2048  \
    -smp 4,maxcpus=4,cores=2,threads=1,sockets=2  \
    -cpu 'Westmere',+kvm_pv_unhalt \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,strict=off,order=cdn,once=c \
    -enable-kvm \
    -monitor stdio \
    -qmp tcp:0:3000,server,nowait \

2.Add a persistent bitmap
  { "execute": "block-dirty-bitmap-add", "arguments": {"node": "drive_image1", "name": "bitmap0","persistent":true}}

3.Shutdown vm, then start it with read-only=on.
  ...
  -object iothread,id=iothread0 \
    -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel77-64-virtio.qcow2,read-only=on \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=0x4,iothread=iothread0 \
  ...

  During vm boot process, block_io_error reported:
   {"timestamp": {"seconds": 1556515930, "microseconds": 945546}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": false, "__com.redhat_reason": "eperm", "node-name": "#block143", "reason": "Operation not permitted", "operation": "write", "action": "report"}}
{"timestamp": {"seconds": 1556515930, "microseconds": 945674}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": false, "__com.redhat_reason": "eperm", "node-name": "#block143", "reason": "Operation not permitted", "operation": "write", "action": "report"}}
{"timestamp": {"seconds": 1556515930, "microseconds": 945728}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": false, "__com.redhat_reason": "eperm", "node-name": "#block143", "reason": "Operation not permitted", "operation": "write", "action": "report"}}
{"timestamp": {"seconds": 1556515930, "microseconds": 945783}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": false, "__com.redhat_reason": "eperm", "node-name": "#block143", "reason": "Operation not permitted", "operation": "write", "action": "report"}}
{"timestamp": {"seconds": 1556515930, "microseconds": 945838}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": false, "__com.redhat_reason": "eperm", "node-name": "#block143", "reason": "Operation not permitted", "operation": "write", "action": "report"}}

4.Query bitmap info:
  {"execute":"query-block"}
  ...
  "dirty-bitmaps": [{"name": "bitmap0", "recording": true, "persistent": true, "busy": false, "status": "active", "granularity": 65536, "count": 23199744}]
  ...

5.Do incremental live backup with the read-only bitmap.
  { "execute": "drive-backup", "arguments": { "device": "drive_image1", "target": "/home/inc.img", "sync":"incremental","bitmap":"bitmap0","format":"qcow2","speed":1000}}
  {"error": {"class": "GenericError", "desc": "Bitmap 'bitmap0' is readonly and cannot be modified"}}

6.Query block job:
  {"execute":"query-block-jobs"}
{"return": []}

7.Quit vm
 (qemu)quit
   

Actual results:
 After step7, qemu core dump with info:
  qemu-kvm: block.c:3471: bdrv_close_all: Assertion `((&all_bdrv_states)->tqh_first == ((void *)0))' failed.
aliang_le.txt: 行 33: 14959 aborted               (Coredump)/usr/libexec/qemu-kvm -name 'avocado-vt-vm1' -machine pc -nodefaults -device VGA,bus=pci.0,addr=0x2 -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20190423-215834-BzwOjODj,server,nowait -mon chardev=qmp_id_qmpmonitor1,mode=control ...
 
 coredump info:
(gdb) bt
#0  0x00007f1c3915d337 in raise () at /lib64/libc.so.6
#1  0x00007f1c3915ea28 in abort () at /lib64/libc.so.6
#2  0x00007f1c39156156 in __assert_fail_base () at /lib64/libc.so.6
#3  0x00007f1c39156202 in  () at /lib64/libc.so.6
#4  0x000055d6ca0e1762 in bdrv_close_all () at block.c:3471
#5  0x000055d6c9e5c18b in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4790

Expected results:
 Qemu can quit successfully.

Additional info:
 Will add coredump file later.

Comment 4 aihua liang 2019-05-10 02:08:16 UTC
Test without iothread, also hit this issue.

Comment 5 John Snow 2019-05-10 21:39:59 UTC
Hum. I didn't realize you could create read-only virtio-pci-blk devices like that. That the VM fails to boot (or has a lot of errors in attempting to boot) does not surprise me in this case -- your VM image almost certainly expects to be able to write to its own /boot and / partitions.

...That said, I have reproduced this crash upstream, so I'll get to it. Good find, thank you.

Comment 6 John Snow 2019-05-10 22:00:57 UTC
Proposal upstream @ https://lists.gnu.org/archive/html/qemu-devel/2019-05/msg02651.html

Comment 9 Miroslav Rezanina 2019-06-11 16:34:25 UTC
Fix included in qemu-kvm-rhev-2.12.0-32.el7

Comment 11 aihua liang 2019-06-12 05:45:24 UTC
Verified with qemu-kvm-rhev-2.12.0-32.el7, the problem has been fixed, so set bug's status to "Verified".

Comment 13 errata-xmlrpc 2019-08-22 09:20:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2553