Bug 1937146 - [QSD] 'quit' the daemon with block jobs in ready status could cause core dump
Summary: [QSD] 'quit' the daemon with block jobs in ready status could cause core dump
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.4
Hardware: All
OS: Unspecified
medium
medium
Target Milestone: rc
: 8.4
Assignee: Kevin Wolf
QA Contact: aihua liang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-03-10 01:27 UTC by Gu Nini
Modified: 2021-11-16 08:16 UTC (History)
9 users (show)

Fixed In Version: qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-11-16 07:51:47 UTC
Type: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2021:4684 0 None None None 2021-11-16 07:52:21 UTC

Description Gu Nini 2021-03-10 01:27:01 UTC
Description of problem:
Start a QSD, do some block jobs, such as block mirror and block commit, after the job reaches ready status, quit the daemon directy without complete the job, then the QSD core dump:

# sh qsd.sh 


qemu-storage-daemon: ../block.c:4474: bdrv_close_all: Assertion `job_next(NULL) == NULL' failed.
qsd.sh: line 8: 221051 Aborted                 (core dumped) qemu-storage-daemon --chardev socket,id=qmp_id_qmpmonitor1,server,nowait,path=/var/tmp/avocado_1 --mon chardev=qmp_id_qmpmonitor1,mode=control --blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/ngu/rhel840-ppc64le-virtio-scsi.qcow2,auto-read-only=on,cache.direct=on,cache.no-flush=off --blockdev node-name=drive_img1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 --nbd-server addr.type=inet,addr.host=10.16.214.94,addr.port=9000,max-connections=10 --export type=nbd,id=export1,node-name=drive_img1,writable=on,name=export1
 

Version-Release number of selected component (if applicable):
Host kernel: 4.18.0-295.el8.ppc64le
Qemu: qemu-kvm-5.2.0-11.module+el8.4.0+10268+62bcbbed.ppc64le

How reproducible:
100%

Steps to Reproduce:
1. Start a QSD daemon:

qemu-storage-daemon \
    --chardev socket,id=qmp_id_qmpmonitor1,server,nowait,path=/var/tmp/avocado_1  \
    --mon chardev=qmp_id_qmpmonitor1,mode=control \
    --blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/ngu/rhel840-ppc64le-virtio-scsi.qcow2,auto-read-only=on,cache.direct=on,cache.no-flush=off \
    --blockdev node-name=drive_img1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \
    --nbd-server addr.type=inet,addr.host=10.16.214.94,addr.port=9000,max-connections=10 \
    --export type=nbd,id=export1,node-name=drive_img1,writable=on,name=export1 \

2. 'blockdev-create/add' an image:

{'execute':'blockdev-create','arguments':{'options': {'driver':'file','filename':'/home/ngu/sn1','size':21474836480},'job-id':'job1'}}
{'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_sn1','auto-read-only':true,'filename':'/home/ngu/sn1','discard':'unmap'}}
{'execute':'blockdev-create','arguments':{'options': {'driver': 'qcow2','file':'drive_sn1','size':21474836480, 'backing-file':'/home/ngu/rhel840-ppc64le-virtio-scsi.qcow2'},'job-id':'job2'}}
{'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'sn1','file':'drive_sn1','read-only':false,'backing':null}}
{"return": {}}
{'execute':'job-dismiss','arguments':{'id':'job1'}}   {'execute':'job-dismiss','arguments':{'id':'job2'}}

3. Create a snapshot on the image:

{"execute":"transaction","arguments":{"actions":[{"type":"blockdev-snapshot","data":{"node":"drive_img1","overlay":"sn1"}}]}}

4. Do 'block-commit':

{'execute': 'block-commit', 'arguments': { 'device':'sn1','job-id':'j1'}}

5. When the block job reach ready status, 'quit' the QSD but without a 'job-complete': 

{'execute': 'quit'}


Actual results:
The QSD core dumped as showed in the bug description part.

Expected results:
The QSD could quit without problem.

Additional info:

Comment 3 aihua liang 2021-05-12 03:24:40 UTC
Can hit this bug, when only do live snapshot, then quit.

 #qemu-storage-daemon --version
qemu-storage-daemon version 5.2.0 (qemu-kvm-5.2.0-15.module+el8.4.0+10650+50781ca0)

 Reproduce step:
  1.Expose image
    qemu-storage-daemon \
    --chardev socket,id=qmp_id_qmpmonitor1,server,nowait,path=/var/tmp/avocado_1  \
    --mon chardev=qmp_id_qmpmonitor1,mode=control \
    --blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel850-64-virtio-scsi.qcow2,auto-read-only=on,cache.direct=on,cache.no-flush=off \
    --blockdev node-name=drive_img1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \

 2. Create snapshot node and do snapshot
    #nc -U /var/tmp/avocado_1
    {"execute":"qmp_capabilities"}
    {'execute':'blockdev-create','arguments':{'options': {'driver':'file','filename':'/root/sn1','size':21474836480},'job-id':'job1'}}
    {'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_sn1','filename':'/root/sn1','auto-read-only':true,'discard':'unmap'}}
    {'execute':'blockdev-create','arguments':{'options': {'driver': 'qcow2','file':'drive_sn1','size':21474836480,'backing-file':'/home/kvm_autotest_root/images/rhel850-64-virtio-scsi.qcow2','backing-fmt':'qcow2'},'job-id':'job2'}}
    {'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'sn1','file':'drive_sn1','backing':null,'read-only':false}}
    {"execute":"transaction","arguments":{"actions":[{"type":"blockdev-snapshot","data":{"node":"drive_img1","overlay":"sn1"}}]}}

  3. Check node info
    {"execute":"query-named-block-nodes"}

  4. Quit
    {"execute":"quit"}

After step4, qemu-storage-deamon coredump.
  qemu-storage-daemon: ../block.c:4474: bdrv_close_all: Assertion `job_next(NULL) == NULL' failed.
atest.txt: line 6: 1034137 Aborted                 (core dumped) qemu-storage-daemon --chardev socket,id=qmp_id_qmpmonitor1,server,nowait,path=/var/tmp/avocado_1 --mon chardev=qmp_id_qmpmonitor1,mode=control --blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel850-64-virtio-scsi.qcow2,auto-read-only=on,cache.direct=on,cache.no-flush=off --blockdev node-name=drive_img1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1

 As comment 1, patch has been sent upstream, will confirm it agian after the patch merged.

Comment 7 aihua liang 2021-05-13 08:53:44 UTC
Test on qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d, the bug has been resolved.

Test Env:
 kernel version:4.18.0-305.1.el8.x86_64
 qemu-kvm version:qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d

Test Steps:
 1.Expose image via qemu-storage-daemon:
    qemu-storage-daemon \
    --chardev socket,id=qmp_id_qmpmonitor1,server=on,wait=off,path=/var/tmp/avocado_1  \
    --mon chardev=qmp_id_qmpmonitor1,mode=control \
    --blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel850-64-virtio-scsi.qcow2,auto-read-only=on,cache.direct=on,cache.no-flush=off \
    --blockdev node-name=drive_img1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \

 2. Create snapshot node and do snapshot
    {'execute':'blockdev-create','arguments':{'options': {'driver':'file','filename':'/root/sn1','size':21474836480},'job-id':'job1'}}
    {'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_sn1','filename':'/root/sn1','auto-read-only':true,'discard':'unmap'}}
    {'execute':'blockdev-create','arguments':{'options': {'driver': 'qcow2','file':'drive_sn1','size':21474836480,'backing-file':'/home/kvm_autotest_root/images/rhel850-64-virtio-scsi.qcow2','backing-fmt':'qcow2'},'job-id':'job2'}}
    {'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'sn1','file':'drive_sn1','backing':null,'read-only':false}}
    {"execute":"transaction","arguments":{"actions":[{"type":"blockdev-snapshot","data":{"node":"drive_img1","overlay":"sn1"}}]}}

 3. Commit from sn1 to base
    {'execute': 'block-commit', 'arguments': { 'device':'sn1','job-id':'j1'}}
{"timestamp": {"seconds": 1620895249, "microseconds": 570223}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "j1"}}
{"timestamp": {"seconds": 1620895249, "microseconds": 570379}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "j1"}}
{"return": {}}
{"timestamp": {"seconds": 1620895249, "microseconds": 574510}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "j1"}}
{"timestamp": {"seconds": 1620895249, "microseconds": 574531}, "event": "BLOCK_JOB_READY", "data": {"device": "j1", "len": 0, "offset": 0, "speed": 0, "type": "commit"}}

 4.Stop image expost via "Ctrl+C"
   {"timestamp": {"seconds": 1620895263, "microseconds": 240204}, "event": "JOB_STATUS_CHANGE", "data": {"status": "standby", "id": "j1"}}
{"timestamp": {"seconds": 1620895263, "microseconds": 240241}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "j1"}}
{"timestamp": {"seconds": 1620895263, "microseconds": 240312}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "j1"}}
{"timestamp": {"seconds": 1620895263, "microseconds": 240338}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "j1"}}
{"timestamp": {"seconds": 1620895263, "microseconds": 240386}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "j1", "len": 0, "offset": 0, "speed": 0, "type": "commit"}}
{"timestamp": {"seconds": 1620895263, "microseconds": 240404}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "j1"}}
{"timestamp": {"seconds": 1620895263, "microseconds": 240423}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "j1"}}
{"timestamp": {"seconds": 1620895263, "microseconds": 240467}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job2"}}
{"timestamp": {"seconds": 1620895263, "microseconds": 240483}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job1"}}

 Note: block mirror + {"execute":"quit"} also works ok.

Comment 8 aihua liang 2021-05-13 08:55:06 UTC
As comment7, set bug's status to "Verified".

Comment 10 errata-xmlrpc 2021-11-16 07:51:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4684


Note You need to log in before you can comment on or make changes to this bug.