Bug 1937146
| Summary: | [QSD] 'quit' the daemon with block jobs in ready status could cause core dump | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | Gu Nini <ngu> |
| Component: | qemu-kvm | Assignee: | Kevin Wolf <kwolf> |
| qemu-kvm sub component: | Block Jobs | QA Contact: | aihua liang <aliang> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | aliang, coli, ddepaula, jinzhao, kwolf, leidwang, qinwang, qzhang, virt-maint |
| Version: | 8.4 | Keywords: | Triaged |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
| Target Release: | 8.4 | ||
| Hardware: | All | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-11-16 07:51:47 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Can hit this bug, when only do live snapshot, then quit.
#qemu-storage-daemon --version
qemu-storage-daemon version 5.2.0 (qemu-kvm-5.2.0-15.module+el8.4.0+10650+50781ca0)
Reproduce step:
1.Expose image
qemu-storage-daemon \
--chardev socket,id=qmp_id_qmpmonitor1,server,nowait,path=/var/tmp/avocado_1 \
--mon chardev=qmp_id_qmpmonitor1,mode=control \
--blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel850-64-virtio-scsi.qcow2,auto-read-only=on,cache.direct=on,cache.no-flush=off \
--blockdev node-name=drive_img1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \
2. Create snapshot node and do snapshot
#nc -U /var/tmp/avocado_1
{"execute":"qmp_capabilities"}
{'execute':'blockdev-create','arguments':{'options': {'driver':'file','filename':'/root/sn1','size':21474836480},'job-id':'job1'}}
{'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_sn1','filename':'/root/sn1','auto-read-only':true,'discard':'unmap'}}
{'execute':'blockdev-create','arguments':{'options': {'driver': 'qcow2','file':'drive_sn1','size':21474836480,'backing-file':'/home/kvm_autotest_root/images/rhel850-64-virtio-scsi.qcow2','backing-fmt':'qcow2'},'job-id':'job2'}}
{'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'sn1','file':'drive_sn1','backing':null,'read-only':false}}
{"execute":"transaction","arguments":{"actions":[{"type":"blockdev-snapshot","data":{"node":"drive_img1","overlay":"sn1"}}]}}
3. Check node info
{"execute":"query-named-block-nodes"}
4. Quit
{"execute":"quit"}
After step4, qemu-storage-deamon coredump.
qemu-storage-daemon: ../block.c:4474: bdrv_close_all: Assertion `job_next(NULL) == NULL' failed.
atest.txt: line 6: 1034137 Aborted (core dumped) qemu-storage-daemon --chardev socket,id=qmp_id_qmpmonitor1,server,nowait,path=/var/tmp/avocado_1 --mon chardev=qmp_id_qmpmonitor1,mode=control --blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel850-64-virtio-scsi.qcow2,auto-read-only=on,cache.direct=on,cache.no-flush=off --blockdev node-name=drive_img1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1
As comment 1, patch has been sent upstream, will confirm it agian after the patch merged.
Test on qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d, the bug has been resolved.
Test Env:
kernel version:4.18.0-305.1.el8.x86_64
qemu-kvm version:qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d
Test Steps:
1.Expose image via qemu-storage-daemon:
qemu-storage-daemon \
--chardev socket,id=qmp_id_qmpmonitor1,server=on,wait=off,path=/var/tmp/avocado_1 \
--mon chardev=qmp_id_qmpmonitor1,mode=control \
--blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel850-64-virtio-scsi.qcow2,auto-read-only=on,cache.direct=on,cache.no-flush=off \
--blockdev node-name=drive_img1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \
2. Create snapshot node and do snapshot
{'execute':'blockdev-create','arguments':{'options': {'driver':'file','filename':'/root/sn1','size':21474836480},'job-id':'job1'}}
{'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_sn1','filename':'/root/sn1','auto-read-only':true,'discard':'unmap'}}
{'execute':'blockdev-create','arguments':{'options': {'driver': 'qcow2','file':'drive_sn1','size':21474836480,'backing-file':'/home/kvm_autotest_root/images/rhel850-64-virtio-scsi.qcow2','backing-fmt':'qcow2'},'job-id':'job2'}}
{'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'sn1','file':'drive_sn1','backing':null,'read-only':false}}
{"execute":"transaction","arguments":{"actions":[{"type":"blockdev-snapshot","data":{"node":"drive_img1","overlay":"sn1"}}]}}
3. Commit from sn1 to base
{'execute': 'block-commit', 'arguments': { 'device':'sn1','job-id':'j1'}}
{"timestamp": {"seconds": 1620895249, "microseconds": 570223}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "j1"}}
{"timestamp": {"seconds": 1620895249, "microseconds": 570379}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "j1"}}
{"return": {}}
{"timestamp": {"seconds": 1620895249, "microseconds": 574510}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "j1"}}
{"timestamp": {"seconds": 1620895249, "microseconds": 574531}, "event": "BLOCK_JOB_READY", "data": {"device": "j1", "len": 0, "offset": 0, "speed": 0, "type": "commit"}}
4.Stop image expost via "Ctrl+C"
{"timestamp": {"seconds": 1620895263, "microseconds": 240204}, "event": "JOB_STATUS_CHANGE", "data": {"status": "standby", "id": "j1"}}
{"timestamp": {"seconds": 1620895263, "microseconds": 240241}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "j1"}}
{"timestamp": {"seconds": 1620895263, "microseconds": 240312}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "j1"}}
{"timestamp": {"seconds": 1620895263, "microseconds": 240338}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "j1"}}
{"timestamp": {"seconds": 1620895263, "microseconds": 240386}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "j1", "len": 0, "offset": 0, "speed": 0, "type": "commit"}}
{"timestamp": {"seconds": 1620895263, "microseconds": 240404}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "j1"}}
{"timestamp": {"seconds": 1620895263, "microseconds": 240423}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "j1"}}
{"timestamp": {"seconds": 1620895263, "microseconds": 240467}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job2"}}
{"timestamp": {"seconds": 1620895263, "microseconds": 240483}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job1"}}
Note: block mirror + {"execute":"quit"} also works ok.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4684 |
Description of problem: Start a QSD, do some block jobs, such as block mirror and block commit, after the job reaches ready status, quit the daemon directy without complete the job, then the QSD core dump: # sh qsd.sh qemu-storage-daemon: ../block.c:4474: bdrv_close_all: Assertion `job_next(NULL) == NULL' failed. qsd.sh: line 8: 221051 Aborted (core dumped) qemu-storage-daemon --chardev socket,id=qmp_id_qmpmonitor1,server,nowait,path=/var/tmp/avocado_1 --mon chardev=qmp_id_qmpmonitor1,mode=control --blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/ngu/rhel840-ppc64le-virtio-scsi.qcow2,auto-read-only=on,cache.direct=on,cache.no-flush=off --blockdev node-name=drive_img1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 --nbd-server addr.type=inet,addr.host=10.16.214.94,addr.port=9000,max-connections=10 --export type=nbd,id=export1,node-name=drive_img1,writable=on,name=export1 Version-Release number of selected component (if applicable): Host kernel: 4.18.0-295.el8.ppc64le Qemu: qemu-kvm-5.2.0-11.module+el8.4.0+10268+62bcbbed.ppc64le How reproducible: 100% Steps to Reproduce: 1. Start a QSD daemon: qemu-storage-daemon \ --chardev socket,id=qmp_id_qmpmonitor1,server,nowait,path=/var/tmp/avocado_1 \ --mon chardev=qmp_id_qmpmonitor1,mode=control \ --blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/ngu/rhel840-ppc64le-virtio-scsi.qcow2,auto-read-only=on,cache.direct=on,cache.no-flush=off \ --blockdev node-name=drive_img1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \ --nbd-server addr.type=inet,addr.host=10.16.214.94,addr.port=9000,max-connections=10 \ --export type=nbd,id=export1,node-name=drive_img1,writable=on,name=export1 \ 2. 'blockdev-create/add' an image: {'execute':'blockdev-create','arguments':{'options': {'driver':'file','filename':'/home/ngu/sn1','size':21474836480},'job-id':'job1'}} {'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_sn1','auto-read-only':true,'filename':'/home/ngu/sn1','discard':'unmap'}} {'execute':'blockdev-create','arguments':{'options': {'driver': 'qcow2','file':'drive_sn1','size':21474836480, 'backing-file':'/home/ngu/rhel840-ppc64le-virtio-scsi.qcow2'},'job-id':'job2'}} {'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'sn1','file':'drive_sn1','read-only':false,'backing':null}} {"return": {}} {'execute':'job-dismiss','arguments':{'id':'job1'}} {'execute':'job-dismiss','arguments':{'id':'job2'}} 3. Create a snapshot on the image: {"execute":"transaction","arguments":{"actions":[{"type":"blockdev-snapshot","data":{"node":"drive_img1","overlay":"sn1"}}]}} 4. Do 'block-commit': {'execute': 'block-commit', 'arguments': { 'device':'sn1','job-id':'j1'}} 5. When the block job reach ready status, 'quit' the QSD but without a 'job-complete': {'execute': 'quit'} Actual results: The QSD core dumped as showed in the bug description part. Expected results: The QSD could quit without problem. Additional info: