Bug 1605026 - Quitting VM causes qemu core dump once the block mirror job paused for no enough target space
Summary: Quitting VM causes qemu core dump once the block mirror job paused for no eno...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.6
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: rc
: ---
Assignee: Virtualization Maintenance
QA Contact: Gu Nini
URL:
Whiteboard:
Depends On:
Blocks: 1635583
TreeView+ depends on / blocked
 
Reported: 2018-07-20 03:26 UTC by Gu Nini
Modified: 2018-11-01 11:15 UTC (History)
10 users (show)

Fixed In Version: qemu-kvm-rhev-2.12.0-13.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1635583 (view as bug list)
Environment:
Last Closed: 2018-11-01 11:13:00 UTC
Target Upstream Version:


Attachments (Terms of Use)
gdb_debug_info_all_threads-07202018 (17.49 KB, text/plain)
2018-07-20 03:32 UTC, Gu Nini
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:3443 None None None 2018-11-01 11:15:19 UTC

Description Gu Nini 2018-07-20 03:26:19 UTC
Description of problem:
Do block mirror with '"on-source-error": "stop"' and '"on-target-error": "stop"' attached in qmp cmd, when the mirror target met no space, the job would paused; quit the guest, then the guest would Aborted (core dumped).

Version-Release number of selected component (if applicable):
Host kernel: 3.10.0-918.el7.x86_64
Qemu-kvm-rhev: qemu-kvm-rhev-2.12.0-7.el7.x86_6

How reproducible:
100%

Steps to Reproduce:
1. Create a small lv, which will be used for the mirror target
# qemu-img create -f raw /home/disk.img 10G
# losetup /dev/loop0 /home/disk.img
# pvcreate /dev/loop0
# vgcreate vg1 /dev/loop0
# lvcreate -L 256M -n lvlv1 vg1

2. Start a guest with a data disk 'drive_image2'
3. Do block mirror for above data disk to the lv created in step1, set '"on-source-error": "stop"' and '"on-target-error": "stop"':
{ "execute": "drive-mirror", "arguments": { "device": "drive_image2", "target": "/dev/vg1/lvlv1", "mode": "absolute-paths", "format":"qcow2","sync": "full", "on-source-error": "stop", "on-target-error": "stop"}
{"timestamp": {"seconds": 1532055583, "microseconds": 991329}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "drive_image2"}}
{"timestamp": {"seconds": 1532055583, "microseconds": 991447}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "drive_image2"}}
{"return": {}}
{"timestamp": {"seconds": 1532055586, "microseconds": 253020}, "event": "BLOCK_JOB_ERROR", "data": {"device": "drive_image2", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1532055586, "microseconds": 253229}, "event": "BLOCK_JOB_ERROR", "data": {"device": "drive_image2", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1532055586, "microseconds": 253371}, "event": "BLOCK_JOB_ERROR", "data": {"device": "drive_image2", "operation": "write", "action": "stop"}}
......
{"timestamp": {"seconds": 1532055586, "microseconds": 363679}, "event": "BLOCK_JOB_ERROR", "data": {"device": "drive_image2", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1532055586, "microseconds": 363712}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "drive_image2"}}
{ "execute": "quit"}
{"return": {}}
{"timestamp": {"seconds": 1532055603, "microseconds": 934305}, "event": "SHUTDOWN", "data": {"guest": false}}

4. After the block job paused, try to quit the guest.


Actual results:
The guest core dump:
./vm2.sh rhel76.qcow2 
QEMU 2.12.0 monitor - type 'help' for more information
(qemu) 
(qemu)
(qemu) Formatting '/dev/vg1/lvlv1', fmt=qcow2 size=1073741824 cluster_size=65536 lazy_refcounts=off refcount_bits=16
qemu-kvm: blockjob.c:437: block_job_iostatus_reset: Assertion `job->job.user_paused && job->job.pause_count > 0' failed.
./vm2.sh: line 29:   931 Aborted                 (core dumped) /usr/libexec/qemu-kvm -name 'avocado-vt-vm 


Expected results:
No core dump for the guest

Additional info:

Comment 2 Gu Nini 2018-07-20 03:32:02 UTC
Created attachment 1464849 [details]
gdb_debug_info_all_threads-07202018

Comment 3 Jeff Cody 2018-08-15 16:00:57 UTC
We are clearing the 'user_paused' flag too soon; the user_resume handler for the job should run first.

Patch submitted to qemu-devel.

Comment 6 Miroslav Rezanina 2018-09-04 14:35:18 UTC
Fix included in qemu-kvm-rhev-2.12.0-13.el7

Comment 8 Gu Nini 2018-09-05 09:27:02 UTC
Verify the bug on qemu-kvm-rhev-2.12.0-13.el7.x86_64 with the same steps in the bug description part.

Comment 10 errata-xmlrpc 2018-11-01 11:13:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3443


Note You need to log in before you can comment on or make changes to this bug.