Bug 1362084

Summary: qemu core dump when do blockdev-add with option detect-zeroes on
Product: Red Hat Enterprise Linux 7 Reporter: Qianqian Zhu <qizhu>
Component: qemu-kvm-rhevAssignee: Markus Armbruster <armbru>
Status: CLOSED ERRATA QA Contact: Qianqian Zhu <qizhu>
Severity: high Docs Contact:
Priority: high    
Version: 7.3CC: areis, armbru, chayang, famz, jinzhao, juzhang, knoel, meyang, mrezanin, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 23:32:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 963588    

Description Qianqian Zhu 2016-08-01 09:35:34 UTC
Description of problem:
qemu core dump when do blockdev-add with option detect-zeroes on;
without option detect-zeroes, blockdev-add works well.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.6.0-17.el7.x86_64
qemu-img-rhev-2.6.0-17.el7.x86_64
kernel-3.10.0-475.el7.x86_64

How reproducible:
2/2

Steps to Reproduce:
1.launch guest:
/usr/libexec/qemu-kvm -name linux -cpu SandyBridge -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 7bef3814-631a-48bb-bae8-2b1de75f7a13 -nodefaults -monitor stdio -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot order=c,menu=on -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/nfs/RHEL-Server-7.3-64-virtio.qcow2,if=none,cache=writeback,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on -spice port=5901,disable-ticketing -vga qxl -global qxl-vga.revision=3 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=3C:D9:2B:09:AB:44,bus=pci.0,addr=0x3 -qmp tcp:0:5555,server,nowait

2. create image:
qemu-img create -f qcow2 /nfs/mirror.qcow2 20G
3. { "execute": "blockdev-add", "arguments": {"options":{ "driver":"qcow2", "id":"my_disk", "detect-zeroes": true, "file":{"filename":"/nfs/mirror.qcow2","driver":"file"}}}}

Actual results:
qemu core dump

Formatting 'mirror1.qcow2', fmt=qcow2 size=21474836480 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
[Thread 0x7fff571ff700 (LWP 13477) exited]
[Thread 0x7fffe29cf700 (LWP 13427) exited]
[Thread 0x7fff551fb700 (LWP 13481) exited]
[Thread 0x7fff569fe700 (LWP 13478) exited]
[Thread 0x7fff531f7700 (LWP 13485) exited]
[Thread 0x7fff561fd700 (LWP 13479) exited]

Program received signal SIGSEGV, Segmentation fault.
0x0000555555964a64 in visit_type_BlockdevRef (v=v@entry=0x555556b506d0, name=name@entry=0x5555559dbbdf "file", obj=obj@entry=0x555556bc66e8, errp=errp@entry=0x7fffffffc4c0) at qapi-visit.c:2201
2201	    switch ((*obj)->type)

Expected results:


Additional info:
Backend: nfs

Comment 1 Ademar Reis 2016-08-01 13:50:57 UTC
blockdev-add API is still experimental upstream and not supported by libvirt. But instead of closing this BZ, I'm reassigning to Kevin for further evaluation, while deferring it to 7.4.

Comment 2 Qianqian Zhu 2016-08-02 02:22:47 UTC
(In reply to Ademar Reis from comment #1)
> blockdev-add API is still experimental upstream and not supported by
> libvirt. But instead of closing this BZ, I'm reassigning to Kevin for
> further evaluation, while deferring it to 7.4.

Since this bz blocks bz 1232914, whick is ON_QA now; so if this bz deferred to 7.4, shall we reopen bz 1232914 as FAILED QA?

Comment 3 Fam Zheng 2016-08-02 02:25:58 UTC
(In reply to qianqianzhu from comment #2)
> (In reply to Ademar Reis from comment #1)
> > blockdev-add API is still experimental upstream and not supported by
> > libvirt. But instead of closing this BZ, I'm reassigning to Kevin for
> > further evaluation, while deferring it to 7.4.
> 
> Since this bz blocks bz 1232914, whick is ON_QA now; so if this bz deferred
> to 7.4, shall we reopen bz 1232914 as FAILED QA?

You can use drive_add, instead, to verify that one.

Comment 4 Qianqian Zhu 2016-08-02 09:54:20 UTC
(In reply to Fam Zheng from comment #3)
> (In reply to qianqianzhu from comment #2)
> > (In reply to Ademar Reis from comment #1)
> > > blockdev-add API is still experimental upstream and not supported by
> > > libvirt. But instead of closing this BZ, I'm reassigning to Kevin for
> > > further evaluation, while deferring it to 7.4.
> > 
> > Since this bz blocks bz 1232914, whick is ON_QA now; so if this bz deferred
> > to 7.4, shall we reopen bz 1232914 as FAILED QA?
> 
> You can use drive_add, instead, to verify that one.

Failed to do blockdev-mirror after drive_add:

(qemu)  drive_add 0 file=/nfs/mirror.qcow2,format=qcow2,id=drive-virtio-disk1,if=none,detect-zeroes=on
OK

{"execute":"blockdev-mirror", "arguments": { "device": "drive-virtio-blk0", "target": "drive-virtio-disk1", "sync":"full"}}
{"error": {"class": "GenericError", "desc": "Cannot mirror to an attached block device"}}

Fam, would you please help check anything wrong here? Thanks.

Comment 5 Fam Zheng 2016-08-03 03:15:11 UTC
The drive_add command has a special syntax for this use case. It looks like this:

 (qemu) hmp drive_add -n 0 file.filename=/tmp/test,node-name=d3,detect-zeroes=on

Comment 6 Markus Armbruster 2016-09-08 09:16:39 UTC
Note that the reproducer's '"detect-zeroes": true' is incorrect:
"detect-zeroes" is enum BlockdevDetectZeroesOptions, with values off,
on, unmap.

Simplified reproducer:

    $ qemu-kvm -nodefaults -S -display none -qmp stdio
    {"QMP": {"version": {"qemu": {"micro": 0, "minor": 6, "major": 2}, "package": ""}, "capabilities": []}}
    warning: host doesn't support requested feature: CPUID.80000001H:ECX.sse4a [bit 6]
    { "execute": "qmp_capabilities" }
    {"return": {}}
    { "execute": "blockdev-add", "arguments": {"options" : {"driver": "raw", "id":"drive-disk1", "discard":"unmap", "rerror":"stop", "werror":"stop", "file": {"driver": "host_device", "filename": "/dev/sdb"}, "detect-zeroes": true }} }
    Segmentation fault (core dumped)

Upstream v2.7.0 handles this fine:

    {"error": {"class": "GenericError", "desc": "Invalid parameter type for 'detect-zeroes', expected: string"}}

Upstream v2.6.0 crashes exactly like downstream.

The crash is in qmp_marshal_blockdev_add()'s cleanup after the error.
Might be a dealloc visitor bug, or a visitor core bug.

May well affect more than just blockdev-add.

Comment 7 Markus Armbruster 2016-09-08 11:28:18 UTC
Fixed upstream in commit 9b4e38f.

Comment 8 Markus Armbruster 2016-09-09 09:13:33 UTC
*** Bug 1327377 has been marked as a duplicate of this bug. ***

Comment 9 Markus Armbruster 2016-09-29 11:42:15 UTC
*** Bug 1314591 has been marked as a duplicate of this bug. ***

Comment 11 Qianqian Zhu 2017-03-06 09:50:24 UTC
Verified with:
qemu-kvm-rhev-2.8.0-5.el7.x86_64
kernel-3.10.0-566.el7.x86_64


Steps:
1. Launch guest with cmd[1]

2. create image:
# qemu-img create -f qcow2 /nfs/mirror.qcow2 20G

3. Add block dev:
{ "execute": "blockdev-add", "arguments": {"driver":"qcow2", "node-name": "d3", "file":{"filename":"/nfs/mirror.qcow2","driver":"file"}, "detect-zeroes": "on"}}

4. Block mirror to new added dev:
{"execute":"blockdev-mirror", "arguments": { "device": "drive-virtio-disk0", "target": "d3", "sync":"full"}}


Result:
Both blockdev-add and blockdev-mirror success.

{"timestamp": {"seconds": 1488793124, "microseconds": 701751}, "event": "BLOCK_JOB_READY", "data": {"device": "drive-virtio-disk0", "len": 3662413824, "offset": 3662413824, "speed": 0, "type": "mirror"}}

cmd[1]:
/usr/libexec/qemu-kvm -name linux -cpu SandyBridge -m 2048 -drive file=/nfs/rhel74-64-virtio.qcow2,if=none,cache=writeback,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0  -qmp tcp:0:5555,server,nowait

Comment 13 errata-xmlrpc 2017-08-01 23:32:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 14 errata-xmlrpc 2017-08-02 01:09:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 15 errata-xmlrpc 2017-08-02 02:01:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 16 errata-xmlrpc 2017-08-02 02:42:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 17 errata-xmlrpc 2017-08-02 03:07:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 18 errata-xmlrpc 2017-08-02 03:27:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392