Bug 1631615

Summary: Wrong werror default for -device drive=<node-name>
Product: Red Hat Enterprise Linux 7 Reporter: aihua liang <aliang>
Component: qemu-kvm-rhevAssignee: Kevin Wolf <kwolf>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.6CC: chayang, coli, juzhang, lijin, ngu, phou, pkrempa, qzhang, timao, virt-maint, xuwei, yhong
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.12.0-24.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1657637 (view as bug list) Environment:
Last Closed: 2019-08-22 09:18:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1643351, 1657637    

Description aihua liang 2018-09-21 04:30:13 UTC
Description of problem:
 werror's auto mode can't work normally when start guest with -blockdev

Version-Release number of selected component (if applicable):
  kernel version: 3.10.0-945.el7.x86_64
  qemu-kvm-rhev version: qemu-kvm-rhev-2.12.0-17.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Install a guest with gluster backend.

2. Full write gluster backend to make it "No space left on device"

3. Start vm with -blockdev:
    /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20180910-021412-u4bPHcZI,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20180910-021412-u4bPHcZI,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idDQoT2q  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20180910-021412-u4bPHcZI,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20180910-021412-u4bPHcZI,path=/var/tmp/seabios-20180910-021412-u4bPHcZI,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20180910-021412-u4bPHcZI,iobase=0x402 \
    -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 \
    -object secret,id=sec0,data=backing \
    -blockdev driver=qcow2,file.driver=gluster,node-name=drive_image1,cache.no-flush=on,cache.direct=off,encrypt.key-secret=sec0,encrypt.format=luks,file.server.0.type=inet,file.server.0.host=ibm-x3650m5-07.lab.eng.pek2.redhat.com,file.server.0.port=24007,file.volume=aliang,file.path=rhel76-64-virtio.qcow2.25 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bus=pci.0,addr=0x4,scsi=off,serial=0xdafafafafafaf12121212121212121212121212121212121212121212aaafffffffff,physical_block_size=4096,logical_block_size=512,disable-modern=on,disable-legacy=off \
    -device virtio-net-pci,mac=9a:5b:5c:5d:5e:5f,id=idqHhiSX,vectors=4,netdev=idLUiEfL,bus=pci.0,addr=0x5  \
    -netdev tap,id=idLUiEfL,vhost=on \
    -m 11264  \
    -blockdev driver=raw,file.driver=file,node-name=drive_cd1,cache.no-flush=on,cache.direct=off,file.filename=/home/kvm_autotest_root/iso/linux/RHEL7.6-Server-x86_64.iso,read-only=on \
    -device ide-cd,id=cd1,drive=drive_cd1 \
    -smp 2,maxcpus=2,cores=1,threads=1,sockets=2  \
    -cpu Penryn \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=on,strict=off,order=cdn  \
    -enable-kvm \
    -monitor stdio \
    -blockdev driver=raw,file.driver=iscsi,node-name=drive_data2,cache.no-flush=on,cache.direct=off,file.transport=tcp,file.portal=10.73.224.153,file.target=iqn.2018-09.com.example:t1,file.lun=1 \
    -device virtio-blk-pci,id=data2,drive=drive_data2,bus=pci.0,addr=0x07 \
    -blockdev driver=qcow2,file.driver=file,node-name=drive_data1,cache.no-flush=on,cache.direct=off,file.filename=/home/data.qcow2 \
    -device virtio-blk-pci,id=data1,drive=drive_data1,bus=pci.0 \
    -blockdev driver=raw,node-name=drive_data0,cache.no-flush=on,cache.direct=off,file.filename=/dev/disk/by-path/ip-10.73.224.153:3260-iscsi-iqn.2018-09.com.example:t1-lun-2,file.driver=host_device \
    -device virtio-blk-pci,bus=pci.0,id=data0,drive=drive_data0,scsi=on,disable-modern=on \
    -qmp tcp:0:3000,server,nowait \

4. Start some apps on guest, and check vm status by qmp monitor
   (qmp)#nc -U /var/tmp/monitor-qmpmonitor1-20180910-021412-u4bPHcZI
{"execute":"qmp_capabilities"}
{"timestamp": {"seconds": 1537499256, "microseconds": 309834}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "__com.redhat_reason": "enospc", "node-name": "drive_image1", "reason": "No space left on device", "operation": "write", "action": "report"}}

5. Shutdown vm, then restart it with -drive:
    ...
    -drive if=none,format=qcow2,id=drive_image1,cache=none,encrypt.key-secret=sec0,encrypt.format=luks,file=gluster://ibm-x3650m5-07.lab.eng.pek2.redhat.com/aliang/rhel76-64-virtio.qcow2.25 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bus=pci.0,addr=0x4,scsi=off,serial=0xdafafafafafaf12121212121212121212121212121212121212121212aaafffffffff,physical_block_size=4096,logical_block_size=512,disable-modern=on,disable-legacy=off \
    ...

6. Start some apps on guest, and check vm status by qmp monitor:
   (qmp)#nc -U /var/tmp/monitor-qmpmonitor1-20180910-021412-u4bPHcZI
{"execute":"qmp_capabilities"}
{"timestamp": {"seconds": 1537498212, "microseconds": 218441}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": true, "__com.redhat_reason": "enospc", "node-name": "#block537", "reason": "No space left on device", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1537498212, "microseconds": 219006}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": false, "__com.redhat_reason": "eother", "node-name": "#block537", "reason": "No medium found", "operation": "write", "action": "report"}}
{"timestamp": {"seconds": 1537498212, "microseconds": 219960}, "event": "STOP"}

7. Shutdown vm, then start it with -blockdev+werror=enospc:
   ....
   -blockdev driver=qcow2,file.driver=gluster,node-name=drive_image1,cache.no-flush=on,cache.direct=off,encrypt.key-secret=sec0,encrypt.format=luks,file.server.0.type=inet,file.server.0.host=ibm-x3650m5-07.lab.eng.pek2.redhat.com,file.server.0.port=24007,file.volume=aliang,file.path=rhel76-64-virtio.qcow2.25 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,werror=enospc,bus=pci.0,addr=0x4,scsi=off,serial=0xdafafafafafaf12121212121212121212121212121212121212121212aaafffffffff,physical_block_size=4096,logical_block_size=512,disable-modern=on,disable-legacy=off \
   ....

8. Start some apps on guest, and check vm status by qmp monitor:
    (qmp)#nc -U /var/tmp/monitor-qmpmonitor1-20180910-021412-u4bPHcZI
{"execute":"qmp_capabilities"}
{"timestamp": {"seconds": 1537498212, "microseconds": 218441}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": true, "__com.redhat_reason": "enospc", "node-name": "#block537", "reason": "No space left on device", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1537498212, "microseconds": 219960}, "event": "STOP"}
    
Actual results:
 Both -drive and -blockdev+werror=enospc work normally when "No space left on device" with qmp event: 
    {"timestamp": {"seconds": 1537498212, "microseconds": 218441}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": true, "__com.redhat_reason": "enospc", "node-name": "#block537", "reason": "No space left on device", "operation": "write", "action": "stop"}}

 -blockdev+werror=auto can't work normally when "No space left on device" with qmp event:
     {"timestamp": {"seconds": 1537501432, "microseconds": 641823}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "__com.redhat_reason": "enospc", "node-name": "drive_image1", "reason": "No space left on device", "operation": "write", "action": "report"}}

Expected results:

 -blockdev+werror=auto can work normally.

Additional info:
  It's not a regression bug. Test on qemu-kvm-rhev-2.12.0-1.el7.x86_64 and qemu-kvm-rhev-2.10.0-21.el7_5.7.x86_64, both hit this issue.

Comment 3 Kevin Wolf 2018-09-28 08:58:26 UTC
To be precise, the bug is with -device ...,drive=<node-name>. This can be reproduced with -drive as well if the node name is used to create the device.

The fix is simple, blk_new() should set the right defaults instead of relying on other code paths doing that.

Comment 6 Kevin Wolf 2018-10-01 09:26:45 UTC
Peter, is this a problem for libvirt?

Comment 7 Peter Krempa 2018-10-01 10:52:51 UTC
We specify the policies explicitly only if the user set a specific one, so without the user specifying anything the behaviour will change. I think that the libvirt API contract allows this as we specify that the "hypervisor default" will be used.

On the other hand it would be better if the behaviour did not change when switching to -blockdev.

Comment 8 Kevin Wolf 2018-12-12 16:07:03 UTC
This is fixed with upstream commit cb53460b70 ('block-backend: Set werror/rerror defaults in blk_new()').

Comment 9 Kevin Wolf 2018-12-12 16:08:22 UTC
*** Bug 1631213 has been marked as a duplicate of this bug. ***

Comment 11 Miroslav Rezanina 2019-02-26 11:24:41 UTC
Fix included in qemu-kvm-rhev-2.12.0-24.el7

Comment 13 lchai 2019-03-05 08:19:10 UTC
1. Reproduced with "kernel-3.10.0-957.el7.x86_64" + "qemu-kvm-rhev-2.12.0-19.el7.x86_64"

1) Full write the host to make it "No space left on device";

2) Boot the guest with blockdev/drive + werror in default mode:

blockdev:
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=/home/rhel-7.6.z.qcow2,node-name=win_disk \
-blockdev driver=qcow2,node-name=drive_win,file=win_disk \
-device virtio-blk-pci,drive=drive_win,id=win1,bus=pci.0,write-cache=on \

drive:
-drive node-name=image_drive1,if=none,aio=threads,cache=none,format=qcow2,file=/home/rhel-7.6.z.qcow2 \
-device virtio-blk-pci,drive=image_drive1,bus=pci.0 \

3) Run some write operations on the guest;
# dd if=/dev/zero of=test.bin bs=1M oflag=direct

4) Check the VM status in qmp & hmp;
qmp:
{"timestamp": {"seconds": 1551772768, "microseconds": 891733}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "__com.redhat_reason": "enospc", "node-name": "drive_win", "reason": "No space left on device", "operation": "write", "action": "report"}}
{"timestamp": {"seconds": 1551772768, "microseconds": 891877}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "__com.redhat_reason": "enospc", "node-name": "drive_win", "reason": "No space left on device", "operation": "write", "action": "report"}}
{"timestamp": {"seconds": 1551772768, "microseconds": 891956}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "__com.redhat_reason": "enospc", "node-name": "drive_win", "reason": "No space left on device", "operation": "write", "action": "report"}}

hmp:
(qemu) info status
VM status: running


2. Verified this issue on "kernel-3.10.0-957.el7.x86_64" + "qemu-kvm-rhev-2.12.0-24.el7.x86_64"

1) Full write the host to make it "No space left on device";

2) Boot the guest with blockdev/drive + werror in default mode;

3) Run some write operations on the guest;
# dd if=/dev/zero of=test.bin bs=1M oflag=direct

4) Check the VM status in qmp & hmp;
qmp:
{"timestamp": {"seconds": 1551773448, "microseconds": 343951}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "nospace": true, "__com.redhat_reason": "enospc", "node-name": "image_drive1", "reason": "No space left on device", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1551773448, "microseconds": 344095}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "nospace": true, "__com.redhat_reason": "enospc", "node-name": "image_drive1", "reason": "No space left on device", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1551773448, "microseconds": 344163}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "nospace": true, "__com.redhat_reason": "enospc", "node-name": "image_drive1", "reason": "No space left on device", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1551773448, "microseconds": 347497}, "event": "STOP"}

hmp:
(qemu) info status
VM status: paused (io-error)

Comment 16 errata-xmlrpc 2019-08-22 09:18:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2553