Bug 1657637

Summary: Wrong werror default for -device drive=<node-name>
Product: Red Hat Enterprise Linux 8 Reporter: Kevin Wolf <kwolf>
Component: qemu-kvmAssignee: Virtualization Maintenance <virt-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 8.0CC: chayang, coli, ddepaula, juzhang, mrezanin, rbalakri, virt-maint, xuwei
Target Milestone: rc   
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-2.12.0-46.module+el8+2351+e14a4632 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1631615 Environment:
Last Closed: 2019-06-14 00:55:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1631615    
Bug Blocks:    

Description Kevin Wolf 2018-12-10 08:14:01 UTC
+++ This bug was initially created as a clone of Bug #1631615 +++

Description of problem:
 werror's auto mode can't work normally when start guest with -blockdev

Version-Release number of selected component (if applicable):
  kernel version: 3.10.0-945.el7.x86_64
  qemu-kvm-rhev version: qemu-kvm-rhev-2.12.0-17.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Install a guest with gluster backend.

2. Full write gluster backend to make it "No space left on device"

3. Start vm with -blockdev:
    /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20180910-021412-u4bPHcZI,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20180910-021412-u4bPHcZI,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idDQoT2q  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20180910-021412-u4bPHcZI,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20180910-021412-u4bPHcZI,path=/var/tmp/seabios-20180910-021412-u4bPHcZI,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20180910-021412-u4bPHcZI,iobase=0x402 \
    -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 \
    -object secret,id=sec0,data=backing \
    -blockdev driver=qcow2,file.driver=gluster,node-name=drive_image1,cache.no-flush=on,cache.direct=off,encrypt.key-secret=sec0,encrypt.format=luks,file.server.0.type=inet,file.server.0.host=ibm-x3650m5-07.lab.eng.pek2.redhat.com,file.server.0.port=24007,file.volume=aliang,file.path=rhel76-64-virtio.qcow2.25 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bus=pci.0,addr=0x4,scsi=off,serial=0xdafafafafafaf12121212121212121212121212121212121212121212aaafffffffff,physical_block_size=4096,logical_block_size=512,disable-modern=on,disable-legacy=off \
    -device virtio-net-pci,mac=9a:5b:5c:5d:5e:5f,id=idqHhiSX,vectors=4,netdev=idLUiEfL,bus=pci.0,addr=0x5  \
    -netdev tap,id=idLUiEfL,vhost=on \
    -m 11264  \
    -blockdev driver=raw,file.driver=file,node-name=drive_cd1,cache.no-flush=on,cache.direct=off,file.filename=/home/kvm_autotest_root/iso/linux/RHEL7.6-Server-x86_64.iso,read-only=on \
    -device ide-cd,id=cd1,drive=drive_cd1 \
    -smp 2,maxcpus=2,cores=1,threads=1,sockets=2  \
    -cpu Penryn \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=on,strict=off,order=cdn  \
    -enable-kvm \
    -monitor stdio \
    -blockdev driver=raw,file.driver=iscsi,node-name=drive_data2,cache.no-flush=on,cache.direct=off,file.transport=tcp,file.portal=10.73.224.153,file.target=iqn.2018-09.com.example:t1,file.lun=1 \
    -device virtio-blk-pci,id=data2,drive=drive_data2,bus=pci.0,addr=0x07 \
    -blockdev driver=qcow2,file.driver=file,node-name=drive_data1,cache.no-flush=on,cache.direct=off,file.filename=/home/data.qcow2 \
    -device virtio-blk-pci,id=data1,drive=drive_data1,bus=pci.0 \
    -blockdev driver=raw,node-name=drive_data0,cache.no-flush=on,cache.direct=off,file.filename=/dev/disk/by-path/ip-10.73.224.153:3260-iscsi-iqn.2018-09.com.example:t1-lun-2,file.driver=host_device \
    -device virtio-blk-pci,bus=pci.0,id=data0,drive=drive_data0,scsi=on,disable-modern=on \
    -qmp tcp:0:3000,server,nowait \

4. Start some apps on guest, and check vm status by qmp monitor
   (qmp)#nc -U /var/tmp/monitor-qmpmonitor1-20180910-021412-u4bPHcZI
{"execute":"qmp_capabilities"}
{"timestamp": {"seconds": 1537499256, "microseconds": 309834}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "__com.redhat_reason": "enospc", "node-name": "drive_image1", "reason": "No space left on device", "operation": "write", "action": "report"}}

5. Shutdown vm, then restart it with -drive:
    ...
    -drive if=none,format=qcow2,id=drive_image1,cache=none,encrypt.key-secret=sec0,encrypt.format=luks,file=gluster://ibm-x3650m5-07.lab.eng.pek2.redhat.com/aliang/rhel76-64-virtio.qcow2.25 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bus=pci.0,addr=0x4,scsi=off,serial=0xdafafafafafaf12121212121212121212121212121212121212121212aaafffffffff,physical_block_size=4096,logical_block_size=512,disable-modern=on,disable-legacy=off \
    ...

6. Start some apps on guest, and check vm status by qmp monitor:
   (qmp)#nc -U /var/tmp/monitor-qmpmonitor1-20180910-021412-u4bPHcZI
{"execute":"qmp_capabilities"}
{"timestamp": {"seconds": 1537498212, "microseconds": 218441}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": true, "__com.redhat_reason": "enospc", "node-name": "#block537", "reason": "No space left on device", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1537498212, "microseconds": 219006}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": false, "__com.redhat_reason": "eother", "node-name": "#block537", "reason": "No medium found", "operation": "write", "action": "report"}}
{"timestamp": {"seconds": 1537498212, "microseconds": 219960}, "event": "STOP"}

7. Shutdown vm, then start it with -blockdev+werror=enospc:
   ....
   -blockdev driver=qcow2,file.driver=gluster,node-name=drive_image1,cache.no-flush=on,cache.direct=off,encrypt.key-secret=sec0,encrypt.format=luks,file.server.0.type=inet,file.server.0.host=ibm-x3650m5-07.lab.eng.pek2.redhat.com,file.server.0.port=24007,file.volume=aliang,file.path=rhel76-64-virtio.qcow2.25 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,werror=enospc,bus=pci.0,addr=0x4,scsi=off,serial=0xdafafafafafaf12121212121212121212121212121212121212121212aaafffffffff,physical_block_size=4096,logical_block_size=512,disable-modern=on,disable-legacy=off \
   ....

8. Start some apps on guest, and check vm status by qmp monitor:
    (qmp)#nc -U /var/tmp/monitor-qmpmonitor1-20180910-021412-u4bPHcZI
{"execute":"qmp_capabilities"}
{"timestamp": {"seconds": 1537498212, "microseconds": 218441}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": true, "__com.redhat_reason": "enospc", "node-name": "#block537", "reason": "No space left on device", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1537498212, "microseconds": 219960}, "event": "STOP"}
    
Actual results:
 Both -drive and -blockdev+werror=enospc work normally when "No space left on device" with qmp event: 
    {"timestamp": {"seconds": 1537498212, "microseconds": 218441}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": true, "__com.redhat_reason": "enospc", "node-name": "#block537", "reason": "No space left on device", "operation": "write", "action": "stop"}}

 -blockdev+werror=auto can't work normally when "No space left on device" with qmp event:
     {"timestamp": {"seconds": 1537501432, "microseconds": 641823}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "__com.redhat_reason": "enospc", "node-name": "drive_image1", "reason": "No space left on device", "operation": "write", "action": "report"}}

Expected results:

 -blockdev+werror=auto can work normally.

Additional info:
  It's not a regression bug. Test on qemu-kvm-rhev-2.12.0-1.el7.x86_64 and qemu-kvm-rhev-2.10.0-21.el7_5.7.x86_64, both hit this issue.

--- Additional comment from Kevin Wolf on 2018-09-28 10:58:26 CEST ---

To be precise, the bug is with -device ...,drive=<node-name>. This can be reproduced with -drive as well if the node name is used to create the device.

The fix is simple, blk_new() should set the right defaults instead of relying on other code paths doing that.

--- Additional comment from Tingting Mao on 2018-09-28 13:23:23 CEST ---

The command could be like below for '-drive & -device' option(Thank Kevin and Miya).


/usr/libexec/qemu-kvm \

      ...
      -drive node-name=image_drive1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=base.qcow2 \
      -device virtio-blk-pci,drive=image_drive1,bus=pci.0,addr=05 \
      ...

--- Additional comment from aihua liang on 2018-09-30 08:39:11 CEST ---

Yes, tried with cmd: 
  ...
  -drive if=none,format=qcow2,node-name=drive_image1,cache=none,encrypt.key-secret=sec0,encrypt.format=luks,file=gluster://ibm-x3650m5-07.lab.eng.pek2.redhat.com/aliang/rhel76-64-virtio.qcow2.25 \
  -device virtio-blk-pci,id=image1,drive=drive_image1,bus=pci.0,addr=0x4,scsi=off,serial=0xdafafafafafaf12121212121212121212121212121212121212121212aaafffffffff,physical_block_size=4096,logical_block_size=512,disable-modern=on,disable-legacy=off
  ...

can also hit this issue.

--- Additional comment from Kevin Wolf on 2018-10-01 11:26:45 CEST ---

Peter, is this a problem for libvirt?

--- Additional comment from Peter Krempa on 2018-10-01 12:52:51 CEST ---

We specify the policies explicitly only if the user set a specific one, so without the user specifying anything the behaviour will change. I think that the libvirt API contract allows this as we specify that the "hypervisor default" will be used.

On the other hand it would be better if the behaviour did not change when switching to -blockdev.

Comment 1 Danilo de Paula 2018-12-11 11:08:26 UTC
QA_ACK+, please?

Comment 2 Danilo de Paula 2018-12-11 16:11:25 UTC
Fix included in qemu-kvm-2.12.0-46.module+el8+2351+e14a4632

Comment 4 lchai 2018-12-13 02:40:25 UTC
1. Reproduced this issue on "kernel-4.18.0-40.el8" + "qemu-kvm-2.12.0-45.module+el8+2313+d65431a0.x86_64"

1) Boot the guest with blockdev/drive + werror in default mode:
blockdev:
        -blockdev driver=raw,file.driver=file,node-name=driver_system_disk,cache.no-flush=on,cache.direct=off,file.filename=/root/win1.raw \
        -device virtio-blk-pci,drive=driver_system_disk,id=system-disk,bus=pci.0,addr=0x3 \

drive:
	-drive node-name=image_drive1,if=none,aio=threads,cache=none,format=raw,file=/root/win1.raw \
	-device virtio-blk-pci,drive=image_drive1,bus=pci.0 \


2) Run some write operations on the guest:
# dd if=/dev/zero of=test.bin bs=1M oflag=direct

3) Check the VM status in qmp/hmp:
qmp:
{"timestamp": {"seconds": 1544607796, "microseconds": 522300}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "node-name": "driver_system_disk", "reason": "No space left on device", "operation": "write", "action": "report"}}
{"timestamp": {"seconds": 1544607796, "microseconds": 522494}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "node-name": "driver_system_disk", "reason": "No space left on device", "operation": "write", "action": "report"}}
hmp:
(qemu) info status
VM status: running

2. This issue was fixed on "kernel-4.18.0-40.el8" + "qemu-kvm-2.12.0-47.module+el8+2367+d2ba437c.x86_64"

1) Boot the guest with blockdev/drive + werror in default mode:

2) Run some write operations on the guest:
# dd if=/dev/zero of=test.bin bs=1M oflag=direct

3) Check the VM status in qmp/hmp:
qmp:
{"timestamp": {"seconds": 1544666508, "microseconds": 843206}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "nospace": true, "node-name": "driver_system_disk", "reason": "No space left on device", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1544666508, "microseconds": 843318}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "nospace": true, "node-name": "driver_system_disk", "reason": "No space left on device", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1544666508, "microseconds": 848617}, "event": "STOP"}

hmp:
(qemu) info status
VM status: paused (io-error)

Comment 5 Xueqiang Wei 2018-12-13 05:36:38 UTC
*** Bug 1643351 has been marked as a duplicate of this bug. ***