Bug 1251943

Summary: "ENOSPAC" caused iscsi error message flood during install guest
Product: Red Hat Enterprise Linux 7 Reporter: Yanan Fu <yfu>
Component: qemu-kvm-rhevAssignee: Fam Zheng <famz>
Status: CLOSED ERRATA QA Contact: FuXiangChun <xfu>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.2CC: chayang, famz, hhuang, juzhang, knoel, virt-maint, yfu
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Windows   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.5.0-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-07 20:36:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yanan Fu 2015-08-10 11:33:38 UTC
Description of problem:
Win 8.1 guest image is managed by libiscsi, and have no enough space.
Install guest,will show flood error message in qemu monitor:

2015-08-10T02:04:48.533151Z qemu-kvm: iSCSI Failure: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:LBA_OUT_OF_RANGE(0x2100)

Version-Release number of selected component (if applicable):
qemu-kvm:2.3.0-13.el7.x86_64
kernel:3.10.0-300.el7.x86_64
virtio-win:virtio-win-1.7.4-1.el7.noarch


How reproducible:
100%

Steps to Reproduce:
1.configure libiscsi server
2.make a 1G lvm to be a qcow2 image
#vgcreate vgtest /dev/sdb
#lvcreate -n lvtest -L 1G vgtest
#qemu-img create -f qcow2 /dev/vgtest/lvtest 20G
3.add the image"/dev/vgtest/lvtest" to libiscsi server.
4.start a vm, install win8.1 guest from cdrom. and guest image is "/dev/vgtest/lvtest"


Actual results:
During install, iscsi error log flood: 2015-08-10T02:04:48.533151Z qemu-kvm: iSCSI Failure: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:LBA_OUT_OF_RANGE(0x2100)

Expected results:
qemu monitor show:(qemu)paused (io-error),after extend lv, install can continue, and finished successfully.

Additional info:

Comment 2 Fam Zheng 2015-09-06 09:20:48 UTC
What's your QEMU command line? You need to set werror=stop to -drive, that will make vm paused.

Comment 3 Yanan Fu 2015-09-16 12:10:49 UTC
(In reply to Fam Zheng from comment #2)
> What's your QEMU command line? You need to set werror=stop to -drive, that
> will make vm paused.

when add "werror=stop" to -drive, guest can be stoped with "io-error", but before it be stoped, there still have flood iscsi error message as before.

command line:
/usr/libexec/qemu-kvm -name test -machine pc,accel=kvm,usb=off,dump-guest-core=on -m 6G -cpu SandyBridge -smp 4,sockets=4,cores=1,threads=1 -no-user-config -nodefaults -monitor stdio -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot menu=on,strict=on -device pci-bridge,bus=pci.0,id=bridge1,chassis_nr=1,addr=0x5 -device ich9-usb-ehci1,id=usb,bus=bridge1,addr=0x2.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=bridge1,multifunction=on,addr=0x2.0x0 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=bridge1,addr=0x2.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=bridge1,addr=0x2.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=bridge1,addr=0x4 -chardev socket,path=/tmp/yfu,server,nowait,id=yfu0 -netdev tap,id=hostnet,vhost=on -device virtio-net-pci,netdev=hostnet,id=net,mac=78:1a:4a:d6:b8:97,bus=bridge1,addr=0x6,bootindex=2 -vnc :1 -vga qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=33554432 -msg timestamp=on -monitor unix:/home/qmp,server,nowait -serial unix:/tmp/ttyS0,server,nowait -qmp tcp:0:4444,server,nowait -device virtio-scsi-pci,id=scsi -iscsi initiator-name=iqn.1994-05.com.redhat:c4836823271d -drive file=iscsi://10.66.4.122/fuyanan/2,if=none,id=drive-scsi,media=disk,format=qcow2,werror=stop,cache=none,aio=native,serial=sluo-disk1-iscsi,id=iqn -device scsi-disk,drive=drive-scsi,id=scsi-0,bus=scsi.0,bootindex=1 -drive file=/root/en_windows_8_1_enterprise_x64_dvd_2971902.iso,if=none,media=cdrom,readonly=on,format=raw,id=cdrom1 -device scsi-disk,drive=cdrom1,id=scsi-1,bus=scsi.0,bootindex=0 -fda /usr/share/virtio-win/virtio-win-1.7.4_amd64.vfd

Comment 4 Fam Zheng 2015-09-17 05:01:12 UTC
What is the flood like?

Comment 5 Yanan Fu 2015-09-17 08:23:34 UTC
(In reply to Fam Zheng from comment #4)
> What is the flood like?

sorry for make a mistake before.
1. without "werror=stop" ,i think the default value is "werror=enospc", qemu keep print:

2015-09-17T07:55:34.229860Z qemu-kvm: iSCSI Failure: SENSE KEY:(null)(3) ASCQ:(null)(0x1100)
2015-09-17T07:55:34.229986Z qemu-kvm: iSCSI Failure: SENSE KEY:(null)(3) ASCQ:(null)(0x1100)
2015-09-17T07:55:34.297813Z qemu-kvm: iSCSI Failure: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:LBA_OUT_OF_RANGE(0x2100)
2015-09-17T07:55:34.322869Z qemu-kvm: iSCSI Failure: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:LBA_OUT_OF_RANGE(0x2100)
2015-09-17T07:55:34.347555Z qemu-kvm: iSCSI Failure: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:LBA_OUT_OF_RANGE(0x2100)
......

and (qemu)info status,  to check the VM status is "runing"

2.with "werror=stop", qemu print a little then stop(these error messages can stop output after a little time):
2015-09-17T08:16:11.252753Z qemu-kvm: iSCSI Failure: SENSE KEY:(null)(3) ASCQ:(null)(0x1100)
2015-09-17T08:16:15.728537Z qemu-kvm: iSCSI Failure: SENSE KEY:(null)(3) ASCQ:(null)(0x1100)

then no output, after checked "info status", it is "VM status: paused (io-error)"

Comment 6 Fam Zheng 2015-10-21 12:45:38 UTC
Issue confirmed.

When a write triggers cluster allocation, qcow2 uses the next unused cluster offset. It the offset falls beyond the end of iscsi target (right, this is not checked), we get the "LBA_OUT_OF_RANGE" error from iscsi target.

From iscsi driver's point of view, it is impossible to distinguish this particular case from a guest "out of range" access, as a result it's not right to return -ENOSPC here, instead -EIO is returned; on the other hand, qcow2 doesn't know whether the -EIO is because of "out of range" access or it is because other failures, due to the limitation of the interface.

I think iscsi driver should return -ERANGE for LBA_OUT_OF_RANGE error and qcow2 should translate -ERANGE to -ENOSPC.

This should go to upstream first.

Fam

Comment 8 Fam Zheng 2015-11-16 06:21:55 UTC
This was fixed in upstream by commit:


commit e01dd3da5cf9aa90ae844d3b86c2c2762066edac
Author: Fam Zheng <famz>
Date:   Thu Nov 5 13:00:09 2015 +0800

    iscsi: Translate scsi sense into error code
    
    Previously we return -EIO blindly when anything goes wrong. Add a helper
    function to parse sense fields and try to make the return code more
    meaningful.
    
    This also fixes the default werror configuration (enospc) when we're
    using qcow2 on an iscsi lun. The old -EIO not being treated as out of
    space error failed to trigger vm stop.
    
    Signed-off-by: Fam Zheng <famz>
    Message-Id: <1446699609-11376-1-git-send-email-famz>
    [libiscsi 1.9 compatibility - Paolo]
    Signed-off-by: Paolo Bonzini <pbonzini>

Comment 13 errata-xmlrpc 2016-11-07 20:36:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2673.html