565836 – sym53c8xx scsi fails if qcow file used, works if lv used

Bug 565836 - sym53c8xx scsi fails if qcow file used, works if lv used

Summary: sym53c8xx scsi fails if qcow file used, works if lv used

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kvm
Sub Component:
Version:	13
Hardware:	x86_64
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Assignee:	Glauber Costa
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	494832
TreeView+	depends on / blocked

Reported:	2010-02-16 13:50 UTC by Ilkka Tengvall
Modified:	2011-06-27 14:59 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2011-06-27 14:59:30 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Ilkka Tengvall 2010-02-16 13:50:48 UTC

Description of problem:
I try to boot up our linux variant under KVM. If I use qcow file for sda for guest, install fails due scsi error. If I use logical volume as sda, it works. So maybe there is a bug in kvm scsi handling that makes it work on lv case, but fails in qcow case.


Version-Release number of selected component (if applicable):

qemu-kvm-0.12.2-6.fc12.x86_64
kernel-2.6.31.12-174.2.3.fc12.x86_64


$ modinfo kvm kvm-intel
filename:       /lib/modules/2.6.31.12-174.2.3.fc12.x86_64/kernel/arch/x86/kvm/kvm.ko
license:        GPL
author:         Qumranet
srcversion:     90771304CC8C82FE80BFC84
depends:        
vermagic:       2.6.31.12-174.2.3.fc12.x86_64 SMP mod_unload 
parm:           oos_shadow:bool
filename:       /lib/modules/2.6.31.12-174.2.3.fc12.x86_64/kernel/arch/x86/kvm/kvm-intel.ko
license:        GPL
author:         Qumranet
srcversion:     8FEA479DFCD7F174DA7864E
depends:        kvm
vermagic:       2.6.31.12-174.2.3.fc12.x86_64 SMP mod_unload 
parm:           bypass_guest_pf:bool
parm:           vpid:bool
parm:           flexpriority:bool
parm:           ept:bool
parm:           emulate_invalid_guest_state:bool


How reproducible:
Happens every time I try it.

Steps to Reproduce:
1. I have a client with kernel 2.6.21 using scsi driver sym53c8xx
2. set up the disk: "qemu-img create -f qcow /tmp/fp.img 40G"
3. start the guest: qemu-kvm -enable-kvm -m 512 -name fp-cla0 -boot n -drive file=/tmp/cla0.img,if=scsi,boot=on -net nic,model=e1000 -net tap,script=no,ifname=tap0
  
Actual results:

device get's stuck with infinite loop like this coming out of serial console:

--------------------------------------
drbd14: role( Secondary -> Primary ) disk( Inconsistent -> UpToDate ) pdsk( DUnknown -> Outdated )
drbd14: Forced to consider local data as UpToDate!
drbd14: Creating new current UUID
sd 0:0:0:0: ABORT operation started.
sd 0:0:0:0: ABORT operation failed.
sd 0:0:0:0: ABORT operation started.
sd 0:0:0:0: ABORT operation complete.
sd 0:0:0:0: ABORT operation started.
sd 0:0:0:0: ABORT operation failed.
sd 0:0:0:0: ABORT operation started.
sd 0:0:0:0: ABORT operation failed.
sd 0:0:0:0: DEVICE RESET operation started.
 target0:0:0: control msgout: c.
sym0: TARGET 0 has been reset.
sd 0:0:0:0: DEVICE RESET operation complete.
sd 0:0:0:0: M_REJECT received (0:0).
sd 0:0:0:0: ABORT operation started.
sd 0:0:0:0: ABORT operation complete.
sd 0:0:0:0: ABORT operation started.
sd 0:0:0:0: ABORT operation complete.
--------------------------------------

and this from output of qemu-kvm commandline (stderr?):
--------------------------------------
lsi_scsi: error: Unimplemented message 0x0c
lsi_scsi: error: Unimplemented message 0x0c
lsi_scsi: error: Unimplemented message 0x0c
lsi_scsi: error: Unimplemented message 0x0c
lsi_scsi: error: Unimplemented message 0x0c
lsi_scsi: error: Unimplemented message 0x0c
lsi_scsi: error: Unimplemented message 0x0c
lsi_scsi: error: Unimplemented message 0x0c
lsi_scsi: error: Unimplemented message 0x0c
lsi_scsi: error: Unimplemented message 0x0c
--------------------------------------




Expected results:
No ABORTS or DEVICE RESET or M_REJECT happens with lv as disk:
qemu-kvm -enable-kvm -m 512 -name fp-lynx -boot n \
  -drive file=/dev/VB_Whipper/LV_KVM_fp-lynx,if=scsi,boot=on \
  -net nic,model=e1000 -net tap,script=no,ifname=tap0


Additional info:

The host is up to date F12 with rawvirt repo enabled and KVM stuff updated from there. I'm willing to take more logs you might need and any other debug stuff.

Comment 1 Bug Zapper 2010-03-15 15:05:23 UTC

This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle.
Changing version to '13'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 2 Ilkka Tengvall 2010-04-21 11:52:27 UTC

This seems to be weird problem. We are hit by it even using RHEL6A3, but without the scsi error printouts. Systems just get stuck. And it happens in different times, sometimes it takes longer to get stuck, sometimes less.

I tried to make a re-producal of it with RHEL6A3, but on that setup it doesn't happen. In our production systems under RHEL6A3 it happens, and only thing to get past it was to switch to using IDE device driver for disk access. Our guest kernel 2.6.21 doens't have virtio, it only came available for later ones.

Something fishy using scsi in guest, and file for disk in host. Switching host to use block device for disk for guest makes the problem dissappear. It is likely something to do with timings, and some systems get the sym53c8xx cause problem under KVM easier than the other systems. Hosts and guests are the same, HW underneath makes the difference.


So I'm going to stop trying to make the re-producal. You may close the ticket, since I can't systematically reproduce it.

Comment 3 Patrick C. F. Ernzer 2010-09-29 15:49:52 UTC

Ikke,

F14 Beta is now available on the internal fileserver, can you please retest this bug against F14 Beta? The LS21 in slot 3 is still assigned to you, so you have a testmachine that can be left running for a couple days.

PCFE

Comment 4 Bug Zapper 2011-06-02 16:29:21 UTC

This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 13 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 5 Bug Zapper 2011-06-27 14:59:30 UTC

Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.