Bug 480843

Summary: SCSI problems on fullvirt guests with > 4Gb mem
Product: Red Hat Enterprise Linux 5 Reporter: Issue Tracker <tao>
Component: xenAssignee: Rik van Riel <riel>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: urgent    
Version: 5.2CC: cward, jplans, jruemker, mmilgram, mshao, riel, tao, yuzhang
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-02 10:05:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 484336    
Attachments:
Description Flags
backport of upstream patch none

Description Issue Tracker 2009-01-20 20:43:40 UTC
Escalated to Bugzilla from IssueTracker

Comment 1 Issue Tracker 2009-01-20 20:43:43 UTC
GENERAL ESCALATION INFO
##########################
PLATFORM: x86_64

PROBLEM: On fully-virtualized xen guests with approximately 4Gb of memory or more, there appears to be general corruption of the SCSI bus which causes a number of problems.  I have been able to reproduce this issue on two systems and it seems (as best I can tell) that the problems start to occur when the guest is allocated about 3930 Mb of memory or more.  Anything less than that and I do not see any problems (and the customer's observations are the same).  The problems that are observed are some combination of the following:

-The partition table is not read at boot time and thus the proper device nodes aren't created.  Sometimes doing a partprobe after bootup reads it and properly creates everything, but this prevents the filesystems from being in fstab and therefore init scripts which depend on those filesystems can not start properly

-/proc/scsi/scsi shows strange attributes for those SCSI devices.  Two examples of what I have seen:

   Host: scsi0 Channel: 00 Id: 00 Lun: 00
    Vendor: W   W    Model:                  Rev:    
    Type:   Direct-Access                    ANSI SCSI revision: ffffffff

   Host: scsi0 Channel: 00 Id: 00 Lun: 00
     Vendor:  o       Model:          :       Rev: inpu
     Type:   Direct-Access                    ANSI SCSI revision: ffffffff

compared to what it should look like:

   Host: scsi0 Channel: 00 Id: 00 Lun: 00
     Vendor: QEMU     Model: QEMU HARDDISK    Rev: 0.9.
     Type:   Direct-Access                    ANSI SCSI revision: 03

-e2fsck/dumpe2fs fails.  It may report something like:

   # e2fsck /dev/sda1
   e2fsck 1.39 (29-May-2006)
   Couldn't find ext2 superblock, trying backup blocks...
   e2fsck: Bad magic number in super-block while trying to open /dev/sda1

   The superblock could not be read or does not describe a correct ext2
   filesystem.  If the device is valid and it really contains an ext2
   filesystem (and not swap or ufs or something else), then the superblock
   is corrupt, and you might try running e2fsck with an alternate superblock:
      e2fsck -b 8193 <device>

or it may find errors in the filesystem and ask to fix them.  I have observed e2fsck work several times in a row (report the fs is clean) and then all of a sudden without changing anything it will fail or report fs errors.  

-Filesystem errors causing journal aborts.  Example:

   Jan 15 10:57:08 localhost kernel: journal_bmap: journal block not found at offset 12 on sda1
   Jan 15 10:57:08 localhost kernel: Aborting journal on device sda1.
   Jan 15 10:57:08 localhost kernel: __journal_remove_journal_head: freeing b_committed_data
   Jan 15 10:57:25 localhost kernel: ext3_abort called.
   Jan 15 10:57:25 localhost kernel: EXT3-fs error (device sda1): ext3_journal_start_sb: Detected aborted journal
   Jan 15 10:57:25 localhost kernel: Remounting filesystem read-only

This happened a few seconds after copying some files to that filesystem

-On one occasion I had a kernel panic while writing the partition table with fdisk.  Unfortunately I wasn't set up to capture any info and it hasn't happened again. 

I tested several different memory configurations including about 30 boots with 3930 or higher and every time problems were seen, while about 10 or 15 boots with less than 3930 and I never saw any problems.  Its difficult to pinpoint the exact amount of memory since xen appears to round the allocated memory / max. allocation numbers after starting the guest.  

These problems occur only on fullvirt, not paravirt, and occurs with both image-backed and physical devices.  It does not occur on IDE devices or when the paravirt drivers are in use with tap:aio.  It appears only 4 IDE devices can be allocated to a guest so the customer is using the paravirt drivers as a workaround since some of their guests have multple devices.  

I've also tested this on 5.3rc2 and the problems persisted. I have a test system set up in the lab (dell-r900-1.gsslab.rdu.redhat.com) on 5.3 with a guest (jrummy-fv5u2) set up to reproduce this if you need it

ACTION REQUESTED OF SEG:  Provide analysis and fix for issue

DEFECT SUSPECTED: Yes but could not find any related cases or BZs 

CUSTOMER IMPACT: Potential for data loss or corruption on guests using SCSI devices.  

SUPPORTING INFO
#######################
ACTIONS TAKEN: Reproduced on 5.2 and 5.3

Attaching sosreport from dom0 and guest as well as example guest config

STEPS TO REPRODUCE: 1) Add a scsi device to a full virt guest like one of the following

   disk = [ "file:/vm/rhtest.img,hda,w", ",hdc:cdrom,r", "file:/vm/rhtest/rhtest_disk1.img,sda,w" ]
   disk = [ "file:/vm/rhtest.img,hda,w", ",hdc:cdrom,r", "phy:/dev/vg1/lv1,sda,w" ]
 
2) Adjust memory settings for guest to be 3930 or higher

   maxmem = 4096
   memory = 4096

3) Boot guest

4) Ways to observe problems

   # cat /proc/scsi/scsi
   # ls /dev/sda*        <- May not show /dev/sda1 even though there is a partition
   # e2fsck /dev/sda1    <- (if /dev/sda1 exists, or after partprobing).  Run it several times and you may see problems
   -Mount the device and write data to it.  If journal doesn't abort try unmounting and running e2fsck again.  Usually this produces something
   
ACTUAL RESULTS: See problem description above

EXPECTED RESULTS: device functions as expected.  No journal aborts, device nodes created at boot time, /proc/scsi/scsi shows correct info, e2fsck works



This event sent from IssueTracker by mmilgram  [Support Engineering Group]
 issue 256914

Comment 2 Issue Tracker 2009-01-20 20:43:47 UTC
There is a git patch which is not in our tree, but appears to fix this
issue:

http://git.kernel.org/?p=linux/kernel/git/avi/kvm-userspace.git;a=commit;h=6ff744c816c9a9452b38eeb559fe47ac5732f79b

 Add 40-bit DMA support to LSI scsi emulation (Ryan Harper)

 This patch fixes Linux machines configured with > 4G of ram and using a
SCSI device.



This event sent from IssueTracker by mmilgram  [Support Engineering Group]
 issue 256914

Comment 5 Rik van Riel 2009-01-28 04:26:26 UTC
Created attachment 330190 [details]
backport of upstream patch

I can confirm that this patch fixes the issue.

Comment 7 Jiri Denemark 2009-03-02 10:22:46 UTC
Fix built into xen-3.0.3-81.el5

Comment 11 Chris Ward 2009-07-03 18:21:31 UTC
~~ Attention - RHEL 5.4 Beta Released! ~~

RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner!

If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.

Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value.

Questions can be posted to this bug or your customer or partner representative.

Comment 12 Yufang Zhang 2009-07-24 06:17:29 UTC
Sorry,I can`t reproduce this bug in xen-3.0.3-80.el5:
I start a HVM guest(i386) with 4.5G memory on a x86_64 host and add a scsi disk to the guest.
In the guest, all information(from /proc/scsi/scsi)seems quite ok.And when I run:
#e2fsck /dev/sda1
There is no error output.
I also mount the scsi device and copy some file to it.There is no error too.After umount the device,run e2fsck again,there is no error.

Has the bug been fixed in xen-3.0.3-80.el5 or I made a mistake?
Thanks.

Comment 13 Yufang Zhang 2009-07-24 07:00:50 UTC
(In reply to comment #12)
> Sorry,I can`t reproduce this bug in xen-3.0.3-80.el5:
> I start a HVM guest(i386) with 4.5G memory on a x86_64 host and add a scsi disk
> to the guest.
> In the guest, all information(from /proc/scsi/scsi)seems quite ok.And when I
> run:
> #e2fsck /dev/sda1
> There is no error output.
> I also mount the scsi device and copy some file to it.There is no error
> too.After umount the device,run e2fsck again,there is no error.
> 
> Has the bug been fixed in xen-3.0.3-80.el5 or I made a mistake?
> Thanks.  
On a 64-on-64 case,I can`t reproduce this bug either.

Comment 14 Yufang Zhang 2009-07-28 08:33:45 UTC
verified in xen-3.0.3-91.el5

Comment 16 errata-xmlrpc 2009-09-02 10:05:28 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1328.html