Bug 181747 - sbp2 oopses on kernel-xen-hypervisor
sbp2 oopses on kernel-xen-hypervisor
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: kernel-xen (Show other bugs)
rawhide
x86_64 Linux
low Severity medium
: ---
: ---
Assigned To: Juan Quintela
Virtualization Bugs
:
Depends On:
Blocks: 179269
  Show dependency treegraph
 
Reported: 2006-02-16 01:22 EST by Alexandre Oliva
Modified: 2009-12-14 15:42 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-02-26 17:58:08 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Picture of the oops (54.55 KB, image/jpeg)
2006-02-16 01:22 EST, Alexandre Oliva
no flags Details

  None (edit)
Description Alexandre Oliva 2006-02-16 01:22:52 EST
Created attachment 124741 [details]
Picture of the oops
Comment 1 Alexandre Oliva 2006-02-16 01:22:52 EST
Description of problem:
I mirror (RAID 1) my notebook's internal HD to an external Firewire/USB
enclosure.  If I boot with Xen, it won't bring up the RAID members from the sbp2
disk early enough.  Shortly after I add them back to the RAID set, I get this
sort of errors in /var/log/messages:

ieee1394: sbp2: aborting sbp2 command
sd 0:0:1:0:
        command: Write (10): 2a 00 05 1f eb a5 00
00 80 00
ieee1394: sbp2: aborting sbp2 command
sd 0:0:1:0:
        command: Test Unit Ready: 00 00 00 00 00 00
ieee1394: sbp2: reset requested
ieee1394: sbp2: Generating sbp2 fetch agent reset
ieee1394: sbp2: aborting sbp2 command
sd 0:0:1:0:
        command: Test Unit Ready: 00 00 00 00 00 00
sd 0:0:1:0: scsi: Device offlined - not ready after error recovery
sd 0:0:1:0: SCSI error: return code = 0x50000

Needless to say, the RAID resyncing didn't go very far.  On another session,
*right* after I readded the external-disk partition to the raid set, I got an
oops, as in the attached picture.

Version-Release number of selected component (if applicable):
kernel-xen-hypervisor-2.6.15-1.1948_FC5
Comment 2 Stephen Tweedie 2006-02-17 13:01:01 EST
This oops doesn't look Xen-specific, and I've seen sbp2 errors like this (though
not with an oops) on older non-xen kernels.  Are you sure it's only happening
with Xen, or does the non-xen 1948 kernel show the same problem?

Also, why does the RAID set not get constructed early enough?  What errors do
you get when it attempts to do so?
Comment 3 Alexandre Oliva 2006-02-17 13:39:47 EST
The non-xen kernels work perfectly fine.  No such errors are produced (although
I have seen `command time out´ errors with earlier kernels, especially when
doing RAID over two Firewire disks, which is no longer the case.

I don't know why the sbp2 raid members were not started when booting with Xen;
the sbp2 module was loaded, and so was the usb-storage module, that introduces
an 8-second delay in the boot, enough for usb and firewire devices to be
recognized (even when I don't have any USB disks plugged in :-)  That works with
non-Xen kernels.  I didn't see any errors fly by during Xen boot up, but I
didn't notice whether it recognized the device early enough.  I was actually
very surprised it did boot into the Xen kernel by default.  Maybe it didn't, and
that would explain why the raid members in it didn't come up.  I'll try the Xen
kernel again momentarily.
Comment 4 Alexandre Oliva 2006-02-17 14:19:52 EST
I have a pretty strong theory on why raid didn't come up the first time: my
kickstart file would rebuild initrd.img with sbp2 for `uname -r`, not for the
Xen hypervisor kernel.  mkinitrd should still have set it up by default, but I
don't think I've ever tested its recent magic within the installer.

Anyhow, I tried with 1.1955_FC5hypervisor and a `du -ks /' was enough to trigger
the sbp2 errors after a minute or so.  With the non-Xen kernel, it works just fine.
Comment 5 Stephen Tweedie 2006-02-17 15:35:01 EST
Do you get the same errors with the normal SMP kernel?  We've seen sbp2 errors
like this before on SMP (upstream SMP seems to have problems with sbp2), and
dom0 is implicitly an SMP kernel.
Comment 6 Alexandre Oliva 2006-02-17 18:55:11 EST
No, tried that earlier today, and sbp2 appears to be rock solid on my other box,
with an Athlon64X2 processor booted with both cpus enabled (I've used maxcpus=1
to work around other random problems, but sbp2 does not appear to suffer from
the same sort of problems that usb-storage, for example, does).

Not sure whether you meant SMP as in kernel built for SMP or as in running on an
actual SMP box.  On x86_64, the default kernel is SMP, so that's what I've been
claiming to work fine on the notebook on which I tried the xen kernel all along.
Comment 8 Red Hat Bugzilla 2007-07-24 21:30:43 EDT
change QA contact
Comment 9 Chris Lalancette 2008-02-26 17:58:08 EST
This report targets FC5, which is now end-of-life.

Please re-test against Fedora 7 or later, and if the issue persists, open a new bug.

Thanks

Note You need to log in before you can comment on or make changes to this bug.