Red Hat Bugzilla – Bug 200777
Mylex extremeRaid 3000 PCI not working
Last modified: 2008-05-06 12:10:51 EDT
Description of problem:
The Mylex extremeRaid 3000 PCI based card does not work with FC5. I don't think
it is working with any 2.6 based kernel.
Version-Release number of selected component (if applicable):
Any 2.6 based kernel
Steps to Reproduce:
1. Install FC5
2. Try to access a created system drive (ex fdisk /dev/rd/c0d0)
3. Boom. It gives an illegal seek.
Illegal Seek on device
fdisk should open up the device for partitioning.
This same card worked under 2.4 based Red Hat OS's. In fact, this exact box and
card was running on RH 7.3. I upgraded the box, and it no longer works. Other
symptoms include a line speed of 125MB/s when checking
/proc/rd/c0/current_status instead of the familiar 1000MB/s. I know that this
card works because I installed FreeBSD on this exact box, and I am able to
access the Mylex drives with no problems. I have tried this exact combination
of cards with several different boxes (Dell 1650, Dell 2650, white box with dual
P4 Xeons and SuperMicro motherboard, Sun Opteron Workstation), and with both FC5
and RHEL 4.3, and have had no luck. I know that other people are using Mylex
controllers, probably SCSI version, and they are working, but since this is the
last Fibre RAID controller available, it would be nice to have it working like
it used to.
I have also tried generic (vanilla) 2.6 kernels from kernel.org, same results.
A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.
Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.
This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.
Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.
In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed. See bug 207474 for further details.
If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.
If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.
Created attachment 141078 [details]
lspci -vv and cat of /proc/rd/c0/current_status for a RH7.3 box for comparison
This is a log file from a working RH 7.3 box, kernel 2.4.21 that has its Mylex
3000 PCI Raid card functioning properly.
I have sent an lspci -vv and cat /proc/rd/c0/current_status of a working Mylex
ExtremeRaid 3000 PCI card from a RH 7.3 based box. Here are some things that I
noticed are different from the 2.4 kernel and the 2.6 kernels.
NOTE: The Mylex card I am currently using has 64 MB of cache, where the one in
the attachment (id=141078) only has 32 MB cache.
1st) When the box boots up and goes through POST, this Mylex card says that it
is trying to attach itself to IRQ 11 (Bus=2 DEV/Slot=8 IRQ=11).
2nd) When a 2.6 based kernel boots up, the Mylex card is found, and all of the
drives are scanned. The enclosure is enumerated, and even the temperature and
fans of the enclosure are read and checked. It even sees the current logical
drive (/dev/rd/c0d0) and says it is online.
Here is where I start to see differences:
lspci of 2.4 kernel shows I/O port of bc80 [size=128]. Granted, this is for the
32 MB card.
lspci of 2.6 kernel (64MB card) shows I/O port of dc80 [size=128]. Is this
significant (both the 2.6.15 and the 2.6.18 kernels that I have tested)?
Also, the PCI address mapping space has changed.
In 2.4 kernel, the driver reports the following:
PCI Address: 0xF8000000 mapped at 0xF880D000, IRQ Channel: 20
2.6.15 FC4smp kernel reports the following:
PCI Address: 0xF8000000 mapped at 0xF881A000, IRQ Channel: 11
2.6.18 FC6 install kernel reports the following:
PCI Address: 0xF8000000 mapped at 0xF8826000, IRQ Channel: 201
When you try and fdisk /dev/rd/c0d0, you get an illegal seek.
Also, I double checked. This card was working before the box was moved to FC4.
When it didn't work in FC4 in this particular box (A Dell 1650), I placed this
card in a Sun Opteron Workstation (Opteron) running RHEL 4.2 (64 bit), and it
still failed the same way. I then put the card back into the Dell 1650,
installed NetBSD on it, and the card worked (slowly, but it worked). I then
tried to install FC6 just last week, thinking that the new kernel may have the
fix, but alas, it still is broken in Linux.
This could be related to:
It looks like a change was made in this area recently. I'm not sure exactly
which kernel version.
Is there any kernel error reported when you try fdisk and get an illegal seek?
No, I don't remember any particular errors coming from the kernel... It is
weird. The card is recognized, but it is not like it is 100% there. And, it
does not matter what system (node) that I move this card into and test with. If
I have it in a Sun 64 bit Opteron workstation using a 2.6 kernel (RHEL 4), no
go. If I have it in a white box system using an Asus motherboard and dual Xeons
running FC2, 3, or 5, no go. I put it into a Dell 1650 or 2650 using FC4 , 5,
or 6.... No go.
The differences that I see are the I/O range using lscpi, and the IRQ reported
once the card and node have booted, plus the fact that the drives now say they
are 125 MB/s instead of the 1000 MB/s bus that they should be on. This is
getting real frustrating...
After reading the link provided in comment 4, I could see that this could be a
problem if it was mis identifying the card. I am not above getting my hands
dirty by looking into the code, but I am an "extreme" neophyte when it comes to
this level of coding... Where should I start. Any suggestions would be a great
help (I have a system completely dedicated for this work at this time, but I
don't know how much longer they are going to let me have it).
Thanks again for your help.
It does seem as though interrupts are being received. Otherwise you would not
see the storage devices being configured.
1. /var/log/messages showing the boot messages, and
from a working and a non-working kernel, on the same hardware if possible.
It appears as though some people have this working with 2.6...
Created attachment 144414 [details]
Messages snippet from a working system (2.4.21 RH7.3)
Created attachment 144415 [details]
/proc/interrupts on a working system. (RH7.3 with custom 2.4.21)
/proc/interrupts on a working system. (RH7.3 with custom 2.4.21)
There are no errors on a non-working kernel during an fdisk except that it says
that it cannot seek on the device. Further, all of the drives are showing up as
125 MB/s instead of 1000 MB/s in the /proc/rd/c0/current_status. I am trying to
get my system back up now (I just tried to reboot with a very old kernel, FC2
based, to see if it would at least recogonize it, but it appears that it is too
old of a kernel).
Also, I tried FC6 with a custom 2.6.19 kernel... Still no joy. The devices are
showing up the same, as 125MB/s drives, and still a seek error when trying to
partition the devices.
I will try to get a /var/log/messages and /proc/interrupts from the non working
I have the non-working dmesg during boot and the /proc/interrupts. It is
interesting... The node thinks that it should be on interrupt 18, but during
boot up, the card, during POST, tells me that it is at interrupt 11. The
function and slot information are correct in both cases, but the interrupt has
Created attachment 144467 [details]
/proc/interrupts on a non working system (Dell 2650, FC6, 2.6.19 custom kernel)
This is a 18.104.22.168 kernel with the cks2 patch set. It exhibits the same type
of errors as any "recent" 2.6 kernel.
Created attachment 144468 [details]
dmesg bootup for a non working system (Dell 2650, FC6, 2.6.19 custom kernel)
This is a custom 22.214.171.124 kernel with cks2 patch set. It exhibits same
problems as all "recent" 2.6 kernels, ie., the Mylex card is not functioning.
Created attachment 144469 [details]
Try using parted to partition the base mylex disk
This is the output from an attempt to run parted on the base Mylex system disk.
It shows the "Invalid argument during seek" that I get during install or any
other time I try to run a command on the Mylex disk.
I patched together a DAC960 driver from the 2.6.10 base kernel into the 126.96.36.199
with cks2 patch set. It still has the same issues, which really was of no
surprise since FC2 and FC3 neither one worked with the Mylex extremeRaid 3000
PLUS card. I am now trying to go clear back to the 2.6.0 drivers and see if I
can somehow squeeze them into the current kernel source and try to get that
driver to work...
I tried the 2.6.0 kernel level driver. It hung up during boot up (after the
DAC960 driver banner, right either before or after the line containing the IRQ).
I realize that the driver in the 2.6.0 kernel is 2.5.47, and the driver version
in the later kernels is 2.5.48. I was able to get the driver compiled, but
there was a warning, and it was about the irq (I remember having chased that one
down quite a ways in the 2.6.10 kernel version of the driver, which runs as
"well" as the 2.6.19 version, ie, the driver sees the array, but the system
drive is not right).
I am about at the level of what I can do here.
I also changed the geometry setting on the card itself, from 2GB to 8GB disk
geometry, no help (although, now the disk geometry shows up as 255/63 instead of
180/32). I tried passing various combinations of pci and acpi command lines,
trying to see if that was it, still no joy.
Has this bug gone anywhere? It is still not working as of 188.8.131.52 (I haven't
tried the 2.6.20 vanilla kernel).
(In reply to comment #16)
> Has this bug gone anywhere?
I have looked at the logs, but I don't see anything that points to the problem.
It seems odd to have no kernel I/O error messages when you try to do I/O, yet
the I/Os appear to be failing.
If you are still willing and able to try something, let's see if a simple
command like badblocks fails. If it does, get an strace and post it.
Start with a really simple read test:
badblocks -v -b512 /dev/rd/c0d0 1
increase number of blocks, and remove -b, until you get a failure. If none, add
a write to the test:
badblocks -vw -b512 /dev/rd/c0d0 1
When you get a failure, then remove -v and get an strace:
strace badblocks -b512 /dev/rd/c0d0 1
Hopefully this will indicate where the problem is.
Created attachment 148633 [details]
strace of a failed fdisk /dev/rd/c0d0 for Mylex ExtremeRAID 3000 PCI
Here is an strace of the fdisk /dev/rd/c0d0. Notice the EINVAL during the
_llseek. The output I get from doing the fdisk is one of "Unable to seek".
For comment #18, uname -a is:
Linux hoepld25 2.6.18-1.2869.fc6 #1 SMP Wed Dec 20 14:51:19 EST 2006 i686 i686
And I get the same error during fdisk on any recent kernels (2.6.19 custom,
2.6.19 FC6 kernel).
Created attachment 148634 [details]
Here is a /proc/rd/c0/current_status from the "broken" box
Please compare this /proc/rd/c0/current_status to the attachment # 141078 [details].
This is a broken current_status. Notice how in this current_status the drives
are saying that they are 125 MB/s, and on the RH73 boxes they are saying that
they are 1000 MB/s drives. The RH73 boxes are the ones that work, while the FC
(any kernel 2.6 based) builds do not work.
Removing NeedsRetesting from whiteboard so we can repurpose it.
Fedora apologizes that these issues have not been resolved yet. We're
sorry it's taken so long for your bug to be properly triaged and acted
on. We appreciate the time you took to report this issue and want to
make sure no important bugs slip through the cracks.
If you're currently running a version of Fedora Core between 1 and 6,
please note that Fedora no longer maintains these releases. We strongly
encourage you to upgrade to a current Fedora release. In order to
refocus our efforts as a project we are flagging all of the open bugs
for releases which are no longer maintained and closing them.
If this bug is still open against Fedora Core 1 through 6, thirty days
from now, it will be closed 'WONTFIX'. If you can reporduce this bug in
the latest Fedora version, please change to the respective version. If
you are unable to do this, please add a comment to this bug requesting
Thanks for your help, and we apologize again that we haven't handled
these issues to this point.
The process we are following is outlined here:
We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.
And if you'd like to join the bug triage team to help make things
better, check out http://fedoraproject.org/wiki/BugZappers
This bug is open for a Fedora version that is no longer maintained and
will not be fixed by Fedora. Therefore we are closing this bug.
If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen thus bug against that version.
Thank you for reporting this bug and we are sorry it could not be fixed.