Bug 116200 - (SCSI MEGARAID) Large disk arrays don't work on 4Gb RAM
Summary: (SCSI MEGARAID) Large disk arrays don't work on 4Gb RAM
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 2
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Dave Jones
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-02-18 23:49 UTC by scott campbell
Modified: 2015-01-04 22:04 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2005-04-16 05:50:40 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description scott campbell 2004-02-18 23:49:10 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040115

Description of problem:
When I try to boot fedora core 2 test release 1 with the smp kernel I
get the following.
INIT: cannot execute "/bin/sh"
It repeats several times before saying it will hybernate for 5 minutes
but the system never gets past this on the bootup.  If I boot a nonsmp
kernel it boots without issue.  I believe the smp kernel is perhaps
not properly loading the driver for the lsi megariad 150-6 controller
and so the partitions are not getting mounted properly.  I could be
way off but it does work fine on a nonsmp kernel.  It does this with
the most recent kernel and with the original bundled kernel as well.

This is on an intel 7501 based smp board from supermicro running dual
xeon 2.8ghz processors with 1mb cache.  LSI megaraid 150-6 controller,
4gb of ram and 5 250gb hard drives in a raid 5 stripe of 1 terabyte. 
The boot partition is sized at about 700gigs.

Version-Release number of selected component (if applicable):
kernel-2.6.2-1.87 smp

How reproducible:
Always

Steps to Reproduce:
1.Boot the machine and select an smp kernel
2.The machine will then boot to mentioned error
3.If a nonsmp kernel is selected on bootup it works fine.
    

Additional info:

Comment 1 Alan Cox 2004-05-03 19:11:13 UTC
Does this still occur in test3

Comment 2 Ronald Hello 2004-12-03 13:28:04 UTC
L.S.

I see that this bug is pending ... too bad.
We have the same set-up (supermicro, 7501 chipset, LSI 150-6
card, 5*250-R5, 1-hotspare) en I see the same problems.
Under kernel-2.6.5-1.358 and kernel-2.6.9-1.6_FC2 the machine
works just fine, but under kernel-smp-2.6.5-1.358 and
kernel-smp-2.6.9-1.6_FC2 it gives `cannot execute binary file'
on almost every program you try to start. I did not check whether
it is just the start-up scripts or the programs themselves.

I can perform specific tests ...

Best regards,

Ronald Hello
rhello

Comment 3 scott campbell 2004-12-03 17:08:21 UTC
I found that the problem appears to be related to memory somehow.  I
decreased our ram from 4gb to 2gb and the machines will boot the smtp
kernel fine and I have been running that way for about a month now and
have had no issues.  I did not actually need 4 gigs of ram for the
server as it is just running samba services for a file server and mail
filtering.  Until I got the idea to try less ram I just ran the single
processor kernel for about 6 months.  I just did not have the luxury
of time when installing the server originally and was able to get by
with a single processor.  I could try tests if needed to see if this
problem is still around.  It actually had the same problem on Core 1
and Core 2.  I have not had time to try core 3 on the system but the
megaraid driver is causing problems on the install for it so I am not
keen on trying it.  The new megaraid drivers in 2.6.9 do seem to be
much better with handling a high load then previously.  

Comment 4 Ronald Hello 2004-12-06 12:21:47 UTC
Hi,

Well ... I need 4 GB ;-)
I just tested Fedora Core 3 (fresh install) and it shows
the same problem with the SMP kernel. I just updated it,
kernel-2.6.9-1.681_FC3 works fine and kernel-smp-2.6.9-1.681_FC3
shows the same problem.


Comment 5 Ronald Hello 2004-12-13 11:30:12 UTC
I can confirm Scott's assessment: we took 2GB out and now
it's running just fine with the smp kernel. But this is
for us not a solution, at best it's a short-term workaround
so that we can continue with our work. At some point in time
I'll have to upgrade a rather large set of servers which are
all on 4 GB (and need it).

Ronald.

Comment 6 Ronald Hello 2004-12-13 14:11:04 UTC
Hi,

Hmm. New info.
I just installed a identical SuperMicro with 4 GB and
another LSI-card, an experimental one (not 150-6). It
also has a tape-drive on an adaptec controller, but I
think that one doesn't matter.
Anyway, this machine is running the smp-kernels just
fine with 4 GB memory and 2 GB swap.
So, even though it seems memory related, it looks like
it's only occurring with the LSI 150-6D RAID card.

Hope this helps ...

Comment 7 Ronald Hello 2004-12-20 11:05:13 UTC
Hi,

We played around a little, swapped a lot of stuff etc.
Two identical configurations:
SuperMicro X5DPIG2 boards, dual Xeon, 4GB, LSI 150-6 card
One with 6 250 GB WD disks, 5 in RAID5 one on Hotspare
One with 6 120 GB Maxtor disks, 4 in RAID 0/1, 2 on Hotspare

The first one experiences the problem with the SMP kernels,
the second one not. So it looks like it has to do with the size
of the `virtual disk'. I'm going to play around with setting up
the 6*120GB system in the same RAID5/hotspare config. I expect
no problems there, I'll report here.
(my experimental system with the experimental card had 7*120GB
in RAID5)

Comment 8 Ronald Hello 2004-12-27 12:25:52 UTC
Final testing,

Two identical systems, SuperMicro, 4GM mem, LSI 150-6, all
disk configurations: 5 in RAID5, 1 on hot-spare.
System one:
250 GB WD disks (2500JD), problem with SMP kernels
System two:
300 GB Maxtor disks (6B300SO), problem with SMP kernels
120 GB Maxtor disks (6Y120MO), NO problem with SMP kernels

So I definitaly think the SMP kernels have a problem with
large `disks' (1 TB?) and 4 GB memory.


Comment 9 Alan Cox 2004-12-27 14:14:12 UTC
And LSI megaraid as well it appears. At least all the reports seem to
be this controller.


Comment 10 Ronald Hello 2005-02-25 12:37:02 UTC
Hi,

We just installed and tested a Suse 9.1 on this config, it does not
have the problems we experience with Fedora Core 3. Since Suse uses
the same megaraid driver in it's 2.6 kernels (as far as I can see),
I think it's really in the Fedora kernel.

Ronald.

Comment 11 Ronald Hello 2005-03-04 11:36:40 UTC
Hi,

Question: is anyone able to test RHEL4 on this config? Or can
anyone get me access to the iso's so that I can test RHEL4 on
this config?

Comment 12 Dave Jones 2005-04-16 05:50:40 UTC
Fedora Core 2 has now reached end of life, and no further updates will be
provided by Red Hat.  The Fedora legacy project will be producing further kernel
updates for security problems only.

If this bug has not been fixed in the latest Fedora Core 2 update kernel, please
try to reproduce it under Fedora Core 3, and reopen if necessary, changing the
product version accordingly.

Thank you.



Note You need to log in before you can comment on or make changes to this bug.