Bug 206188 - [rhel4 u6] reboot fails on IBM System x336
Summary: [rhel4 u6] reboot fails on IBM System x336
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.4
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Konrad Rzeszutek
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-09-12 19:40 UTC by James Lamb
Modified: 2007-11-17 01:14 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-04-23 17:50:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description James Lamb 2006-09-12 19:40:23 UTC
Description of problem:

We have recently started to use Redhat Enterprise v4 update 4 on i386 and x86-64
systems. With the change from EL3 we have noticed that most of the systems fail
to reboot with a simple 'reboot' command. In each instance we get the following 

[root@xxxxxxxxxx ~]# init 6
INIT: Switching to runlevel: 6
INIT: Sending processes the TERM signal
Stopping HAL daemon: [  OK  ]
Stopping system message bus: [  OK  ]
Stopping Red Hat Network Daemon: [  OK  ]
Stopping atd: [  OK  ]
Stopping httpd: [  OK  ]
Stopping sshd:[  OK  ]
Shutting down sendmail: [  OK  ]
Shutting down sm-client: [  OK  ]
Shutting down smartd: [FAILED]
Stopping snmpd: [  OK  ]
Stopping xinetd: [  OK  ]
Stopping acpi daemon: [  OK  ]
Stopping crond: [  OK  ]
Shutting down ntpd: [  OK  ]
Shutting down kernel logger: [  OK  ]
Shutting down system logger: [  OK  ]
Shutting down interface eth0:  [  OK  ]
Shutting down interface eth1:  [  OK  ]
Shutting down interface eth2:  [  OK  ]
Shutting down loopback interface:  [  OK  ]
Starting killall:  [  OK  ]
Sending all processes the TERM signal...
Sending all processes the KILL signal...
Saving random seed:  
Syncing hardware clock to system time
Turning off swap:  
Turning off quotas:  
Unmounting pipe file systems:  
Unmounting file systems:  
Please stand by while rebooting the system...
md: stopping all md devices.
md: md0 switched to read-only mode. 


It then gets to that point and fails to reboot every time. I have attempted to
install the Operating from CD (our default is to kickstart the servers) and it
also fails. The way I have resolved the issue is to append the following
information to the kernel boot parameters : acpi=noirq

Once that has been added the system restarts every time. It is very similar to
the following redhat buzilla #182961 

We are using in this instance the IBM System X (X Series) 336 servers with the
latest firmware/bmc etc from IBM.

Version-Release number of selected component (if applicable):

Red Hat Enterprise Linux ES release 4 (Nahant Update 4)
Kernel 2.6.9-42.0.2.ELsmp on an i686


How reproducible:

Install Redhat Enterprise v4 Update 4 on an IBM x336, then attempt to restart
the system using init 6 or reboot and the system will fail to reboot giving the
above output.



Expected results:

System reboots.

Additional info:

Comment 1 Konrad Rzeszutek 2006-10-12 14:14:44 UTC
What is the model? 8837,1879, or 7978? 

Thank you.

Comment 2 James Lamb 2006-10-12 15:10:40 UTC
Hello,

The model that we are having problem with is as follows 8837. If we need to we
could probably put a demo system on a public IP for you to have a play with.

Regards,

James

Comment 3 Konrad Rzeszutek 2006-10-12 17:00:01 UTC
James,

My records indication that the latest BIOS is v33A (APJT33A). BMC is v24A
is(APBT24A) . Can you verify that you have the same version?

Comment 4 James Lamb 2006-10-16 11:59:08 UTC
Hello,

We have the following software revisions installed on the servers that have been
having issues.

BIOS : v1.13
BMC : v1.15

These are the revisions that IBM have been shipping servers with in recent
months, When I checked the software updates on the IBM.com web  page they are
reported as being current.

Regards,

James

Comment 5 Konrad Rzeszutek 2006-10-16 15:15:59 UTC
James,

Those are the latest and greatest numbers. I do not have access to this machine
locally but I have let an IBM person know about this and asked him to reproduce
your problem. 

Comment 6 Konrad Rzeszutek 2006-10-17 14:26:04 UTC
James,

The IBM person was able to sucessfully reboot the machine. I have asked them to
provide with a 'sysreport' report so that I can compare his machine to yours.
Can you also run 'sysreport' and attach the tar ball to this BZ? Thank you.

Comment 7 Konrad Rzeszutek 2006-10-17 18:07:32 UTC
James,

Can you check if 'aarich' module is loaded? If it is, can you rmmod it and then
try to reboot, please? Thanks

Comment 8 Konrad Rzeszutek 2006-10-27 17:09:50 UTC
James, any update?

Comment 9 James Lamb 2007-01-22 12:37:17 UTC
Hello,

there are no aarich modules loaded on this server. The RAID card in question is
aacraid.

(sorry about the long delay I have been on leave).
Regards,

James


Comment 10 Konrad Rzeszutek 2007-01-22 18:24:27 UTC
James,

Thanks for your response. Can you attach the sysreport to this BZ pls? Just run
'sysreport' and uplaod the tarball it creates. Thanks.

Comment 11 Konrad Rzeszutek 2007-02-02 14:39:56 UTC
James,

Any update on running the command?

Comment 12 Konrad Rzeszutek 2007-03-23 22:09:48 UTC
ping?

Comment 13 Konrad Rzeszutek 2007-04-23 17:50:57 UTC
I am closing this BZ as DEFERRED. James, pls re-open this and upload the tarball
from running sysreport.


Note You need to log in before you can comment on or make changes to this bug.