Bug 510284 - Dell T7400 dual Xeon quad core (E5405) system crash/freeze (X86_64)
Summary: Dell T7400 dual Xeon quad core (E5405) system crash/freeze (X86_64)
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.3
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ---
: ---
Assignee: Red Hat Kernel Manager
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-07-08 15:12 UTC by Ian Dickens
Modified: 2023-09-14 01:17 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-06-02 13:03:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Ian Dickens 2009-07-08 15:12:35 UTC
Description of problem:

Background - Dell T7400 with 2 quad core Xeon processors (E5405) and 32G of RAM running RHEL 5.3.  we have the auditing system configured as well runing the NISPOM ruleset.

We are attempting to use condor on these systems (we have 2 of them) allowing condor to attempt to use 6 of the 8 cores.  When telling condor to use more than 2 cores the system freezes beyond recovery.  The machines will not crash when using only 2 cores.  I read up on the apci threads and attempted to use RH kernel 2.6.18-156 and was able to reproduce the freeze.  Once frozen, the system obviously does not respond to any network/consolve probing (as you would expect).  It's dead...

The condor tools run as a non-privileged user...

Version-Release number of selected component (if applicable):

RHEL 5.3 using either kernels 2.6.18-128 or 2.6.18.156 for X86_64 platforms.


How reproducible:

So far...  Configure condor to assine slots to more than 2 of the 8 cores.  Crash is almost immediate once the test jobs are submitted.  However, if you configure condor to only used 2 cores - the jobs complete as expected.  I am guessing this might be a kernel bug since the whole machine dies...

To be fair:  I am not 100% sure condor is not to blame - but since it runs as a non-root user the kernel should not crash - condor should....

Comment 1 RHEL Program Management 2014-03-07 12:40:09 UTC
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.

Comment 2 RHEL Program Management 2014-06-02 13:03:12 UTC
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).

Comment 3 Red Hat Bugzilla 2023-09-14 01:17:10 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.