Bug 167257 - NMI watchdog lockup while attempting to rejoin cluster
Summary: NMI watchdog lockup while attempting to rejoin cluster
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.0
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On: 166701
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-08-31 23:22 UTC by Henry Harris
Modified: 2007-11-30 22:07 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-09-19 20:51:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
KDB info from failed node including dmesg, ps,sr t, etc. (280.00 KB, text/plain)
2005-08-31 23:22 UTC, Henry Harris
no flags Details

Description Henry Harris 2005-08-31 23:22:44 UTC
Description of problem: On a two node cluster, one of the nodes was removed 
from the cluster.  When it attempted to rejoin the cluster, the kernel paniced 
due to an NMI watchdog lockup.


Version-Release number of selected component (if applicable):


How reproducible:
First time this problem has been seen.

Steps to Reproduce:
1. Remove one node from two node cluster
2. Rejoin node to cluster
3.
  
Actual results: Kernel panic


Expected results: Normal operation


Additional info:  This problem may be related to bug #166701 as it occurred in 
low memory and a spinlock was involved.

Comment 1 Henry Harris 2005-08-31 23:22:45 UTC
Created attachment 118323 [details]
KDB info from failed node including dmesg, ps,sr t, etc.

Comment 2 Ben Marzinski 2005-09-14 18:04:00 UTC
Let me know if this issue shows up again, while running the U2 gfs code.

Comment 3 Benjamin Kahn 2006-05-16 15:17:32 UTC
Doesn't actually block RHEL4NFSFailover


Note You need to log in before you can comment on or make changes to this bug.