Bug 222875

Summary: Kernel panic - not syncing: SM: Record message above and reboot.
Product: [Retired] Red Hat Cluster Suite Reporter: Tomasz Jaszowski <tjaszowski>
Component: cmanAssignee: Christine Caulfield <ccaulfie>
Status: CLOSED CANTFIX QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: medium    
Version: 4CC: cluster-maint, teigland
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-01-08 14:04:29 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
kernel panic on iLO none

Description Tomasz Jaszowski 2007-01-16 17:58:39 UTC
Description of problem:
We have cluster with two node with shared storage (GFS).
During test of gfs (creating/mounting/umounting/recreating) after or during
reboot we had 'kernel panic'. Node fencing was disabled by invalid password set.

Version-Release number of selected component (if applicable):

Red Hat Enterprise Linux ES release 4 (Nahant Update 4)


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:
on iLO we have notticed:

CMAN: removing node tefse-pro1 from the cluster : Missed too many heartbeats


SM: Assertion failed on line 52 of file
/usr/src/redhat/BUILD/cman-kernel-2.6.9-45/smp/src/sm_misc.c
SM: assertion: "!error"
SM time = 230783
error = -1, nodeid = 4294967295

Kernel panic - not syncing: SM: Record message above and reboot.

Expected results:
Why this node got 'kernel panic' ?

Additional info:
unfortunately i don't have any time stamp, and I can't find any logs regarding
this kernel panic at /var/log/messages.

Comment 1 Tomasz Jaszowski 2007-01-16 17:58:39 UTC
Created attachment 145711 [details]
kernel panic on iLO

Comment 2 Christine Caulfield 2007-01-17 11:19:58 UTC
It seems to be requesting node id -1 from cman (which is invalid).

Dave: did cman give SM this node number ?

Comment 3 Tomasz Jaszowski 2007-01-22 12:16:06 UTC
Hi,

 Any ideas how to avoid this kernel panic? (we would like to join this system
into production, so answer to this bug becoming critical...)

Thanks



Comment 4 David Teigland 2007-01-22 15:01:37 UTC
Could you describe exactly what you did to get this?  And does it happen
every time you do that?


Comment 5 Tomasz Jaszowski 2007-01-24 08:08:02 UTC
it happened during configuration of GFS partitions. We created them, added to
fstab, mounted, unmounted few times, few times rebooted...and after one of those
reboots on one of nodes we saw that message. Unfortunately I can't provide exact
path how to reproduce it. 

We didn't tried to reproduce it, and it happened only once

Comment 6 Tomasz Jaszowski 2007-01-25 21:20:44 UTC
(In reply to comment #5)
> it happened during configuration of GFS partitions. We created them, added to
> fstab, mounted, unmounted few times, few times rebooted...and after one of those
> reboots on one of nodes we saw that message. Unfortunately I can't provide exact
> path how to reproduce it. 
> 
> We didn't tried to reproduce it, and it happened only once

nothing more to add

Comment 7 Tomasz Jaszowski 2007-02-02 15:38:04 UTC
Hi

any ideas?

Comment 8 Christine Caulfield 2007-02-02 17:19:39 UTC
Not without any more information, no. Sorry.

Comment 9 Tomasz Jaszowski 2008-01-08 14:04:29 UTC
as I'm not able to provide more detailed informations, setting as cantfix. 

Thanks for help