Bug 222875

Summary:

Kernel panic - not syncing: SM: Record message above and reboot.

Product:

[Retired] Red Hat Cluster Suite

Reporter:

Tomasz Jaszowski <tjaszowski>

Component:

cman

Assignee:

Christine Caulfield <ccaulfie>

Status:

CLOSED CANTFIX

QA Contact:

Cluster QE <mspqa-list>

Severity:

high

Docs Contact:

Priority:

medium

Version:

CC:

cluster-maint, teigland

Target Milestone:

---

Target Release:

---

Hardware:

i686

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2008-01-08 14:04:29 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
kernel panic on iLO	none

Description Tomasz Jaszowski 2007-01-16 17:58:39 UTC

Description of problem:
We have cluster with two node with shared storage (GFS).
During test of gfs (creating/mounting/umounting/recreating) after or during
reboot we had 'kernel panic'. Node fencing was disabled by invalid password set.

Version-Release number of selected component (if applicable):

Red Hat Enterprise Linux ES release 4 (Nahant Update 4)


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:
on iLO we have notticed:

CMAN: removing node tefse-pro1 from the cluster : Missed too many heartbeats


SM: Assertion failed on line 52 of file
/usr/src/redhat/BUILD/cman-kernel-2.6.9-45/smp/src/sm_misc.c
SM: assertion: "!error"
SM time = 230783
error = -1, nodeid = 4294967295

Kernel panic - not syncing: SM: Record message above and reboot.

Expected results:
Why this node got 'kernel panic' ?

Additional info:
unfortunately i don't have any time stamp, and I can't find any logs regarding
this kernel panic at /var/log/messages.

Comment 1 Tomasz Jaszowski 2007-01-16 17:58:39 UTC

Created attachment 145711 [details]
kernel panic on iLO

Comment 2 Christine Caulfield 2007-01-17 11:19:58 UTC

It seems to be requesting node id -1 from cman (which is invalid).

Dave: did cman give SM this node number ?

Comment 3 Tomasz Jaszowski 2007-01-22 12:16:06 UTC

Hi,

 Any ideas how to avoid this kernel panic? (we would like to join this system
into production, so answer to this bug becoming critical...)

Thanks

Comment 4 David Teigland 2007-01-22 15:01:37 UTC

Could you describe exactly what you did to get this?  And does it happen
every time you do that?

Comment 5 Tomasz Jaszowski 2007-01-24 08:08:02 UTC

it happened during configuration of GFS partitions. We created them, added to
fstab, mounted, unmounted few times, few times rebooted...and after one of those
reboots on one of nodes we saw that message. Unfortunately I can't provide exact
path how to reproduce it. 

We didn't tried to reproduce it, and it happened only once

Comment 6 Tomasz Jaszowski 2007-01-25 21:20:44 UTC

(In reply to comment #5)
> it happened during configuration of GFS partitions. We created them, added to
> fstab, mounted, unmounted few times, few times rebooted...and after one of those
> reboots on one of nodes we saw that message. Unfortunately I can't provide exact
> path how to reproduce it. 
> 
> We didn't tried to reproduce it, and it happened only once

nothing more to add

Comment 7 Tomasz Jaszowski 2007-02-02 15:38:04 UTC

Hi

any ideas?

Comment 8 Christine Caulfield 2007-02-02 17:19:39 UTC

Not without any more information, no. Sorry.

Comment 9 Tomasz Jaszowski 2008-01-08 14:04:29 UTC

as I'm not able to provide more detailed informations, setting as cantfix. 

Thanks for help