Bug 203829 - node fenced after einval messages
Summary: node fenced after einval messages
Keywords:
Status: CLOSED DUPLICATE of bug 199673
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: magma
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-08-23 21:41 UTC by Lenny Maiorani
Modified: 2009-04-16 20:20 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-10-05 20:05:25 UTC
Embargoed:


Attachments (Terms of Use)

Description Lenny Maiorani 2006-08-23 21:41:52 UTC
Description of problem:
Node was fenced just after these errors appeared in /var/log/messages. Not sure
if node panicked or not.

Aug 17 09:51:22 igrid02 ntpd[11533]: synchronized to 128.118.1.137, stratum 2
Aug 17 09:51:48 igrid02 clurgmgrd: [4166]: <info> Executing
/opt/crosswalk/email
_notifier/run_email_notifier status 
Aug 17 09:52:48 igrid02 last message repeated 2 times
Aug 17 09:54:18 igrid02 last message repeated 3 times
Aug 17 09:54:41 igrid02 kernel: ly einval 17b0219 fr 1 r 1 usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (20221) req reply einval 17602cd fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (27826) req reply einval 179009a fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (29059) req reply einval 1550004 fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (14073) req reply einval 18e00ed fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (25612) req reply einval 19402d7 fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (19889) req reply einval 14c036a fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (20681) req reply einval 175029c fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (813) req reply einval 16e00a0 fr 1 r 1
us
rm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (29247) req reply einval 1750201 fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (3929) req reply einval 18200b6 fr 1 r 1
u
srm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (26478) req reply einval 164014e fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (8886) req reply einval 183035e fr 1 r 1
u
srm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (18805) req reply einval 178003f fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (23383) req reply einval 17d0123 fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma send einval to 1


Version-Release number of selected component (if applicable):
magma-1.0.4-U4pre1 received from Lon

How reproducible:
Not sure how to reproduce


Actual results:
node gets fenced

Expected results:
node recovers from einval messages

Additional info:
U4pre1 was taken from bz #193128

Comment 1 Lon Hohberger 2006-08-24 15:14:10 UTC
The DLM is producing those messages about the Magma lockspace.  I'm not sure off
the top of my head what they mean.



Comment 2 David Teigland 2006-08-24 15:22:52 UTC
The messages are warnings that usually indicate two (or more) nodes
are repeatedly and quickly locking/unlocking the same lock.  The
dlm can generally deal with this, but it's something that we want
to avoid when possible.  (Some recent rgmanager changes were made
to avoid situations like this by using lock conversions.)

There's no direct link betweeen these messages and the node failing.
What did cman report on the other nodes as the reason for this node
being removed from the cluster and fenced?


Comment 3 Lon Hohberger 2006-08-24 15:55:53 UTC
Correct, in the U4 magma plugin(s) + magma + rgmanager, we take a NL lock and
attempt to promote it to EX in a loop (rather than the old behavior, which was
to take a noqueue+EX in a loop until we got the lock).


Comment 5 Lon Hohberger 2006-10-05 20:05:25 UTC

*** This bug has been marked as a duplicate of 199673 ***


Note You need to log in before you can comment on or make changes to this bug.