Bug 203829 - node fenced after einval messages
node fenced after einval messages
Status: CLOSED DUPLICATE of bug 199673
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: magma (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Lon Hohberger
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-08-23 17:41 EDT by Lenny Maiorani
Modified: 2009-04-16 16:20 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-10-05 16:05:25 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Lenny Maiorani 2006-08-23 17:41:52 EDT
Description of problem:
Node was fenced just after these errors appeared in /var/log/messages. Not sure
if node panicked or not.

Aug 17 09:51:22 igrid02 ntpd[11533]: synchronized to 128.118.1.137, stratum 2
Aug 17 09:51:48 igrid02 clurgmgrd: [4166]: <info> Executing
/opt/crosswalk/email
_notifier/run_email_notifier status 
Aug 17 09:52:48 igrid02 last message repeated 2 times
Aug 17 09:54:18 igrid02 last message repeated 3 times
Aug 17 09:54:41 igrid02 kernel: ly einval 17b0219 fr 1 r 1 usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (20221) req reply einval 17602cd fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (27826) req reply einval 179009a fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (29059) req reply einval 1550004 fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (14073) req reply einval 18e00ed fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (25612) req reply einval 19402d7 fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (19889) req reply einval 14c036a fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (20681) req reply einval 175029c fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (813) req reply einval 16e00a0 fr 1 r 1
us
rm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (29247) req reply einval 1750201 fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (3929) req reply einval 18200b6 fr 1 r 1
u
srm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (26478) req reply einval 164014e fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (8886) req reply einval 183035e fr 1 r 1
u
srm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (18805) req reply einval 178003f fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma (23383) req reply einval 17d0123 fr 1 r 1 
usrm::vf
Aug 17 09:54:41 igrid02 kernel: Magma send einval to 1


Version-Release number of selected component (if applicable):
magma-1.0.4-U4pre1 received from Lon

How reproducible:
Not sure how to reproduce


Actual results:
node gets fenced

Expected results:
node recovers from einval messages

Additional info:
U4pre1 was taken from bz #193128
Comment 1 Lon Hohberger 2006-08-24 11:14:10 EDT
The DLM is producing those messages about the Magma lockspace.  I'm not sure off
the top of my head what they mean.

Comment 2 David Teigland 2006-08-24 11:22:52 EDT
The messages are warnings that usually indicate two (or more) nodes
are repeatedly and quickly locking/unlocking the same lock.  The
dlm can generally deal with this, but it's something that we want
to avoid when possible.  (Some recent rgmanager changes were made
to avoid situations like this by using lock conversions.)

There's no direct link betweeen these messages and the node failing.
What did cman report on the other nodes as the reason for this node
being removed from the cluster and fenced?
Comment 3 Lon Hohberger 2006-08-24 11:55:53 EDT
Correct, in the U4 magma plugin(s) + magma + rgmanager, we take a NL lock and
attempt to promote it to EX in a loop (rather than the old behavior, which was
to take a noqueue+EX in a loop until we got the lock).
Comment 5 Lon Hohberger 2006-10-05 16:05:25 EDT

*** This bug has been marked as a duplicate of 199673 ***

Note You need to log in before you can comment on or make changes to this bug.