Bug 207690 - bootup of node hung after process_reply duplicateid message
Summary: bootup of node hung after process_reply duplicateid message
Keywords:
Status: CLOSED DUPLICATE of bug 217626
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: cman
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Christine Caulfield
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-09-22 15:49 UTC by Lenny Maiorani
Modified: 2009-04-16 20:01 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-01-24 15:56:24 UTC
Embargoed:


Attachments (Terms of Use)

Description Lenny Maiorani 2006-09-22 15:49:17 UTC
Description of problem:
from /var/log/messages

Sep 21 06:57:11 igrid01 kernel: CMAN <CVS> (built Aug 15 2006 02:14:50) installed
Sep 21 06:57:11 igrid01 kernel: NET: Registered protocol family 30
Sep 21 06:57:11 igrid01 kernel: CMAN: Waiting to join or form a Linux-cluster
Sep 21 06:57:13 igrid01 kernel: CMAN: sending membership request
Sep 21 06:57:13 igrid01 kernel: CMAN: sending membership request
Sep 21 06:57:13 igrid01 kernel: CMAN: got node igrid03
Sep 21 06:57:13 igrid01 kernel: CMAN: got node igrid02
Sep 21 06:57:13 igrid01 kernel: CMAN: quorum regained, resuming activity
Sep 21 06:57:13 igrid01 kernel: DLM <CVS> (built Aug 15 2006 02:15:00) installed
Sep 21 06:57:13 igrid01 cman: startup succeeded
Sep 21 06:57:13 igrid01 kernel: DLM Opaque Thread started
Sep 21 06:57:14 igrid01 hald[19694]: Timed out waiting for hotplug event 2114.
Rebasing to 2114
Sep 21 06:57:55 igrid01 kernel: SM: 00000000 process_reply duplicateid=1
nodeid=3 3/3


Version-Release number of selected component (if applicable):
cman-1.0.4-0.x86_64.rpm
kernel 2.6.9-34

How reproducible: occurred once


Steps to Reproduce:
1. unknown
2.
3.

Comment 1 David Teigland 2006-09-25 21:50:34 UTC
This may not be a problem.  What were you doing when it happened?
Were there any apparent problems with gfs/dlm/clvm/fencing?
If it happens again, collect the output from the following
commands on all cluster nodes (in addition to /var/log/messages):

mount
cman_tool status
cman_tool nodes
cman_tool services
ps ax -o pid,stat,cmd,wchan
cat /proc/cluster/sm_debug
cat /proc/cluster/dlm_debug
cat /proc/cluster/lock_dlm/debug


Comment 2 Lenny Maiorani 2006-09-27 02:39:44 UTC
Only seen this once, but I will keep my eyes peeled and collect this info the
next time!

Comment 3 Christine Caulfield 2006-11-29 11:00:17 UTC
I wonder if this is the same as (or related to) 217626 ?

Comment 4 Christine Caulfield 2007-01-24 15:56:24 UTC
I'll set this to to be a duplicate of 217626, as it looks identical to me. If
you see it again with that patch or release then feel free to reopen this one.

*** This bug has been marked as a duplicate of 217626 ***


Note You need to log in before you can comment on or make changes to this bug.