Bug 208968

Summary: Unable to obtain cluster lock (err #48 and #50)
Product: [Retired] Red Hat Cluster Suite Reporter: Lenny Maiorani <lenny>
Component: dlmAssignee: Lon Hohberger <lhh>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 4CC: cluster-maint, hklein, nobody+wcheng, scott.cannata
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007:0149 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-12-13 18:42:21 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lenny Maiorani 2006-10-02 19:41:31 UTC
Description of problem:


Version-Release number of selected component (if applicable):
1.9.53

How reproducible:
Have seen this several times, but unsure how to reproduce

Steps to Reproduce: Unknown
  
Actual results: 
Access to data through VIP services is disabled

Expected results: 
This doesn't happen

Additional info:
Info from /var/log/messages.

Node1:
Sep 30 19:44:46 frii01 kernel: dlm: capture: process_lockqueue_reply id 87c3007b
state 0
Sep 30 20:24:54 frii01 kernel: dlm: capture: process_lockqueue_reply id 8b4903b5
state 0
Sep 30 20:25:03 frii01 kernel: dlm: midcomms: bad header version 34000045
Sep 30 20:25:03 frii01 kernel: dlm: midcomms: cmd=0, flags=41, length=64,
lkid=2226062912, lockspace=17435146
Sep 30 20:25:03 frii01 kernel: dlm: midcomms: base=000001005220d000,
offset=1720, len=1736, ret=1720, limit=00001000 newbuf=1
Sep 30 20:25:03 frii01 kernel: 45 00 00 34 00 29 40 00-40 06 af 84 0a 0a 0a 01
Sep 30 20:25:03 frii01 kernel: 0a 0a 0a 02 52 48 95 54-e0 d7 14 c2 50 84 4d c4
Sep 30 20:25:03 frii01 kernel: 80 10 11 33 43 33 00 00-01 01 08 0a 0e 00 96 18
Sep 30 20:25:03 frii01 kernel: 0f 03 cb a5
Sep 30 20:25:03 frii01 kernel: ff ff ff ff
Sep 30 20:25:03 frii01 kernel: 46 02 00 00
Sep 30 20:25:03 frii01 kernel: 00
Sep 30 20:25:03 frii01 last message repeated 3 times
Sep 30 20:25:03 frii01 kernel: dlm: lowcomms: addr=000001005220d000, base=0,
len=3456, iov_len=4096, iov_base[0]=000001005220dd80, read=3
456
Sep 30 20:25:03 frii01 kernel: dlm: capture: process_lockqueue_reply id 8d390207
state 0
Sep 30 20:25:03 frii01 kernel: dlm: capture: process_lockqueue_reply state 0
Sep 30 20:25:03 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:03 frii01 kernel: dlm: reply
Sep 30 20:25:03 frii01 kernel: rh_cmd 5
Sep 30 20:25:03 frii01 kernel: rh_lkid 8c6402fe
Sep 30 20:25:03 frii01 kernel: lockstate 0
Sep 30 20:25:03 frii01 kernel: nodeid 2
Sep 30 20:25:03 frii01 kernel: status 4294901758
Sep 30 20:25:03 frii01 kernel: lkid 0
Sep 30 20:25:03 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:03 frii01 kernel: dlm: reply
Sep 30 20:25:04 frii01 kernel: rh_cmd 5
Sep 30 20:25:04 frii01 kernel: rh_lkid 8b580314
Sep 30 20:25:04 frii01 kernel: lockstate 0
Sep 30 20:25:04 frii01 kernel: nodeid 2
Sep 30 20:25:04 frii01 kernel: status 4294901758
Sep 30 20:25:04 frii01 kernel: lkid 0
Sep 30 20:25:04 frii01 kernel: dlm: capture: process_lockqueue_reply id 8b6103d0
state 0
Sep 30 20:25:04 frii01 kernel: dlm: capture: process_lockqueue_reply id 8b31021b
state 0
Sep 30 20:25:04 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:04 frii01 kernel: dlm: reply
Sep 30 20:25:04 frii01 kernel: rh_cmd 5
Sep 30 20:25:04 frii01 kernel: rh_lkid 8ac1010c
Sep 30 20:25:04 frii01 kernel: lockstate 0
Sep 30 20:25:04 frii01 kernel: nodeid 1
Sep 30 20:25:04 frii01 kernel: status 4294901758
Sep 30 20:25:04 frii01 kernel: lkid 0
Sep 30 20:25:04 frii01 kernel: dlm: capture: process_lockqueue_reply id 88d901b9
state 0
Sep 30 20:25:04 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:04 frii01 kernel: dlm: reply
Sep 30 20:25:04 frii01 kernel: rh_cmd 5
Sep 30 20:25:04 frii01 kernel: rh_lkid 8c340274
Sep 30 20:25:04 frii01 kernel: lockstate 0
Sep 30 20:25:04 frii01 kernel: nodeid 1
Sep 30 20:25:04 frii01 kernel: status 4294901758
Sep 30 20:25:04 frii01 kernel: lkid 0
Sep 30 20:25:04 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:04 frii01 kernel: dlm: reply
Sep 30 20:25:04 frii01 kernel: rh_cmd 5
Sep 30 20:25:04 frii01 kernel: rh_lkid 8afb024a
Sep 30 20:25:04 frii01 kernel: lockstate 0
Sep 30 20:25:04 frii01 kernel: nodeid 1
Sep 30 20:25:04 frii01 kernel: status 4294901758
Sep 30 20:25:04 frii01 kernel: lkid 0
Sep 30 20:25:04 frii01 kernel: dlm: capture: process_lockqueue_reply id 89150231
state 0
Sep 30 20:25:04 frii01 kernel: dlm: capture: process_lockqueue_reply id 8a0b0264
state 0
Sep 30 20:25:04 frii01 kernel: dlm: capture: process_lockqueue_reply id 89b50211
state 0
Sep 30 20:25:04 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:04 frii01 kernel: dlm: reply
Sep 30 20:25:04 frii01 kernel: rh_cmd 5
Sep 30 20:25:04 frii01 kernel: rh_lkid 8b4e00b3
Sep 30 20:25:04 frii01 kernel: lockstate 0
Sep 30 20:25:04 frii01 kernel: nodeid 1
Sep 30 20:25:04 frii01 kernel: status 4294901758
Sep 30 20:25:04 frii01 kernel: lkid 0
Sep 30 20:25:04 frii01 kernel: dlm: capture: process_lockqueue_reply id 8bbd02de
state 0
Sep 30 20:25:04 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:04 frii01 kernel: dlm: reply
Sep 30 20:25:04 frii01 kernel: rh_cmd 5
Sep 30 20:25:04 frii01 kernel: rh_lkid 8b0000ff
Sep 30 20:25:04 frii01 kernel: lockstate 0
Sep 30 20:25:04 frii01 kernel: nodeid 1
Sep 30 20:25:04 frii01 kernel: status 4294901758
Sep 30 20:25:04 frii01 kernel: lkid 0
Sep 30 20:25:04 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:05 frii01 kernel: dlm: reply
Sep 30 20:25:05 frii01 kernel: rh_cmd 5
Sep 30 20:25:05 frii01 kernel: rh_lkid 8c6303df
Sep 30 20:25:05 frii01 kernel: lockstate 0
Sep 30 20:25:05 frii01 kernel: nodeid 1
Sep 30 20:25:05 frii01 kernel: status 4294901758
Sep 30 20:25:05 frii01 kernel: lkid 0
Sep 30 20:25:05 frii01 kernel: dlm: capture: process_lockqueue_reply id 89c902a2
state 0
Sep 30 20:25:05 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:05 frii01 kernel: dlm: reply
Sep 30 20:25:05 frii01 kernel: rh_cmd 5
Sep 30 20:25:05 frii01 kernel: rh_lkid 8a6901d9
Sep 30 20:25:05 frii01 kernel: lockstate 0
Sep 30 20:25:05 frii01 kernel: nodeid 1
Sep 30 20:25:05 frii01 kernel: status 4294901758
Sep 30 20:25:05 frii01 kernel: lkid 0
Sep 30 20:25:05 frii01 kernel: dlm: capture: process_lockqueue_reply id 8c4c0063
state 0
Sep 30 20:25:05 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:05 frii01 kernel: dlm: reply
Sep 30 20:25:05 frii01 kernel: rh_cmd 5
Sep 30 20:25:05 frii01 kernel: rh_lkid 8b9b0199
Sep 30 20:25:05 frii01 kernel: lockstate 0
Sep 30 20:25:05 frii01 kernel: nodeid 1
Sep 30 20:25:05 frii01 kernel: status 4294901758
Sep 30 20:25:05 frii01 kernel: lkid 0
Sep 30 20:25:05 frii01 kernel: dlm: capture: process_lockqueue_reply id 8b600314
state 0
Sep 30 20:25:05 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:05 frii01 kernel: dlm: reply
Sep 30 20:25:05 frii01 kernel: rh_cmd 5
Sep 30 20:25:05 frii01 kernel: rh_lkid 8ab30074
Sep 30 20:25:05 frii01 kernel: lockstate 0
Sep 30 20:25:05 frii01 kernel: nodeid 2
Sep 30 20:25:05 frii01 kernel: status 4294901758
Sep 30 20:25:05 frii01 kernel: lkid 0
Sep 30 20:25:05 frii01 kernel: dlm: midcomms: bad header version 34000045
Sep 30 20:25:05 frii01 kernel: dlm: midcomms: cmd=0, flags=41, length=64,
lkid=2226062912, lockspace=17435146
Sep 30 20:25:05 frii01 kernel: dlm: midcomms: base=000001005220d000,
offset=1720, len=2168, ret=1720, limit=00001000 newbuf=1
Sep 30 20:25:05 frii01 kernel: 45 00 00 34 00 29 40 00-40 06 af 84 0a 0a 0a 01
Sep 30 20:25:05 frii01 kernel: 0a 0a 0a 02 52 48 95 54-e0 d7 14 c2 50 84 4d c4
Sep 30 20:25:05 frii01 kernel: 80 10 11 33 43 33 00 00-01 01 08 0a 0e 00 96 18
Sep 30 20:25:05 frii01 kernel: 0f 03 cb a5
Sep 30 20:25:05 frii01 kernel: ff ff ff ff
Sep 30 20:25:05 frii01 kernel: 46 02 00 00
Sep 30 20:25:05 frii01 kernel: 00
Sep 30 20:25:05 frii01 last message repeated 3 times
Sep 30 20:25:05 frii01 kernel: dlm: lowcomms: addr=000001005220d000, base=0,
len=3888, iov_len=640, iov_base[0]=000001005220df30, read=43
2
Sep 30 20:25:05 frii01 kernel: dlm: capture: process_lockqueue_reply id 8d390207
state 0
Sep 30 20:25:05 frii01 kernel: dlm: capture: process_lockqueue_reply id 8b3c00d7
state 0
Sep 30 20:25:05 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:05 frii01 kernel: dlm: reply
Sep 30 20:25:05 frii01 kernel: rh_cmd 5
Sep 30 20:25:05 frii01 kernel: rh_lkid 8c6402fe
Sep 30 20:25:05 frii01 kernel: lockstate 0
Sep 30 20:25:05 frii01 kernel: nodeid 2
Sep 30 20:25:05 frii01 kernel: status 4294901758
Sep 30 20:25:05 frii01 kernel: lkid 0
Sep 30 20:25:05 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:05 frii01 kernel: dlm: reply
Sep 30 20:25:06 frii01 kernel: rh_cmd 5
Sep 30 20:25:06 frii01 kernel: rh_lkid 8b580314
Sep 30 20:25:06 frii01 kernel: lockstate 0
Sep 30 20:25:06 frii01 kernel: nodeid 2
Sep 30 20:25:06 frii01 kernel: status 4294901758
Sep 30 20:25:06 frii01 kernel: lkid 0
Sep 30 20:25:06 frii01 kernel: dlm: capture: process_lockqueue_reply id 8b6103d0
state 0
Sep 30 20:25:06 frii01 kernel: dlm: capture: process_lockqueue_reply id 8b31021b
state 0
Sep 30 20:25:06 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:06 frii01 kernel: dlm: reply
Sep 30 20:25:06 frii01 kernel: rh_cmd 5
Sep 30 20:25:06 frii01 kernel: rh_lkid 8ac1010c
Sep 30 20:25:06 frii01 kernel: lockstate 0
Sep 30 20:25:06 frii01 kernel: nodeid 1
Sep 30 20:25:06 frii01 kernel: status 4294901758
Sep 30 20:25:06 frii01 kernel: lkid 0
Sep 30 20:25:06 frii01 kernel: dlm: capture: process_lockqueue_reply id 88d901b9
state 0
Sep 30 20:25:06 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:06 frii01 kernel: dlm: reply
Sep 30 20:25:06 frii01 kernel: rh_cmd 5
Sep 30 20:25:06 frii01 kernel: rh_lkid 8c340274
Sep 30 20:25:06 frii01 kernel: lockstate 0
Sep 30 20:25:06 frii01 kernel: nodeid 1
Sep 30 20:25:06 frii01 kernel: status 4294901758
Sep 30 20:25:06 frii01 kernel: lkid 0
Sep 30 20:25:06 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:06 frii01 kernel: dlm: reply
Sep 30 20:25:06 frii01 kernel: rh_cmd 5
Sep 30 20:25:06 frii01 kernel: rh_lkid 8afb024a
Sep 30 20:25:06 frii01 kernel: lockstate 0
Sep 30 20:25:06 frii01 kernel: nodeid 1
Sep 30 20:25:06 frii01 kernel: status 4294901758
Sep 30 20:25:06 frii01 kernel: lkid 0
Sep 30 20:25:06 frii01 kernel: dlm: capture: process_lockqueue_reply id 89150231
state 0
Sep 30 20:25:06 frii01 kernel: dlm: capture: process_lockqueue_reply id 8a0b0264
state 0
Sep 30 20:25:06 frii01 kernel: dlm: capture: process_lockqueue_reply id 89b50211
state 0
Sep 30 20:25:06 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:06 frii01 kernel: dlm: reply
Sep 30 20:25:06 frii01 kernel: rh_cmd 5
Sep 30 20:25:06 frii01 kernel: rh_lkid 8b4e00b3
Sep 30 20:25:06 frii01 kernel: lockstate 0
Sep 30 20:25:06 frii01 kernel: nodeid 1
Sep 30 20:25:06 frii01 kernel: status 4294901758
Sep 30 20:25:06 frii01 kernel: lkid 0
Sep 30 20:25:06 frii01 kernel: dlm: capture: process_lockqueue_reply id 8bbd02de
state 0
Sep 30 20:25:06 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:06 frii01 kernel: dlm: reply
Sep 30 20:25:06 frii01 kernel: rh_cmd 5
Sep 30 20:25:06 frii01 kernel: rh_lkid 8b0000ff
Sep 30 20:25:06 frii01 kernel: lockstate 0
Sep 30 20:25:06 frii01 kernel: nodeid 1
Sep 30 20:25:06 frii01 kernel: status 4294901758
Sep 30 20:25:06 frii01 kernel: lkid 0
Sep 30 20:25:06 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:06 frii01 kernel: dlm: reply
Sep 30 20:25:07 frii01 kernel: rh_cmd 5
Sep 30 20:25:07 frii01 kernel: rh_lkid 8c6303df
Sep 30 20:25:07 frii01 kernel: lockstate 0
Sep 30 20:25:07 frii01 kernel: nodeid 1
Sep 30 20:25:07 frii01 kernel: status 4294901758
Sep 30 20:25:07 frii01 kernel: lkid 0
Sep 30 20:25:07 frii01 kernel: dlm: capture: process_lockqueue_reply id 89c902a2
state 0
Sep 30 20:25:07 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:07 frii01 kernel: dlm: reply
Sep 30 20:25:07 frii01 kernel: rh_cmd 5
Sep 30 20:25:07 frii01 kernel: rh_lkid 8a6901d9
Sep 30 20:25:07 frii01 kernel: lockstate 0
Sep 30 20:25:07 frii01 kernel: nodeid 1
Sep 30 20:25:07 frii01 kernel: status 4294901758
Sep 30 20:25:07 frii01 kernel: lkid 0
Sep 30 20:25:07 frii01 kernel: dlm: capture: process_lockqueue_reply id 8c4c0063
state 0
Sep 30 20:25:07 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:07 frii01 kernel: dlm: reply
Sep 30 20:25:07 frii01 kernel: rh_cmd 5
Sep 30 20:25:07 frii01 kernel: rh_lkid 8b9b0199
Sep 30 20:25:07 frii01 kernel: lockstate 0
Sep 30 20:25:07 frii01 kernel: nodeid 1
Sep 30 20:25:07 frii01 kernel: status 4294901758
Sep 30 20:25:07 frii01 kernel: lkid 0
Sep 30 20:25:07 frii01 kernel: dlm: capture: process_lockqueue_reply id 8b600314
state 0
Sep 30 20:25:07 frii01 kernel: dlm: capture: reply from 2 no lock
Sep 30 20:25:07 frii01 kernel: dlm: reply
Sep 30 20:25:07 frii01 kernel: rh_cmd 5
Sep 30 20:25:07 frii01 kernel: rh_lkid 8ab30074
Sep 30 20:25:07 frii01 kernel: lockstate 0
Sep 30 20:25:07 frii01 kernel: nodeid 2
Sep 30 20:25:07 frii01 kernel: status 4294901758
Sep 30 20:25:07 frii01 kernel: lkid 0
Sep 30 20:25:07 frii01 kernel: dlm: midcomms: bad header version 34000045
Sep 30 20:25:07 frii01 kernel: dlm: midcomms: cmd=0, flags=41, length=64,
lkid=2226062912, lockspace=17435146
Sep 30 20:25:07 frii01 kernel: dlm: midcomms: base=000001005220d000,
offset=1720, len=2376, ret=1720, limit=00001000 newbuf=1
Sep 30 20:25:07 frii01 kernel: 45 00 00 34 00 29 40 00-40 06 af 84 0a 0a 0a 01
Sep 30 20:25:07 frii01 kernel: 0a 0a 0a 02 52 48 95 54-e0 d7 14 c2 50 84 4d c4
Sep 30 20:25:07 frii01 kernel: 80 10 11 33 43 33 00 00-01 01 08 0a 0e 00 96 18
Sep 30 20:25:07 frii01 kernel: 0f 03 cb a5
Sep 30 20:25:07 frii01 kernel: ff ff ff ff
Sep 30 20:25:07 frii01 kernel: 46 02 00 00
Sep 30 20:25:07 frii01 kernel: 00
Sep 30 20:25:07 frii01 last message repeated 3 times
Sep 30 20:25:07 frii01 kernel: dlm: lowcomms: addr=000001005220d000, base=0,
len=4096, iov_len=208, iov_base[0]=000001005220e000, read=20
8
Sep 30 20:25:50 frii01 clurgmgrd[15603]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:26:35 frii01 clurgmgrd[15603]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:27:05 frii01 clurgmgrd[15603]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:27:50 frii01 clurgmgrd[15603]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:27:51 frii01 clvmd: Cluster LVM daemon started - connected to CMAN
Sep 30 20:28:35 frii01 clurgmgrd[15603]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:29:05 frii01 clurgmgrd[15603]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:29:50 frii01 clurgmgrd[15603]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:30:35 frii01 clurgmgrd[15603]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:31:20 frii01 clurgmgrd[15603]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:31:50 frii01 clurgmgrd[15603]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:32:20 frii01 clurgmgrd[15603]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:33:05 frii01 clurgmgrd[15603]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:33:35 frii01 clurgmgrd[15603]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:34:05 frii01 clurgmgrd[15603]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:34:50 frii01 clurgmgrd[15603]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:35:35 frii01 clurgmgrd[15603]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:36:20 frii01 clurgmgrd[15603]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:37:05 frii01 clurgmgrd[15603]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:37:50 frii01 clurgmgrd[15603]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:38:20 frii01 clurgmgrd[15603]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:39:05 frii01 clurgmgrd[15603]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:39:50 frii01 clurgmgrd[15603]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:40:20 frii01 clurgmgrd[15603]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:41:05 frii01 clurgmgrd[15603]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:41:50 frii01 clurgmgrd[15603]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:42:20 frii01 clurgmgrd[15603]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:43:05 frii01 clurgmgrd[15603]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:43:50 frii01 clurgmgrd[15603]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:44:35 frii01 clurgmgrd[15603]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:45:20 frii01 clurgmgrd[15603]: <err> #50: Unable to obtain cluster
lock: Connection timed out 



Node 2:
Sep 30 20:26:26 frii02 clurgmgrd[17122]: <err> #49: Failed getting status for RG
172.16.107.225 
Sep 30 20:27:56 frii02 clurgmgrd[17122]: <err> #49: Failed getting status for RG
172.16.106.225 
Sep 30 20:29:26 frii02 clurgmgrd[17122]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:30:11 frii02 clurgmgrd[17122]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:30:56 frii02 clurgmgrd[17122]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:31:41 frii02 clurgmgrd[17122]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:32:26 frii02 clurgmgrd[17122]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:33:11 frii02 clurgmgrd[17122]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:33:56 frii02 clurgmgrd[17122]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:34:41 frii02 clurgmgrd[17122]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:35:26 frii02 clurgmgrd[17122]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:35:56 frii02 clurgmgrd[17122]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:36:41 frii02 clurgmgrd[17122]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:37:26 frii02 clurgmgrd[17122]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:38:11 frii02 clurgmgrd[17122]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:38:56 frii02 clurgmgrd[17122]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:40:26 frii02 clurgmgrd[17122]: <err> #51: Failed getting status for RG
172.16.107.225 
Sep 30 20:41:56 frii02 clurgmgrd[17122]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:42:41 frii02 clurgmgrd[17122]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:43:26 frii02 clurgmgrd[17122]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:44:11 frii02 clurgmgrd[17122]: <err> #50: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:44:56 frii02 clurgmgrd[17122]: <err> #48: Unable to obtain cluster
lock: Connection timed out 
Sep 30 20:45:41 frii02 clurgmgrd[17122]: <err> #50: Unable to obtain cluster
lock: Connection timed out

Comment 2 Lon Hohberger 2006-10-05 20:07:06 UTC
*** Bug 200841 has been marked as a duplicate of this bug. ***

Comment 3 Lon Hohberger 2006-10-05 20:13:58 UTC
There's a possibility that this is fixed in U4.  If it isn't, we'll need to work
to get a reliable test-case which causes it.

Comment 4 Lon Hohberger 2006-10-05 20:34:11 UTC
Setting to component 'dlm', but waiting for more information.

Comment 5 Lon Hohberger 2007-05-22 18:22:57 UTC
After comparing notes, the symptoms here are caused by a memory leak in the DLM
which is caused by rgmanager.

This leak has been fixed in Red Hat Cluster Suite 4 Update 5.  Please upgrade
magma, magma-plugins, dlm, dlm-kernel, and rgmanager if you continue to see this
problem on a prior release.

Comment 6 Lon Hohberger 2007-07-17 17:08:41 UTC
Note - pursuant to bug #247766, there is a chance that this was caused not
specifically by the rgmanager lock-leak in RHCS 4.4, but in general by a large
number of locks outstanding.