Bug 171211

Summary: dlm: cannot start lowcomms -98
Product: [Retired] Red Hat Cluster Suite Reporter: Corey Marthaler <cmarthal>
Component: dlmAssignee: Christine Caulfield <ccaulfie>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 4CC: ccaulfie, cluster-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2006-0167 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-01-06 20:19:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2005-10-19 14:48:15 UTC
Description of problem:
[root@link-02 ~]# clvmd
clvmd could not connect to cluster manager
Consult syslog for more information

dlm: Can't bind to port 21064
dlm: cannot start lowcomms -98
Oct 18 21:17:00 link-08 kernel: dlm: Can't bind to port 21064
Oct 18 21:17:00 link-08 kernel: dlm: cannot start lowcomms -98
Oct 18 21:17:00 link-08 clvmd: Unable to create lockspace for CLVM: Address
already in use


From Patrick's email:
Two interesting facts:

- netcat (nc) was quite capable of binding to port 21064 even when DLM couldn't.
- unloading the DLM module and reloading fixed the problem.

Which leads me to the conclusion that it's some saved state in the DLM that's
getting in the way, possibly a cached nodeid or IP address.

Had the IP address of the node changed between closing down the previous cluster
instance and restarting it? (looking at the code that would certainly cause an
error, though I'm not sure it would be that one!).

I think this problem should be bugged and maybe we try to figure out just what
caused it. Certainly there is some state in the DLM that should be flushed when
it gets deactivated that might be a contributory cause.


Version-Release number of selected component (if applicable):
[root@link-02 ~]# clvmd -V
Cluster LVM daemon version: 2.01.14 (2005-08-04)
Protocol version:           0.2.1

Comment 1 Christine Caulfield 2005-10-24 15:29:54 UTC
*** Bug 171622 has been marked as a duplicate of this bug. ***

Comment 2 Christine Caulfield 2005-10-24 15:33:12 UTC
For a bit more info see:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=171622

A fix is in the STABLE branch of CVS.

Comment 3 Christine Caulfield 2005-11-01 17:07:39 UTC
Fixed on RHEL4 branch for U3

Checking in lowcomms.c;
/cvs/cluster/cluster/dlm-kernel/src/lowcomms.c,v  <--  lowcomms.c
new revision: 1.22.2.11; previous revision: 1.22.2.10
done


Comment 5 Red Hat Bugzilla 2006-01-06 20:19:05 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0167.html