Bug 171211 - dlm: cannot start lowcomms -98
Summary: dlm: cannot start lowcomms -98
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: dlm
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Christine Caulfield
QA Contact: Cluster QE
URL:
Whiteboard:
: 171622 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-10-19 14:48 UTC by Corey Marthaler
Modified: 2009-04-16 20:00 UTC (History)
2 users (show)

Fixed In Version: RHBA-2006-0167
Clone Of:
Environment:
Last Closed: 2006-01-06 20:19:03 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2006:0167 0 normal SHIPPED_LIVE dlm-kernel bug fix update 2006-01-06 05:00:00 UTC

Description Corey Marthaler 2005-10-19 14:48:15 UTC
Description of problem:
[root@link-02 ~]# clvmd
clvmd could not connect to cluster manager
Consult syslog for more information

dlm: Can't bind to port 21064
dlm: cannot start lowcomms -98
Oct 18 21:17:00 link-08 kernel: dlm: Can't bind to port 21064
Oct 18 21:17:00 link-08 kernel: dlm: cannot start lowcomms -98
Oct 18 21:17:00 link-08 clvmd: Unable to create lockspace for CLVM: Address
already in use


From Patrick's email:
Two interesting facts:

- netcat (nc) was quite capable of binding to port 21064 even when DLM couldn't.
- unloading the DLM module and reloading fixed the problem.

Which leads me to the conclusion that it's some saved state in the DLM that's
getting in the way, possibly a cached nodeid or IP address.

Had the IP address of the node changed between closing down the previous cluster
instance and restarting it? (looking at the code that would certainly cause an
error, though I'm not sure it would be that one!).

I think this problem should be bugged and maybe we try to figure out just what
caused it. Certainly there is some state in the DLM that should be flushed when
it gets deactivated that might be a contributory cause.


Version-Release number of selected component (if applicable):
[root@link-02 ~]# clvmd -V
Cluster LVM daemon version: 2.01.14 (2005-08-04)
Protocol version:           0.2.1

Comment 1 Christine Caulfield 2005-10-24 15:29:54 UTC
*** Bug 171622 has been marked as a duplicate of this bug. ***

Comment 2 Christine Caulfield 2005-10-24 15:33:12 UTC
For a bit more info see:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=171622

A fix is in the STABLE branch of CVS.

Comment 3 Christine Caulfield 2005-11-01 17:07:39 UTC
Fixed on RHEL4 branch for U3

Checking in lowcomms.c;
/cvs/cluster/cluster/dlm-kernel/src/lowcomms.c,v  <--  lowcomms.c
new revision: 1.22.2.11; previous revision: 1.22.2.10
done


Comment 5 Red Hat Bugzilla 2006-01-06 20:19:05 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0167.html



Note You need to log in before you can comment on or make changes to this bug.