Bug 144488 - Disconnection of user-link not detected
Summary: Disconnection of user-link not detected
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: redhat-config-cluster
Version: 3
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-01-07 17:10 UTC by Lon Hohberger
Modified: 2009-04-16 20:15 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-05-25 16:41:02 UTC
Embargoed:


Attachments (Terms of Use)
Patch fixing behavior. (3.44 KB, patch)
2005-01-07 17:12 UTC, Lon Hohberger
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2005:047 0 high SHIPPED_LIVE clumanager and redhat-config-cluster bug fix update 2005-05-25 04:00:00 UTC

Description Lon Hohberger 2005-01-07 17:10:18 UTC
Description of problem:

If Cluster Manager is configured to use a private LAN for heartbeating
and a public LAN for user services, detection of the a disconnected
link is not properly handled and no failover is performed.

Comment 1 Lon Hohberger 2005-01-07 17:12:04 UTC
Created attachment 109478 [details]
Patch fixing behavior.

This patch includes a backport from the LCP (http://sources.redhat.com/cluster)
resource group manager to detect the ethernet link.

Comment 2 Lon Hohberger 2005-01-07 17:13:24 UTC
This patch *does not work* if the interface is a bonded interface;
more work is necessary in order to do this properly.

The simplest solution is to check the link of each slave bonded to a
master interface and return a failure if all links are down.  This
should not be difficult.

Comment 3 Lon Hohberger 2005-01-07 21:02:01 UTC
Patch which fixes this and other bugzillas:

http://people.redhat.com/lhh/clumanager-1.2.23-0.4lhh.patch

Packages which fix this and several other current outstanding bugzillas:

http://people.redhat.com/lhh/clumanager-1.2.23-0.4lhh.i386.rpm
http://people.redhat.com/lhh/clumanager-1.2.23-0.4lhh.src.rpm


Comment 4 Lon Hohberger 2005-01-07 21:14:03 UTC
Note: The above are not Red Hat errata; they are test patches/packages.

Comment 5 Lon Hohberger 2005-01-11 18:35:24 UTC
The patch calls the 'ip' command which is also defined in svclib_ip,
rather than calling /sbin/ip.

This conflict causes the below log messages:

Jan 11 13:58:34 node1 clusvcmgrd: [10366]: <info> service info:
Starting IP address 10.1.1.1
Jan 11 13:58:34 node1 clusvcmgrd: [10366]: <err> service error: Usage:
ip [start, stop, status] serviceID
Jan 11 13:58:34 node1 clusvcmgrd: [10366]: <err> service error: Error
determining status of bond0
Jan 11 13:58:34 node1 clusvcmgrd: [10366]: <err> service error: Error
finding slaves of bond0
Jan 11 13:58:34 node1 clusvcmgrd: [10366]: <err> service error:
Network link not detected on bond0
Jan 11 13:58:34 node1 clusvcmgrd: [10366]: <err> service error: Cannot
start IP address 10.1.1.1; retrying ...



Comment 6 Lon Hohberger 2005-01-13 15:23:57 UTC
1.2.24-0.1 test fixes the above conflict and a variable name conflict
which caused IP addresses to be assigned to slave interfaces instead
of bond0:0, bond0:1, etc...  This was an artifact of the backport.

Patch:

http://people.redhat.com/lhh/clumanager-1.2.24.patch

Packages:

http://people.redhat.com/lhh/clumanager-1.2.24-0.1.i386.rpm
http://people.redhat.com/lhh/clumanager-1.2.24-0.1.src.rpm

Comment 7 Lon Hohberger 2005-02-11 04:47:01 UTC
Note -- this will have to be a configuration option so as not to break
people depending on the old behavior!

Comment 10 Lon Hohberger 2005-02-23 15:07:52 UTC
GUI support in.

Comment 14 Jay Turner 2005-05-25 16:41:02 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2005-047.html



Note You need to log in before you can comment on or make changes to this bug.