Bug 144488 - Disconnection of user-link not detected
Summary: Disconnection of user-link not detected
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: redhat-config-cluster
Version: 3
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
Depends On:
TreeView+ depends on / blocked
Reported: 2005-01-07 17:10 UTC by Lon Hohberger
Modified: 2009-04-16 20:15 UTC (History)
2 users (show)

Clone Of:
Last Closed: 2005-05-25 16:41:02 UTC

Attachments (Terms of Use)
Patch fixing behavior. (3.44 KB, patch)
2005-01-07 17:12 UTC, Lon Hohberger
no flags Details | Diff

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2005:047 high SHIPPED_LIVE clumanager and redhat-config-cluster bug fix update 2005-05-25 04:00:00 UTC

Description Lon Hohberger 2005-01-07 17:10:18 UTC
Description of problem:

If Cluster Manager is configured to use a private LAN for heartbeating
and a public LAN for user services, detection of the a disconnected
link is not properly handled and no failover is performed.

Comment 1 Lon Hohberger 2005-01-07 17:12:04 UTC
Created attachment 109478 [details]
Patch fixing behavior.

This patch includes a backport from the LCP (http://sources.redhat.com/cluster)
resource group manager to detect the ethernet link.

Comment 2 Lon Hohberger 2005-01-07 17:13:24 UTC
This patch *does not work* if the interface is a bonded interface;
more work is necessary in order to do this properly.

The simplest solution is to check the link of each slave bonded to a
master interface and return a failure if all links are down.  This
should not be difficult.

Comment 3 Lon Hohberger 2005-01-07 21:02:01 UTC
Patch which fixes this and other bugzillas:


Packages which fix this and several other current outstanding bugzillas:


Comment 4 Lon Hohberger 2005-01-07 21:14:03 UTC
Note: The above are not Red Hat errata; they are test patches/packages.

Comment 5 Lon Hohberger 2005-01-11 18:35:24 UTC
The patch calls the 'ip' command which is also defined in svclib_ip,
rather than calling /sbin/ip.

This conflict causes the below log messages:

Jan 11 13:58:34 node1 clusvcmgrd: [10366]: <info> service info:
Starting IP address
Jan 11 13:58:34 node1 clusvcmgrd: [10366]: <err> service error: Usage:
ip [start, stop, status] serviceID
Jan 11 13:58:34 node1 clusvcmgrd: [10366]: <err> service error: Error
determining status of bond0
Jan 11 13:58:34 node1 clusvcmgrd: [10366]: <err> service error: Error
finding slaves of bond0
Jan 11 13:58:34 node1 clusvcmgrd: [10366]: <err> service error:
Network link not detected on bond0
Jan 11 13:58:34 node1 clusvcmgrd: [10366]: <err> service error: Cannot
start IP address; retrying ...

Comment 6 Lon Hohberger 2005-01-13 15:23:57 UTC
1.2.24-0.1 test fixes the above conflict and a variable name conflict
which caused IP addresses to be assigned to slave interfaces instead
of bond0:0, bond0:1, etc...  This was an artifact of the backport.





Comment 7 Lon Hohberger 2005-02-11 04:47:01 UTC
Note -- this will have to be a configuration option so as not to break
people depending on the old behavior!

Comment 10 Lon Hohberger 2005-02-23 15:07:52 UTC
GUI support in.

Comment 14 Jay Turner 2005-05-25 16:41:02 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.