Bug 131006 - RFE: Make a clumembd option to restrict cluster broadcast heartbeating to private NICs/loopback
Summary: RFE: Make a clumembd option to restrict cluster broadcast heartbeating to pri...
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: clumanager   
(Show other bugs)
Version: 3
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
Depends On:
Blocks: 131576
TreeView+ depends on / blocked
Reported: 2004-08-26 16:21 UTC by Lon Hohberger
Modified: 2009-04-16 20:15 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2004-11-09 16:13:53 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Patch which implements behavior (4.95 KB, patch)
2004-08-27 16:47 UTC, Lon Hohberger
no flags Details | Diff

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2004:491 high SHIPPED_LIVE Updated clumanager and redhat-config-cluster packages 2004-12-20 05:00:00 UTC

Description Lon Hohberger 2004-08-26 16:21:59 UTC
Description of problem:

When using a private crossover cable for heartbeating with clumanager
1.2, you have a choice between broadcast and multicast modes.  In
multicast mode (the recommended!), the cluster software sends
heartbeats over the primary NIC (defined as the NIC which has an IP
which matches the cluster member's name).  The primary NIC can be
bonded ethernet device - thus providing higher availability.

The downside to multicast is that it requires a switch or router which
understands multicast to be in the mix.

So, the alternative method of heartbeating is broadcast - which works
by bombarding every physical interface with heartbeat packets.  This
includes loopback devices, and all other network devices which have a
non-virtual IP configured (e.g. eth0, but not eth0:0).

This is partly necessary because a node doesn't "see" its own
broadcast traffic necessarily, so the use of 'lo' is needed so a node
can declare itself alive.

This works well, but is really brute-force: ALL nics get the broadcast
packets sent out.  This means that if a cluster has a public and a
private network (the latter used only for cluster communications)
which doesn't understand multicast, that the public network gets
unnecessary broadcast traffic.

The request is this:

Provide an option to only use 'lo' and the primary NIC (defined
previously) for heartbeating.

Comment 1 Lon Hohberger 2004-08-27 16:47:38 UTC
Created attachment 103170 [details]
Patch which implements behavior

To turn on "primary-nic + lo only":

cludb -p clumembd%broadcast_primary_only yes

To turn it off:

cludb -r clumembd%broadcast_primary_only

Comment 2 Lon Hohberger 2004-09-02 15:57:27 UTC
1.2.18pre1 patch (unsupported; test only, etc.)


This includes the fix for this bug and a few others.

Comment 3 Lon Hohberger 2004-09-14 14:22:06 UTC
New behavior with feature:

Member binds to only the loopback address and the 'primary NIC',
regardless of how many physical NICs exist.

Comment 4 Derek Anderson 2004-11-09 16:13:53 UTC
Tested this by setting up 2 eth cards, eth0 (primary) and eth1
(secondary).  With default setting (broadcast_primary_only off)
verified that we were sending broadcast packets on eth1:

[root@link-01 root]# tcpdump -i eth1
tcpdump: listening on eth1
10:13:18.927517 > udp 16 (DF)
10:13:19.677554 > udp 16 (DF)

Then shut down the cluster and issued `cludb -p
clumembd%broadcast_primary_only yes`, wrote to shared config storage,
and restarted the cluster.  Ran tcpdump on eth1 again and confirmed
that this interface was not issuing broadcast packets.

Version clumanager-1.2.22-2

This is ready for RHEL3-U4.

Comment 6 John Flanagan 2004-12-21 03:40:14 UTC
An advisory has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.