Bug 159740

Summary: cluster heartbeat interface not used
Product: [Retired] Red Hat Cluster Suite Reporter: Luis Alexandre Fontes <luis_alexandre_fontes>
Component: clumanagerAssignee: Lon Hohberger <lhh>
Status: CLOSED NOTABUG QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 3CC: cluster-maint
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-06-08 19:25:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Luis Alexandre Fontes 2005-06-07 17:39:15 UTC
Scenario:
=========
We have 2 machines, each one with the following configuration:
- RHEL 3.0 AS u3;
- clumanager 1.2.22-2;
- 2 NICS (Intel Gigabit): eth0 for public and eth1 for private network;
- 1 HBA (QLogic 2200) attached to EMC storage;

this is the physical model:

                            +-------------+
                            |             |
                            |   router    |
                            | 172.27.2.1  |
                            |             |
                            +------+------+
                                   |
        eth0 +---------------------+--------------------+ eth0
 172.27.2.35 |                                          | 172.27.2.36
             |        (virtual ip: 172.27.2.100)        |
             |                                          |
     +-------+------+ eth1                      +-------+------+
     |              | 10.0.0.1                  |              |
     | linpro035ias +---------heartbeat---------+ linpro036ias |
     |              |                  10.0.0.2 |              |
     +-------+------+                      eth1 +-------+------+
         hba |                                          | hba
             +---------------------+--------------------+
                                   |
                            +------+------+
                            |             |
                            | EMC storage |
                            |  /dev/sdc3  |
                            |             |
                            +-------------+

- the member names are linpro035ias (which points to 172.27.2.35) and 
linpro036ias (which points to 172.27.2.36);
- the virtual ip (172.27.2.100) associated with the service in the cluster;
- the heartbeat is broadcast;
- the tiebreaker is network (ip 172.27.2.1, the router/gateway);
- there's a crossover cable for private networking (eth1);

The problem:
============

If clumembd%broadcast_primary_only is not defined (or set to no, the default 
setting), heartbeat packets are sent over eth0 (public) and eth1 (private),
but if I unplug the crossover cable, the cluster continues as if nothing were 
happened.

If clumembd%broadcast_primary_only is set to yes, heartbeat packets are sent 
just over eth0.

So, I conclude that a private network between the two nodes is not being used.


Steps to reproduce:
===================

01. set eth0 on first machine to 172.27.2.35; on the second machine to 
172.27.2.36 (both connected to a router/gateway);
02. set eth1 on first machine to 10.0.0.1; on the second machine to 10.0.0.2 
(connected using a crossover cable);
03. set a cluster service to use httpd (/etc/init.d/httpd);
04. set a service ip address to 172.27.2.100;
05. set a device (/dev/sdc3, attached to the storage), to mount /u02 as ext3
06. /u02 must contain the www directory (mv /var/www /u02);
07. /etc/httpd/conf/httpd.conf must be edited to replace /var/www to /u02/www;
08. in the Cluster Daemon Properties, enable Broadcast Heartbeating and Network 
Tiebreaker (172.27.2.1 -> the router/gateway);
09. edit /etc/syslog.conf and append the following line:
    local4.*  /var/log/cluster
    and restart the syslog service;
09. start the rawdevices service;
10. start the clumanager service;


What happened after you performed the steps above? 
==================================================
1. if the crossover cable is unplugged (that should be used for heartbeating), 
nothing happens; it can be monitored using 'tail -f /var/log/cluster';
2. if the crossover cable is plugged again, and clumembd%broadcast_primary_only 
is set to yes
   ( cludb -put clumembd%broadcast_primary_only yes ) the heartbeat packets go 
through eth0 (enhancement requested by Lon Hohberger), so eth1 is not used.


What should have happened instead?
==================================

Lon Hohberger said on => https://bugzilla.redhat.com/bugzilla/show_bug.cgi?
id=144838
"Cluster Manager requires that all members coexist on the same fully connected 
subnet and that the link(s) used for cluster communication are the same link(s) 
used to monitor the tiebreaker IP address."

So, if the link used for cluster communication must be the same link used to 
monitor the tiebreaker IP address:

a) packets used to attend the service httpd go through eth0;
b) packets used to monitor the tiebreaker IP address go through eth0;
c) heartbeat packets go through eth0;
d) eth1 (that should be used for heartbeating) is useless (and never used with 
clumembd%broadcast_primary_only set to yes);

So, the "clumembd%broadcast_primary_only" set to yes will make the service 
packets (httpd), heartbeating packets and tiebreaker monitoring packets go 
through all the same physical interface (ie eth0).

I think that we should have a parameter, or an option in the cluster 
configuration GUI, to specify that broadcast heartbeating will use network 
interface X (ie eth1), like this:

# cludb -put clumembd%broadcast_interface eth1

Comment 1 Lon Hohberger 2005-06-08 19:25:12 UTC
Expected behavior for all cases.  I've attempted to explain this in a general
manner here:

http://people.redhat.com/lhh/network-stuff.html

For additional assistance, please contact Red Hat Support for additional
configuration assistance.

http://www.redhat.com/apps/support