Bug 114653 - clumanager should not run if joining the multicast group fails
clumanager should not run if joining the multicast group fails
Status: CLOSED ERRATA
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: clumanager (Show other bugs)
3
i386 Linux
low Severity high
: ---
: ---
Assigned To: Lon Hohberger
:
: 136553 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-01-30 13:26 EST by Lon Hohberger
Modified: 2009-04-16 16:35 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-03-19 14:27:54 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Patch to fix behavior and make code more clear (8.96 KB, patch)
2004-02-06 12:28 EST, Lon Hohberger
no flags Details | Diff

  None (edit)
Description Lon Hohberger 2004-01-30 13:26:26 EST
Red Hat Cluster Manager uses either broadcast or multicast for
heartbeat transmission/reception (and can use both; just in an
unsupported fashion).  

Currently, if a member can not join a multicast group, the heartbeat
transmission thread tries to send anyway - even though there are no
heartbeat file descriptors are active.

After the configured failover time, the member reboots, complaining
that it could not send a heartbeat within the failover interval.  This
problem is common on cluster connected via an inexpensive hub or
switch which doesn't handle multicast traffic.

The membership daemon should not run at all if it can not join the
multicast group.
Comment 1 Lon Hohberger 2004-02-06 12:28:15 EST
Created attachment 97517 [details]
Patch to fix behavior and make code more clear
Comment 3 Lon Hohberger 2004-02-06 12:29:42 EST
Patch is against 1.2.9
Comment 4 Lon Hohberger 2004-03-19 13:50:09 EST
Testing:

(1) Stop cluster software on one member
(2) ifconfig eth0 - record IP address & netmask & broadcast
(3) ifdown eth0 on that member.  Unplug eth0 on the same member.
(4) rmmod <ethernet module> (e100?)
(5) ifconfig eth0 <ip> netmask <netmask> broadcast <broadcast> up
(6) Start cluster software on member

Node should reboot on old version after failover interval.

Comment 5 Suzanne Hillman 2004-03-19 14:27:54 EST
Verified.
Comment 6 Lon Hohberger 2005-01-10 13:00:57 EST
*** Bug 136553 has been marked as a duplicate of this bug. ***
Comment 7 Lon Hohberger 2007-12-21 10:10:20 EST
Fixing product name.  Clumanager on RHEL3 was part of RHCS3, not RHEL3

Note You need to log in before you can comment on or make changes to this bug.