Bug 182233 - Last node in a cluster doesn't send "down" notification to userspace
Last node in a cluster doesn't send "down" notification to userspace
Status: CLOSED CURRENTRELEASE
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: cman (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Christine Caulfield
Cluster QE
:
Depends On:
Blocks: 180185
  Show dependency treegraph
 
Reported: 2006-02-21 04:34 EST by Christine Caulfield
Modified: 2009-04-16 16:00 EDT (History)
1 user (show)

See Also:
Fixed In Version: U4
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-08-03 08:03:36 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Lon's test program to demonstrate the problem (2.65 KB, text/x-csrc)
2006-02-21 04:34 EST, Christine Caulfield
no flags Details

  None (edit)
Description Christine Caulfield 2006-02-21 04:34:22 EST
Description of problem:

If you take all the nodes but one out of a cluster, the last node does not send
the last state change notification to userspace.
By implication, if a node dies in a two node cluster then the remaining node
doesn't get notification.

This /only/ applies to userspace applications using cman directly. NOT to kernel
applications (eg DLM) or those using the service manager API. 


Version-Release number of selected component (if applicable):


How reproducible:
terrifyingly.

Steps to Reproduce:
1. Run the attached program on one node of a two node cluster
2. Take one node down
3. Notice there is no state change notification
  
Actual results:
Nothing. The program does not notice that the node has left

Expected results:
A state change notification.

Additional info:
I'm not yet sure how badly clvmd is affected by this. Lon should be able to
judge if it affects any of his bailiwick.

The fix is trivial
Comment 1 Christine Caulfield 2006-02-21 04:34:23 EST
Created attachment 124945 [details]
Lon's test program to demonstrate the problem
Comment 2 Christine Caulfield 2006-02-21 05:45:27 EST
Fixed in STABLE:
Checking in membership.c;
/cvs/cluster/cluster/cman-kernel/src/membership.c,v  <--  membership.c
new revision: 1.44.2.18.6.3; previous revision: 1.44.2.18.6.2
done

Fixed in RHEL4:
Checking in membership.c;
/cvs/cluster/cluster/cman-kernel/src/membership.c,v  <--  membership.c
new revision: 1.44.2.21; previous revision: 1.44.2.20
done

The effect on clvmd is simply to make the first command after a transition wait
for ages until it times out. After that everything is fine because the timeout
will cause clvmd to re-read the nodes list.
Comment 3 Lon Hohberger 2006-02-21 09:50:21 EST
Patch confirmed.

Note You need to log in before you can comment on or make changes to this bug.