Bug 201396 - clusvcadm hangs if node processing request dies
clusvcadm hangs if node processing request dies
Status: CLOSED ERRATA
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: rgmanager (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Lon Hohberger
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-08-04 15:50 EDT by Corey Marthaler
Modified: 2009-04-16 16:20 EDT (History)
1 user (show)

See Also:
Fixed In Version: RHBA-2007-0148
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-05-10 17:03:07 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Fixes a segfault in clusvcadm exposed by the fix to 201396 (1.33 KB, patch)
2007-01-03 16:01 EST, Lon Hohberger
no flags Details | Diff
Makes clusvcadm / rgmanager produce an error if the node dies while handling a req. (5.17 KB, patch)
2007-01-03 16:04 EST, Lon Hohberger
no flags Details | Diff

  None (edit)
Description Corey Marthaler 2006-08-04 15:50:38 EDT
Description of problem:
Got into the senario where a force unmount was needed inorder to relocate the HA
filesystem service. When the unmount failed a self_fence was needed. After that
happened the original relocate cmd hung.

[root@taft-02 ~]# clustat
Member Status: Quorate

  Member Name                              Status
  ------ ----                              ------
  taft-01                                  Online, rgmanager
  taft-02                                  Online, Local, rgmanager
  taft-03                                  Online, rgmanager

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  nfs1                 taft-01                        started

[root@taft-02 ~]# clusvcadm -r nfs1 -m taft-02
Trying to relocate nfs1 to taft-02...                                    

Version-Release number of selected component (if applicable):
[root@taft-02 ~]# rpm -q rgmanager
rgmanager-1.9.51-0
[root@taft-02 ~]# uname -ar
Linux taft-02 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:32:02 EDT 2006 x86_64 x86_64
x86_64 GNU/Lix


How reproducible:
everytime you get into this senario
Comment 3 Lon Hohberger 2006-11-17 10:33:08 EST
So, what we need to do is periodically have the forwarding thread check the
membership list to see if the node it was talking to died.  If it died, and
there's a response filedes, send some sort of error back through the response fd
to indicate that the status is unknown.
Comment 4 Lon Hohberger 2007-01-03 16:01:23 EST
Created attachment 144739 [details]
Fixes a segfault in clusvcadm exposed by the fix to 201396
Comment 5 Lon Hohberger 2007-01-03 16:04:57 EST
Created attachment 144741 [details]
Makes clusvcadm / rgmanager produce an error if the node dies while handling a req.

Output looks like:

[root@red rgmanager]# clusvcadm -e test04 -n green.lab.boston.redhat.com
Member green.lab.boston.redhat.com trying to enable test04...node processing
request died
(Status unknown)

This works in both the enable-on-remote-node and relocate-remote-service cases,
and was tested by performing a 'reboot -fn' while the requested operation was
taking place.
Comment 8 Red Hat Bugzilla 2007-05-10 17:03:07 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0148.html

Note You need to log in before you can comment on or make changes to this bug.