Bug 181817 - fence init needs the -t option for the stop case as well
fence init needs the -t option for the stop case as well
Status: CLOSED CURRENTRELEASE
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: fence (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Ryan O'Hara
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-02-16 15:38 EST by Corey Marthaler
Modified: 2009-04-16 16:27 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-08-05 17:30:16 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Corey Marthaler 2006-02-16 15:38:44 EST
Description of problem:
This is related to bz 168698. 
There is still the case where the fence init will hang on attempting to stop the
fence service.
Comment 2 Ryan O'Hara 2006-12-21 12:17:42 EST
I could not recreate this. In a 3 node cluster, I stopped cman on 2 of the nodes
to get the cluster non-quorate. I was still able to stop fenced on the remaining
node without any problem. Also note 'service cman stop' will fail if fenced is
running.

Marking this NEEDINFO until we figure out how to recreate this problem.
Comment 3 Corey Marthaler 2007-01-04 15:44:33 EST
I have also been unable to recreate this, though I know it was quite
reproducable at one time. You can either just close this as WORKSFORME and
if/when I see it again I'll reopen it, or you can just add the -t option and
mark it closed that way.
Comment 4 Corey Marthaler 2007-01-04 17:10:59 EST
Finally reproduced it. 

1. On a four node cluster, start ccsd on all four nodes. 
2. Start cman on only 3 nodes (enough to get quorum)
3. Start fenced on only 1 nodes
4. Stop cman on the two nodes that don't have fenced started
5. Attempt to stop fenced... HANG

[root@link-08 ~]# fence_tool join
[root@link-08 ~]# cman_tool services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           2   4 join      S-6,20,1
[1]

[root@link-08 ~]# fence_tool leave
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
[this will be stuck until quorum return] 

If there was a working "-t" flag, then the fence_tool leave would eventually
time out.
Comment 5 Ryan O'Hara 2007-01-04 17:34:37 EST
I have also reproduced it. Note that using 'service cman stop' will do a
leave/remove and adjust quorum accordingly, thus the bug was not exposed. My bad.

Comment 6 Ryan O'Hara 2007-01-05 11:56:37 EST
Fixed.

Added timeout option for fence_tool leave command (previously was only used in
fence_tool join) and made appropriate changes to the init script. Default
timeout is 120 seconds (in init script), same as the the default timeout for
startup. Note that the fenced init script contains two separate, configurable
variables: FENCED_START_TIMEOUT and FENCED_STOP_TIMEOUT.

Comment 7 Ryan O'Hara 2007-01-05 11:56:54 EST
Fixed.

Added timeout option for fence_tool leave command (previously was only used in
fence_tool join) and made appropriate changes to the init script. Default
timeout is 120 seconds (in init script), same as the the default timeout for
startup. Note that the fenced init script contains two separate, configurable
variables: FENCED_START_TIMEOUT and FENCED_STOP_TIMEOUT.
Comment 8 Corey Marthaler 2007-04-23 15:18:20 EDT
Marking verified.
Comment 9 Chris Feist 2008-08-05 17:30:16 EDT
This has been fixed in the current (4.7) release.

Note You need to log in before you can comment on or make changes to this bug.