Bug 181817 - fence init needs the -t option for the stop case as well
Summary: fence init needs the -t option for the stop case as well
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: fence
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Ryan O'Hara
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-02-16 20:38 UTC by Corey Marthaler
Modified: 2009-04-16 20:27 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-08-05 21:30:16 UTC
Embargoed:


Attachments (Terms of Use)

Description Corey Marthaler 2006-02-16 20:38:44 UTC
Description of problem:
This is related to bz 168698. 
There is still the case where the fence init will hang on attempting to stop the
fence service.

Comment 2 Ryan O'Hara 2006-12-21 17:17:42 UTC
I could not recreate this. In a 3 node cluster, I stopped cman on 2 of the nodes
to get the cluster non-quorate. I was still able to stop fenced on the remaining
node without any problem. Also note 'service cman stop' will fail if fenced is
running.

Marking this NEEDINFO until we figure out how to recreate this problem.

Comment 3 Corey Marthaler 2007-01-04 20:44:33 UTC
I have also been unable to recreate this, though I know it was quite
reproducable at one time. You can either just close this as WORKSFORME and
if/when I see it again I'll reopen it, or you can just add the -t option and
mark it closed that way.

Comment 4 Corey Marthaler 2007-01-04 22:10:59 UTC
Finally reproduced it. 

1. On a four node cluster, start ccsd on all four nodes. 
2. Start cman on only 3 nodes (enough to get quorum)
3. Start fenced on only 1 nodes
4. Stop cman on the two nodes that don't have fenced started
5. Attempt to stop fenced... HANG

[root@link-08 ~]# fence_tool join
[root@link-08 ~]# cman_tool services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           2   4 join      S-6,20,1
[1]

[root@link-08 ~]# fence_tool leave
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
fence_tool: waiting for cluster quorum
[this will be stuck until quorum return] 

If there was a working "-t" flag, then the fence_tool leave would eventually
time out.

Comment 5 Ryan O'Hara 2007-01-04 22:34:37 UTC
I have also reproduced it. Note that using 'service cman stop' will do a
leave/remove and adjust quorum accordingly, thus the bug was not exposed. My bad.



Comment 6 Ryan O'Hara 2007-01-05 16:56:37 UTC
Fixed.

Added timeout option for fence_tool leave command (previously was only used in
fence_tool join) and made appropriate changes to the init script. Default
timeout is 120 seconds (in init script), same as the the default timeout for
startup. Note that the fenced init script contains two separate, configurable
variables: FENCED_START_TIMEOUT and FENCED_STOP_TIMEOUT.



Comment 7 Ryan O'Hara 2007-01-05 16:56:54 UTC
Fixed.

Added timeout option for fence_tool leave command (previously was only used in
fence_tool join) and made appropriate changes to the init script. Default
timeout is 120 seconds (in init script), same as the the default timeout for
startup. Note that the fenced init script contains two separate, configurable
variables: FENCED_START_TIMEOUT and FENCED_STOP_TIMEOUT.

Comment 8 Corey Marthaler 2007-04-23 19:18:20 UTC
Marking verified.

Comment 9 Chris Feist 2008-08-05 21:30:16 UTC
This has been fixed in the current (4.7) release.


Note You need to log in before you can comment on or make changes to this bug.