Bug 168722 - Cluster Manager hangs on stopping in 2 node cluster.
Cluster Manager hangs on stopping in 2 node cluster.
Status: CLOSED INSUFFICIENT_DATA
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: clumanager (Show other bugs)
3
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Lon Hohberger
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-09-19 15:24 EDT by Khalid Mahmud
Modified: 2009-04-16 16:18 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-05-04 10:40:46 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Khalid Mahmud 2005-09-19 15:24:32 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322)

Description of problem:
service clumanager hangs on 

Shutting down Red Hat Cluster Manager...
Initiating shutdown of Quorum Services:                    [  OK  ]
Waiting for User Services to stop:                         [  OK  ]
Waiting for Quorum Services to stop:                       [  OK  ]
Waiting for Membership Services to stop:

ps -ef |grep clumanager   shows the following

root      8072 29795  0 16:26 pts/3    00:00:00 /bin/sh /sbin/service clumanager stop
root      8079  8072  1 16:26 pts/3    00:00:00 /bin/sh /etc/init.d/clumanager stop


Version-Release number of selected component (if applicable):
clumanager-1.2.3-1

How reproducible:
Always

Steps to Reproduce:
1.service clumanager stop
2.
3.
  

Actual Results:  Nothing. The command hangs.


Expected Results:  The cluster service should have stopped.

Additional info:

clumanager-1.2.3-1
Comment 1 Lon Hohberger 2005-09-19 15:29:52 EDT
Please retry with the U5 version, 1.2.26.1, or the U6 beta version, 1.2.28. 
This works for me.
Comment 2 Khalid Mahmud 2005-09-22 15:51:10 EDT
I upgraded to clumanager-1.2.26.1-1.
Problem still happens but not everytime now.
Comment 3 Lon Hohberger 2005-09-22 16:57:20 EDT
First of all, it works for me, which is why I'm puzzled.

Can I look at your configuration?  You can block out IPs/hostnames if you need
to, but the rest needs to be intact.
Comment 4 Khalid Mahmud 2005-09-22 17:27:15 EDT
Hi Lon
  I can't give you access to the machines as it is behing firewall and it is 
against company policy.
  Let me know how else I can send you the information you need.
Thanks
Khalid
Comment 5 Lon Hohberger 2005-09-22 17:56:36 EDT
I meant /etc/cluster.xml ...
Comment 7 Lon Hohberger 2005-09-26 17:28:23 EDT
I hid the configuration from public view.

The configuration looks good, it looks like it's waiting for membership to stop
(which is odd, actually).

So this is prioritized correctly, can you please file a request here?

https://www.redhat.com/apps/support/
Comment 8 Lon Hohberger 2005-09-29 15:49:55 EDT
I still can't reproduce this.  Is there anything special I need to know to make
this happen?

Is the cluster formed (are both nodes up)?  Is there only one up but at the
point we try to stop, the node is quorate?  Do I need to run "service clumanager
start ; service clumanager stop" in a tight loop?
Comment 9 Khalid Mahmud 2005-10-04 14:51:53 EDT
Cluster is formed. Both nodes are up. I just run service clumanager stop.
The command works sometimes but sometimes it hangs. I do have a couple of 
services defined and they failover between the nodes correctly.
Comment 10 Lon Hohberger 2005-10-04 17:22:18 EDT
Well, it's hanging in this loop (obviously):

		if [ -n "`pidof $MEMBD`" ]; then
			echo -n $"Waiting for Membership Services to stop: "
			while [ -n "`pidof $MEMBD`" ]; do
				sleep 1
			done
			echo_success
			echo
		else
			echo $"Membership Services are stopped."
		fi

so, it didn't exit for some reason.  Can you run "service clumanager stop", and
while it's 'hung', run "service clumanager status" from another terminal?  That
should correctly indicate which daemon(s) are stuck.

Note You need to log in before you can comment on or make changes to this bug.