Bug 160646 - GFS cluster node does not shutdown: CMANsendmsg failed: -101
GFS cluster node does not shutdown: CMANsendmsg failed: -101
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: cman (Show other bugs)
4
All Linux
medium Severity high
: ---
: ---
Assigned To: Jonathan Earl Brassow
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-06-16 06:54 EDT by Axel Thimm
Modified: 2007-11-30 17:11 EST (History)
1 user (show)

See Also:
Fixed In Version: RHEL3 U5
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-10-04 13:17:32 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Axel Thimm 2005-06-16 06:54:43 EDT
Description of problem:
Trying to shutdown a GFS node in a 3 cluster node hangs with repeated

CMANsendmsg failed: -101

lines. -101 seems to mean the network is down, so perhaps this is a race between
network and cluster shutdown?

Version-Release number of selected component (if applicable):
cman-1.0-0.pre33.15

How reproducible:
always

Steps to Reproduce:
1.create a GFS cluster
2.try rebooting one node
3.
  
Actual results:
reboot fails on shutdown as described above

Expected results:
shutdown should complete

Additional info:
Comment 1 Jonathan Earl Brassow 2005-06-16 10:33:32 EDT
Are the init scripts active so the system shuts down in the right order?

Shutting down the network before shutting cman would cause a problem like this.
Comment 2 Axel Thimm 2005-06-17 04:04:41 EDT
Yes, all GFS related init scripts have ben chkconfig-enabled. The shutdown is
performed by normal init scripts in the script-given ordering.

Perhaps cman shutdown fails for any reason and later on cman holds the final
rebooting? Then there would be two bugs, one for not having cman properly
shutdown (and I can imagine fencing to take part in this), and another one for
the not-stopped cman not allowing a system to shutdown/reboot.
Comment 3 Jonathan Earl Brassow 2005-07-19 20:10:07 EDT
I believe this was solved by alewis by altering the clvm init script.
Comment 4 AJ Lewis 2005-08-03 10:49:42 EDT
hrm...not sure - are there any initscript errors before this happens?
Comment 5 Jonathan Earl Brassow 2005-08-03 11:48:29 EDT
I don't think there were any errors that were reported by the clvmd init script.  Previously, it would 
shutdown volumes, but not kill the clvmd daemon.  Since the daemon was still logged into cman, cman 
would refuse to shutdown and start spitting out errors like described above....

The way to get to the bottom of this hypothesis is to have the user attach their clvmd init script and 
check to make sure that it is killing off the daemon during all shutdown cases.
Comment 6 Jonathan Earl Brassow 2005-10-04 13:17:32 EDT
The clvmd init script now kills off the daemon when shutting down
Comment 7 Axel Thimm 2005-10-04 17:18:16 EDT
The resolution of this bug is RHEL3 (aka RHCS 3), while the bug was opened
against FC4 which is more like RHEL4 wrt to RHCS/RHGFS.

Also there seems to still be some racing in RHEL4 in shutting down the cluster with

for service in rgmanager gfs clvmd fenced cman ccsd; do
   service $service stop
done

Sometimes cman fails to stop, and service cman stop needs to be reissued.

Note You need to log in before you can comment on or make changes to this bug.