Red Hat Bugzilla – Bug 428108
service left untouched if restricted failover domain goes away
Last modified: 2010-10-22 17:35:27 EDT
Description of problem:
If a node is killed while a service is in the state "stopping", it remains in
this state even after the node comes back online. This only happens in the case
that the service is part of a restricted failover domain that prevents the
service being relocated elsewhere in the cluster; without this restriction the
service will be correctly re-started on another node.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Configure a cluster having a service that takes some time to start/stop (e.g.
the attached service script takes 30s for start/stop).
2. Configure the service as a member of a restricted failover domain that will
only allow the service to run on a single node.
3. While the service is running on that node, disable it and then kill power to
the node while the service is still shutting down.
Service stays in the state "stopping" forever even after the node is powered up
Service restarts automatically once the killed node is back online.
Created attachment 291124 [details]
cluster.conf for a two-node cluster exhibiting this problem
Created attachment 291125 [details]
test service script that is slow to start/stop
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
Workaround exists. Disable the service.
This only occurs if there is no node capable of running the service, and is
effectively the same bugzilla as 435466
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.