Bug 428108 - service left untouched if restricted failover domain goes away
Summary: service left untouched if restricted failover domain goes away
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: rgmanager
Version: 4
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-01-09 09:49 UTC by Bryn M. Reeves
Modified: 2018-10-20 00:24 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-05-18 21:13:28 UTC
Embargoed:


Attachments (Terms of Use)
cluster.conf for a two-node cluster exhibiting this problem (1.16 KB, text/plain)
2008-01-09 09:49 UTC, Bryn M. Reeves
no flags Details
test service script that is slow to start/stop (121 bytes, text/plain)
2008-01-09 09:50 UTC, Bryn M. Reeves
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:1048 0 normal SHIPPED_LIVE rgmanager bug-fix and enhancement update 2009-05-18 21:12:29 UTC

Description Bryn M. Reeves 2008-01-09 09:49:37 UTC
Description of problem:
If a node is killed while a service is in the state "stopping", it remains in
this state even after the node comes back online. This only happens in the case
that the service is part of a restricted failover domain that prevents the
service being relocated elsewhere in the cluster; without this restriction the
service will be correctly re-started on another node.


Version-Release number of selected component (if applicable):
rgmanager-1.9.68-1.0.1

How reproducible:
100%

Steps to Reproduce:
1. Configure a cluster having a service that takes some time to start/stop (e.g.
the attached service script takes 30s for start/stop).
2. Configure the service as a member of a restricted failover domain that will
only allow the service to run on a single node.
3. While the service is running on that node, disable it and then kill power to
the node while the service is still shutting down.
  
Actual results:
Service stays in the state "stopping" forever even after the node is powered up
again.

Expected results:
Service restarts automatically once the killed node is back online.

Additional info:

Comment 1 Bryn M. Reeves 2008-01-09 09:49:37 UTC
Created attachment 291124 [details]
cluster.conf for a two-node cluster exhibiting this problem

Comment 2 Bryn M. Reeves 2008-01-09 09:50:11 UTC
Created attachment 291125 [details]
test service script that is slow to start/stop

Comment 3 RHEL Program Management 2008-01-09 09:55:59 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 5 Lon Hohberger 2008-02-05 15:57:05 UTC
Workaround exists.  Disable the service.

Comment 6 Lon Hohberger 2008-03-04 20:21:28 UTC
This only occurs if there is no node capable of running the service, and is
effectively the same bugzilla as 435466

Comment 7 Lon Hohberger 2008-03-04 20:22:27 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=435466

(RHEL5 bug)

Comment 13 errata-xmlrpc 2009-05-18 21:13:28 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1048.html


Note You need to log in before you can comment on or make changes to this bug.