Bug 634277 - RFE: Critical/Non-Critical services & resources
Summary: RFE: Critical/Non-Critical services & resources
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: rgmanager
Version: 6.0
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
URL:
Whiteboard:
Keywords: FutureFeature, Triaged
: 493660 (view as bug list)
Depends On: 634298
Blocks: 655920 674710
TreeView+ depends on / blocked
 
Reported: 2010-09-15 17:43 UTC by Lon Hohberger
Modified: 2018-11-14 20:20 UTC (History)
7 users (show)

(edit)
Clone Of: 605733
: 674710 (view as bug list)
(edit)
Last Closed: 2011-05-19 14:18:14 UTC


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0750 normal SHIPPED_LIVE rgmanager bug fix and enhancement update 2011-05-18 18:09:01 UTC

Description Lon Hohberger 2010-09-15 17:43:03 UTC
+++ This bug was initially created as a clone of Bug #605733 +++

Description of problem:

RHCS only defines three recovery options for a failed process:-
- Disable
- Restart and relocate if restart fails
- Relocate

There is not a "Restart but do not relocate" option.

The use case is a configuration running multiple custom/flaky applications using the same storage and IP address. If an individual application fails, the customer wants to attempt restart(s), but if the restart of a individual application fails, there is absolutely no point in relocating, because it's unlikely fix the problem and just mess up the other applications running on the same box.

--- Additional comment from lhh@redhat.com on 2010-09-15 13:29:19 EDT ---

There are three main components:

1) a restart-disable policy on the whole service which interacts
   with the existing max-restarts / restart-expire-time
2) non-critical independent subtrees: 
   - the ability to let designated resources fail
   - the ability to recover these resources
3) restart threshold policies on independent subtrees
   - the ability to define max-restarts / restart-expire-time
     on a per subtree basis
   - operation with normal independent subtrees:
     service goes into recovery when threshold is exceeded
   - operation with non-critical independent subtrees: 
     disable subtree when threshold is exceeded

Comment 1 Lon Hohberger 2010-09-15 17:51:14 UTC
*** Bug 493660 has been marked as a duplicate of this bug. ***

Comment 2 Lon Hohberger 2010-12-01 19:30:41 UTC
There are 17 patches in STABLE31 addressing this issue.

Comment 5 Lon Hohberger 2011-02-01 18:13:48 UTC
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=16ab187d7733c653dddc3e1b9cd90524ccdf8947
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=ca924c428bbf149531f896b52c9ba6f1597c634b
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=5203d9eefe530a13525dc32d9f48568fbabfd495
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=0e5e14cd1471464edf14776bd7ac84d14623a03d
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=5907340776e360b327642f24f7ace0ae812b7a81
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=b631ffdb818f7cf3512840dd99b8844aa230b03d
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=abe50ac2721ec8124aa2a614c2a0a05e4cfa3ad7
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=57232d8ad1dde6927a7d8cd267d1f3813e2bf0ca
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=c5db095bea06e76e021577bd56d2658f90ebbecc
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=109a4f729592e2f9039ec369df440cbb21a078c7
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=c2fa8fe7c8f2a3cbf1023a170f3f78a8de559b7a
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=3b03b46fe7d3c7d747db9a2b7721cc56aef458f2
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=4e2261f72411aae2604d9d3b771f221b11ef4b6b
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=d11004237d32ef094cf515e9215be2430723dacf
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=0ffd512aea6ed74ed0127284d0112bcbffa33061
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=c7187470032d2ab7c32a6a3ae43a358e1a99656b
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=d100504de6eff5f83dba79319ad0bd560f7a57df

Comment 8 Lon Hohberger 2011-02-11 19:03:36 UTC
How it works, and what to expect:

https://bugzilla.redhat.com/show_bug.cgi?id=605733#c14

Comment 10 errata-xmlrpc 2011-05-19 14:18:14 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0750.html


Note You need to log in before you can comment on or make changes to this bug.