Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 435466

Summary:

Rgmanager shows wrong service status in restricted failover domain

Product:

Red Hat Enterprise Linux 5

Reporter:

Agnieszka Kukalowicz <qqlka>

Component:

rgmanager

Assignee:

Lon Hohberger <lhh>

Status:

CLOSED ERRATA

QA Contact:

Severity:

medium

Docs Contact:

Priority:

medium

Version:

5.1

CC:

cbolz, cluster-maint, djansa, djuran, edamato, tao, tellis

Target Milestone:

Target Release:

---

Hardware:

i686

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2009-01-20 20:56:18 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Pass 1 patch	none

Description Agnieszka Kukalowicz 2008-02-29 15:16:32 UTC

Description of problem:
Rgmanager shows that service configured in one node restricted failover domain 
is "started" even when the node the service is running on was fencing.

Version-Release number of selected component (if applicable):
rgmanager-2.0.31-1.el5

How reproducible:
Always

Steps to Reproduce:
1. Configure service in one node restricted failover domain
2. Power off the node service is running on
3. On the other cluster node do "clustat"
4. Clustat shows the service is "started"
  
Cluster before the node is down:

Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  w2.local		                    1 Online, Local, rgmanager
  w1.local		                    2 Online, rgmanager

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  vm:VM_Work11_RHEL51  w1.local		started
  vm:VM_Work21_RHEL51  w2.local		            started


Actual results:


Expected results:


Additional info:

Comment 1 Agnieszka Kukalowicz 2008-02-29 15:20:51 UTC

Description of problem:
Rgmanager shows that service configured in one node restricted failover domain 
is "started" even when the node the service is running on was fencing.

Version-Release number of selected component (if applicable):
rgmanager-2.0.31-1.el5

How reproducible:
Always

Steps to Reproduce:
1. Configure service in one node restricted failover domain
2. Power off the node service is running on
3. On the other cluster node do "clustat"
4. Clustat shows the service is "started"

Additional info:
  
Cluster before the node is down:
Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  w2.local		                    1 Online, Local, rgmanager
  w1.local		                    2 Online, rgmanager

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  vm:VM_Work11_RHEL51  w1.local		started
  vm:VM_Work21_RHEL51  w2.local		            started


Cluster after node is down:
Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  w2.local.polska.pl                    1 Offline
  w1.local.polska.pl                    2 Online, Local, rgmanager

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  vm:VM_Work11_RHEL51  w1.local.polska.pl             started
  vm:VM_Work21_RHEL51  w2.local.polska.pl             started

Cluster configuration in /etc/cluster/cluster.conf:
<rm>
  <failoverdomains>
    <failoverdomain name="VM_w1_failover" ordered="0" restricted="1">
        <failoverdomainnode name="w1.local" priority="1"/>
    </failoverdomain>
    <failoverdomain name="VM_w2_failover" ordered="0" restricted="1">
        <failoverdomainnode name="w2.local" priority="1"/>

    </failoverdomain>
  </failoverdomains>
  <resources/>
  <vm autostart="1" domain="VM_w1_failover" exclusive="0" 
name="VM_Work11_RHEL51" path="/virts/w11" recovery="restart"/>
  <vm autostart="1" domain="VM_w2_failover" exclusive="0" 
name="VM_Work21_RHEL51" path="/virts/w21" recovery="restart"/>
</rm>

Comment 2 Lon Hohberger 2008-02-29 18:22:25 UTC

The code says it's known - but the behavior's certainly not right.  It's not
difficult to fix.

Notes: 
 * One fix needs to be in groups.c in eval_services()
 * One fix needs to go into default_event_script.sl

Comment 4 RHEL Program Management 2008-06-04 22:46:06 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 5 Lon Hohberger 2008-07-25 17:40:12 UTC

Created attachment 312662 [details]
Pass 1 patch

Tested, but needs more testing when central_processing is enabled.  This also
fixes #428108

Comment 6 Lon Hohberger 2008-07-25 17:58:50 UTC

Note - it doesn't apply against RHEL4 branch; so my comment about it fixing
428108 is misleading

Comment 7 Lon Hohberger 2008-08-26 14:26:39 UTC

http://git.fedorahosted.org/git/?p=cluster.git;a=commit;h=c84780c201b7efdb63dffe0e6182f6b1b2d054a7

Comment 12 errata-xmlrpc 2009-01-20 20:56:18 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0101.html