Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 714671

Summary:

rgmanager with central processing behaves differently depending on the order in which services appear in cluster.conf

Product:

Red Hat Enterprise Linux 5

Reporter:

Julio Entrena Perez <jentrena>

Component:

rgmanager

Assignee:

Lon Hohberger <lhh>

Status:

CLOSED DUPLICATE

QA Contact:

Cluster QE <mspqa-list>

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

5.6

CC:

cluster-maint, edamato, jentrena

Target Milestone:

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2011-06-21 13:43:05 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
cluster.conf file that shows the deferred RIND event taking place	none
cluster.conf file that shows the RIND event taking place when expected	none

Description Julio Entrena Perez 2011-06-20 12:10:28 UTC

Created attachment 505599 [details]
cluster.conf file that shows the deferred RIND event taking place

> Description of problem:
Depending on the order in which services appear in cluster.conf, rgmanager with central processing defers the RIND script follow-service.sl during a failover.

> Version-Release number of selected component (if applicable):
rgmanager-2.0.52-9.el5_6.1 .

> How reproducible:
Always.

> Steps to Reproduce:
1. Create a cluster.conf file with:
- two nodes 1, 2 and a quorum disk.
- two failover domains:
  - FD1 includes both nodes and prefers the first.
  - FD2 includes the second node only.
- five services (defined in this order):
  - service A uses FD1.
  - service D uses FD2.
  - service E uses FD2.
  - service B uses FD1 and has a hard dependency on service A.
  - service C uses FD1 and has a hard dependency on service A.

- central processing enabled for rgmanager.

- four RIND events:
  - event class service follow-service-sl A,D,A.
  - event class node follow-service-sl A,D,A.
  - event class service follow-service-sl A,E,A.
  - event class node follow-service-sl A,E,A.

2. Start all services. A, B and C will start in node 1, D and E will start in node 2.

3. Stop the rgmanager service in node 1.
  
> Actual results:
- Services A, B and C are stopped in node 1.
- Services A, B and C are started in node 2.
- Services D and E are stopped in node 2.

> Expected results:
- Services A, B and C are stopped in node 1.
- Service A is started in node 2.
- Services D and E are stopped in node 2.
- Services B and C are started in node 2.

> Additional info:
If services are defined in cluster.conf in the following order, the expected behaviour is achieved:
  - service B uses FD1 and has a hard dependency on service A.
  - service C uses FD1 and has a hard dependency on service A.
  - service A uses FD1.
  - service D uses FD2.
  - service E uses FD2.

Please, note that the only difference is the order in which services appear in cluster.conf. No property of any service is modified at all.

See attached 'cluster.conf.works' and 'cluster.conf.doesntwork' files.

Comment 1 Julio Entrena Perez 2011-06-20 12:11:10 UTC

Created attachment 505600 [details]
cluster.conf file that shows the RIND event taking place when expected

Comment 2 Julio Entrena Perez 2011-06-20 12:12:14 UTC

# diff cluster.conf.works cluster.conf.doesntwork 
2c2
< <cluster config_version="123" name="cl55ase">
---
> <cluster config_version="124" name="cl55ase">
54a55,62
>                 <service autostart="0" domain="fd_n1" exclusive="0" name="IWR_db_p1" nfslock="1" recovery="relocate">
>                         <fs ref="fs_share">
>                                 <nfsexport ref="nfs_share">
>                                         <nfsclient ref="nfsc_share"/>
>                                 </nfsexport>
>                         </fs>
>                         <ip ref="10.33.1.250"/>
>                 </service>
63,70d70
< 		<service autostart="0" domain="fd_n1" exclusive="0" name="IWR_db_p1" nfslock="1" recovery="relocate">
< 			<fs ref="fs_share">
< 				<nfsexport ref="nfs_share">
< 					<nfsclient ref="nfsc_share"/>
< 				</nfsexport>
< 			</fs>
< 			<ip ref="10.33.1.250"/>
< 		</service>

Comment 4 Julio Entrena Perez 2011-06-20 12:14:35 UTC

I forgot to mention that the behaviour is the same regardless of which node is the RG-master and which one is the RG-worker.

Comment 5 Lon Hohberger 2011-06-21 13:43:05 UTC

The way the event processing works is very configuration-order dependent; that is:

  <service name="a" depend="b"/>
  <service name="b" />

When a node comes online, you will see the following:

  [node 1 online]
    start service a (can't; dependency not met)
    start service b
  [service b started]
    dependency met, start service a
  [service a started]

If you change the order:

  <service name="b" />
  <service name="a" depend="b"/>

You will see:

  [node 1 online]
    start service b
    start service a
  [service b started]
  [service a started]

You can create specific priority ordering for node events by adding the 'priority' attribute to services:

  <service name="a" depend="b" priority="2" />
  <service name="b" priority="1" />

... or by reordering them in cluster.conf.

*** This bug has been marked as a duplicate of bug 492828 ***