Hide Forgot
Created attachment 505599 [details] cluster.conf file that shows the deferred RIND event taking place > Description of problem: Depending on the order in which services appear in cluster.conf, rgmanager with central processing defers the RIND script follow-service.sl during a failover. > Version-Release number of selected component (if applicable): rgmanager-2.0.52-9.el5_6.1 . > How reproducible: Always. > Steps to Reproduce: 1. Create a cluster.conf file with: - two nodes 1, 2 and a quorum disk. - two failover domains: - FD1 includes both nodes and prefers the first. - FD2 includes the second node only. - five services (defined in this order): - service A uses FD1. - service D uses FD2. - service E uses FD2. - service B uses FD1 and has a hard dependency on service A. - service C uses FD1 and has a hard dependency on service A. - central processing enabled for rgmanager. - four RIND events: - event class service follow-service-sl A,D,A. - event class node follow-service-sl A,D,A. - event class service follow-service-sl A,E,A. - event class node follow-service-sl A,E,A. 2. Start all services. A, B and C will start in node 1, D and E will start in node 2. 3. Stop the rgmanager service in node 1. > Actual results: - Services A, B and C are stopped in node 1. - Services A, B and C are started in node 2. - Services D and E are stopped in node 2. > Expected results: - Services A, B and C are stopped in node 1. - Service A is started in node 2. - Services D and E are stopped in node 2. - Services B and C are started in node 2. > Additional info: If services are defined in cluster.conf in the following order, the expected behaviour is achieved: - service B uses FD1 and has a hard dependency on service A. - service C uses FD1 and has a hard dependency on service A. - service A uses FD1. - service D uses FD2. - service E uses FD2. Please, note that the only difference is the order in which services appear in cluster.conf. No property of any service is modified at all. See attached 'cluster.conf.works' and 'cluster.conf.doesntwork' files.
Created attachment 505600 [details] cluster.conf file that shows the RIND event taking place when expected
# diff cluster.conf.works cluster.conf.doesntwork 2c2 < <cluster config_version="123" name="cl55ase"> --- > <cluster config_version="124" name="cl55ase"> 54a55,62 > <service autostart="0" domain="fd_n1" exclusive="0" name="IWR_db_p1" nfslock="1" recovery="relocate"> > <fs ref="fs_share"> > <nfsexport ref="nfs_share"> > <nfsclient ref="nfsc_share"/> > </nfsexport> > </fs> > <ip ref="10.33.1.250"/> > </service> 63,70d70 < <service autostart="0" domain="fd_n1" exclusive="0" name="IWR_db_p1" nfslock="1" recovery="relocate"> < <fs ref="fs_share"> < <nfsexport ref="nfs_share"> < <nfsclient ref="nfsc_share"/> < </nfsexport> < </fs> < <ip ref="10.33.1.250"/> < </service>
I forgot to mention that the behaviour is the same regardless of which node is the RG-master and which one is the RG-worker.
The way the event processing works is very configuration-order dependent; that is: <service name="a" depend="b"/> <service name="b" /> When a node comes online, you will see the following: [node 1 online] start service a (can't; dependency not met) start service b [service b started] dependency met, start service a [service a started] If you change the order: <service name="b" /> <service name="a" depend="b"/> You will see: [node 1 online] start service b start service a [service b started] [service a started] You can create specific priority ordering for node events by adding the 'priority' attribute to services: <service name="a" depend="b" priority="2" /> <service name="b" priority="1" /> ... or by reordering them in cluster.conf. *** This bug has been marked as a duplicate of bug 492828 ***