the cluster.conf is: ... <resources> ... <fs device="/dev/data/mt-daten" force_fsck="0" force_unmount="1" fstype="ext3" mountpoint="/exports.smb/mt-daten" name="mt-daten" options="acl" self_fence="1"/> <fs device="/dev/data/zMuell" force_fsck="0" force_unmount="1" fsid="17217" fsty pe="ext3" mountpoint="/exports.smb/mt-daten/zMuell" name="zMuell" options="acl" self_fence="1"/> ... </resources> <service autostart="1" domain="storage" exclusive="1" name="storage" recovery="restart"> ... <fs ref="mt-daten"/> <fs ref="zMuell"/> ... </service> ... if I stop the rgmanger he try to umount <fs ref="mt-daten"/> before he umounts <fs ref="zMuell"/> that is not posible. so he reboot the host. the correct behavior is to umount <fs ref="zMuell"/> before <fs ref="mt-daten"/> if he starts the rgmanager do the rigth thing: he mounts <fs ref="mt-daten"/> befor he umounts <fs ref="zMuell"/>
The ordering is currently not guaranteed for a list of like-typed resources at this point. If you have an ordering dependency between two <fs> resources, the way to guarantee it (right now) is: <service> <fs name="foo"> <fs name="bar"/> </fs> </service> If you structure your service this way, bar will always be started after foo but stopped before foo. Now, the historical reason for this non-guarantee was the idea that it might be possible in the future to branch during starting/stopping of complex services - i.e. perform operations on multiple non-codependent resources in parallel. For example, consider a service where two non-codependent scripts are needed which, although not I/O or CPU intensive, each take five minutes to complete: <service> <script name="foo"/> <script name="bar"/> </service> We could start foo and bar simultaneously, saving just about 5 minutes. However, the actual, *practical* use of this is very limited. More importantly, however, is the fact that implementation of this functionality is very likely destabilizing. Additionally, it would very probably break existing start-ordering behaviors upon which, no doubt, people have already developed an expectency. Additionally, the practical uses of having implicit ordering guarantees vastly exceed the theoretical "performance gain" which might (at some point) have been attained by starting resources in parallel. Therefore, I think we should implement implicit ordering guarantees as described.
Falk, Did you intentionally file this against RHCS 5, or was it supposed to be against RHCS4?
wrong version you are rigth... correctet now
Ok, I know how to fix this, but it requires a surprising amount of code change to make it work correctly.
Created attachment 149651 [details] Patch.
*** Bug 231411 has been marked as a duplicate of this bug. ***
Devel ACK for 4.5.
The patch attached only ensures ordering within a given type (i.e. file systems or scripts). It does not fix ordering in the case that a user has mixed resource types, for example: <fs name="a"/> <script name="1"/> <fs name="b"/> <script name="2"/> The patch only ensures that a starts before b (and the reverse on stopping), and that 1 starts before 2, but it does not ensure that a starts before 1. Addressing this requires fixing #232139, which is a bug in ccsd
rgmanager currently searches for children by known resource types. In order to blindly search and discover children based on the content of cluster.conf, it is required that ccsd return information even if the tag has no child nodes - which is addressed with this patch: https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=149999
Created attachment 150405 [details] Patch * Includes the functionality of previous patch 149651 (e.g. preserve ordering by defined resource types). That is, it allows ordering of <fs/> children of <service> to be started in their order based in cluster.conf (and stopped in reverse order). Example: <service> <ip address="10.1.1.2"/> <fs name="a"/> <script name="1"/> <fs name="b"/> <script name="2"/> </service> Because scripts are ordered after file systems in the service.sh meta-data (and IPs are started after fs, but before script), the order of start in this block becomes a, b, 10.1.1.2, 1, 2; and stop is the reverse (2, 1, 10.1.1.2, b, a). * Preserves ordering of all undefined child resource types in the order they appear in cluster.conf. For example: <service> <ip address="10.1.1.2"> <fs name="a"/> <script name="1"/> <fs name="b"/> <script name="2"/> </ip> </service> Because "fs" and "script" are not defined children in the ip.sh meta-data, their ordering is preserved verbatim. I.E. on start: ip 10.1.1.2, fs a, script 1, fs b, script 2 in start; exactly reversed on stop (2, b, 1, a, 10.1.1.2). All defined children's ordering is preserved by type:
(scratch the last line of the prev. comment)
fixes in CVS
Fails QA. Children of other services are started (incorrectly).
Created attachment 150702 [details] Incremental patch against 150405 which fixes incorrect start problem
Incremental patch in CVS (along with automated test cases).
Created attachment 153997 [details] Incremental patch which fixes the following case <service ref="test1"> <script ref="initscript"> <clusterfs ref="argle"/> </script> <fs ref="mount1"> <nfsexport ref="Dummy Export"> <nfsclient ref="Admin group"/> <nfsclient ref="User group"/> <nfsclient ref="red"/> </nfsexport> </fs> </service> <service ref="test2"> <script ref="initscript"> <clusterfs ref="argle"/> <ip ref="192.168.1.3"/> <fs ref="mount2"> <nfsexport ref="Dummy Export"> <nfsclient ref="Admin group"/> <nfsclient ref="User group"/> <nfsclient ref="red"/> </nfsexport> </fs> <script ref="script2"/> <ip ref="192.168.1.4"/> </script> <script ref="script3"/> </service> With the current code, the clusterfs ref in the test2 service is duplicated due to the old code which added child types when found. Since we look for untyped children explicitly in the new code, adding untyped children would cause the clusterfs resource to be duplicated.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0149.html