Description of problem: in a HALVM configuration (multi-LV), on a 2-node cluster, it is not possible to have 2 (or more) Services using the same LVM/FS resources. The 1st defined Service will start/relocate/stop correctly. The 2nd/subsequent defined Service will start/relocate/stop properly, except the FS never gets mounted. No error in /var/log/messages, status is green. Version-Release number of selected component (if applicable): 5.2 (updated as of 09/17/08) How reproducible: Always Reproduced on-site on real cluster config (2 x rhel5.2 nodes) Reproduced here in a lab in a xen-VM virtual cluster (2 x rhel 5.2 guests) Steps to Reproduce: 1.configure a HALVM Multi LV cluster a) lvm.conf, initrd (newer than lvm.conf) correctly configured b) LVM resource defined as follows : (no lv_name) <lvm name="vgiscsi2" vg_name="vgiscsi2"/> c) FS resource defined as follows <fs device="/dev/vgiscsi2/lviscsi2" force_fsck="0" force_unmount="0" fsid="26297" fstype="ext3" mountpoint="/data_halvm" name="data_halvm" self_fence="0"/> d) 2 identical Services defined as follows <service autostart="0" domain="domain1" exclusive="0" name="HALVM2_fs2" recovery="restart"> <lvm ref="vgiscsi2"> <fs ref="data_halvm"/> </lvm> </service> <service autostart="0" domain="domain1" exclusive="0" name="HALVM_fs" recovery="restart"> <lvm ref="vgiscsi2"> <fs ref="data_halvm"/> </lvm> </service> Suppose HALVM2_fs2 has been defined BEFORE HALVM_fs 2. Start HALVM2_fs2 service - The service starts and the filesystem (/data_halvm) gets mounted - /var/log/messages displays Sep 17 16:13:06 guest127 clurgmgrd[2065]: <notice> Starting disabled service service:HALVM2_fs2 Sep 17 16:13:07 guest127 kernel: kjournald starting. Commit interval 5 seconds Sep 17 16:13:07 guest127 kernel: EXT3 FS on dm-0, internal journal Sep 17 16:13:07 guest127 kernel: EXT3-fs: mounted filesystem with ordered data mode. Sep 17 16:13:07 guest127 clurgmgrd[2065]: <notice> Service service:HALVM2_fs2 started Sep 17 16:13:10 guest127 clurgmgrd: [2065]: <notice> Getting status 3. Stop HALVM2_fs2 service /var/log/messages displays Sep 17 16:13:24 guest127 clurgmgrd[2065]: <notice> Stopping service service:HALVM2_fs2 Sep 17 16:13:25 guest127 clurgmgrd[2065]: <notice> Service service:HALVM2_fs2 is disabled 4. Start HALVM_fs service - Commands succeeds, status of HALVM_fs Service in Luci becomes green. - service starts but filesystem does NOT get mounted /var/log/messages displays Sep 17 16:13:42 guest127 clurgmgrd[2065]: <notice> Starting disabled service service:HALVM_fs Sep 17 16:13:43 guest127 clurgmgrd[2065]: <notice> Service service:HALVM_fs started ==> there is indeed no mention of "kernel: EXT3-fs" message 5. Stop HALVM_fs Service and Start HALVM2_fs2 HALVM_fs stops correctly HALVM2_fs2 starts correctly and filesystem is mounted. This Service will always be the functional one (unless deleted) 6. Stop HALVM2_fs2 Service Delete HALVM2_fs2 Service Start HALVM_fs Service : it starts, and filesystem gets mounted. Actual results: only 1 Service can use a given LVM/FS combination and be functional. If multiple services are defined, only the 1st one will function as intended. If 1st service gets deleted, then the remaining will become functional. Expected results: It should be possible to have N defined Services using the same LVM/FS combination, and usable ONE at a time when needed. Additional info:
This is by-design, actually, but we'll call it a bug. The second reference to the fs resource is dropped at configuration-time. If you look at the output of "rg_test test /etc/cluster/cluster.conf", it will say something like: Warning: Max references exceeded for resource data_halvm (type fs) This is because the states of individual resources are not stored in shared state, only the status of the service as a whole is. This is something we can fix with pacemaker in a future release of RHEL (which stores the states of everything), but not without a fair bit of work within rgmanager. We would either need to: * distribute states of all resource instances cluster-wide (and add reference counts to some unused portion of the state structure), or * check for conflicts at run (e.g. start) time - i.e. if there's a service running with a reference on a resource that we also reference, then we fail to start. If you have a really good use case where this behavior is specifically required, please add it, otherwise this will be a NOTABUG. You can always use central_processing and define three services: * one LVM * one service-A * one service-B ... set a dependency on service-B and service-A on LVM, then add special policies (e.g. write some of your own) to ensure service-A and service-B exist only on the same node as LVM, if running, but can never run together. Be creative :> Corner cases like this is why we added central_processing. See: http://sources.redhat.com/cluster/wiki/EventScripting We can work on adding an example to the event scripting interface to achieve this desired behavior using 3 services.