Bug 224462
Summary: | clurgmgrd claim "service started" but it is not | ||
---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Roger Pena-Escobio <orkcu> |
Component: | rgmanager | Assignee: | Lon Hohberger <lhh> |
Status: | CLOSED NOTABUG | QA Contact: | Cluster QE <mspqa-list> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4 | CC: | cluster-maint, tmarshal |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2007-01-26 17:58:54 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Roger Pena-Escobio
2007-01-25 20:04:01 UTC
Quite the interesting configuration there :) Ok, to start, have a look at: # rg_test test /etc/cluster/cluster.conf Unique/primary not unique type clusterfs, name=WWWData Error storing clusterfs resource Unique/primary not unique type clusterfs, name=WWWSoft Error storing clusterfs resource ... When rgmanager detects collisions between attributes of a resource type which are required to be unique across the resource type, it stops parsing that branch of the tree (so, your references to scripts in the apache26 service are not even present in service trees that rgmanager constructs internally - see the bottom of the output of rg_test). The apache26 service has two resource collisions with the apache25 service: (1) WWWData is defined twice, with basically identical components (except fsid, which does not affect your configuration). You should put this one in your <resources> block and pass it by reference (like you did with scripts). (2) WWWSoft is defined twice with a different device, but the same mount point, causing a naming & mount point collision. You need to rename one to something else to resolve the naming collision. The mount point is also the same, and that must be unique. However, you can make it not required to be unique tweaking the metadata in /usr/share/cluster/clusterfs.sh: * set "unique" to "0" for the "mountpoint" parameter. * restart rgmanager on both nodes Most users should *not* do this, but in your case, it looks safe to do (since the two services will never coexist on the same node due to restricted failover domains). Warning: do not change the primary attribute ("name", in most cases), or you will probably break stuff. Anyway, if you change the 'unique' flag to the 'mountpoint' parameter to 0 in /usr/share/cluster/clusterfs.sh, and restart rgmanager, the following configuration should work: <rm> <failoverdomains> <failoverdomain name="mysql" ordered="0" restricted="1"> <failoverdomainnode name="blade21" priority="1"/> <failoverdomainnode name="blade22" priority="1"/> </failoverdomain> <failoverdomain name="apache25" ordered="0" restricted="1"> <failoverdomainnode name="blade25" priority="1"/> </failoverdomain> <failoverdomain name="apache26" ordered="0" restricted="1"> <failoverdomainnode name="blade26" priority="1"/> </failoverdomain> <failoverdomain name="ftp" ordered="0" restricted="1"> <failoverdomainnode name="blade25" priority="1"/> <failoverdomainnode name="blade26" priority="1"/> </failoverdomain> </failoverdomains> <resources> <script file="/etc/init.d/httpd" name="apache start-stop"/> <script file="/etc/init.d/vsftpd" name="vsftpd"/> <clusterfs device="/dev/emcpowerd1" force_unmount="0" fsid="41107" fstype="gfs" mountpoint="/opt/www" name="WWWData" options=""/> </clusterfs> </resources> <service autostart="1" domain="mysql" name="mysqld" recovery="restart"> <fs device="/dev/mapper/MysqlData-VarLibMysql" force_fsck="0" force_unmount="1" fsid="30618" fstype="ext3" mountpoint="/var/lib/mysql" name="MysqlData" options="" self_fence="1"/> <ip address="172.17.0.123" monitor_link="1"/> <script file="/etc/init.d/mysqld" name="mysql start-stop"/> </service> <service autostart="1" domain="apache25" name="apache25"> <clusterfs ref="WWWData"/> <clusterfs device="/dev/emcpowera1" force_unmount="0" fsid="30342" fstype="gfs" mountpoint="/opt/soft" name="WWWSoft1" options=""/> <script ref="vsftpd"/> <script ref="apache start-stop"/> </service> <service autostart="1" domain="apache26" name="apache26"> <clusterfs ref="WWWData"/> <clusterfs device="/dev/emcpowerb1" force_unmount="0" fsid="30343" fstype="gfs" mountpoint="/opt/soft" name="WWWSoft2" options=""/> <script ref="vsftpd"/> <script ref="apache start-stop"/> </service> </rm> Now, if you don't change /usr/share/cluster/clusterfs.sh, you'll have to change the mount point and make the scripts for apache context-sensitive. You can do this by checking "OCF_RESKEY_service_name" and starting apache with a different config based on that from the script if you use the above configuration; i.e. (untested example, the idea is that it starts httpd based on the service it's part of, and uses /etc/httpd/conf/httpd-<service_name>.conf). --- /etc/init.d/httpd.old 2007-01-26 12:08:59.000000000 -0500 +++ /etc/init.d/httpd 2007-01-26 12:10:33.000000000 -0500 @@ -57,6 +57,9 @@ # when not running is also a failure. So we just do it the way init scripts # are expected to behave here. start() { + if [ "$OCF_RESKEY_service_name" ]; then + OPTIONS="$OPTIONS -f /etc/httpd/conf/httpd-${OCF_RESKEY_service_name}.conf" + fi echo -n $"Starting $prog: " check13 || exit 1 LANG=$HTTPD_LANG daemon $httpd $OPTIONS If you choose to do it this way, WWWSoft1 and WWWSoft2 in the above example configuration will need different mount points (/opt/soft1 and /opt/soft2, for example), and /etc/httpd/conf/httpd-apache25.conf and httpd-apache26.conf will need whatever is pointing at /opt/soft set accordingly. While you get things up and running, I will investigate the possibility of allowing non-primary (but unique) namespace collisions across disjoint restricted failover domains. This will not be solved overnight, mind you (and may fall into the realm of the dependency code we're working on). Generally, you should always design your services as though they can coexist - unless there is a device disconnect between the nodes (e.g. for example, /dev/emcpowera1 is not connected to blade25 and /dev/emcpowerb1 is not connected to blade26). Oh, the above configuration has an extraneous "</clusterfs>" thing in the <resources> section. Remove it before use ;) Created attachment 146689 [details]
Original configuration
Created attachment 146690 [details]
rg_test output of original configuration
Created attachment 146691 [details]
altered configuration
Created attachment 146692 [details]
rg_test output of new configuration, clusterfs.sh not modified yet
Created attachment 146693 [details]
rg_test output of new configuration, clusterfs.sh modified to set mountpoint unique="0"
Created attachment 146694 [details]
clusterfs.sh with unique for mountpoint set to 0
[Note: from RHEL5 branch, but should work on RHEL4]
I've filed a separate bugzilla feature request to allow reuse of "unique" attributes if the services will never collide, as well as add syslog-logging (rather than just "printf") for when resource collisions occur: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=224608 The current behavior concerning resource collisions is not a bug, but may be possible to expand the behavior as described previously (and in the above noted bugzilla). Additionally, the collisions might be something we can check for in the GUIs (system-config-cluster and Conga) - so that this does not quietly hit other users. |