User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.2) Gecko/2008092318 Fedora/3.0.2-1.fc9 Firefox/3.0.2 The oracledb resource agent seems to be doing listener status check at strange intervals which is too long for the customer. By adding logging statements to the script, they can see the get_lsnr_status function called at a repeating pattern of 30s, 5m, 4.5m. The problem is that if they kill the oracle listener process, it may be up to 5 minutes before the cluster notices and attempts to restart it. Reproducible: Always Steps to Reproduce: 1) Configure cluster with an oracledb resource 2) Add logger statements in oracledb.sh surrounding the get_lsnr_status call on line 808 like so: logger "**** LISTENER STATUS START ****" get_lsnr_status $subsys_lock $depth update_status $? $last last=$? logger "### LISTENER STATUS END ####" 3) Start rgmanager Actual Results: Repeating pattern of 30s, 5m, 4.5m between logged messages for status checks. Expected Results: A smaller window between status checks. The solution that worked for me is to change the interval, this case I changed to 23: <actions> <action name="start" timeout="900"/> <action name="stop" timeout="90"/> <action name="recover" timeout="990"/> <!-- Checks to see if it's mounted in the right place --> <action name="status" timeout="10"/> <action name="monitor" timeout="10"/> <!-- Checks to see if we can read from the mountpoint --> <action name="status" depth="10" timeout="30" interval="23"/> <action name="monitor" depth="10" timeout="30" interval="23"/> <!-- Checks to see if we can write to the mountpoint (if !ROFS) --> <action name="status" depth="20" timeout="90" interval="10m"/> <action name="monitor" depth="20" timeout="90" interval="10m"/> <action name="meta-data" timeout="5"/> <action name="verify-all" timeout="5"/> </actions> <special tag="rgmanager"> <attributes maxinstances="1"/> </special> </resource-agent> EOT }
The depth="20" checks should be removed.
In cluster.conf, you can work around changing the resource-agent metadata by manually adding a special "action" child to the oracledb resource in the service: <oracledb ... > <action name="status" depth="*" timeout="30" interval="23" /> </oracledb> Because there is no 'depth="20"' in the oracledb resource agent (or at least it's not used), having multiple depths is useless. The above replaces all depths to a 23 second check interval.
Created attachment 326410 [details] Fix, pass 1
Let me know if this helps, it does what I noted in the previous comment about deleting the extraneous "depth".
http://git.fedorahosted.org/git/?p=cluster.git;a=commit;h=222230d643cd293187eebf85a5d5f79368eb63e4
~~ Attention Partners RHEL 5.4 Partner Alpha Released! ~~ RHEL 5.4 Partner Alpha has been released on partners.redhat.com. There should be a fix present that addresses this particular request. Please test and report back your results here, at your earliest convenience. Our Public Beta release is just around the corner! If you encounter any issues, please set the bug back to the ASSIGNED state and describe the issues you encountered. If you have verified the request functions as expected, please set your Partner ID in the Partner field above to indicate successful test results. Do not flip the bug status to VERIFIED. Further questions can be directed to your Red Hat Partner Manager. Thanks!
~~ Attention - RHEL 5.4 Beta Released! ~~ RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner! If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity. Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value. Questions can be posted to this bug or your customer or partner representative.
~~ Attention Partners - RHEL 5.4 Snapshot 1 Released! ~~ RHEL 5.4 Snapshot 1 has been released on partners.redhat.com. If you have already reported your test results, you can safely ignore this request. Otherwise, please notice that there should be a fix available now that addresses this particular request. Please test and report back your results here, at your earliest convenience. The RHEL 5.4 exception freeze is quickly approaching. If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity. Do not flip the bug status to VERIFIED. Instead, please set your Partner ID in the Verified field above if you have successfully verified the resolution of this issue. Further questions can be directed to your Red Hat Partner Manager or other appropriate customer representative.
~~ Attention Partners - RHEL 5.4 Snapshot 5 Released! ~~ RHEL 5.4 Snapshot 5 is the FINAL snapshot to be release before RC. It has been released on partners.redhat.com. If you have already reported your test results, you can safely ignore this request. Otherwise, please notice that there should be a fix available now that addresses this particular issue. Please test and report back your results here, at your earliest convenience. If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity. If it is urgent, escalate the issue to your partner manager as soon as possible. There is /very/ little time left to get additional code into 5.4 before GA. Partners, after you have verified, do not flip the bug status to VERIFIED. Instead, please set your Partner ID in the Verified field above if you have successfully verified the resolution of this issue. Further questions can be directed to your Red Hat Partner Manager or other appropriate customer representative.
Shane, please update us with the latest test results for confirming the resolution of this request. Thank you.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1339.html