Bug 470917

Summary: The oracledb.sh script checks in strange intervals(10s, 5m, 4.5m)
Product: Red Hat Enterprise Linux 5 Reporter: Shane Bradley <sbradley>
Component: rgmanagerAssignee: Lon Hohberger <lhh>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: high    
Version: 5.4CC: cluster-maint, cward, djansa, edamato, tao
Target Milestone: rcKeywords: OtherQA
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-02 11:03:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Fix, pass 1 none

Description Shane Bradley 2008-11-10 20:52:02 UTC
User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.2) Gecko/2008092318 Fedora/3.0.2-1.fc9 Firefox/3.0.2

The oracledb resource agent seems to be doing listener status check at strange intervals which is too long for the customer.  By adding logging statements to the script, they can see the get_lsnr_status function called at a repeating pattern of 30s, 5m, 4.5m.  The problem is that if they kill the oracle listener process, it may be up to 5 minutes before the cluster notices and attempts to restart it.

Reproducible: Always

Steps to Reproduce:
1) Configure cluster with an oracledb resource
2) Add logger statements in oracledb.sh surrounding the get_lsnr_status call on line 808 like so:

       logger "**** LISTENER STATUS START ****"
       get_lsnr_status $subsys_lock $depth
       update_status $? $last
       last=$?
       logger "### LISTENER STATUS END ####"
3) Start rgmanager 
Actual Results:  
Repeating pattern of 30s, 5m, 4.5m between logged messages for status checks.

Expected Results:  
A smaller window between status checks.

The solution that worked for me is to change the interval, this case I changed to 23:
    <actions>
        <action name="start" timeout="900"/> 
    <action name="stop" timeout="90"/> 
        <action name="recover" timeout="990"/> 

    <!-- Checks to see if it's mounted in the right place --> 
    <action name="status" timeout="10"/>  
    <action name="monitor" timeout="10"/>

    <!-- Checks to see if we can read from the mountpoint --> 
    <action name="status" depth="10" timeout="30" interval="23"/> 
    <action name="monitor" depth="10" timeout="30" interval="23"/> 


    <!-- Checks to see if we can write to the mountpoint (if !ROFS) -->  
    <action name="status" depth="20" timeout="90" interval="10m"/> 
    <action name="monitor" depth="20" timeout="90" interval="10m"/>

    <action name="meta-data" timeout="5"/> 
    <action name="verify-all" timeout="5"/>
    </actions>  

    <special tag="rgmanager">  
    <attributes maxinstances="1"/> 
    </special>  
</resource-agent> 
EOT
}

Comment 1 Lon Hohberger 2008-11-10 22:14:25 UTC
The depth="20" checks should be removed.

Comment 2 Lon Hohberger 2008-12-08 18:21:03 UTC
In cluster.conf, you can work around changing the resource-agent metadata by manually adding a special "action" child to the oracledb resource in the service:

  <oracledb ... >
    <action name="status" depth="*" timeout="30" interval="23" />
  </oracledb>

Because there is no 'depth="20"' in the oracledb resource agent (or at least it's not used), having multiple depths is useless.  The above replaces all depths to a 23 second check interval.

Comment 3 Lon Hohberger 2008-12-09 22:07:06 UTC
Created attachment 326410 [details]
Fix, pass 1

Comment 4 Lon Hohberger 2008-12-09 22:09:07 UTC
Let me know if this helps, it does what I noted in the previous comment about deleting the extraneous "depth".

Comment 10 Chris Ward 2009-06-14 23:16:14 UTC
~~ Attention Partners RHEL 5.4 Partner Alpha Released! ~~

RHEL 5.4 Partner Alpha has been released on partners.redhat.com. There should
be a fix present that addresses this particular request. Please test and report back your results here, at your earliest convenience. Our Public Beta release is just around the corner!

If you encounter any issues, please set the bug back to the ASSIGNED state and
describe the issues you encountered. If you have verified the request functions as expected, please set your Partner ID in the Partner field above to indicate successful test results. Do not flip the bug status to VERIFIED. Further questions can be directed to your Red Hat Partner Manager. Thanks!

Comment 11 Chris Ward 2009-07-03 18:13:01 UTC
~~ Attention - RHEL 5.4 Beta Released! ~~

RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner!

If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.

Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value.

Questions can be posted to this bug or your customer or partner representative.

Comment 12 Chris Ward 2009-07-10 19:06:17 UTC
~~ Attention Partners - RHEL 5.4 Snapshot 1 Released! ~~

RHEL 5.4 Snapshot 1 has been released on partners.redhat.com. If you have already reported your test results, you can safely ignore this request. Otherwise, please notice that there should be a fix available now that addresses this particular request. Please test and report back your results here, at your earliest convenience. The RHEL 5.4 exception freeze is quickly approaching.

If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.

Do not flip the bug status to VERIFIED. Instead, please set your Partner ID in the Verified field above if you have successfully verified the resolution of this issue. 

Further questions can be directed to your Red Hat Partner Manager or other appropriate customer representative.

Comment 13 Chris Ward 2009-08-03 15:44:34 UTC
~~ Attention Partners - RHEL 5.4 Snapshot 5 Released! ~~

RHEL 5.4 Snapshot 5 is the FINAL snapshot to be release before RC. It has been 
released on partners.redhat.com. If you have already reported your test results, 
you can safely ignore this request. Otherwise, please notice that there should be 
a fix available now that addresses this particular issue. Please test and report 
back your results here, at your earliest convenience.

If you encounter any issues while testing Beta, please describe the 
issues you have encountered and set the bug into NEED_INFO. If you 
encounter new issues, please clone this bug to open a new issue and 
request it be reviewed for inclusion in RHEL 5.4 or a later update, if it 
is not of urgent severity. If it is urgent, escalate the issue to your partner manager as soon as possible. There is /very/ little time left to get additional code into 5.4 before GA.

Partners, after you have verified, do not flip the bug status to VERIFIED. Instead, please set your Partner ID in the Verified field above if you have successfully verified the resolution of this issue. 

Further questions can be directed to your Red Hat Partner Manager or other 
appropriate customer representative.

Comment 15 Chris Ward 2009-08-25 13:23:08 UTC
Shane, please update us with the latest test results for confirming the
resolution of this request. Thank you.

Comment 17 errata-xmlrpc 2009-09-02 11:03:58 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1339.html