Bug 817550
| Summary: | Change oracledb.sh script so that it properly checks the status of an Oracle database | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | cphillip |
| Component: | resource-agents | Assignee: | Ryan McCabe <rmccabe> |
| Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 6.4 | CC: | agk, bugzilla, cfeist, cluster-maint, djansa, lhh, mjuricek, sbradley |
| Target Milestone: | rc | ||
| Target Release: | 6.4 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | resource-agents-3.9.2-14.el6 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2013-02-21 07:52:06 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 782183, 840699 | ||
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0288.html |
Description of problem: The oracledb.sh script when called with a status argument should only check the status of oracle and report this back to rgmanager. It should not restart the database as part of the status function. The following function gets called by status_oracle when the oracledb.sh script is called with a status argument. As you can see instead of just reporting the status as down, it will restart the database without any notification to rgmanager: get_db_status() { declare -i subsys_lock=$1 declare -i i=0 declare -i rv=0 declare ora_procname for procname in $DB_PROCNAMES ; do ora_procname="ora_${procname}_${ORACLE_SID}" status $ora_procname if [ $? -eq 0 ] ; then # This one's okay; go to the next one. continue fi # # We're not supposed to be running, and we are, # in fact, not running... # XXX only works when monitoring one db process; consider # extending in future. # if [ $subsys_lock -ne 0 ]; then return 3 fi for (( i=$RESTART_RETRIES ; i; i-- )) ; do # this db process is down - stop and # (re)start all ora_XXXX_$ORACLE_SID processes initlog -q -n $SCRIPT -s "Restarting Oracle Database..." stop_db immediate if [ $? != 0 ] ; then # stop failed - return 1 return 1 fi start_db if [ $? == 0 ] ; then # ora_XXXX_$ORACLE_SID processes started # successfully, so break out of the # stop/start # 'for' loop break fi done if [ $i -eq 0 ]; then # stop/start's failed - return 1 (failure) return 1 fi done return 0 This behaviour of the oracledb.sh script when called with a status argument is the bug that needs to be fixed. The way that the code is written oracle could be failing and get restarted by the status commands repeatedly (an infinite number of times) with no notification to rgmanger, this means that is it not possible to accurately control the behaviour of the oracle resource or the service based on the status of the database, and this effects options like coalesce. This issue is solely with the oracledb.sh script, if a script resource is used this will not occur because the code that will be run will not include the issue. It seems a little overkill to rewrite the code to start/stop and monitor oracle for what is a fairly simple issue with the existing resource script. The following change to the oracledb.sh resource script should fix this: declare -i RESTART_RETRIES=3 to declare -i RESTART_RETRIES=0