Bug 817550 - Change oracledb.sh script so that it properly checks the status of an Oracle database
Change oracledb.sh script so that it properly checks the status of an Oracle ...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: resource-agents (Show other bugs)
6.4
Unspecified Unspecified
medium Severity medium
: rc
: 6.4
Assigned To: Ryan McCabe
Cluster QE
:
Depends On:
Blocks: 782183 840699
  Show dependency treegraph
 
Reported: 2012-04-30 09:27 EDT by cphillip
Modified: 2013-07-11 15:26 EDT (History)
8 users (show)

See Also:
Fixed In Version: resource-agents-3.9.2-14.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-02-21 02:52:06 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 422563 None None None Never

  None (edit)
Description cphillip 2012-04-30 09:27:25 EDT
Description of problem:

The oracledb.sh script when called with a status argument should only check the status of oracle and report this back to rgmanager.  It should not restart the database as part of the status function.

The following function gets called by status_oracle when the oracledb.sh script is called with a status argument.

As you can see instead of just reporting the status as down, it will restart the database without any notification to rgmanager:

get_db_status()
{
        declare -i subsys_lock=$1
        declare -i i=0
        declare -i rv=0
        declare ora_procname

        for procname in $DB_PROCNAMES ; do

                ora_procname="ora_${procname}_${ORACLE_SID}"

                status $ora_procname
                if [ $? -eq 0 ] ; then
                        # This one's okay; go to the next one.
                        continue
                fi

                #
                # We're not supposed to be running, and we are,
                # in fact, not running...
                # XXX only works when monitoring one db process; consider
                # extending in future.
                #
                if [ $subsys_lock -ne 0 ]; then
                        return 3
                fi

                for (( i=$RESTART_RETRIES ; i; i-- )) ; do
                        # this db process is down - stop and
                        # (re)start all ora_XXXX_$ORACLE_SID processes
                        initlog -q -n $SCRIPT -s "Restarting Oracle Database..."
                        stop_db immediate
                        if [ $? != 0 ] ; then
                                # stop failed - return 1
                                return 1
                        fi

                        start_db
                        if [ $? == 0 ] ; then
                                # ora_XXXX_$ORACLE_SID processes started
                                # successfully, so break out of the
                                # stop/start # 'for' loop
                                break
                        fi
                done

                if [ $i -eq 0 ]; then
                        # stop/start's failed - return 1 (failure)
                        return 1
                fi
        done
        return 0


This behaviour of the oracledb.sh script when called with a status argument is the bug that needs to be fixed. 

The way that the code is written oracle could be failing and get restarted by the status commands repeatedly (an infinite number of times) with no notification to rgmanger,  this means that is it not possible to accurately control the behaviour of the oracle resource or the service based on the status of the database, and this effects options like coalesce.

This issue is solely with the oracledb.sh script, if a script resource is used this will not occur because the code that will be run will not include the issue.  It seems a little overkill to rewrite the code to start/stop and monitor oracle for what is a fairly simple issue with the existing resource script.

The following change to the oracledb.sh resource script should fix this:

declare -i      RESTART_RETRIES=3
to 
declare -i      RESTART_RETRIES=0
Comment 7 errata-xmlrpc 2013-02-21 02:52:06 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0288.html

Note You need to log in before you can comment on or make changes to this bug.