Bug 1318985 - Oracle resource agent unable to start because of ORA-01081
Summary: Oracle resource agent unable to start because of ORA-01081
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: resource-agents
Version: 7.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Oyvind Albrigtsen
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On: 1318240
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-18 10:25 UTC by Oyvind Albrigtsen
Modified: 2020-12-11 12:07 UTC (History)
5 users (show)

Fixed In Version: resource-agents-3.9.5-69.el7
Doc Type: Bug Fix
Doc Text:
Clone Of: 1318240
Environment:
Last Closed: 2016-11-04 00:02:08 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:2174 0 normal SHIPPED_LIVE resource-agents bug fix and enhancement update 2016-11-03 13:16:36 UTC

Description Oyvind Albrigtsen 2016-03-18 10:25:35 UTC
+++ This bug was initially created as a clone of Bug #1318240 +++

Description of problem:
It might happen that oracle cannot be started by resource agent with the following error:
INFO: ORA-01081 error found, trying to cleanup oracle (dbstart_mount output:
ORA-01081: cannot start already-running ORACLE - shut it down first)

Version-Release number of selected component (if applicable):
resource-agents-3.9.5-34.el6.x86_64

How reproducible: always, once in that state

Steps to Reproduce:
1. pcs resource debug-start oracle

Actual results: Oracle will never start.

Expected results: Oracle starts happily.

Additional info:

I believe that this happens with recovery tests after all ora_* processes have
been killed and resource agent tried to start Oracle on another node.

To enable resource agent to start it again it is necessary to issue 'shutdown
immediate;' in sqlplus.

This is how the problem demonstrates itself:

# pcs resource debug-start oracle
Error performing operation: Operation not permitted
Operation start for oracle (ocf:heartbeat:oracle) returned 1
 >  stderr: INFO: ORA-01081 error found, trying to cleanup oracle (dbstart_mount output: ORA-01081: cannot start already-running ORACLE - shut it down first)
 >  stderr: ls: cannot access /u01/app/oracle/product/12.1.0/dbhome_1/dbs/lk*: No such file or directory
 >  stderr: ERROR: oracle oradb can not be mounted (status: OPEN)

--- Additional comment from Oyvind Albrigtsen on 2016-03-18 11:18:16 CET ---

Tested and verified patch available upstream: https://github.com/ClusterLabs/resource-agents/pull/783

Comment 2 Mike McCune 2016-03-28 23:14:23 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 4 michal novacek 2016-08-01 11:48:43 UTC
I have verified that oracle resource agent starts correctly after having all
the processes killed with resource-agents-3.9.5-79.el7.x86_64.

-----

setup:

# pcs resource show oracle 
 Resource: oracle (class=ocf provider=heartbeat type=oracle)
  Attributes: sid=oradb monprofile=mprofile
  Operations: start interval=0s timeout=120 (oracle-start-interval-0s)
              stop interval=0s timeout=120 (oracle-stop-interval-0s)
              monitor interval=30s (oracle-monitor-interval-30s)

# pcs resource show ora-group
 Group: ora-group
  Resource: vip (class=ocf provider=heartbeat type=IPaddr2)
   Attributes: ip=10.34.70.97 cidr_netmask=23
   Operations: start interval=0s timeout=20s (vip-start-interval-0s)
               stop interval=0s timeout=20s (vip-stop-interval-0s)
               monitor interval=30s (vip-monitor-interval-30s)
  Resource: halvm (class=ocf provider=heartbeat type=LVM)
   Attributes: exclusive=true partial_activation=false volgrpname=shared
   Operations: start interval=0s timeout=30 (halvm-start-interval-0s)
               stop interval=0s timeout=30 (halvm-stop-interval-0s)
               monitor interval=10 timeout=30 (halvm-monitor-interval-10)
  Resource: fs (class=ocf provider=heartbeat type=Filesystem)
   Attributes: device=/dev/shared/shared0 directory=/u01 fstype=ext4 options=
   Operations: start interval=0s timeout=60 (fs-start-interval-0s)
               stop interval=0s timeout=60 (fs-stop-interval-0s)
               monitor interval=30s (fs-monitor-interval-30s)
  Resource: oracle (class=ocf provider=heartbeat type=oracle)
   Attributes: sid=oradb monprofile=mprofile
   Operations: start interval=0s timeout=120 (oracle-start-interval-0s)
               stop interval=0s timeout=120 (oracle-stop-interval-0s)
               monitor interval=30s (oracle-monitor-interval-30s)

# pcs resource | grep oracle
     oracle     (ocf::heartbeat:oracle):        Started light-02.cluster-qe.lab.eng.brq.redhat.com

before the fix (resource-agents-3.9.5-52.el7.x86_64)
====================================================

# pkill -9 ora_ && sleep 30

# pcs resource debug-monitor oracle
Error performing operation: Argument list too long
Operation monitor for oracle (ocf:heartbeat:oracle) returned 7
 >  stderr: INFO: oracle process not running
#
# sleep 60
# pcs resource | grep oracle
     oracle     (ocf::heartbeat:oracle):        Stopped

# pcs resource debug-start oracle
Error performing operation: Operation not permitted
Operation start for oracle (ocf:heartbeat:oracle) returned 1
 >  stderr: INFO: ORA-01081 error found, trying to cleanup oracle \
 (dbstart_mount output: ORA-01081: cannot start already-running ORACLE - shut \
 it down first)
 >  stderr: ls: cannot access /u01/app/oracle/product/12.1.0/dbhome_1/dbs/lk*: \
 No such file or directory
 >  stderr: ERROR: oracle oradb can not be mounted (status: OPEN)


fixed version (resource-agents-3.9.5-79.el7.x86_64)
===================================================

# pkill -9 ora_ && sleep 30s #monitor interval

# pcs resource debug-monitor oracle
Error performing operation: Argument list too long
Operation monitor for oracle (ocf:heartbeat:oracle) returned 7
 >  stderr: INFO: oracle process not running

# sleep 60
# pcs resource | grep oracle
     oracle     (ocf::heartbeat:oracle):        Started light-02.cluster-qe.lab.eng.brq.redhat.com

Comment 6 errata-xmlrpc 2016-11-04 00:02:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2174.html


Note You need to log in before you can comment on or make changes to this bug.