Hide Forgot
+++ This bug was initially created as a clone of Bug #1318240 +++ Description of problem: It might happen that oracle cannot be started by resource agent with the following error: INFO: ORA-01081 error found, trying to cleanup oracle (dbstart_mount output: ORA-01081: cannot start already-running ORACLE - shut it down first) Version-Release number of selected component (if applicable): resource-agents-3.9.5-34.el6.x86_64 How reproducible: always, once in that state Steps to Reproduce: 1. pcs resource debug-start oracle Actual results: Oracle will never start. Expected results: Oracle starts happily. Additional info: I believe that this happens with recovery tests after all ora_* processes have been killed and resource agent tried to start Oracle on another node. To enable resource agent to start it again it is necessary to issue 'shutdown immediate;' in sqlplus. This is how the problem demonstrates itself: # pcs resource debug-start oracle Error performing operation: Operation not permitted Operation start for oracle (ocf:heartbeat:oracle) returned 1 > stderr: INFO: ORA-01081 error found, trying to cleanup oracle (dbstart_mount output: ORA-01081: cannot start already-running ORACLE - shut it down first) > stderr: ls: cannot access /u01/app/oracle/product/12.1.0/dbhome_1/dbs/lk*: No such file or directory > stderr: ERROR: oracle oradb can not be mounted (status: OPEN) --- Additional comment from Oyvind Albrigtsen on 2016-03-18 11:18:16 CET --- Tested and verified patch available upstream: https://github.com/ClusterLabs/resource-agents/pull/783
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions
I have verified that oracle resource agent starts correctly after having all the processes killed with resource-agents-3.9.5-79.el7.x86_64. ----- setup: # pcs resource show oracle Resource: oracle (class=ocf provider=heartbeat type=oracle) Attributes: sid=oradb monprofile=mprofile Operations: start interval=0s timeout=120 (oracle-start-interval-0s) stop interval=0s timeout=120 (oracle-stop-interval-0s) monitor interval=30s (oracle-monitor-interval-30s) # pcs resource show ora-group Group: ora-group Resource: vip (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=10.34.70.97 cidr_netmask=23 Operations: start interval=0s timeout=20s (vip-start-interval-0s) stop interval=0s timeout=20s (vip-stop-interval-0s) monitor interval=30s (vip-monitor-interval-30s) Resource: halvm (class=ocf provider=heartbeat type=LVM) Attributes: exclusive=true partial_activation=false volgrpname=shared Operations: start interval=0s timeout=30 (halvm-start-interval-0s) stop interval=0s timeout=30 (halvm-stop-interval-0s) monitor interval=10 timeout=30 (halvm-monitor-interval-10) Resource: fs (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/shared/shared0 directory=/u01 fstype=ext4 options= Operations: start interval=0s timeout=60 (fs-start-interval-0s) stop interval=0s timeout=60 (fs-stop-interval-0s) monitor interval=30s (fs-monitor-interval-30s) Resource: oracle (class=ocf provider=heartbeat type=oracle) Attributes: sid=oradb monprofile=mprofile Operations: start interval=0s timeout=120 (oracle-start-interval-0s) stop interval=0s timeout=120 (oracle-stop-interval-0s) monitor interval=30s (oracle-monitor-interval-30s) # pcs resource | grep oracle oracle (ocf::heartbeat:oracle): Started light-02.cluster-qe.lab.eng.brq.redhat.com before the fix (resource-agents-3.9.5-52.el7.x86_64) ==================================================== # pkill -9 ora_ && sleep 30 # pcs resource debug-monitor oracle Error performing operation: Argument list too long Operation monitor for oracle (ocf:heartbeat:oracle) returned 7 > stderr: INFO: oracle process not running # # sleep 60 # pcs resource | grep oracle oracle (ocf::heartbeat:oracle): Stopped # pcs resource debug-start oracle Error performing operation: Operation not permitted Operation start for oracle (ocf:heartbeat:oracle) returned 1 > stderr: INFO: ORA-01081 error found, trying to cleanup oracle \ (dbstart_mount output: ORA-01081: cannot start already-running ORACLE - shut \ it down first) > stderr: ls: cannot access /u01/app/oracle/product/12.1.0/dbhome_1/dbs/lk*: \ No such file or directory > stderr: ERROR: oracle oradb can not be mounted (status: OPEN) fixed version (resource-agents-3.9.5-79.el7.x86_64) =================================================== # pkill -9 ora_ && sleep 30s #monitor interval # pcs resource debug-monitor oracle Error performing operation: Argument list too long Operation monitor for oracle (ocf:heartbeat:oracle) returned 7 > stderr: INFO: oracle process not running # sleep 60 # pcs resource | grep oracle oracle (ocf::heartbeat:oracle): Started light-02.cluster-qe.lab.eng.brq.redhat.com
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2174.html