Hide Forgot
Description of problem: Oracle resource agent for Oracle12c in RHEL 7 cluster contain status check which performs grep of "ps" output for Oracle user based on environment variable $ORACLE_OWNER. However if the $ORACLE_OWNER value contains more than 8 characters the `ps` cuts the username in output so the grep won't succeed causing status check to fail. From sources of resource-agents (RHEL 7.1), ClusterLabs-resource-agents-5434e96/heartbeat/oralsnr: 269 show_procs() { 270 ps -e -o pid,user,args | 271 grep '[t]nslsnr' | grep -w "$listener" | grep -w "$ORACLE_OWNER" EXAMPLE: ======== If $ORACLE_OWNER is "oracle123" it will be displayed in `ps -e -o pid,user,args` output as "oracle1+" and the check (above line #271) will fail causing the status check to fail. ======== "ps" offers -U <username> parameter to list only processes related to <username> so we could use it instead of grepping the user itself. The `ps -U $ORACLE_OWNER` should work also if the username is longer than 8 characters Version: RHEL 7.1, resource-agents-3.9.5 How reproducible: Always
Tested and working as expected: https://github.com/ClusterLabs/resource-agents/pull/781
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions
I have verified the with resource-agents resource-agents-3.9.5-79.el7.x86_64 oralsnr agent will handle correctly oracle listener process running under user having name longer than eight characters. Common setup: ============= -- create new user with name longer than eight characters * configure oracle to run in the cluster (2), (3) * disable oracle listener that you verified will run in the cluster (1) * check that there are no processes owned by user oracle * create new user with name longer than eight characters having same groups as user oracle * exchange its uid with oracle user [root@kiff-03 ~]# useradd oracle1234 -G oinstall, dba, asmdba, asmoper, oper, backupdba, dgdba, kmdba [root@kiff-03 ~]# id oracle1234 uid=54323(oracle1234) gid=54330(oracle1234) groups=54330(oracle1234),54321(oinstall),54322(dba),54323(asmdba),54324(asmoper),54326(oper),54327(backupdba),54328(dgdba),54329(kmdba) [root@kiff-03 ~]# id oracle uid=54321(oracle) gid=54321(oinstall) groups=54321(oinstall),54322(dba),54323(asmdba),54324(asmoper),54326(oper),54327(backupdba),54328(dgdba),54329(kmdba) [root@kiff-03 ~]# usermod --uid 54321 oracle1234 --non-unique [root@kiff-03 ~]# usermod --uid 54323 oracle [root@kiff-03 ~]# id oracle uid=54323(oracle) gid=54321(oinstall) groups=54321(oinstall),54322(dba),54323(asmdba),54324(asmoper),54326(oper),54327(backupdba),54328(dgdba),54329(kmdba) [root@kiff-03 ~]# id oracle1234 uid=54321(oracle1234) gid=54330(oracle1234) groups=54330(oracle1234),54321(oinstall),54322(dba),54323(asmdba),54324(asmoper),54326(oper),54327(backupdba),54328(dgdba),54329(kmdba) patched version: resource-agents-3.9.5-79.el7.x86_64 ==================================================== oracle listener starts, monitor action succeeds [root@kiff-03 ~]# pcs resource enable oralsnr [root@kiff-03 ~]# pcs resource | grep oracle oracle (ocf::heartbeat:oracle): Started kiff-03.cluster-qe.lab.eng.brq.redhat.com [root@kiff-03 ~]# pcs resource debug-monitor oralsnr Operation monitor for oralsnr (ocf:heartbeat:oralsnr) returned 0 [root@kiff-03 ~]# ps axfu | grep LISTENER root 14735 0.0 0.0 112648 968 pts/0 S+ 12:25 0:00 | \_ grep --color=auto LISTENER oracle1+ 17837 0.1 0.0 171964 12992 ? Ssl 12:21 0:00 /u01/app/oracle/product/12.1.0/dbhome_1/bin/tnslsnr LISTENER -inherit BEFORE the patch: resource-agents-3.9.5-52.el7.x86_64 ===================================================== [root@kiff-03 ~]# pcs resource enable oralsnr [root@kiff-03 ~]# pcs resource | grep oracle oracle (ocf::heartbeat:oracle): (FAILED) Started kiff-03.cluster-qe.lab.eng.brq.redhat.com [root@kiff-03 ~]# pcs resource debug-monitor oralsnr Error performing operation: Argument list too long Operation monitor for oralsnr (ocf:heartbeat:oralsnr) returned 7 [root@kiff-03 ~]# ps axfu | grep LISTENER root 5354 0.0 0.0 112648 968 pts/0 S+ 12:24 0:00 | \_ grep --color=auto LISTENER oracle1+ 17837 0.1 0.0 171964 12992 ? Ssl 12:21 0:00 /u01/app/oracle/product/12.1.0/dbhome_1/bin/tnslsnr LISTENER -inherit ----- (1) [root@kiff-03 ~]# pcs status Cluster name: STSRHTS2268 Stack: corosync Current DC: kiff-03.cluster-qe.lab.eng.brq.redhat.com (version 1.1.15-10.el7-e174ec8) - partition with quorum Last updated: Fri Aug 19 11:58:08 2016 Last change: Fri Aug 19 11:54:14 2016 by root via crm_resource on kiff-03.cluster-qe.lab.eng.brq.redhat.com 2 nodes and 7 resources configured: 3 resources DISABLED and 0 BLOCKED from being started due to failures Online: [ kiff-01.cluster-qe.lab.eng.brq.redhat.com kiff-03.cluster-qe.lab.eng.brq.redhat.com ] Full list of resources: fence-kiff-01 (stonith:fence_ipmilan): Started kiff-03.cluster-qe.lab.eng.brq.redhat.com fence-kiff-03 (stonith:fence_ipmilan): Started kiff-01.cluster-qe.lab.eng.brq.redhat.com Resource Group: ora-group vip (ocf::heartbeat:IPaddr2): Started kiff-03.cluster-qe.lab.eng.brq.redhat.com halvm (ocf::heartbeat:LVM): Started kiff-03.cluster-qe.lab.eng.brq.redhat.com fs (ocf::heartbeat:Filesystem): Started kiff-03.cluster-qe.lab.eng.brq.redhat.com oracle (ocf::heartbeat:oracle): Started kiff-03.cluster-qe.lab.eng.brq.redhat.com oralsnr (ocf::heartbeat:oralsnr): Stopped (disabled) Failed Actions: * oralsnr_start_0 on kiff-01.cluster-qe.lab.eng.brq.redhat.com 'not installed' (5): call=90, status=complete, exitreason='none', last-rc-change='Fri Aug 19 11:38:11 2016', queued=0ms, exec=49ms Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled (2) $cat /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora # listener.ora Network Configuration File: /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora # Generated by Oracle configuration tools. SID_LIST_LISTENER = (SID_LIST= (SID_DESC = (GLOBAL_DBNAME = oradb) (ORACLE_HOME = /u01/app/oracle/product/12.1.0/dbhome_1) (SID_NAME = oradb) ) ) SID_LIST_LISTENER_CDB2 = (SID_LIST= (SID_DESC = (GLOBAL_DBNAME = cdb2) (ORACLE_HOME = /u01/app/oracle/product/12.1.0/dbhome_1) (SID_NAME = cdb2) ) ) LISTENER = (DESCRIPTION_LIST = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = 10.34.71.249)(PORT = 1521)) ) (DESCRIPTION = (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521)) ) ) LISTENER_CDB2 = (DESCRIPTION_LIST = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = 10.34.71.249)(PORT = 1522)) ) (DESCRIPTION = (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1522)) ) ) (3) [root@kiff-03 ~]# pcs resource --full Group: ora-group Resource: vip (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=10.34.71.249 cidr_netmask=23 Operations: start interval=0s timeout=20s (vip-start-interval-0s) stop interval=0s timeout=20s (vip-stop-interval-0s) monitor interval=30s (vip-monitor-interval-30s) Resource: halvm (class=ocf provider=heartbeat type=LVM) Attributes: exclusive=true partial_activation=false volgrpname=shared Operations: start interval=0s timeout=30 (halvm-start-interval-0s) stop interval=0s timeout=30 (halvm-stop-interval-0s) monitor interval=10 timeout=30 (halvm-monitor-interval-10) Resource: fs (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/shared/shared0 directory=/u01 fstype=ext4 options= Operations: start interval=0s timeout=60 (fs-start-interval-0s) stop interval=0s timeout=60 (fs-stop-interval-0s) monitor interval=30s (fs-monitor-interval-30s) Resource: oracle (class=ocf provider=heartbeat type=oracle) Attributes: sid=oradb Operations: start interval=0s timeout=120 (oracle-start-interval-0s) stop interval=0s timeout=120 (oracle-stop-interval-0s) monitor interval=30s (oracle-monitor-interval-30s) Resource: oralsnr (class=ocf provider=heartbeat type=oralsnr) Attributes: listener=LISTENER sid=oradb Meta Attrs: target-role=Stopped Operations: start interval=0s timeout=120 (oralsnr-start-interval-0s) stop interval=0s timeout=120 (oralsnr-stop-interval-0s) monitor interval=10 timeout=30 (oralsnr-monitor-interval-10)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2174.html