| Summary: | Oracle12c resource status check fails if username is longer than 8 characters in pacemaker cluster | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Josef Zimek <pzimek> | |
| Component: | resource-agents | Assignee: | Oyvind Albrigtsen <oalbrigt> | |
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | |
| Severity: | medium | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 7.1 | CC: | agk, cluster-maint, fdinitto, mnovacek | |
| Target Milestone: | rc | |||
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | resource-agents-3.9.5-69.el7 | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1337671 (view as bug list) | Environment: | ||
| Last Closed: | 2016-11-04 00:02:00 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | ||||
| Bug Blocks: | 1337671 | |||
Tested and working as expected: https://github.com/ClusterLabs/resource-agents/pull/781 This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions
I have verified the with resource-agents resource-agents-3.9.5-79.el7.x86_64
oralsnr agent will handle correctly oracle listener process running under user
having name longer than eight characters.
Common setup:
=============
-- create new user with name longer than eight characters
* configure oracle to run in the cluster (2), (3)
* disable oracle listener that you verified will run in the cluster (1)
* check that there are no processes owned by user oracle
* create new user with name longer than eight characters having same groups
as user oracle
* exchange its uid with oracle user
[root@kiff-03 ~]# useradd oracle1234 -G oinstall, dba, asmdba, asmoper, oper, backupdba, dgdba, kmdba
[root@kiff-03 ~]# id oracle1234
uid=54323(oracle1234) gid=54330(oracle1234) groups=54330(oracle1234),54321(oinstall),54322(dba),54323(asmdba),54324(asmoper),54326(oper),54327(backupdba),54328(dgdba),54329(kmdba)
[root@kiff-03 ~]# id oracle
uid=54321(oracle) gid=54321(oinstall) groups=54321(oinstall),54322(dba),54323(asmdba),54324(asmoper),54326(oper),54327(backupdba),54328(dgdba),54329(kmdba)
[root@kiff-03 ~]# usermod --uid 54321 oracle1234 --non-unique
[root@kiff-03 ~]# usermod --uid 54323 oracle
[root@kiff-03 ~]# id oracle
uid=54323(oracle) gid=54321(oinstall) groups=54321(oinstall),54322(dba),54323(asmdba),54324(asmoper),54326(oper),54327(backupdba),54328(dgdba),54329(kmdba)
[root@kiff-03 ~]# id oracle1234
uid=54321(oracle1234) gid=54330(oracle1234) groups=54330(oracle1234),54321(oinstall),54322(dba),54323(asmdba),54324(asmoper),54326(oper),54327(backupdba),54328(dgdba),54329(kmdba)
patched version: resource-agents-3.9.5-79.el7.x86_64
====================================================
oracle listener starts, monitor action succeeds
[root@kiff-03 ~]# pcs resource enable oralsnr
[root@kiff-03 ~]# pcs resource | grep oracle
oracle (ocf::heartbeat:oracle): Started kiff-03.cluster-qe.lab.eng.brq.redhat.com
[root@kiff-03 ~]# pcs resource debug-monitor oralsnr
Operation monitor for oralsnr (ocf:heartbeat:oralsnr) returned 0
[root@kiff-03 ~]# ps axfu | grep LISTENER
root 14735 0.0 0.0 112648 968 pts/0 S+ 12:25 0:00 | \_ grep --color=auto LISTENER
oracle1+ 17837 0.1 0.0 171964 12992 ? Ssl 12:21 0:00 /u01/app/oracle/product/12.1.0/dbhome_1/bin/tnslsnr LISTENER -inherit
BEFORE the patch: resource-agents-3.9.5-52.el7.x86_64
=====================================================
[root@kiff-03 ~]# pcs resource enable oralsnr
[root@kiff-03 ~]# pcs resource | grep oracle
oracle (ocf::heartbeat:oracle): (FAILED) Started kiff-03.cluster-qe.lab.eng.brq.redhat.com
[root@kiff-03 ~]# pcs resource debug-monitor oralsnr
Error performing operation: Argument list too long
Operation monitor for oralsnr (ocf:heartbeat:oralsnr) returned 7
[root@kiff-03 ~]# ps axfu | grep LISTENER
root 5354 0.0 0.0 112648 968 pts/0 S+ 12:24 0:00 | \_ grep --color=auto LISTENER
oracle1+ 17837 0.1 0.0 171964 12992 ? Ssl 12:21 0:00 /u01/app/oracle/product/12.1.0/dbhome_1/bin/tnslsnr LISTENER -inherit
-----
(1)
[root@kiff-03 ~]# pcs status
Cluster name: STSRHTS2268
Stack: corosync
Current DC: kiff-03.cluster-qe.lab.eng.brq.redhat.com (version 1.1.15-10.el7-e174ec8) - partition with quorum
Last updated: Fri Aug 19 11:58:08 2016 Last change: Fri Aug 19 11:54:14 2016 by root via crm_resource on kiff-03.cluster-qe.lab.eng.brq.redhat.com
2 nodes and 7 resources configured: 3 resources DISABLED and 0 BLOCKED from being started due to failures
Online: [ kiff-01.cluster-qe.lab.eng.brq.redhat.com kiff-03.cluster-qe.lab.eng.brq.redhat.com ]
Full list of resources:
fence-kiff-01 (stonith:fence_ipmilan): Started kiff-03.cluster-qe.lab.eng.brq.redhat.com
fence-kiff-03 (stonith:fence_ipmilan): Started kiff-01.cluster-qe.lab.eng.brq.redhat.com
Resource Group: ora-group
vip (ocf::heartbeat:IPaddr2): Started kiff-03.cluster-qe.lab.eng.brq.redhat.com
halvm (ocf::heartbeat:LVM): Started kiff-03.cluster-qe.lab.eng.brq.redhat.com
fs (ocf::heartbeat:Filesystem): Started kiff-03.cluster-qe.lab.eng.brq.redhat.com
oracle (ocf::heartbeat:oracle): Started kiff-03.cluster-qe.lab.eng.brq.redhat.com
oralsnr (ocf::heartbeat:oralsnr): Stopped (disabled)
Failed Actions:
* oralsnr_start_0 on kiff-01.cluster-qe.lab.eng.brq.redhat.com 'not installed' (5): call=90, status=complete, exitreason='none',
last-rc-change='Fri Aug 19 11:38:11 2016', queued=0ms, exec=49ms
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
(2)
$cat /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora
# listener.ora Network Configuration File: /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora
# Generated by Oracle configuration tools.
SID_LIST_LISTENER =
(SID_LIST=
(SID_DESC =
(GLOBAL_DBNAME = oradb)
(ORACLE_HOME = /u01/app/oracle/product/12.1.0/dbhome_1)
(SID_NAME = oradb)
)
)
SID_LIST_LISTENER_CDB2 =
(SID_LIST=
(SID_DESC =
(GLOBAL_DBNAME = cdb2)
(ORACLE_HOME = /u01/app/oracle/product/12.1.0/dbhome_1)
(SID_NAME = cdb2)
)
)
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = 10.34.71.249)(PORT = 1521))
)
(DESCRIPTION =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
)
)
LISTENER_CDB2 =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = 10.34.71.249)(PORT = 1522))
)
(DESCRIPTION =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1522))
)
)
(3)
[root@kiff-03 ~]# pcs resource --full
Group: ora-group
Resource: vip (class=ocf provider=heartbeat type=IPaddr2)
Attributes: ip=10.34.71.249 cidr_netmask=23
Operations: start interval=0s timeout=20s (vip-start-interval-0s)
stop interval=0s timeout=20s (vip-stop-interval-0s)
monitor interval=30s (vip-monitor-interval-30s)
Resource: halvm (class=ocf provider=heartbeat type=LVM)
Attributes: exclusive=true partial_activation=false volgrpname=shared
Operations: start interval=0s timeout=30 (halvm-start-interval-0s)
stop interval=0s timeout=30 (halvm-stop-interval-0s)
monitor interval=10 timeout=30 (halvm-monitor-interval-10)
Resource: fs (class=ocf provider=heartbeat type=Filesystem)
Attributes: device=/dev/shared/shared0 directory=/u01 fstype=ext4 options=
Operations: start interval=0s timeout=60 (fs-start-interval-0s)
stop interval=0s timeout=60 (fs-stop-interval-0s)
monitor interval=30s (fs-monitor-interval-30s)
Resource: oracle (class=ocf provider=heartbeat type=oracle)
Attributes: sid=oradb
Operations: start interval=0s timeout=120 (oracle-start-interval-0s)
stop interval=0s timeout=120 (oracle-stop-interval-0s)
monitor interval=30s (oracle-monitor-interval-30s)
Resource: oralsnr (class=ocf provider=heartbeat type=oralsnr)
Attributes: listener=LISTENER sid=oradb
Meta Attrs: target-role=Stopped
Operations: start interval=0s timeout=120 (oralsnr-start-interval-0s)
stop interval=0s timeout=120 (oralsnr-stop-interval-0s)
monitor interval=10 timeout=30 (oralsnr-monitor-interval-10)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2174.html |
Description of problem: Oracle resource agent for Oracle12c in RHEL 7 cluster contain status check which performs grep of "ps" output for Oracle user based on environment variable $ORACLE_OWNER. However if the $ORACLE_OWNER value contains more than 8 characters the `ps` cuts the username in output so the grep won't succeed causing status check to fail. From sources of resource-agents (RHEL 7.1), ClusterLabs-resource-agents-5434e96/heartbeat/oralsnr: 269 show_procs() { 270 ps -e -o pid,user,args | 271 grep '[t]nslsnr' | grep -w "$listener" | grep -w "$ORACLE_OWNER" EXAMPLE: ======== If $ORACLE_OWNER is "oracle123" it will be displayed in `ps -e -o pid,user,args` output as "oracle1+" and the check (above line #271) will fail causing the status check to fail. ======== "ps" offers -U <username> parameter to list only processes related to <username> so we could use it instead of grepping the user itself. The `ps -U $ORACLE_OWNER` should work also if the username is longer than 8 characters Version: RHEL 7.1, resource-agents-3.9.5 How reproducible: Always