Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1695024

Summary: oraasm: false positive monitoring op result [RHEL 7]
Product: Red Hat Enterprise Linux 7 Reporter: Josef Zimek <pzimek>
Component: resource-agentsAssignee: Oyvind Albrigtsen <oalbrigt>
Status: CLOSED WONTFIX QA Contact: cluster-qe <cluster-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.6CC: agk, cluster-maint, david.deaderick, fdinitto, nwahl, sbradley
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1786812 (view as bug list) Environment:
Last Closed: 2020-11-11 21:38:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1786812    

Description Josef Zimek 2019-04-02 10:17:39 UTC
Description of problem:

Monitoring operation of oraasm resource can result in false-positive situation where resource keeps marked as "Started" while some of key components are not running. Root cause boils down to way we check the status, we rely on return code of following command:

        su - $OCF_RESKEY_user -c ". $ORA_ENVF; crsctl check has | grep -q \"CRS-4638\""


However this will get successful even if smon process (asm_smon_+ASM) is not running. Without this process the database is not able to serve content yet we keep the resource in "Started" state.



Version-Release number of selected component (if applicable):
esource-agents-4.1.1-12.el7_6.8.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Start oraasm resource
2. Stop the asm_smon_+ASM process
3. Resource will remain inStarted state despite ASM not running


Actual results:
Monitoring operation returns success rc because `crsctl check has | grep -q \"CRS-4638\"` will return 0 even if ASM is not running.


Expected results:
Monitoring operation checks the running processes for ASM process


Additional info:
Part of monitoring, besides process check, could be querying the database to make sure it is able to provide content

Comment 2 Josef Zimek 2019-04-02 11:43:14 UTC
More details about status check:


WORKING FINE scenario:
======================

# pcs status
ORAASM        (ocf::heartbeat:oraasm):        Started nodeA       <------------


# ps -ef | grep smon
grid      9270     1  0 Mar29 ?        00:00:04 asm_smon_+ASM     <------------
root     29586 23312  0 15:59 pts/0    00:00:00 grep --color=auto smon


# /data01/app/12.2.0/grid_122010/bin/crsctl check has
CRS-4638: Oracle High Availability Services is online             <-------------


# su - grid
$ sqlplus / as sysasm

SQL*Plus: Release 12.2.0.1.0 Production on Mon Apr 1 16:10:28 2019

Copyright (c) 1982, 2016, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production



SQL> select instance_name, host_name, status from v$instance;

INSTANCE_NAME
----------------
HOST_NAME                                                        STATUS
---------------------------------------------------------------- ------------
+ASM
nodeA                                                            STARTED




FALSE POSITIVE scenario (smon process not running):
===================================================

# pcs status
ORAASM        (ocf::heartbeat:oraasm):        Started nodeA       <------------

# ps -ef | grep smon
root     29586 23312  0 15:59 pts/0    00:00:00 grep --color=auto smon


# /data01/app/12.2.0/grid_122010/bin/crsctl check has
CRS-4638: Oracle High Availability Services is online             <-------------



# su - grid
$ sqlplus / as sysasm

SQL*Plus: Release 12.2.0.1.0 Production on Mon Apr 1 16:14:45 2019

Copyright (c) 1982, 2016, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> select instance_name, host_name, status from v$instance;
select instance_name, host_name, status from v$instance
*
ERROR at line 1:
ORA-01034: ORACLE not available
Process ID: 0
Session ID: 0 Serial number: 0






So no matter if smon process is running or not the `crsctl check has` will return:
CRS-4638: Oracle High Availability Services is online

Which resource agent considers as OK and reports resource "Started". So we need more reliable monitoring check.

NOTE: i am no ASM expert but it seems that smon is just one of possible ASM processes. It seems ASM can spawn various processes named asm_*.

Comment 5 Chris Williams 2020-11-11 21:38:35 UTC
Red Hat Enterprise Linux 7 shipped it's final minor release on September 29th, 2020. 7.9 was the last minor releases scheduled for RHEL 7.
From intial triage it does not appear the remaining Bugzillas meet the inclusion criteria for Maintenance Phase 2 and will now be closed. 

From the RHEL life cycle page:
https://access.redhat.com/support/policy/updates/errata#Maintenance_Support_2_Phase
"During Maintenance Support 2 Phase for Red Hat Enterprise Linux version 7,Red Hat defined Critical and Important impact Security Advisories (RHSAs) and selected (at Red Hat discretion) Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available."

If this BZ was closed in error and meets the above criteria please re-open it flag for 7.9.z, provide suitable business and technical justifications, and follow the process for Accelerated Fixes:
https://source.redhat.com/groups/public/pnt-cxno/pnt_customer_experience_and_operations_wiki/support_delivery_accelerated_fix_release_handbook  

Feature Requests can re-opened and moved to RHEL 8 if the desired functionality is not already present in the product. 

Please reach out to the applicable Product Experience Engineer[0] if you have any questions or concerns.  

[0] https://bugzilla.redhat.com/page.cgi?id=agile_component_mapping.html&product=Red+Hat+Enterprise+Linux+7