Bug 2073254

Summary: oracle processes are terminated before stopping oracle resource during shutdown
Product: Red Hat Enterprise Linux 8 Reporter: Takayuki Nagata <tnagata>
Component: resource-agentsAssignee: Oyvind Albrigtsen <oalbrigt>
Status: ASSIGNED --- QA Contact: cluster-qe <cluster-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.5CC: agk, cluster-maint, fdinitto, nwahl, sbradley
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Takayuki Nagata 2022-04-08 04:02:30 UTC
Description of problem:
When system is shutdown/rebooted, the following message is shown since oracle processes are terminated before stopping the oracle resource.

Apr  6 10:34:53 node1 oracle(ins_pkg-oracle-INS)[675071]: INFO: Oracle instance INS already stopped

Version-Release number of selected component (if applicable):
resource-agents-4.1.1-98.el8_5.2.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Configure pacemaker cluster with oracle resource
2. Start cluster service
3. Shutdown/reboot the system.

Actual results:
Oracle processes are are terminated before stopping the oracle resource as follows.

Apr  6 10:34:53 node1 systemd[1]: session-c3.scope: Killing process 7540 (ora_pmon_ins) with signal SIGTERM.
Apr  6 10:34:53 node1 systemd[1]: session-c3.scope: Killing process 7544 (ora_clmn_ins) with signal SIGTERM.
Apr  6 10:34:53 node1 systemd[1]: session-c3.scope: Killing process 7548 (ora_psp0_ins) with signal SIGTERM.
Apr  6 10:34:53 node1 systemd[1]: session-c3.scope: Killing process 7552 (ora_vktm_ins) with signal SIGTERM.
:
Apr  6 10:34:53 node1 systemd[1]: session-c3.scope: Killing process 35942 (ora_w009_ins) with signal SIGTERM.
Apr  6 10:34:53 node1 systemd[1]: session-c3.scope: Killing process 37693 (ora_m003_ins) with signal SIGTERM.
Apr  6 10:34:53 node1 systemd[1]: session-c3.scope: Killing process 1170709 (ora_m004_ins) with signal SIGTERM.
Apr  6 10:34:53 node1 systemd[1]: Stopping Session c3 of user oracle.
:
Apr  6 10:34:53 node1 pacemakerd[2137]: notice: Caught 'Terminated' signal
:
Apr  6 10:34:53 node1 oracle(ins_pkg-oracle-INS)[675071]: INFO: Oracle instance INS already stopped

Expected results:
The oracle resource can stop the processes correctly.

Additional info:

Comment 1 Reid Wahl 2022-05-28 23:48:28 UTC
This would probably benefit from a closer look at the sosreports via the support case. This does not appear to be a problem with the resource agents. The logs presented in comment 0 show that systemd is killing the oracle processes before pacemaker and the resource agents even attempt to stop oracle.

Comment 2 Takayuki Nagata 2022-05-30 05:26:57 UTC
"su - $ORACLE_OWNER -s /bin/sh -c ..." creates processes under a new .scope unit. I think that the cause of this issue is the unit, and I think that adding dependencies to the unit could fix the issue.