Bug 581515 - oracledb.sh script assumes it needs to start Enterprise Manager and iSQL*Plus when oracle 10G
Summary: oracledb.sh script assumes it needs to start Enterprise Manager and iSQL*Plus...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: rgmanager
Version: 5.5
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks: 587399
TreeView+ depends on / blocked
 
Reported: 2010-04-12 13:53 UTC by Craig Kornmesser
Modified: 2010-04-29 19:37 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 587399 (view as bug list)
Environment:
Last Closed: 2010-04-29 19:33:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Cluster.conf file from cluster (3.37 KB, text/plain)
2010-04-12 14:13 UTC, Craig Kornmesser
no flags Details
oracledb.sh file with the changes I've had to make to make sure it starts Oracle via the cluster (20.35 KB, text/plain)
2010-04-12 14:16 UTC, Craig Kornmesser
no flags Details

Description Craig Kornmesser 2010-04-12 13:53:40 UTC
Description of problem:
When the DBA's install oracle they don't install the enterprise manager or iSQL*Plus.  The oracledb.sh script does not check to see if these components are installed, it just assumes they are and tries to start them.  Since those components are not installed, the script fails, because they don't start and then the cluster fails.

Version-Release number of selected component (if applicable):
All versions of rgmanager.

How reproducible:
Always, unless you comment out the following section of the oracledb.sh script.
#       if [ "$ORACLE_TYPE" = "base-em" ]; then
#               action "Starting iSQL*Plus:" isqlplusctl start || return 1
#               action "Starting Oracle EM DB Console:" emctl start dbconsole || return 1
#       elif [ "$ORACLE_TYPE" = "ias" ]; then
#               action "Starting Oracle EM:" emctl start em || return 1
#               action "Starting iAS Infrastructure:" opmnctl startall || return 1
#       fi


Steps to Reproduce:
1. Install oracle 10G without iSQL Plus and the Oracle EM DB Console.
2. Cluster an oracle 10G DB
3. start, stop or move it between nodes.
  
Actual results:

The oracle service fails.
Apr 11 06:42:46 cusnwd0v kernel: kjournald starting.  Commit interval 5 seconds
Apr 11 06:42:46 cusnwd0v kernel: EXT3 FS on dm-0, internal journal
Apr 11 06:42:46 cusnwd0v kernel: EXT3-fs: mounted filesystem with ordered data mode.
Apr 11 06:42:47 cusnwd0v kernel: kjournald starting.  Commit interval 5 seconds
Apr 11 06:42:47 cusnwd0v kernel: EXT3 FS on dm-1, internal journal
Apr 11 06:42:47 cusnwd0v kernel: EXT3-fs: mounted filesystem with ordered data mode.
Apr 11 06:42:47 cusnwd0v kernel: kjournald starting.  Commit interval 5 seconds
Apr 11 06:42:47 cusnwd0v kernel: EXT3 FS on dm-2, internal journal
Apr 11 06:42:47 cusnwd0v kernel: EXT3-fs: mounted filesystem with ordered data mode.
Apr 11 06:42:47 cusnwd0v kernel: kjournald starting.  Commit interval 5 seconds
Apr 11 06:42:47 cusnwd0v kernel: EXT3 FS on dm-3, internal journal
Apr 11 06:42:47 cusnwd0v kernel: EXT3-fs: mounted filesystem with ordered data mode.
Apr 11 06:42:47 cusnwd0v kernel: kjournald starting.  Commit interval 5 seconds
Apr 11 06:42:47 cusnwd0v kernel: EXT3 FS on dm-4, internal journal
Apr 11 06:42:47 cusnwd0v kernel: EXT3-fs: mounted filesystem with ordered data mode.
Apr 11 06:42:48 cusnwd0v kernel: kjournald starting.  Commit interval 5 seconds
Apr 11 06:42:48 cusnwd0v kernel: EXT3 FS on dm-5, internal journal
Apr 11 06:42:48 cusnwd0v kernel: EXT3-fs: mounted filesystem with ordered data mode.
Apr 11 06:42:48 cusnwd0v kernel: kjournald starting.  Commit interval 5 seconds
Apr 11 06:42:48 cusnwd0v kernel: EXT3 FS on dm-7, internal journal
Apr 11 06:42:48 cusnwd0v kernel: EXT3-fs: mounted filesystem with ordered data mode.
Apr 11 06:42:49 cusnwd0v kernel: kjournald starting.  Commit interval 5 seconds
Apr 11 06:42:49 cusnwd0v kernel: EXT3 FS on dm-6, internal journal
Apr 11 06:42:49 cusnwd0v kernel: EXT3-fs: mounted filesystem with ordered data mode.
Apr 11 06:42:49 cusnwd0v kernel: kjournald starting.  Commit interval 5 seconds
Apr 11 06:42:49 cusnwd0v kernel: EXT3 FS on dm-8, internal journal
Apr 11 06:42:49 cusnwd0v kernel: EXT3-fs: mounted filesystem with ordered data mode.
Apr 11 06:42:49 cusnwd0v kernel: kjournald starting.  Commit interval 5 seconds
Apr 11 06:42:49 cusnwd0v kernel: EXT3 FS on dm-9, internal journal
Apr 11 06:42:49 cusnwd0v kernel: EXT3-fs: mounted filesystem with ordered data mode.
Apr 11 06:42:50 cusnwd0v kernel: kjournald starting.  Commit interval 5 seconds
Apr 11 06:42:50 cusnwd0v kernel: EXT3 FS on dm-10, internal journal
Apr 11 06:42:50 cusnwd0v kernel: EXT3-fs: mounted filesystem with ordered data mode.
Apr 11 06:42:52 cusnwd0v luci[8567]: Unable to retrieve batch 1675237673 status from cusnwd0v-ic.carrier.utc.com:11111: module scheduled for execution
Apr 11 06:43:04 cusnwd0v last message repeated 2 times
Apr 11 06:43:06 cusnwd0v cat: 
Apr 11 06:43:06 cusnwd0v cat: SQL*Plus: Release 10.2.0.4.0 - Production on Sun Apr 11 06:42:53 2010
Apr 11 06:43:06 cusnwd0v cat: 
Apr 11 06:43:06 cusnwd0v cat: Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Apr 11 06:43:06 cusnwd0v cat: 
Apr 11 06:43:06 cusnwd0v cat: Connected to an idle instance.
Apr 11 06:43:06 cusnwd0v cat: 
Apr 11 06:43:06 cusnwd0v cat: SQL> ORACLE instance started.
Apr 11 06:43:06 cusnwd0v cat: 
Apr 11 06:43:06 cusnwd0v cat: Total System Global Area 1.6106E+10 bytes
Apr 11 06:43:06 cusnwd0v cat: Fixed Size		    2112088 bytes
Apr 11 06:43:06 cusnwd0v cat: Variable Size		 6777996712 bytes
Apr 11 06:43:06 cusnwd0v cat: Database Buffers	 9294577664 bytes
Apr 11 06:43:06 cusnwd0v cat: Redo Buffers		   31440896 bytes
Apr 11 06:43:06 cusnwd0v cat: Database mounted.
Apr 11 06:43:06 cusnwd0v cat: Database opened.
Apr 11 06:43:06 cusnwd0v cat: SQL> Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
Apr 11 06:43:06 cusnwd0v cat: With the Partitioning, Data Mining and Real Application Testing options
Apr 11 06:43:10 cusnwd0v clurgmgrd[8632]: <notice> start on oracledb "WCHILL1P" returned 1 (generic error) 
Apr 11 06:43:10 cusnwd0v clurgmgrd[8632]: <warning> #68: Failed to start service:WNDCHLLDB; return value: 1 
Apr 11 06:43:10 cusnwd0v clurgmgrd[8632]: <notice> Stopping service service:WNDCHLLDB 
Apr 11 06:43:11 cusnwd0v luci[8567]: Unable to retrieve batch 1675237673 status from cusnwd0v-ic.carrier.utc.com:11111: module scheduled for execution
Apr 11 06:43:14 cusnwd0v clurgmgrd[8632]: <notice> stop on oracledb "WCHILL1P" returned 1 (generic error) 
Apr 11 06:43:17 cusnwd0v luci[8567]: Unable to retrieve batch 1675237673 status from cusnwd0v-ic.carrier.utc.com:11111: module scheduled for execution
Apr 11 06:43:24 cusnwd0v luci[8567]: Unable to retrieve batch 1675237673 status from cusnwd0v-ic.carrier.utc.com:11111: module scheduled for execution
Apr 11 06:43:24 cusnwd0v clurgmgrd: [8632]: <notice> Forcefully unmounting /wchillp/redo_logsb 
Apr 11 06:43:25 cusnwd0v clurgmgrd: [8632]: <warning> killing process 12608 (oracle oracle /wchillp/redo_logsb) 
Apr 11 06:43:25 cusnwd0v clurgmgrd: [8632]: <warning> killing process 12656 (oracle oracle /wchillp/redo_logsb) 
Apr 11 06:43:30 cusnwd0v luci[8567]: Unable to retrieve batch 1675237673 status from cusnwd0v-ic.carrier.utc.com:11111: module scheduled for execution
Apr 11 06:43:31 cusnwd0v clurgmgrd: [8632]: <notice> Forcefully unmounting /wchillp/app/oracle 
Apr 11 06:43:32 cusnwd0v clurgmgrd: [8632]: <warning> killing process 12682 (oracle tnslsnr /wchillp/app/oracle) 
Apr 11 06:43:37 cusnwd0v luci[8567]: Unable to retrieve batch 1675237673 status from cusnwd0v-ic.carrier.utc.com:11111: module scheduled for execution
Apr 11 06:43:37 cusnwd0v clurgmgrd[8632]: <crit> #12: RG service:WNDCHLLDB failed to stop; intervention required 
Apr 11 06:43:37 cusnwd0v clurgmgrd[8632]: <notice> Service service:WNDCHLLDB is failed 
Apr 11 06:43:38 cusnwd0v clurgmgrd[8632]: <crit> #13: Service service:WNDCHLLDB failed to stop cleanly 
Apr 11 06:43:43 cusnwd0v luci[8567]: Unable to retrieve batch 1675237673 status from cusnwd0v-ic.carrier.utc.com:11111: clusvcadm start failed to start WNDCHLLDB: 
Apr 11 07:13:25 cusnwd0v clurgmgrd[8632]: <notice> Stopping service service:WNDCHLLDB 
Apr 11 07:13:26 cusnwd0v clurgmgrd[8632]: <notice> Service service:WNDCHLLDB is disabled 


Expected results:

The oracledb.sh should check to see if enterprise manager or iSQL*Plus is installed and not just assume that those components are there.  For now when I patch the linux server and rgmanager gets updated, I need to make sure I edit the oracledb.sh file and comment out the following lines.

#       if [ "$ORACLE_TYPE" = "base-em" ]; then
#               action "Starting iSQL*Plus:" isqlplusctl start || return 1
#               action "Starting Oracle EM DB Console:" emctl start dbconsole || return 1
#       elif [ "$ORACLE_TYPE" = "ias" ]; then
#               action "Starting Oracle EM:" emctl start em || return 1
#               action "Starting iAS Infrastructure:" opmnctl startall || return 1
#       fi

Additional info:

Comment 1 Lon Hohberger 2010-04-12 13:57:53 UTC
Please attach your cluster.conf

Comment 2 Craig Kornmesser 2010-04-12 14:13:24 UTC
Created attachment 405972 [details]
Cluster.conf file from cluster

Comment 3 Craig Kornmesser 2010-04-12 14:16:18 UTC
Created attachment 405974 [details]
oracledb.sh file with the changes I've had to make to make sure it starts Oracle via the cluster

Comment 4 Lon Hohberger 2010-04-16 19:23:14 UTC
	    <longdesc lang="en">
		This is the Oracle installation type:
		base - Database Instance and Listener only
		base-em (or 10g) - Database, Listener, Enterprise Manager,
				   and iSQL*Plus
		ias (or 10g-ias)
            </longdesc>

Sounds like setting it to "base" should work for your configuration without editing the script.

Comment 5 Craig Kornmesser 2010-04-16 20:15:36 UTC
I understand I could set it to base, but where would you set that such that you don't run the risk of it getting overwritten?  Any changes to the oracledb.sh file will get overwritten when an update is applied to the rgmanager package.  Which is exactly what happened here.  The script incorrectly identifies any 10G install as base-em, which is incorrect.

Comment 6 Lon Hohberger 2010-04-16 20:38:30 UTC
Changing this line in cluster.conf:

  <oracledb home="/wchillp/app/oracle/product/10.2.0" name="WCHILL1P" type="10g" user="oracle" vhost="vip-windchilldb.carrier.utc.com"/>

to:

  <oracledb home="/wchillp/app/oracle/product/10.2.0" name="WCHILL1P" type="base" user="oracle" vhost="vip-windchilldb.carrier.utc.com"/>

... should do it.

Comment 7 Lon Hohberger 2010-04-16 20:59:30 UTC
(Don't forget to change the config version and so forth; making a change to the agent will cause the Oracle instance to be restarted, so do it at your next maintenance window)

Comment 8 Lon Hohberger 2010-04-16 21:00:23 UTC
Oops -- making a change to the resource line in cluster.conf, not the agent, will cause the resource instance to be restarted.

Comment 9 Craig Kornmesser 2010-04-21 18:15:46 UTC
Oh, Ok... I guess. I wish this was better documented and when you try to use the agent configuration via luci, it gives you a pull down menu for the type and let you choose from the three options: base, base-em or base-asi.

I had contacted Redhat support when I was initially trying to get this to work, but they were not very helpful.  All they did was have me create my own oracle agent, which was still not the right answer.

Comment 10 Lon Hohberger 2010-04-29 19:33:00 UTC
I'm sorry that Red Hat Support did not answer your question as required.

If you'd like, we can clone this bug against the luci interface and/or Documentation for clarification as to what the base/base-em/base-ias possibilities mean.

However, as far as rgmanager is concerned, this isn't a bug.


Note You need to log in before you can comment on or make changes to this bug.