815964 – Monitoring Probes gives a error deadlock message (ORA-00060) RHNSAT.RHN_SYNCH_PROBE_STATE

Bug 815964 - Monitoring Probes gives a error deadlock message (ORA-00060) RHNSAT.RHN_SYNCH_PROBE_STATE

Summary: Monitoring Probes gives a error deadlock message (ORA-00060) RHNSAT.RHN_SYNC...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Spacewalk
Classification:	Community
Component:	Server
Sub Component:
Version:	1.8
Hardware:	All
OS:	All
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Stephen Herr
QA Contact:	Red Hat Satellite QA List
Docs Contact:
URL:
Whiteboard:
Depends On:	653464 674071
Blocks:	space18
TreeView+	depends on / blocked

Reported:	2012-04-24 22:04 UTC by Stephen Herr
Modified:	2012-11-01 16:21 UTC (History)
CC List:	7 users (show)
Fixed In Version:	spacewalk-web-1.8.34-1
Clone Of:	674071
Environment:
Last Closed:	2012-11-01 16:21:10 UTC
Embargoed:

Attachments	(Terms of Use)

Description Stephen Herr 2012-04-24 22:04:38 UTC

+++ This bug was initially created as a clone of Bug #674071 +++

Description of problem:

When running a lot of monitoring probes with large tables (many data), RHN probes show a deadlock error message. 

Version-Release number of selected component (if applicable):
Red Hat Network Satellite 5.3 

How reproducible:
100%

Steps to Reproduce:
1. Create a lot of probes
2. Run
  
Actual results:

RHN Probes shows deadlock messages

Expected results:

RHN Probes run as expected

Additional info:

====== ISSUE 
Customer is continuing to see "deadlock detected" messages from Taskomatic:

***
INFO   | jvm 1    | 2011/01/14 08:17:03 | Caused by: 
INFO   | jvm 1    | 2011/01/14 08:17:03 | java.sql.SQLException: ORA-00060: deadlock detected while waiting for resource
INFO   | jvm 1    | 2011/01/14 08:17:03 | ORA-06512: at "RHNSAT.RHN_SYNCH_PROBE_STATE", line 4
INFO   | jvm 1    | 2011/01/14 08:17:03 | ORA-06512: at line 1
INFO   | jvm 1    | 2011/01/14 08:17:03 | 
INFO   | jvm 1    | 2011/01/14 08:17:03 |       ... 18 more
INFO   | jvm 1    | 2011/01/14 08:17:03 | 2011-01-14 08:17:03,145 [DefaultQuartzScheduler_Worker-0] ERROR com.redhat.rhn.taskomatic.core.SchedulerKernel - org.quartz.JobExecutionException: com.redhat.rhn.common.db.WrappedSQLException: ORA-00060: deadlock detected while waiting for resource
INFO   | jvm 1    | 2011/01/14 08:17:03 | ORA-06512: at "RHNSAT.RHN_SYNCH_PROBE_STATE", line 4
INFO   | jvm 1    | 2011/01/14 08:17:03 | ORA-06512: at line 1
INFO   | jvm 1    | 2011/01/14 08:17:03 |  [See nested exception: com.redhat.rhn.common.db.WrappedSQLException: ORA-00060: deadlock detected while waiting for resource
INFO   | jvm 1    | 2011/01/14 08:17:03 | ORA-06512: at "RHNSAT.RHN_SYNCH_PROBE_STATE", line 4
INFO   | jvm 1    | 2011/01/14 08:17:03 | ORA-06512: at line 1


===== Analisys

Looking the oracle traces logs we could identify SQL which is causing the ORA-00060 deadlock. 

DEADLOCK DETECTED ( ORA-00060 )
  44   Current SQL Statement:
  45 
  46       UPDATE PROBE_STATE
  47       SET    LAST_CHECK = TO_DATE(:p1, 'YYYY-MM-DD HH24:MI:SS'),
  48              STATE = :p2,
  49              OUTPUT = :p3
  50       WHERE  SCOUT_ID = :p4
  51       AND    PROBE_ID = :p5
  52 
  53 End of information on OTHER waiting sessions.
  54 Current SQL statement for this session:
  55 UPDATE RHN_PROBE_STATE SET STATE = 'PENDING', OUTPUT = 'Awaiting update' WHERE LAST_CHECK < ( SELECT ( SYSDATE - GREATEST(15 / 60 / 24, ((3 * RHN_DEPLOYED_PROB     E.CHECK_INTERVAL_MINUTES) / 60 / 24))) FROM RHN_DEPLOYED_PROBE WHERE RHN_DEPLOYED_PROBE.RECID = RHN_PROBE_STATE.PROBE_ID )

========================================

Querying customer DB internally 

SQL> SELECT probe_type as "Probe Type", count(*) as "Total Probes" from rhn_probe GROUP BY probe_type;

Probe Type	Total Probes
--------------- ------------
check			3237
suite			 124

SQL>

Comment 1 Stephen Herr 2012-04-24 22:07:45 UTC

fixed in Spacewalk master: 6f79cf500064b4bf0d5055fef4756351da912ad3

Comment 2 Stephen Herr 2012-09-05 15:09:30 UTC

and b7b676e05053795e7c8202bd7484a51cc49781c6

Comment 3 Stephen Herr 2012-09-05 15:33:10 UTC

and 6da40f3e97b78d807546dca6b315e2c05b80ab04

Comment 4 Jan Pazdziora (Red Hat) 2012-10-30 19:25:22 UTC

Moving ON_QA. Packages that address this bugzilla should now be available in yum repos at http://yum.spacewalkproject.org/nightly/

Comment 5 Jan Pazdziora (Red Hat) 2012-11-01 16:21:10 UTC

Spacewalk 1.8 has been released: https://fedorahosted.org/spacewalk/wiki/ReleaseNotes18

Note You need to log in before you can comment on or make changes to this bug.