Bug 1093265
Summary: | Database deadlock errors when attempting to apply alert definitions updates | ||||||
---|---|---|---|---|---|---|---|
Product: | [JBoss] JBoss Operations Network | Reporter: | Larry O'Leary <loleary> | ||||
Component: | Database | Assignee: | RHQ Project Maintainer <rhq-maint> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Mike Foley <mfoley> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | JON 3.2 | CC: | jshaughn, lkrejci, myarboro | ||||
Target Milestone: | ER02 | Keywords: | Triaged | ||||
Target Release: | JON 3.3.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: |
If a user made a change to an alert template, saved it, realized that something was missed, made a second change to the same alert template and then resaved it, a database deadlock error could occur if the updates followed in quick succession. The fix adds a longer button unavailability timeout to prevent concurrent updates.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2014-12-11 14:05:17 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Although probably not the issue given the error we're seeing, all Oracle installs for 3.2.0 should apply: <no-tx-separate-pools>true</no-tx-separate-pools> To their RHQDS declaration in the standalone-full.conf. This is fixed with the application of the 3.2.1 CP. I'd recommend making this change and re-evaluating. One other thought is that I don't think we prevent the GUI updating a template in quick succession. Maybe not even that quick, just while the prior update is still in progress. The log almost looks like 3 very quick invocations of updateTemplate were invoked... Interestingly, the failure does not seem to be while applying the template changes to the potentially large number of affected resources, but rather while updating the template itself. Also, if this is easily repeated in the errant environment, try turning on debug level server logging to gather some extra information. This issue occurs when the same alert template is updated more then once within a short period of time. In other words: - make a change to an alert template - save it - realize you missed something and make a second change to the same alert template - save it It isn't clear how updates are persisted to the database but it is clear that the same template can be updated while the first update is not yet saved. This results in a database deadlock that will prevent the update action from completing. I think the solution here is to disable the UI button until the prior request is completed. master commit f5bf43c71e26bd98d88a55686bf32487580d806a Author: Jay Shaughnessy <jshaughn> Date: Fri Aug 1 15:20:42 2014 -0400 Added some longer button disablement to avoid concurrent updates. If this "cheap" fix isn't sufficient we'd have to look at something server-side, and more substantial. release/jon3.3.x: commit c015d106cc97ab3e95775cf61b0f220a8ff63c43 Author: Jay Shaughnessy <jshaughn> Date: Fri Aug 1 15:20:42 2014 -0400 [1093265] Added some longer button disablement to avoid concurrent updates. If this "cheap" fix isn't sufficient we'd have to look at something server-side, and more substantial. (cherry picked from commit f5bf43c71e26bd98d88a55686bf32487580d806a) Signed-off-by: Lukas Krejci <lkrejci> Moving to ON_QA as available for test with the following brew build: https://brewweb.devel.redhat.com//buildinfo?buildID=381194 |
Created attachment 891392 [details] Excerpt from server log showing deadlock messages/exceptions/stacks Description of problem: Dead lock errors are reported resulting in database failures being reported. From the stack trace (in the log excerpt) it appears these happen when one or more alert definitions are being updated by an alert template. Caused by: java.sql.BatchUpdateException: ORA-00060: deadlock detected while waiting for resource DBAs analysis provided the following report: Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production With the Partitioning and Real Application Testing options System name: Linux Node name: nummela Release: 2.6.32-358.6.2.el6.x86_64 Version: #1 SMP Tue May 14 15:48:21 EDT 2013 Machine: x86_64 Redo thread mounted by this instance: 1 Oracle process number: 232 Unix process pid: 49078, image: oracle@nummela *** 2014-04-24 13:29:48.469 *** SESSION ID:(426.30254) 2014-04-24 13:29:48.469 *** CLIENT ID:() 2014-04-24 13:29:48.469 *** SERVICE NAME:(SYS$USERS) 2014-04-24 13:29:48.469 *** MODULE NAME:(JDBC Thin Client) 2014-04-24 13:29:48.469 *** ACTION NAME:() 2014-04-24 13:29:48.469 *** 2014-04-24 13:29:48.469 DEADLOCK DETECTED ( ORA-00060 ) [Transaction Deadlock] The following deadlock is not an ORACLE error. It is a deadlock due to user error in the design of an application or from issuing incorrect ad-hoc SQL. The following information may aid in determining the deadlock: Deadlock graph: ---------Blocker(s)-------- ---------Waiter(s)--------- Resource Name process session holds waits process session holds waits TM-0019b065-00000000 232 537 SX 98 73 SX SSX TM-0019b0cb-00000000 98 73 SX 232 537 SX SSX session 426: DID 0001-00E8-01508E45 session 62: DID 0001-0062-015559F7 session 62: DID 0001-0062-015559F7 session 426: DID 0001-00E8-01508E45 Rows waited on: Session 426: no row Session 62: no row ----- Information for the OTHER waiting sessions ----- Session 62: sid: 62 ser: 1049 audsid: 2329692 user: 109/JON flags: (0x1100041) USR/- flags_idl: (0x1) BSY/-/-/-/-/- flags2: (0x40009) -/-/INC pid: 87 O/S info: user: orac, term: UNKNOWN, ospid: 38349 image: oracle@nummela client details: O/S info: user: jbossadm, term: unknown, ospid: 1234 machine: jboss-on-01.example.com program: JDBC Thin Client application name: JDBC Thin Client, hash value=2546894660 current SQL: delete from RHQ_CONFIG where id=:1 ----- End of information for the OTHER waiting sessions ----- Information for THIS session: ----- Current SQL Statement for this session (sql_id=gc2srj09d9kss) ----- delete from RHQ_CONFIG where id=:1 *** 2014-04-24 13:29:48.567 Attempting to break deadlock by signaling ORA-00060 Version-Release number of selected component (if applicable): 3.2.0.GA Additional info: This JBoss ON system is made up of two servers. At the time of the deadlock, server 02 did not report any errors and did not appear to be attempting any database purge or other jobs.