Bug 909935 - engine: live snapshot fails with nested exception is java.sql.SQLException: javax.resource.ResourceException: IJ000460: Error checking for a transaction
Summary: engine: live snapshot fails with nested exception is java.sql.SQLException: j...
Keywords:
Status: CLOSED DUPLICATE of bug 885460
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.1.3
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 3.2.0
Assignee: Maor
QA Contact: Dafna Ron
URL:
Whiteboard: storage
Depends On:
Blocks: 902824
TreeView+ depends on / blocked
 
Reported: 2013-02-11 13:34 UTC by Dafna Ron
Modified: 2016-02-10 17:45 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-04-04 07:15:45 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:
amureini: Triaged+


Attachments (Terms of Use)
logs (632.24 KB, application/x-gzip)
2013-02-11 13:34 UTC, Dafna Ron
no flags Details

Description Dafna Ron 2013-02-11 13:34:26 UTC
Created attachment 696085 [details]
logs

Description of problem:

I ran multiple live snapshot (created several vms -> create live snapshots on each vm) 
one of the snapshots failed with nested exception is java.sql.SQLException: javax.resource.ResourceException: IJ000460: Error checking for a transaction

Version-Release number of selected component (if applicable):

si27

How reproducible:

Steps to Reproduce:
1. create vms 10 from template using pool 
2. run the vms
3. create a live snapshots on each vm
  
Actual results:

snapshot failed with nested exception is java.sql.SQLException: javax.resource.ResourceException: IJ000460: Error checking for a transaction

Expected results:

we should not fail 

Additional info: logs

Comment 2 Haim 2013-03-11 13:23:37 UTC
adding regression, scenario used to work in 3.1.0

Comment 3 Ayal Baron 2013-04-03 07:34:08 UTC
IIuc the failed live snapshot took more than 5 minutes.  What I'd like to understand is why it takes so long.

Comment 4 Maor 2013-04-03 16:45:52 UTC
> IIuc the failed live snapshot took more than 5 minutes.  What I'd like to
> understand is why it takes so long.
I didn't see in the logs that it took more then 5 minutes, 
From what I have seen engine got an exception after we rolled back, and tried to call rollbackQuota. rollbackQuota tried to fetch the storage pool, but since there was a rollback there is no transaction to use in the DB.
IMO the fix for BZ909937 should avoid getting to that rollback issue, but still, there should be a proper fix for the rollbackQuota after compensate.

Comment 5 Maor 2013-04-03 16:53:17 UTC
I suspect that the issue here is that at the end command phase the storage pool is not initialized, there for when we rollback, the rollbackQuota tries first to get it from the memory and if it is not there, it will try to fetch it from the DB.
This could be at each command which will rollback at the end command phase.
Ofri can u confirm this is the case here?

Comment 6 Maor 2013-04-04 07:15:45 UTC

*** This bug has been marked as a duplicate of bug 885460 ***

Comment 7 Maor 2013-04-04 13:47:45 UTC
Logs of both bugs show the same issue:

rollbackQuota(ImportVmCommand.java:1005) [engine-bll.jar:]
....
javax.resource.ResourceException: IJ000459: Transaction is not active: tx=TransactionImple < ac, BasicAction: 0:ffff7f000001:2a04c229:50be1126:5aeb9 status: ActionStatus.ABORT_ONLY >
According comment4 of BZ885460

The problem in this bug is that since there was a dead lock (described in BZ909937), the transaction was aborted, and compensation flow was executed.
As part of the compensation flow, engine tried to execute rollbackQuota and fetch the Storage Pool to revert the quota assignment.
Since the transaction was aborted, the fetch of the storage pool could not be achieved and there for we got an exception the same as BZ885460


Note You need to log in before you can comment on or make changes to this bug.