Bug 1264677
Summary: | Can't create a template with Cinder disks | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Ori Gofen <ogofen> | ||||||
Component: | ovirt-engine | Assignee: | Daniel Erez <derez> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Ori Gofen <ogofen> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 3.6.0 | CC: | acanan, amureini, derez, gklein, lsurette, masayag, mlipchuk, ogofen, pkliczew, rbalakri, Rhev-m-bugs, rnori, yeylon, ykaul | ||||||
Target Milestone: | ovirt-3.6.0-rc3 | ||||||||
Target Release: | 3.6.0 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | ovirt-engine-3.6.0.1 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 1264691 (view as bug list) | Environment: | |||||||
Last Closed: | 2016-03-09 21:13:31 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1135132, 1252958, 1253215 | ||||||||
Attachments: |
|
Description
Ori Gofen
2015-09-20 14:31:35 UTC
Hi Ravi, The mentioned code is located at 'AddVmTemplateCommand -> addVmTemplateCinderDisks()', and is invoked from within a transaction. Perhaps there's an issue with Future's get() and transactions on JBoss EAP? Thanks! Please provide engine log with debug level and corresponding vdsm log. Also, please describe the flow, as I see future objects being used there both to get a return value from running async engine commands, and probably also VDSM ones. Also, please verify you run using the exact same version (engine+vdsm), both in jboss EAP and wildfly versions, as I find it hard to believe it is related to that. (In reply to Oved Ourfali from comment #3) > Also, please describe the flow, as I see future objects being used there > both to get a return value from running async engine commands, and probably > also VDSM ones. > The relevant flow consists of a VM with a single Cinder disk (i.e. no VDSM tasks, only CoCo). The flow is as follows: AddVmTemplateCommand -> executeCommand() -> addVmTemplateCinderDisks method (inside a transaction) -> Future 'VdcActionType.CloneCinderDisks' -> stuck on future.get(). > Also, please verify you run using the exact same version (engine+vdsm), both > in jboss EAP and wildfly versions, as I find it hard to believe it is > related to that. It was just a guess.. As the same scenario works correctly in oVirt stable (and not in RHEV 3.6 beta), so it could be any other difference (DB version/jboss/java/etc). @Ori - can you please attach the logs with debug level.. (https://bugzilla.redhat.com/show_bug.cgi?id=1264677#c2) But there was no difference between latest upstream and downstream builds? Also, we will also need a cinder instance to work with. Can you send its details offline? By examining the log, it is shown that the duration of the command exceeds 10 minutes: Start: 2015-08-30 15:54:41,331 INFO [org.ovirt.engine.core.bll.AddVmTemplateCommand] (ajp-/127.0.0.1:8702-9) [4566b62d] Running command: AddVmTemplateCommand internal: false. Entities affected : ID: 00000001-0001-0001-0001-00000000033a Type: StoragePoolAction group CREATE_TEMPLATE with role type USER End: 2015-08-30 16:04:42,758 ERROR [org.ovirt.engine.core.bll.AddVmTemplateCommand] (ajp-/127.0.0.1:8702-9) [] Command 'org.ovirt.engine.core.bll.AddVmTemplateCommand' failed: Could not get JDBC Connection; nested exception is java.sql.SQLException: javax.resource.ResourceException: IJ000460: Error checking for a transaction ... Caused by: javax.resource.ResourceException: IJ000459: Transaction is not active: And this makes sense since the default timeout of engine's transaction is 600 seconds: <coordinator-environment default-timeout="600"/> I think it makes perfect sense to re-shape that command and exclude the template/disks creation part from the transaction in AddVmTemplateCommand.executeCommand(). It might be that on the environment where the scenario worked - the command didn't face the transaction timeout (better connectivity/smaller disk or other reason for not hitting timeout). Created attachment 1075328 [details]
vdsm.log
Created attachment 1075329 [details]
engine.log
Attached are logs from my env. (In reply to Moti Asayag from comment #6) > By examining the log, it is shown that the duration of the command exceeds > 10 minutes: > > Start: > 2015-08-30 15:54:41,331 INFO > [org.ovirt.engine.core.bll.AddVmTemplateCommand] (ajp-/127.0.0.1:8702-9) > [4566b62d] Running command: AddVmTemplateCommand internal: false. Entities > affected : ID: 00000001-0001-0001-0001-00000000033a Type: StoragePoolAction > group CREATE_TEMPLATE with role type USER > > > End: > > 2015-08-30 16:04:42,758 ERROR > [org.ovirt.engine.core.bll.AddVmTemplateCommand] (ajp-/127.0.0.1:8702-9) [] > Command 'org.ovirt.engine.core.bll.AddVmTemplateCommand' failed: Could not > get JDBC Connection; nested exception is java.sql.SQLException: > javax.resource.ResourceException: IJ000460: Error checking for a transaction > ... > Caused by: javax.resource.ResourceException: IJ000459: Transaction is not > active: > > > And this makes sense since the default timeout of engine's transaction is > 600 seconds: > <coordinator-environment default-timeout="600"/> > > I think it makes perfect sense to re-shape that command and exclude the > template/disks creation part from the transaction in > AddVmTemplateCommand.executeCommand(). > > It might be that on the environment where the scenario worked - the command > didn't face the transaction timeout (better connectivity/smaller disk or > other reason for not hitting timeout). Yeah, thought about that as well. However, the timeout repeats constantly on multiple rhev environments. Whereas, works constantly on multiple oVirt envs... (In reply to Oved Ourfali from comment #5) > But there was no difference between latest upstream and downstream builds? > > Also, we will also need a cinder instance to work with. Can you send its > details offline? Sure, I'll send it offline by request. Ori has one as well if needed. Based on Moti's comment, and offline discussion, moving to storage to consider applying Moti's suggestion, and check whether it solves the issue. Returning needinfo from comment #1 The Future.get operation can block only if the child command's execute method is not completing. I would check to see if CloneSingleCinderDiskCommand.executeCommand method is completing, If not I would think there is an underlying issue there. Verified on rhevm-3.6.0.1-0.1.el6.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-0376.html |