Description of problem: 1. at 13:41:06, admin attaches export domain to DC. 2017-08-21 13:41:06,655+10 INFO [org.ovirt.engine.core.bll.storage.domain.AttachStorageDomainToPoolCommand] (org.ovirt.thread.pool-6-thread-48) [b56aacc1-c32e-49be-8c67-15897edb9783] Running command: AttachStorageDomainToPoolCommand internal: false. Entities affected : ID: 92db7237-df7e-4e08-bbb3-3c040bfef826 Type: StorageAction group MANIPULATE_STORAGE_DOMAIN with role type ADMIN, ID: 59938be9-0258-02fc-02b4-0000000000aa Type: StoragePoolAction group MANIPULATE_STORAGE_DOMAIN with role type ADMIN 2. Then at 13:41:39, a GetImagesListVDSCommand returning 50+ images (highly populated Export SD) 3. Then a sequence of GetVolumesListVDSCommand and GetImageInfoVDSCommand for the list in [2]. This loops for 5 minutes, getting the info for all the images sequentially. 4. At the 5 minute mark, at 13:46:39 2017-08-21 13:46:39,036+10 ERROR [org.ovirt.engine.core.bll.storage.disk.image.GetUnregisteredDiskQuery] (org.ovirt.thread.pool-6-thread-48) [b56aacc1-c32e-49be-8c67-15897edb9783] Query 'GetUnregisteredDiskQuery' failed: Could not get JDBC Connection; nested exception is java.sql.SQLException: javax.resource.ResourceException: IJ000460: Error checking for a transaction 2017-08-21 13:46:39,036+10 ERROR [org.ovirt.engine.core.bll.storage.disk.image.GetUnregisteredDiskQuery] (org.ovirt.thread.pool-6-thread-48) [b56aacc1-c32e-49be-8c67-15897edb9783] Exception: org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLException: javax.resource.ResourceException: IJ000460: Error checking for a transaction If I reduce the number of images so that the loop at [3] takes less than 5 minutes, everything works fine. The logs are similar to bugzilla #1446878 but I don't see any deadlock in postgres logs. Is the loop on getting the volume infos for all images inside a 5 minute transaction that times out? Version-Release number of selected component (if applicable): rhevm-4.1.4.2-0.1.el7.noarch How reproducible: 100% 1. Create NFS Share 2. Create Export Domain on NFS share 3. Maintenance and Detach Export Domain 4. Use virt-v2v to populate it with ~50+ disks. # for ((n=0;n<50;n++)); do virt-v2v -o rhev -os 10.64.24.33:/exports/data7 rhel7.3; done 5. Attach Export Domain * You may need more than 50 images if the Storage/Hosts are fast. Mine is quite slow. Actual results: Export Domain fails to attach Expected results: Export Domain attached
Benny, the attached patch is merged. Should this be MODIFIED, or are we waiting for anything else?
Moved to MODIFIED
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason: [No relevant external trackers attached] For more info please contact: rhv-devops
Verified with the following code: -------------------------------------- ovirt-engine-4.2.0-0.5.master.el7.noarch vdsm-4.20.8-53.gitc3edfc0.el7.centos.x86_64 Verified with the following scenario: -------------------------------------- 1. Created 50 vms with disk and exported to export domain 2. Set the export to maintenance and detached it 3. Attached the export domain again >>>>> the attach operation has undergone significant performance improvements and the it took a few second to complete 4. Vm Import displays the vms within a few seconds too. Moving to VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:1488
BZ<2>Jira Resync