+++ This bug is a downstream clone. The original bug is: +++ +++ bug 1430865 +++ ====================================================================== Description of problem: Unable to attach a data storage domain that was detached/removed Version-Release number of selected component (if applicable): 4.0.6.3-0.1.el7ev How reproducible: Not always Steps to Reproduce: Unsure - while detaching the data storage domain, the OVS network went offline in another data center while this was occuring Actual results: Adding the storage domain back and attaching fails with the following errors: [....] 2017-03-09 11:47:59,861 ERROR [org.ovirt.engine.core.bll.storage.disk.image.RegisterDiskCommand] (org.ovirt.thread.pool-6-thread-33) [2ffff1f4] Transaction rolled-back for command 'org.ovirt.engine.core.bll.storage.disk.image.RegisterDiskCommand'. 2017-03-09 11:47:59,861 ERROR [org.ovirt.engine.core.bll.storage.disk.image.RegisterDiskCommand] (org.ovirt.thread.pool-6-thread-33) [2ffff1f4] Transaction rolled-back for command 'org.ovirt.engine.core.bll.storage.disk.image.RegisterDiskCommand'. 2017-03-09 11:47:59,861 INFO [org.ovirt.engine.core.utils.transaction.TransactionSupport] (org.ovirt.thread.pool-6-thread-33) [2ffff1f4] transaction rolled back 2017-03-09 11:47:59,862 ERROR [org.ovirt.engine.core.bll.storage.domain.AttachStorageDomainToPoolCommand] (org.ovirt.thread.pool-6-thread-33) [2ffff1f4] Command 'org.ovirt.engine.core.bll.storage.domain.AttachStorageDomainToPoolCommand' failed: CallableStatementCallback; SQL [{call insertunregistereddiskstovms(?, ?, ?, ?)}]; ERROR: duplicate key value violates unique constraint "pk_unregistered_disks_to_vms" Detail: Key (disk_id, entity_id, storage_domain_id)=(f1f4cd1e-6bc7-4915-991c-49cb0abb9aef, a5e20a64-4160-4963-8dc0-68b86410c910, 9b161463-ce93-45e7-8f14-074d670dd32b) already exists. Where: SQL statement "INSERT INTO unregistered_disks_to_vms ( disk_id, entity_id, entity_name, storage_domain_id ) VALUES ( v_disk_id, v_entity_id, v_entity_name, v_storage_domain_id )" PL/pgSQL function insertunregistereddiskstovms(uuid,uuid,character varying,uuid) line 3 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "pk_unregistered_disks_to_vms" [....] Expected results: Attaching a storage domain back to its original location should work properly Additional info: All engine.log output is located here: http://pastebin.test.redhat.com/463201 - see lines 409 through 426 for the messages above. (Originally by Sam Yangsao)
This constraint was introduced in RHV 4.0.1 as part of the fix for bug 1302780. Maor - can you take a look please? (Originally by Allon Mureinik)
Hi Sam, Just wanted to clear something. You wrote that you have one DC with 2 clusters, Dell-cluster and HP-cluster. by clusters, did you mean linux clusters to get High Availability for RHEV-M? (Originally by Maor Lipchuk)
(In reply to Maor from comment #5) > Hi Sam, > > Just wanted to clear something. > You wrote that you have one DC with 2 clusters, Dell-cluster and HP-cluster. > by clusters, did you mean linux clusters to get High Availability for RHEV-M? Hey Maor, It's 1 Data center with 2 clusters. No HA for the RHV-M :) Thanks! (Originally by Sam Yangsao)
Hi, I will take a look at it first think tomorrow morning. Thank you for the info (Originally by Maor Lipchuk)
Hi Sam, I think that I found the issue, thank you very much for your help and the access to your env, that was much helpful and reduced the time finding the issue. It looks like there were VMs with disks and snapshots and some of the snapshots got deleted before there was an OVF update in the OVF_STORE disk. In that point of time the OVF of the VM indicated the disks contains snapshots. while those disks were without any snapshots. Once the storage domain got attached those disks were fetched as potential disks to register which were part of the VMs also. There seem to be a bug in the xml parser of the OVF that add those disks the VMs which those are attached to, since the XML was not updated after the removal of the snapshots those disks were initialized with VMs there were attached to, although those VMs were actually the same VM and that caused the SQL exception. Steps to reproduce: 1. Create a VM with disks and snapshot 2. Delete the snapshot 3. force remove the storage domain (do not deactivate it since the OVF_STORE will be updated this way) 4. Try to attach the storage domain back again to the Data Center I will post a patch that fixes it. Thank you again for your help (Originally by Maor Lipchuk)
Awe(In reply to Maor from comment #13) > Hi Sam, > > I think that I found the issue, thank you very much for your help and the > access to your env, that was much helpful and reduced the time finding the > issue. > > It looks like there were VMs with disks and snapshots and some of the > snapshots got deleted before there was an OVF update in the OVF_STORE disk. > In that point of time the OVF of the VM indicated the disks contains > snapshots. > while those disks were without any snapshots. > Once the storage domain got attached those disks were fetched as potential > disks to register which were part of the VMs also. > > There seem to be a bug in the xml parser of the OVF that add those disks the > VMs which those are attached to, since the XML was not updated after the > removal of the snapshots those disks were initialized with VMs there were > attached to, although those VMs were actually the same VM and that caused > the SQL exception. > > Steps to reproduce: > 1. Create a VM with disks and snapshot > 2. Delete the snapshot > 3. force remove the storage domain (do not deactivate it since the OVF_STORE > will be updated this way) > 4. Try to attach the storage domain back again to the Data Center > > I will post a patch that fixes it. > Thank you again for your help Awesome, thanks for your hardwork in finding this Maor. (Originally by Sam Yangsao)
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason: [FOUND CLONE FLAGS: ['rhevm-4.1.z', 'rhevm-4.2-ga'], ] For more info please contact: rhv-devops (Originally by rhev-integ)
Verified with the following code: --------------------------------- vdsm-4.19.11-1.el7ev.x86_64 rhevm-4.1.2-0.1.el7 Steps to reproduce: ------------------------------------------ 1. Create a VM with disks and snapshot 2. Delete the snapshot 3. force remove the storage domain (do not deactivate it since the OVF_STORE will be updated this way) 4. Try to attach the storage domain back again to the Data Center Moving to VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1280