Bug 1430865 - ERROR: duplicate key value violates unique constraint "pk_unregistered_disks_to_vms"
Summary: ERROR: duplicate key value violates unique constraint "pk_unregistered_disks_...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.0.6
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ovirt-4.2.0
: 4.2.0
Assignee: Maor
QA Contact: Raz Tamir
URL:
Whiteboard:
Depends On:
Blocks: 1446920
TreeView+ depends on / blocked
 
Reported: 2017-03-09 18:15 UTC by Sam Yangsao
Modified: 2023-09-07 18:51 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, if a snapshot of a disk attached to a virtual machine was deleted and the user tried to attach the storage domain containing this virtual machine before the OVF_STORE had been updated with the change, the attachment operation would fail. Because the OVF indicated the presence of a disk with a snapshot, this disk was fetched as a potential disk to register, even though it was already part of a virtual machine. In the current release, the disks are counted only once and the storage domain can be attached.
Clone Of:
: 1446920 (view as bug list)
Environment:
Last Closed: 2018-05-15 17:41:09 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:
ylavi: testing_plan_complete?


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2018:1488 0 None None None 2018-05-15 17:42:35 UTC
oVirt gerrit 74280 0 master MERGED core: Use set for disk ids fetched from VM's OVF. 2020-10-07 10:31:08 UTC
oVirt gerrit 74320 0 ovirt-engine-4.1 MERGED core: Use set for disk ids fetched from VM's OVF. 2020-10-07 10:31:08 UTC

Description Sam Yangsao 2017-03-09 18:15:35 UTC
Description of problem:

Unable to attach a data storage domain that was detached/removed

Version-Release number of selected component (if applicable):

4.0.6.3-0.1.el7ev

How reproducible:

Not always

Steps to Reproduce:

Unsure - while detaching the data storage domain, the OVS network went offline in another data center while this was occuring

Actual results:

Adding the storage domain back and attaching fails with the following errors:

[....]
2017-03-09 11:47:59,861 ERROR [org.ovirt.engine.core.bll.storage.disk.image.RegisterDiskCommand] (org.ovirt.thread.pool-6-thread-33) [2ffff1f4] Transaction rolled-back for command 'org.ovirt.engine.core.bll.storage.disk.image.RegisterDiskCommand'.
2017-03-09 11:47:59,861 ERROR [org.ovirt.engine.core.bll.storage.disk.image.RegisterDiskCommand] (org.ovirt.thread.pool-6-thread-33) [2ffff1f4] Transaction rolled-back for command 'org.ovirt.engine.core.bll.storage.disk.image.RegisterDiskCommand'.
2017-03-09 11:47:59,861 INFO  [org.ovirt.engine.core.utils.transaction.TransactionSupport] (org.ovirt.thread.pool-6-thread-33) [2ffff1f4] transaction rolled back
2017-03-09 11:47:59,862 ERROR [org.ovirt.engine.core.bll.storage.domain.AttachStorageDomainToPoolCommand] (org.ovirt.thread.pool-6-thread-33) [2ffff1f4] Command 'org.ovirt.engine.core.bll.storage.domain.AttachStorageDomainToPoolCommand' failed: CallableStatementCallback; SQL [{call insertunregistereddiskstovms(?, ?, ?, ?)}]; ERROR: duplicate key value violates unique constraint "pk_unregistered_disks_to_vms"
  Detail: Key (disk_id, entity_id, storage_domain_id)=(f1f4cd1e-6bc7-4915-991c-49cb0abb9aef, a5e20a64-4160-4963-8dc0-68b86410c910, 9b161463-ce93-45e7-8f14-074d670dd32b) already exists.
  Where: SQL statement "INSERT INTO unregistered_disks_to_vms (
        disk_id,
        entity_id,
        entity_name,
        storage_domain_id
        )
    VALUES (
        v_disk_id,
        v_entity_id,
        v_entity_name,
        v_storage_domain_id
        )"
PL/pgSQL function insertunregistereddiskstovms(uuid,uuid,character varying,uuid) line 3 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "pk_unregistered_disks_to_vms"
[....]

Expected results:

Attaching a storage domain back to its original location should work properly 

Additional info:

All engine.log output is located here:

http://pastebin.test.redhat.com/463201 - see lines 409 through 426 for the messages above.

Comment 4 Allon Mureinik 2017-03-12 14:28:23 UTC
This constraint was introduced in RHV 4.0.1 as part of the fix for bug 1302780. Maor - can you take a look please?

Comment 5 Maor 2017-03-12 15:35:29 UTC
Hi Sam,

Just wanted to clear something.
You wrote that you have one DC with 2 clusters, Dell-cluster and HP-cluster.
by clusters, did you mean linux clusters to get High Availability for RHEV-M?

Comment 6 Sam Yangsao 2017-03-13 15:42:37 UTC
(In reply to Maor from comment #5)
> Hi Sam,
> 
> Just wanted to clear something.
> You wrote that you have one DC with 2 clusters, Dell-cluster and HP-cluster.
> by clusters, did you mean linux clusters to get High Availability for RHEV-M?

Hey Maor,

It's 1 Data center with 2 clusters.  No HA for the RHV-M :)

Thanks!

Comment 10 Maor 2017-03-18 22:03:53 UTC
Hi,

I will take a look at it first think tomorrow morning.
Thank you for the info

Comment 13 Maor 2017-03-19 13:10:22 UTC
Hi Sam,

I think that I found the issue, thank you very much for your help and the access to your env, that was much helpful and reduced the time finding the issue.

It looks like there were VMs with disks and snapshots and some of the snapshots got deleted before there was an OVF update in the OVF_STORE disk.
In that point of time the OVF of the VM indicated the disks contains snapshots.
while those disks were without any snapshots.
Once the storage domain got attached those disks were fetched as potential disks to register which were part of the VMs also.

There seem to be a bug in the xml parser of the OVF that add those disks the VMs which those are attached to, since the XML was not updated after the removal of the snapshots those disks were initialized with VMs there were attached to, although those VMs were actually the same VM and that caused the SQL exception.

Steps to reproduce:
1. Create a VM with disks and snapshot
2. Delete the snapshot
3. force remove the storage domain (do not deactivate it since the OVF_STORE will be updated this way)
4. Try to attach the storage domain back again to the Data Center

I will post a patch that fixes it.
Thank you again for your help

Comment 14 Sam Yangsao 2017-03-20 13:49:50 UTC
Awe(In reply to Maor from comment #13)
> Hi Sam,
> 
> I think that I found the issue, thank you very much for your help and the
> access to your env, that was much helpful and reduced the time finding the
> issue.
> 
> It looks like there were VMs with disks and snapshots and some of the
> snapshots got deleted before there was an OVF update in the OVF_STORE disk.
> In that point of time the OVF of the VM indicated the disks contains
> snapshots.
> while those disks were without any snapshots.
> Once the storage domain got attached those disks were fetched as potential
> disks to register which were part of the VMs also.
> 
> There seem to be a bug in the xml parser of the OVF that add those disks the
> VMs which those are attached to, since the XML was not updated after the
> removal of the snapshots those disks were initialized with VMs there were
> attached to, although those VMs were actually the same VM and that caused
> the SQL exception.
> 
> Steps to reproduce:
> 1. Create a VM with disks and snapshot
> 2. Delete the snapshot
> 3. force remove the storage domain (do not deactivate it since the OVF_STORE
> will be updated this way)
> 4. Try to attach the storage domain back again to the Data Center
> 
> I will post a patch that fixes it.
> Thank you again for your help

Awesome, thanks for your hardwork in finding this Maor.

Comment 15 rhev-integ 2017-04-26 10:50:22 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[FOUND CLONE FLAGS: ['rhevm-4.1.z', 'rhevm-4.2-ga'], ]

For more info please contact: rhv-devops

Comment 17 Raz Tamir 2017-05-23 22:44:51 UTC
Verified with our automation on ovirt-4.2.0-0.0.master.20170519193842.gitf4353fb6.el7.centos.

No failures on importing and attaching storage domain with unregistered disks

Comment 20 errata-xmlrpc 2018-05-15 17:41:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:1488

Comment 21 Franta Kust 2019-05-16 13:08:09 UTC
BZ<2>Jira Resync


Note You need to log in before you can comment on or make changes to this bug.