Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1087577

Summary: compensation when removing a DC fails
Product: [Retired] oVirt Reporter: Martin Mucha <mmucha>
Component: ovirt-engine-coreAssignee: Yevgeny Zaspitsky <yzaspits>
Status: CLOSED CURRENTRELEASE QA Contact: Martin Pavlik <mpavlik>
Severity: high Docs Contact:
Priority: high    
Version: 3.5CC: amureini, bugs, gklein, iheim, laravot, masayag, michal.skrivanek, mmucha, mpavlik, rbalakri, yeylon, yzaspits
Target Milestone: ---Keywords: CodeChange, Triaged
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: network
Fixed In Version: ovirt-engine-3.5.0_beta Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-17 12:45:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Network RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Martin Mucha 2014-04-14 16:59:08 UTC
Description of problem:

when 
org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand#executeCommand
fails, method
org.ovirt.engine.core.bll.CommandBase#internalCompensate
is called and fails.

org.springframework.dao.DataIntegrityViolationException: CallableStatementCallback; SQL [{call insertnetwork(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)}]; ERROR: insert or update on table "network" violates foreign key constraint "fk_network_storage_pool"
  Detail: Key (storage_pool_id)=(c61ddffb-ca19-40fd-9da7-db6eb5bb23c6) is not present in table "storage_pool".
  Where: SQL statement "INSERT INTO network(addr, description, free_text_comment, id, name, subnet, gateway, type, vlan_id, stp, storage_pool_id, mtu, vm_network, provider_network_provider_id, provider_network_external_id, qos_id, label)
	VALUES(v_addr, v_description, v_free_text_comment, v_id, v_name, v_subnet, v_gateway, v_type, v_vlan_id, v_stp, v_storage_pool_id, v_mtu, v_vm_network, v_provider_network_provider_id, v_provider_network_external_id, v_qos_id, v_label)"
PL/pgSQL function insertnetwork(character varying,character varying,text,uuid,character varying,character varying,character varying,integer,integer,boolean,uuid,integer,boolean,uuid,text,uuid,text) line 3 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: insert or update on table "network" violates foreign key constraint "fk_network_storage_pool"
  Detail: Key (storage_pool_id)=(c61ddffb-ca19-40fd-9da7-db6eb5bb23c6) is not present in table "storage_pool".
  Where: SQL statement "INSERT INTO network(addr, description, free_text_comment, id, name, subnet, gateway, type, vlan_id, stp, storage_pool_id, mtu, vm_network, provider_network_provider_id, provider_network_external_id, qos_id, label)
	VALUES(v_addr, v_description, v_free_text_comment, v_id, v_name, v_subnet, v_gateway, v_type, v_vlan_id, v_stp, v_storage_pool_id, v_mtu, v_vm_network, v_provider_network_provider_id, v_provider_network_external_id, v_qos_id, v_label)"
PL/pgSQL function insertnetwork(character varying,character varying,text,uuid,character varying,character varying,character varying,integer,integer,boolean,uuid,integer,boolean,uuid,text,uuid,text) line 3 at SQL statement


How reproducible:
100%


Steps to Reproduce:

you need to fire exception here
org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand#executeCommand
before
setSucceeded(true);

Actual results:


Expected results:


Additional info:

Comment 1 Allon Mureinik 2014-04-15 10:45:29 UTC
The failure seems to come from compensating the DC's networks. Moti, can you please take a look?

Comment 2 Liron Aravot 2014-05-28 06:49:19 UTC
the bug is about the compensation context have no order, so it isn't aware that it should re-insert the pool before performing the other inserts there to not break the schema. to solve it in that command only we can remove the pool and clear the compensation in the same transaction (which will also solve it for the qos removal), if we'll need it for more usecases (we mostly have operations on "atomic" objects) we can have an infra RFE for that.

also, take in notice that you should fail the command after executring the removal of the storage pool.

Comment 3 Sandro Bonazzola 2014-06-11 07:05:04 UTC
This is an automated message:
oVirt 3.4.2 has been released.
This bug has been re-targeted from 3.4.2 to 3.4.3 since priority or severity were high or urgent.

Comment 4 Sandro Bonazzola 2014-06-11 07:05:39 UTC
This is an automated message:
oVirt 3.4.2 has been released.
This bug has been re-targeted from 3.4.2 to 3.4.3 since priority or severity were high or urgent.

Comment 5 Allon Mureinik 2014-07-07 21:28:57 UTC
Meni, Aharon - please review the bugs description and steps to reproduce - a DC is removed, and fails on some compensation related to networks.
This has nothing to do with storage except for the unfortunate mistake of confusing the logical DC in the engine's code with VDSM's Storage Pool.

Comment 6 Meni Yakove 2014-07-28 04:47:43 UTC
How can we test this?

Comment 7 Martin Mucha 2014-08-08 07:30:57 UTC
Sorry I don't have exact information about how this bug was solved, I only reported it. As far as I know, problem was caused by wrong order of execution compensation contexts after commands fails. Scenario could be: "try to make DC removal fail and check whether compensation was executed properly", *but* that isn't very accurate, since it could fail for many reasons on many places. I did it by throwing exception as a last statement before pronouncing command to be successfully finished, and I don't know whether it could be accurately tested without modifying code.

Comment 8 Martin Pavlik 2014-09-11 12:59:17 UTC
based on comment 7 marking this as code change and verified

Comment 9 Sandro Bonazzola 2014-10-17 12:45:14 UTC
oVirt 3.5 has been released and should include the fix for this issue.

Comment 10 Yevgeny Zaspitsky 2015-11-11 18:02:16 UTC
@see comment #7