Bug 1528316 - engine-setup fails upgrading a system with template subversions
Summary: engine-setup fails upgrading a system with template subversions
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Network
Version: 4.2.0
Hardware: Unspecified
OS: Linux
high
high
Target Milestone: ovirt-4.2.2
: ---
Assignee: Eli Mesika
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks: 1547038 1687707
TreeView+ depends on / blocked
 
Reported: 2017-12-21 14:16 UTC by Giorgio Biacchi
Modified: 2019-10-29 07:08 UTC (History)
16 users (show)

Fixed In Version: ovirt-engine-4.2.2.4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1547038 (view as bug list)
Environment:
Last Closed: 2018-03-29 10:54:15 UTC
oVirt Team: Infra
rule-engine: ovirt-4.2+
rule-engine: blocker+


Attachments (Terms of Use)
engine-setup logs (633.88 KB, text/plain)
2017-12-21 14:16 UTC, Giorgio Biacchi
no flags Details
Engine db dump (10.91 MB, text/x-vhdl)
2017-12-27 14:30 UTC, Giorgio Biacchi
no flags Details


Links
System ID Priority Status Summary Last Updated
oVirt gerrit 87636 master ABANDONED engine: unify guids in vminterface - add new entity_guid 2020-05-28 12:43:30 UTC
oVirt gerrit 87637 master ABANDONED engine: unify guids in vminterface - remove old columns 2020-05-28 12:43:30 UTC
oVirt gerrit 87638 master ABANDONED engine: unify guids in vminterface - all the rest 2020-05-28 12:43:30 UTC
oVirt gerrit 88379 ovirt-engine-4.2 MERGED db: workaround for fk conflict 2020-05-28 12:43:30 UTC

Internal Links: 1685890

Description Giorgio Biacchi 2017-12-21 14:16:32 UTC
Created attachment 1370913 [details]
engine-setup logs

Description of problem:

engine-setup fails with the following message:

[ ERROR ] Failed to execute stage 'Setup validation': Failed checking
Engine database: an exception occurred while validating the Engine
database, please check the logs for getting more info:
Constraint violation found in  vm_interface (vmt_guid) |1

It seems that the problem is caused by templates subversions because manually running fkvalidator.sh brings me to subversion templates guids.

Version-Release number of selected component (if applicable):

4.1.8.2-1.el7.centos

How reproducible:

run engine-setup to upgrade to 4.2.0

Steps to Reproduce:
1. create a template and a template subversion
2. run engine-setup to upgrade to 4.2.0
3.

Actual results:

engine-setup fails finding database constraints violations

Expected results:

engine-setup terminates successfully

Additional info:

Comment 1 Sandro Bonazzola 2017-12-21 14:43:34 UTC
Tomas, can you give some hint on the constraint failure here?
Any workaround or way to fix templates subversions to make this work?
Do we have a bug in the constraints?

Comment 2 Arik 2017-12-25 15:04:04 UTC
(In reply to Sandro Bonazzola from comment #1)
The table vm_static contains two relevant columns: vm_guid, vmt_guid.
VM whose id is 1 and is based on template 2 will have:
vm_guid=1, vmt_guid=2
Template whose id is 2 will have:
vm_guid=2 and vmt_guid=2
A template version whose id is 3 and is a version of template 2 will have:
vm_guid=3, vmt_guid=2.

Now the table vm_interface has the following constraint:
"fk_vm_interface_vm_static_template" FOREIGN KEY (vmt_guid) REFERENCES vm_static(vm_guid) ON DELETE CASCADE

That constraint is correct since vmt_guid in vm_interface should (if it's not NULL) reference an existing vm_guid in vm_static.
However, fkvalidator_sp.sql generates the following validation for this constraint:

select  (vmt_guid)  from public.vm_interface where  (vmt_guid) IS NOT NULL and  (vmt_guid)  not in (select vmt_guid from public.vm_static);

This is wrong since the last part should actually be:
... not in (select vm_guid from public.vm_static)
(vmt_guid needs to be replaced with vm_guid)

Now imagine that the mentioned template version 3 has an interface, that interface is set (in vm_interfaces) with vmt_guid=3 but there is no vmt_guid=3 in vm_static. That's why that validation fails for template versions.

Moving to infra to fix the fkvalidator_sp.sql.

Comment 3 Eli Mesika 2017-12-26 11:35:01 UTC
Can you please attach the 4.1.8 PG DB dump as well ? , Thanks

Comment 4 Eli Mesika 2017-12-26 12:03:33 UTC
(In reply to Arik from comment #2)

> 
> Now the table vm_interface has the following constraint:
> "fk_vm_interface_vm_static_template" FOREIGN KEY (vmt_guid) REFERENCES
> vm_static(vm_guid) ON DELETE CASCADE
> 
> That constraint is correct since vmt_guid in vm_interface should (if it's
> not NULL) reference an existing vm_guid in vm_static.

The problem is in the vm_interface definition :

     Column      |           Type           |                     Modifiers                      
-----------------+--------------------------+----------------------------------------------------
 id              | uuid                     | not null
 vm_guid         | uuid                     | 
 vmt_guid        | uuid                     | 
 mac_addr        | character varying(20)    | 
 name            | character varying(50)    | not null
 speed           | integer                  | 
 type            | integer                  | default 0
 _create_date    | timestamp with time zone | default ('now'::text)::timestamp without time zone
 _update_date    | timestamp with time zone | 
 linked          | boolean                  | not null default true
 vnic_profile_id | uuid                     | 


Each row here had either a valid vm_guid and NULL vmt_guid or vice verse.

I think that the right way that follows DB schema rules is to separate this table to two tables 

vm_interface that has only vm_guid 
vm_template_interface that has only vmt_guid 

Than the constraint mentioned is replaced with two simple constraints

ON vm_interface :

fk_vm_interface_vm_static_template" FOREIGN KEY (vm_guid) REFERENCES
vm_static(vm_guid) ON DELETE CASCADE

ON vm_template_interface:

fk_vm_template_interface_vm_static_template" FOREIGN KEY (vmt_guid) REFERENCES
vm_static(vm_guid) ON DELETE CASCADE

Then the relevant SPs and DAL should be changed to handle all of that.

This is the right way to go IMO , fkvalidator.sh is generic and can  not be modified to special cases. The right approach IMO is to fix the DB schema where we abused it for "simplicity" which costs us in maintenance.

Comment 5 Eli Mesika 2017-12-26 12:05:21 UTC
(In reply to Eli Mesika from comment #4)

> ON vm_template_interface:
> 
> fk_vm_template_interface_vm_static_template" FOREIGN KEY (vmt_guid)
> REFERENCES
> vm_static(vm_guid) ON DELETE CASCADE

Sorry, this should be :

 fk_vm_template_interface_vm_static_template" FOREIGN KEY (vmt_guid)
 REFERENCES
 vm_static(vmt_guid) ON DELETE CASCADE

Comment 6 Oved Ourfali 2017-12-26 13:07:02 UTC
So this should probably move to network I guess.

Comment 7 Arik 2017-12-26 13:11:03 UTC
(In reply to Oved Ourfali from comment #6)
> So this should probably move to network I guess.

Yep, but the solution would be a bit different than the one described in comment 4 -
we can unify both vm_guid and vmt_guid in the vm_interface table to a single column and distinguish between templates and VMs according to the vm_type column in vm_static.

Comment 8 Giorgio Biacchi 2017-12-27 14:30:55 UTC
Created attachment 1372801 [details]
Engine db dump

Comment 9 Jiri Belka 2018-01-05 10:06:10 UTC
We got same issue on our brq-setup (a long existing env originating in 3.1).

Is there any progress with this BZ? To upgrade our env is good opportunity to discover potentionally other issues which exist only on long-existing env.

Comment 10 Dan Kenigsberg 2018-01-05 11:28:23 UTC
Eli, can you supply a workaround to disable this upgrade-time validation, since a writing a proper fix proves to take a little while?


Jiri, which is the version of Engine that you wish to upgrade? Eli tells me that this bug has existed since 3.6.

Comment 11 Jiri Belka 2018-01-05 12:11:08 UTC
(In reply to Dan Kenigsberg from comment #10)

> Jiri, which is the version of Engine that you wish to upgrade?

rhevm-4.1.7.6-0.1.el7.noarch (hm, I'll need to update to latest 4.1.8 then).

> Eli tells me that this bug has existed since 3.6.

I hit this issue now, and as I said, this env originates in 3.1 with various hacks/workarounds to pass all upgrade issues we have met (and thanks to this - they were reported/solved).

Comment 12 eraviv 2018-01-07 06:27:28 UTC
Hi Jiri,
A customer bumped into this issue because he had template versions. Do you also have template versions, and if so, do you maybe know since when (i.e. on which rhevm version were they created)?

thanks

Comment 13 Eli Mesika 2018-01-07 09:19:44 UTC
(In reply to Dan Kenigsberg from comment #10)
> Eli, can you supply a workaround to disable this upgrade-time validation,
> since a writing a proper fix proves to take a little while?

There is not workaround for that since the fkvalidator.sh script uses the PG information schema tables to do its magic.

We have to fix the DB and remove the incorrect FKs using a engine-setup plugin 

I will write and submit a patch doing exactly that ASAP

Comment 14 Eli Mesika 2018-01-07 15:06:08 UTC
Meanwhile you can do the following manually :

Before upgrade :

ALTER TABLE vm_interface DROP CONSTRAINT IF EXISTS fk_vm_interface_vm_static;
ALTER TABLE vm_interface DROP CONSTRAINT IF EXISTS fk_vm_interface_vm_static_template;

After upgrade :

ALTER TABLE ONLY vm_interface
    ADD CONSTRAINT fk_vm_interface_vm_static FOREIGN KEY (vm_guid) REFERENCES vm_static(vm_guid) ON DELETE CASCADE;


ALTER TABLE ONLY vm_interface
    ADD CONSTRAINT fk_vm_interface_vm_static_template FOREIGN KEY (vmt_guid) REFERENCES vm_static(vm_guid) ON DELETE CASCADE;

Comment 15 Jiri Belka 2018-01-19 14:17:11 UTC
(In reply to Eli Mesika from comment #14)
> Meanwhile you can do the following manually :
> 
> Before upgrade :
> 
> ALTER TABLE vm_interface DROP CONSTRAINT IF EXISTS fk_vm_interface_vm_static;
> ALTER TABLE vm_interface DROP CONSTRAINT IF EXISTS
> fk_vm_interface_vm_static_template;
> 
> After upgrade :
> 
> ALTER TABLE ONLY vm_interface
>     ADD CONSTRAINT fk_vm_interface_vm_static FOREIGN KEY (vm_guid)
> REFERENCES vm_static(vm_guid) ON DELETE CASCADE;
> 
> 
> ALTER TABLE ONLY vm_interface
>     ADD CONSTRAINT fk_vm_interface_vm_static_template FOREIGN KEY (vmt_guid)
> REFERENCES vm_static(vm_guid) ON DELETE CASCADE;

I tried this and our brq-setup is on 4.1 now.

Comment 16 Jiri Belka 2018-01-22 11:45:41 UTC
(In reply to Jiri Belka from comment #15)
> (In reply to Eli Mesika from comment #14)
> > Meanwhile you can do the following manually :
> > 
> > Before upgrade :
> > 
> > ALTER TABLE vm_interface DROP CONSTRAINT IF EXISTS fk_vm_interface_vm_static;
> > ALTER TABLE vm_interface DROP CONSTRAINT IF EXISTS
> > fk_vm_interface_vm_static_template;
> > 
> > After upgrade :
> > 
> > ALTER TABLE ONLY vm_interface
> >     ADD CONSTRAINT fk_vm_interface_vm_static FOREIGN KEY (vm_guid)
> > REFERENCES vm_static(vm_guid) ON DELETE CASCADE;
> > 
> > 
> > ALTER TABLE ONLY vm_interface
> >     ADD CONSTRAINT fk_vm_interface_vm_static_template FOREIGN KEY (vmt_guid)
> > REFERENCES vm_static(vm_guid) ON DELETE CASCADE;
> 
> I tried this and our brq-setup is on 4.1 now.

                                       ^^^^ 4.2 of course.

Comment 17 Dan Kenigsberg 2018-02-14 09:00:20 UTC
Jiri, can you please try to upgrade from 4.0 with subversions to 4.1? We would like to make sure if this bug reproduces with postgress 9.2.

Comment 18 Jiri Belka 2018-02-19 15:08:32 UTC
(In reply to Dan Kenigsberg from comment #17)
> Jiri, can you please try to upgrade from 4.0 with subversions to 4.1? We
> would like to make sure if this bug reproduces with postgress 9.2.

I just did clean install of 4.0, created 1 template and 1 subtemplate and upgraded to 4.1 without any issue.

Comment 19 Jiri Belka 2018-02-19 15:19:29 UTC
And subsequent upgrade to 4.2 failed as expected:

...
[ INFO  ] Checking the Engine database consistency
[ ERROR ] Failed to execute stage 'Setup validation': Failed checking Engine database: an exception occurred while validating the Engine database, please check the logs for getting more info:
         Constraint violation found in  vm_interface (vmt_guid) |1
         
...

Comment 20 Dan Kenigsberg 2018-02-20 11:57:44 UTC
I've created bug 1547038 to track the network-specific problem. I leave this bug to understand why our db validation is suddenly more pedantic, and move it to Infra.

Comment 21 Jiri Belka 2018-02-22 11:49:03 UTC
ovirt-engine-4.2.1.3-0.1.el7.noarch to ovirt-engine-4.2.2.1-0.1.el7.noarch (4.2.2-2).


- before upgrade

engine=# ALTER TABLE vm_interface DROP CONSTRAINT IF EXISTS fk_vm_interface_vm_static;
NOTICE:  constraint "fk_vm_interface_vm_static" of relation "vm_interface" does not exist, skipping
ALTER TABLE
engine=# ALTER TABLE vm_interface DROP CONSTRAINT IF EXISTS fk_vm_interface_vm_static_template;
ALTER TABLE
engine=# \q

- after upgrade

engine=# ALTER TABLE ONLY vm_interface  ADD CONSTRAINT fk_vm_interface_vm_static FOREIGN KEY (vm_guid) REFERENCES vm_static(vm_guid) ON DELETE CASCADE;
ERROR:  insert or update on table "vm_interface" violates foreign key constraint "fk_vm_interface_vm_static"
DETAIL:  Key (vm_guid)=(87693c3d-2515-44c2-adf1-1488e07bdfcd) is not present in table "vm_static".
engine-#     ADD CONSTRAINT fk_vm_interface_vm_static_template FOREIGN KEY (vmt_guid) REFERENCES vm_static(vm_guid) ON DELETE CASCADE;
ALTER TABLE
engine=#

Comment 22 Michael Burman 2018-03-06 07:48:07 UTC
Dan, 

Is upgrade from 4.2.2.1-0.1.el7 >> 4.2.2.2-1 with template subversion is good for verification of this bug? or i MUST start with 4.1.10.1??

Comment 23 Martin Perina 2018-03-06 08:08:13 UTC
Not included in ovirt-engine-4.2.2.2 moving back to MODIFIED

Comment 24 Michael Burman 2018-03-06 08:23:38 UTC
Ok good to know, i just wasted my time on testing it.
Failed to upgrade from 4.2.2.1-0.1.el7 >> 4.2.2.2-1 with template subversion on this bug, which means that testing from 4.2 > 4.2 is good enough to verify this bug.
No need to start from 4.1.10.1(just answered to my self)

Comment 25 Michael Burman 2018-03-18 08:06:40 UTC
Verified on - 4.2.2.4-0.1.el7

Comment 26 Sandro Bonazzola 2018-03-29 10:54:15 UTC
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.