Bug 1940448

Summary: Upgrade to 4.4.5 fails schema upgrade if user_profiles table contains duplicate entries
Product: [oVirt] ovirt-engine Reporter: Rik Theys <rik.theys>
Component: Database.CoreAssignee: Eli Mesika <emesika>
Status: CLOSED CURRENTRELEASE QA Contact: Guilherme Santos <gdeolive>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.4.5.10CC: aoconnor, bugs, gdeolive, mperina, rszwajko
Target Milestone: ovirt-4.4.5-1Flags: pm-rhel: ovirt-4.4+
aoconnor: blocker+
Target Release: 4.4.5.11   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-4.4.5.11 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-04-15 07:12:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Rik Theys 2021-03-18 12:50:14 UTC
Description of problem:
During the upgrade of oVirt from 4.4.4 to 4.4.5.10 packages and running engine-setup, the upgrade fails the upgrade of the database schema.

The log shows:

psql:/usr/share/ovirt-engine/dbscripts/common_sp.sql:1209: NOTICE:  drop cascades to function fn_db_get_async_tasks()
NOTICE:  drop cascades to trigger delete_disk_image_dynamic_for_image on table images
psql:/usr/share/ovirt-engine/dbscripts/upgrade/04_04_0970_make_user_profiles_generic.sql:45: ERROR:  could not create unique index "pk_property_id"
DETAIL:  Key (property_id)=(00000000-0000-0000-0000-000000000000) is duplicated.
FATAL: Cannot execute sql command: --file=/usr/share/ovirt-engine/dbscripts/upgrade/04_04_0970_make_user_profiles_generic.sql
 
Looking at the user_profiles table, it seems there are multiple entries with all-zero values for the ssh_public_key_id field.

engine=# select * from user_profiles;
              profile_id              |               user_id                |                                       ssh_public_key                                        |          ssh_public_key_id           
--------------------------------------+--------------------------------------+---------------------------------------------------------------------------------------------+--------------------------------------
 a1b2a3e6-a99f-4da6-ba0b-7b8a2b9a97b0 | ad756760-33d7-42ef-80a9-ad1754f9798a |                                                                                             | 00000000-0000-0000-0000-000000000000
 238cb3dd-0a15-48f3-b9c7-1474daee4626 | 7d05da90-33d2-48b1-b07a-1f350bdd59a5 |                                                                                             | 00000000-0000-0000-0000-000000000000
 80a24e5c-f3ab-4cda-bebc-8b6a092bd11b | 8b10a8b4-0686-42e0-ad6c-8fea2fab583a | ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIErV4zwa6Unm/IGzWmpxSdrmk8SmDIJZrFylLyBMWK8Y poro@errai | 3086a093-de88-47fc-badf-667aa8f72d31
(3 rows)


Version-Release number of selected component (if applicable):
ovirt-engine-dbscripts-4.4.5.10-1.el8.noarch


How reproducible:
100% if you have multiple entries with all-zeros id.

Steps to Reproduce:
1. Have multiple entries in user_profiles with an all-zero's id for the ssh public key. No idea how the system ended up in this situation.
2. Try to upgrade from 4.4.4 to 4.4.5.10
3.

Actual results:
Upgrade fails. The setup rolls back the database, but ovirt-engine fails to start. A downgrade of the packages and rerunning engine-setup with --offline is needed to get it to run again.

Expected results:
Upgrade succeeds

Additional info:

As discussed on IRC in #ovirt with dupondje and rszwajko, the solution seems to be to remove the entries without an SSH key. I have not yet tested that this indeed resolves the issue as the system is currently in use.

This bug is opened so this can be automatically performed on upgrade in the future.

Comment 1 RHEL Program Management 2021-03-18 12:53:22 UTC
The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again.

Comment 2 rszwajko 2021-03-18 12:59:29 UTC
ssh_public_key_id is effectively an UUID as it is used by REST as resource identifier. The new code guarantees that using a constraint.
In this case it's clearly an invalid state.
Impacted entries have no SSH key so they would be removed anyway during the migration. However since this case did occur we probably cannot rely that ssh_public_key_id are unique even for valid entries.

Comment 3 Eli Mesika 2021-03-18 13:24:20 UTC
Can we have a db dump before the upgrade to investigate?

Comment 4 Rik Theys 2021-03-18 13:44:42 UTC
(In reply to Eli Mesika from comment #3)
> Can we have a db dump before the upgrade to investigate?

Is there an easy way to anonymize the dump? It seem to contain credentials and all the configuration or the VM's etc.
Is there any specific information you are looking for from the dump?

Regards,
Rik

Comment 5 Martin Perina 2021-03-18 18:16:35 UTC
(In reply to Rik Theys from comment #4)
> (In reply to Eli Mesika from comment #3)
> > Can we have a db dump before the upgrade to investigate?
> 
> Is there an easy way to anonymize the dump? It seem to contain credentials
> and all the configuration or the VM's etc.
> Is there any specific information you are looking for from the dump?
> 
> Regards,
> Rik

There is no such way to anonymize database dump, you would just need to trust Red Hat. But for now it looks good, we have been able to provide a fix based on the user_profiles table content you have provided, so the dump is not needed for now.

Thanks!

Comment 6 Guilherme Santos 2021-04-12 17:37:52 UTC
Verified on:
ovirt-engine-4.4.5.11-0.1.el8ev.noarch
    
Steps:    
1. have a 4.4.4 engine use ovirt-aaa-jdbc-tool to add some users and reset its passwords
# rpm -q ovirt-engine
ovirt-engine-4.4.4.7-0.2.el8ev.noarch
# ovirt-aaa-jdbc-tool user add eqqqwe \
    --attribute=firstName=sssaohn \
    --attribute=lastName=Dsssoe \
    --attribute=email=sas
...
# ovirt-aaa-jdbc-tool user password-reset jdoe --password-valid-to="2025-08-15 10:30:00Z"
...

2. On webadmin, give login permission for the users

3. Have 000000-0000... ssh_public_key_id for some of the users (either by manually editing in the db or unselecting "Persist grid settings" without adding any public_ssh_key in user options in webadmin)
engine=# select user_id, ssh_public_key_id from user_profiles;
               user_id                |          ssh_public_key_id           
--------------------------------------+--------------------------------------
 f5f961a0-9b8a-11eb-ad3a-001a4ad04f00 | 863aee36-7f63-4532-88f6-60382ba85eec
 64817775-7d89-46c4-a30b-e346e55f2cfb | 00000000-0000-0000-0000-000000000000
 81ee0000-1949-43f4-9b47-fd66c8ab811a | 00000000-0000-0000-0000-000000000000
 fd4c0026-b48e-4fbb-af02-6a5309232a06 | b6ad06f1-e5ac-49fd-93dc-7fccd719d8ac
 49d69817-ed43-4934-bfc4-fa2c14d3b303 | 5eb555a7-6e18-44b8-80c4-e995f394d702
(5 rows)

4. update engine to 4.4.5

Results:
update finished successfully