Bug 2125228

Summary: [13->16.2] Undercloud upgrade fails on nova_db_sync_stein
Product: Red Hat OpenStack Reporter: Sergii Golovatiuk <sgolovat>
Component: openstack-tripleo-heat-templatesAssignee: Bogdan Dobrelya <bdobreli>
Status: CLOSED ERRATA QA Contact: Khomesh Thakre <kthakre>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 16.2 (Train)CC: arcsingh, bdobreli, dasmith, eglynn, jhakimra, jpretori, jschluet, kchamart, kthakre, mburns, sbauza, sgordon, vromanso
Target Milestone: z4Keywords: Regression, Triaged
Target Release: 16.2 (Train on RHEL 8.4)   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-11.6.1-2.20221010235131.e0d438c.el8ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 2130140 (view as bug list) Environment:
Last Closed: 2022-12-07 19:24:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2130140    

Description Sergii Golovatiuk 2022-09-08 11:46:18 UTC
Description of problem:

Undercloud upgrade from 13 to 16.2 fails


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Deploy OSP13, Prepare env for upgrade. Run "openstack undercloud upgrade --yes"


Actual results:
Upgrade fails with 

[root@undercloud-0 stack]# podman logs 4839a7ebbd46
sudo: unable to send audit message: Operation not permitted
sudo: unable to send audit message: Operation not permitted
ERROR: Could not access cell0.
Has the nova_api database been created?
Has the nova_cell0 database been created?
Has "nova-manage api_db sync" been run?
Has "nova-manage cell_v2 map_cell0" been run?
Is [api_database]/connection set in nova.conf?
Is the cell0 database connection URL correct?
Error: (pymysql.err.InternalError) (1054, "Unknown column 'cell_mappings.disabled' in 'field list'")
[SQL: SELECT cell_mappings.created_at AS cell_mappings_created_at, cell_mappings.updated_at AS cell_mappings_updated_at, cell_mappings.id AS cell_mappings_id, cell_mappings.uuid AS cell_mappings_uuid, cell_mappings.name AS cell_mappings_name, cell_mappings.transport_url AS cell_mappings_transport_url, cell_mappings.database_connection AS cell_mappings_database_connection, cell_mappings.disabled AS cell_mappings_disabled
FROM cell_mappings
WHERE cell_mappings.uuid = %(uuid_1)s
 LIMIT %(param_1)s]
[parameters: {'uuid_1': '00000000-0000-0000-0000-000000000000', 'param_1': 1}]
(Background on this error at: http://sqlalche.me/e/2j85)
sudo: unable to send audit message: Operation not permitted
Running batches of 50 until complete
Error attempting to run <function create_incomplete_consumers at 0x7fcf8439cd08>
Error attempting to run <function populate_queued_for_delete at 0x7fcf82261620>
6 rows matched query migrate_empty_ratio, 6 migrated
Error attempting to run <function fill_virtual_interface_list at 0x7fcf8225aae8>
Error attempting to run <function populate_user_id at 0x7fcf82261730>
Error attempting to run <function create_incomplete_consumers at 0x7fcf8439cd08>
Error attempting to run <function populate_queued_for_delete at 0x7fcf82261620>
Error attempting to run <function fill_virtual_interface_list at 0x7fcf8225aae8>
Error attempting to run <function populate_user_id at 0x7fcf82261730>
+---------------------------------------------+--------------+-----------+
|                  Migration                  | Total Needed | Completed |
+---------------------------------------------+--------------+-----------+
|         create_incomplete_consumers         |      0       |     0     |
| delete_build_requests_with_no_instance_uuid |      0       |     0     |
|         fill_virtual_interface_list         |      0       |     0     |
|             migrate_empty_ratio             |      6       |     6     |
|          migrate_keypairs_to_api_db         |      0       |     0     |
|       migrate_quota_classes_to_api_db       |      0       |     0     |
|        migrate_quota_limits_to_api_db       |      0       |     0     |
|          migration_migrate_to_uuid          |      0       |     0     |
|     populate_missing_availability_zones     |      0       |     0     |
|          populate_queued_for_delete         |      0       |     0     |
|               populate_user_id              |      0       |     0     |
|                populate_uuids               |      0       |     0     |
|     service_uuids_online_data_migration     |      0       |     0     |
+---------------------------------------------+--------------+-----------+
Some migrations failed unexpectedly. Check log for details.

root@undercloud-0 stack]# podman ps -a | grep  4839a7ebbd46
4839a7ebbd46  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp15-openstack-nova-conductor:20200115.1            /usr/bin/bootstra...  20 hours ago  Exited (2) 20 hours ago          nova_db_sync_stein


Expected results:

Passed undercloud upgrade.

Additional info:

I performed RCA and here is what I found

if we compare paunch configs for 16.1 and 16.2 we would find the difference

[root@undercloud-0 ~]# diff -ubr 1 /var/lib/tripleo-config/container-startup-config/step_3/hashed-nova_api_db_sync_stein.json
--- 1   2022-09-08 11:30:13.112478005 +0000
+++ /var/lib/tripleo-config/container-startup-config/step_3/hashed-nova_api_db_sync_stein.json  2022-09-08 11:18:15.567444086 +0000
@@ -7,8 +7,8 @@
   ],
   "detach": false,
   "environment": {
-    "TRIPLEO_DEPLOY_IDENTIFIER": "1662562103",
-    "TRIPLEO_CONFIG_HASH": "aea856f83d5616154dd5dcf7d2e06081"
+    "TRIPLEO_DEPLOY_IDENTIFIER": "1662634871",
+    "TRIPLEO_CONFIG_HASH": "c4506d96554c2fa3e551c1e9f8cfa5ad-c4506d96554c2fa3e551c1e9f8cfa5ad"
   },
   "image": "undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp15-openstack-nova-api:20200115.1",
   "net": "host",
@@ -26,8 +26,8 @@
     "/etc/puppet:/etc/puppet:ro",
     "/var/log/containers/nova:/var/log/nova:z",
     "/var/log/containers/httpd/nova-api:/var/log/httpd:z",
-    "/var/lib/kolla/config_files/nova_api_db_sync.json:/var/lib/kolla/config_files/config.json:ro",
-    "/var/lib/config-data/puppet-generated/nova:/var/lib/kolla/config_files/src:ro",
+    "/var/lib/config-data/nova/etc/my.cnf.d/tripleo.cnf:/etc/my.cnf.d/tripleo.cnf:ro",
+    "/var/lib/config-data/nova/etc/nova/:/etc/nova/:ro",
     "/var/lib/container-config-scripts/:/container-config-scripts/:ro"
   ]
 }

Also, if I run

paunch debug --file /var/lib/tripleo-config/container-startup-config/step_3/hashed-nova_api_db_sync_stein.json --container nova_api_db_sync_stein --interactive --shell --action run

I don't see /etc/nova/ with mysql string. All I see is default nova.conf that comes from RPM

Comment 17 errata-xmlrpc 2022-12-07 19:24:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.4), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8794