Bug 2125228 - [13->16.2] Undercloud upgrade fails on nova_db_sync_stein
Summary: [13->16.2] Undercloud upgrade fails on nova_db_sync_stein
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.2 (Train)
Hardware: All
OS: All
urgent
urgent
Target Milestone: z4
: 16.2 (Train on RHEL 8.4)
Assignee: Bogdan Dobrelya
QA Contact: Khomesh Thakre
URL:
Whiteboard:
Depends On:
Blocks: 2130140
TreeView+ depends on / blocked
 
Reported: 2022-09-08 11:46 UTC by Sergii Golovatiuk
Modified: 2022-12-07 19:25 UTC (History)
13 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.6.1-2.20221010235131.e0d438c.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2130140 (view as bug list)
Environment:
Last Closed: 2022-12-07 19:24:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 856531 0 None MERGED [Train-Only] Restore nova_api_db_sync_stein vols 2022-09-13 12:59:54 UTC
Red Hat Issue Tracker OSP-18623 0 None None None 2022-09-08 12:16:30 UTC
Red Hat Product Errata RHBA-2022:8794 0 None None None 2022-12-07 19:25:31 UTC

Description Sergii Golovatiuk 2022-09-08 11:46:18 UTC
Description of problem:

Undercloud upgrade from 13 to 16.2 fails


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Deploy OSP13, Prepare env for upgrade. Run "openstack undercloud upgrade --yes"


Actual results:
Upgrade fails with 

[root@undercloud-0 stack]# podman logs 4839a7ebbd46
sudo: unable to send audit message: Operation not permitted
sudo: unable to send audit message: Operation not permitted
ERROR: Could not access cell0.
Has the nova_api database been created?
Has the nova_cell0 database been created?
Has "nova-manage api_db sync" been run?
Has "nova-manage cell_v2 map_cell0" been run?
Is [api_database]/connection set in nova.conf?
Is the cell0 database connection URL correct?
Error: (pymysql.err.InternalError) (1054, "Unknown column 'cell_mappings.disabled' in 'field list'")
[SQL: SELECT cell_mappings.created_at AS cell_mappings_created_at, cell_mappings.updated_at AS cell_mappings_updated_at, cell_mappings.id AS cell_mappings_id, cell_mappings.uuid AS cell_mappings_uuid, cell_mappings.name AS cell_mappings_name, cell_mappings.transport_url AS cell_mappings_transport_url, cell_mappings.database_connection AS cell_mappings_database_connection, cell_mappings.disabled AS cell_mappings_disabled
FROM cell_mappings
WHERE cell_mappings.uuid = %(uuid_1)s
 LIMIT %(param_1)s]
[parameters: {'uuid_1': '00000000-0000-0000-0000-000000000000', 'param_1': 1}]
(Background on this error at: http://sqlalche.me/e/2j85)
sudo: unable to send audit message: Operation not permitted
Running batches of 50 until complete
Error attempting to run <function create_incomplete_consumers at 0x7fcf8439cd08>
Error attempting to run <function populate_queued_for_delete at 0x7fcf82261620>
6 rows matched query migrate_empty_ratio, 6 migrated
Error attempting to run <function fill_virtual_interface_list at 0x7fcf8225aae8>
Error attempting to run <function populate_user_id at 0x7fcf82261730>
Error attempting to run <function create_incomplete_consumers at 0x7fcf8439cd08>
Error attempting to run <function populate_queued_for_delete at 0x7fcf82261620>
Error attempting to run <function fill_virtual_interface_list at 0x7fcf8225aae8>
Error attempting to run <function populate_user_id at 0x7fcf82261730>
+---------------------------------------------+--------------+-----------+
|                  Migration                  | Total Needed | Completed |
+---------------------------------------------+--------------+-----------+
|         create_incomplete_consumers         |      0       |     0     |
| delete_build_requests_with_no_instance_uuid |      0       |     0     |
|         fill_virtual_interface_list         |      0       |     0     |
|             migrate_empty_ratio             |      6       |     6     |
|          migrate_keypairs_to_api_db         |      0       |     0     |
|       migrate_quota_classes_to_api_db       |      0       |     0     |
|        migrate_quota_limits_to_api_db       |      0       |     0     |
|          migration_migrate_to_uuid          |      0       |     0     |
|     populate_missing_availability_zones     |      0       |     0     |
|          populate_queued_for_delete         |      0       |     0     |
|               populate_user_id              |      0       |     0     |
|                populate_uuids               |      0       |     0     |
|     service_uuids_online_data_migration     |      0       |     0     |
+---------------------------------------------+--------------+-----------+
Some migrations failed unexpectedly. Check log for details.

root@undercloud-0 stack]# podman ps -a | grep  4839a7ebbd46
4839a7ebbd46  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp15-openstack-nova-conductor:20200115.1            /usr/bin/bootstra...  20 hours ago  Exited (2) 20 hours ago          nova_db_sync_stein


Expected results:

Passed undercloud upgrade.

Additional info:

I performed RCA and here is what I found

if we compare paunch configs for 16.1 and 16.2 we would find the difference

[root@undercloud-0 ~]# diff -ubr 1 /var/lib/tripleo-config/container-startup-config/step_3/hashed-nova_api_db_sync_stein.json
--- 1   2022-09-08 11:30:13.112478005 +0000
+++ /var/lib/tripleo-config/container-startup-config/step_3/hashed-nova_api_db_sync_stein.json  2022-09-08 11:18:15.567444086 +0000
@@ -7,8 +7,8 @@
   ],
   "detach": false,
   "environment": {
-    "TRIPLEO_DEPLOY_IDENTIFIER": "1662562103",
-    "TRIPLEO_CONFIG_HASH": "aea856f83d5616154dd5dcf7d2e06081"
+    "TRIPLEO_DEPLOY_IDENTIFIER": "1662634871",
+    "TRIPLEO_CONFIG_HASH": "c4506d96554c2fa3e551c1e9f8cfa5ad-c4506d96554c2fa3e551c1e9f8cfa5ad"
   },
   "image": "undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp15-openstack-nova-api:20200115.1",
   "net": "host",
@@ -26,8 +26,8 @@
     "/etc/puppet:/etc/puppet:ro",
     "/var/log/containers/nova:/var/log/nova:z",
     "/var/log/containers/httpd/nova-api:/var/log/httpd:z",
-    "/var/lib/kolla/config_files/nova_api_db_sync.json:/var/lib/kolla/config_files/config.json:ro",
-    "/var/lib/config-data/puppet-generated/nova:/var/lib/kolla/config_files/src:ro",
+    "/var/lib/config-data/nova/etc/my.cnf.d/tripleo.cnf:/etc/my.cnf.d/tripleo.cnf:ro",
+    "/var/lib/config-data/nova/etc/nova/:/etc/nova/:ro",
     "/var/lib/container-config-scripts/:/container-config-scripts/:ro"
   ]
 }

Also, if I run

paunch debug --file /var/lib/tripleo-config/container-startup-config/step_3/hashed-nova_api_db_sync_stein.json --container nova_api_db_sync_stein --interactive --shell --action run

I don't see /etc/nova/ with mysql string. All I see is default nova.conf that comes from RPM

Comment 17 errata-xmlrpc 2022-12-07 19:24:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.4), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8794


Note You need to log in before you can comment on or make changes to this bug.