Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1851239

Summary: [13->16.1] Instances post upgrade shutdown on "nova-compute attempted direct database access which is not allowed by policy"
Product: Red Hat OpenStack Reporter: Archit Modi <amodi>
Component: openstack-tripleo-heat-templatesAssignee: Ollie Walsh <owalsh>
Status: CLOSED CURRENTRELEASE QA Contact: David Rosenfeld <drosenfe>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 16.1 (Train)CC: bdobreli, dciabrin, jfrancoa, jpretori, lmiccini, mburns, morazi, owalsh, rrasouli, smooney, stephenfin
Target Milestone: z3Keywords: Triaged, UpgradeBlocker
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-11-20 14:48:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1849235    
Bug Blocks:    

Description Archit Modi 2020-06-25 21:07:56 UTC
Description of problem: After successful completion of 13->16.1 upgrade, instances that were spun up before/during upgrade (on rhos-13) are now shutdown due to the following error:
nova.exception.DBNotAllowed: nova-compute attempted direct database access which is not allowed by policy

How reproducible: always

Steps to Reproduce:
1. Deploy Rhos-13 & start upgrade process to 16.1
2. During upgrade spin up an instance and live migrate to another compute host
3. After successful FFWD2 upgrade, the instance is shown as ACTIVE but state is shutdown (powering off)

Actual results:
#spin up an instance
+--------------------------------------+-----------+--------+------------+-------------+------------------------------------+------------------------------+--------------------------------------+-------------+--------------------------------------+-------------------+------------------------+------------+
| ID                                   | Name      | Status | Task State | Power State | Networks                           | Image Name                   | Image ID                             | Flavor Name | Flavor ID                            | Availability Zone | Host                   | Properties |
+--------------------------------------+-----------+--------+------------+-------------+------------------------------------+------------------------------+--------------------------------------+-------------+--------------------------------------+-------------------+------------------------+------------+
| 6f9d6d6f-6d32-4350-a217-896ea440fcec | test-7370 | ACTIVE | None       | Running     | private=192.168.100.13, 10.0.0.216 | cirros-0.4.0-x86_64-disk.img | 7f21c88a-daad-4e17-890c-b5ab5fdc07e7 | m1.tiny     | f4c46705-fff1-45b3-ab78-248bb6951e7f | nova              | compute-1.redhat.local |            |
+--------------------------------------+-----------+--------+------------+-------------+------------------------------------+------------------------------+--------------------------------------+-------------+--------------------------------------+-------------------+------------------------+------------+
(qe-Cloud-0) [stack@undercloud-0 ~]$ openstack server migrate test-7370 --block-migration --live-migration
(qe-Cloud-0) [stack@undercloud-0 ~]$ openstack server list --long
+--------------------------------------+-----------+-----------+------------+-------------+------------------------------------+------------------------------+--------------------------------------+-------------+--------------------------------------+-------------------+------------------------+------------+
| ID                                   | Name      | Status    | Task State | Power State | Networks                           | Image Name                   | Image ID                             | Flavor Name | Flavor ID                            | Availability Zone | Host                   | Properties |
+--------------------------------------+-----------+-----------+------------+-------------+------------------------------------+------------------------------+--------------------------------------+-------------+--------------------------------------+-------------------+------------------------+------------+
| 6f9d6d6f-6d32-4350-a217-896ea440fcec | test-7370 | MIGRATING | migrating  | Running     | private=192.168.100.13, 10.0.0.216 | cirros-0.4.0-x86_64-disk.img | 7f21c88a-daad-4e17-890c-b5ab5fdc07e7 | m1.tiny     | f4c46705-fff1-45b3-ab78-248bb6951e7f | nova              | compute-1.redhat.local |            |
+--------------------------------------+-----------+-----------+------------+-------------+------------------------------------+------------------------------+--------------------------------------+-------------+--------------------------------------+-------------------+------------------------+------------+
(qe-Cloud-0) [stack@undercloud-0 ~]$ openstack server list --long
+--------------------------------------+-----------+--------+------------+-------------+------------------------------------+------------------------------+--------------------------------------+-------------+--------------------------------------+-------------------+------------------------+------------+
| ID                                   | Name      | Status | Task State | Power State | Networks                           | Image Name                   | Image ID                             | Flavor Name | Flavor ID                            | Availability Zone | Host                   | Properties |
+--------------------------------------+-----------+--------+------------+-------------+------------------------------------+------------------------------+--------------------------------------+-------------+--------------------------------------+-------------------+------------------------+------------+
| 6f9d6d6f-6d32-4350-a217-896ea440fcec | test-7370 | ACTIVE | None       | Running     | private=192.168.100.13, 10.0.0.216 | cirros-0.4.0-x86_64-disk.img | 7f21c88a-daad-4e17-890c-b5ab5fdc07e7 | m1.tiny     | f4c46705-fff1-45b3-ab78-248bb6951e7f | nova              | compute-0.redhat.local |            |
+--------------------------------------+-----------+--------+------------+-------------+------------------------------------+------------
# post ffwd2 upgrade
(qe-Cloud-0) [stack@undercloud-0 ~]$ openstack server list --long
+--------------------------------------+-----------+--------+--------------+-------------+------------------------------------+------------------------------+--------------------------------------+-------------+--------------------------------------+-------------------+------------------------+------------+
| ID                                   | Name      | Status | Task State   | Power State | Networks                           | Image Name                   | Image ID                             | Flavor Name | Flavor ID                            | Availability Zone | Host                   | Properties |
+--------------------------------------+-----------+--------+--------------+-------------+------------------------------------+------------------------------+--------------------------------------+-------------+--------------------------------------+-------------------+------------------------+------------+
| 6f9d6d6f-6d32-4350-a217-896ea440fcec | test-7370 | ACTIVE | powering-off | Shutdown    | private=192.168.100.13, 10.0.0.216 | cirros-0.4.0-x86_64-disk.img | 7f21c88a-daad-4e17-890c-b5ab5fdc07e7 | m1.tiny     | f4c46705-fff1-45b3-ab78-248bb6951e7f | nova              | compute-0.redhat.local |            |
+--------------------------------------+-----------+--------+--------------+-------------+------------------------------------+------------------------------+--------------------------------------+-------------+--------------------------------------+-------------------+------------------------+------------+
Expected results:
Instances are running with power on

Additional info:

Comment 2 Ollie Walsh 2020-07-08 15:17:33 UTC
Right now this only occurs when UpgradeLevelNovaCompute: auto. There is no reason to do this for FFU so the simple fix in https://bugzilla.redhat.com/show_bug.cgi?id=1849235 is sufficient for now.

Long term we need to rework the nova database config in t-h-t. I've started this in https://review.opendev.org/718552. However this would cause issues for deployments that omit controllers (e.g scale up deployments sometimes do this to speed up the process) so I need to implement an alternative approach for this case.

Comment 13 Ollie Walsh 2020-11-20 14:48:10 UTC
Closing this as the upgrade issue was resolved. Will track the larger tripleo refactor in BZ1899982