Bug 2181897

Summary: [Multi-RHEL] Server failed to reach VERIFY_RESIZE status: group ID permissions issue on 'nova_compute' and 'nova_migration_target' containers
Product: Red Hat OpenStack Reporter: Miro Tomaska <mtomaska>
Component: openstack-tripleo-commonAssignee: Bogdan Dobrelya <bdobreli>
Status: CLOSED ERRATA QA Contact: James Parker <jparker>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 17.1 (Wallaby)CC: alifshit, aopincar, bdobreli, dasmith, eglynn, jbadiapa, jhakimra, jparker, jpretori, kchamart, mburns, owalsh, pgrist, rdiazcam, sbauza, sgordon, slinaber, svyas, vromanso
Target Milestone: gaKeywords: Triaged
Target Release: 17.1Flags: mtomaska: needinfo-
mtomaska: needinfo-
bdobreli: needinfo-
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-common-15.4.1-1.20230518211050.cbb03c0.el9ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-16 01:14:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2016660    

Description Miro Tomaska 2023-03-26 18:34:47 UTC
Description of problem:
There are multiple tempest test failing with an error like this
Details: (ServerActionsTestJSON:test_resize_server_revert_with_volume_attached) Server dfc0d66a-05ea-4733-8477-b8bc451329c9 failed to reach VERIFY_RESIZE status and task state "None" within the required time (300 s). Current status: ACTIVE. Current task state: None.

An example of a failed tests:
https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/custom/job/custom-17.1_compact-director-rhel-9.2-virthost-3cont_2comp_3ceph_1freeipa_4comprhel8-ipv4-geneve-ceph-tls-multirhel/2/testReport/junit/tempest.api.compute.servers.test_server_actions/ServerActionsTestJSON/test_resize_server_revert_id_c03aab19_adb1_44f5_917d_c419577e9e68_/

It is possible that perhaps this is no a product issue but maybe a test only issue. Regardless, I wanted to have it officially documented. Kashyap Chamarthy was already informed about this BZ.


Version-Release number of selected component (if applicable):
17.1

How reproducible:
100% on a multirhel deployment

Steps to Reproduce:
1. Deploy 17.1 multirhel, you can use Jenkins job for it
   https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/custom/job/custom-17.1_compact-director-rhel-9.2-virthost-3cont_2comp_3ceph_1freeipa_4comprhel8-ipv4-geneve-ceph-tls-multirhel/
2. Run tempest.api.compute.* tests. 
3. Notice they will fail with the error mentioned above

Actual results:


Expected results:


Additional info:

Comment 6 Artom Lifshitz 2023-04-05 15:34:26 UTC
Asking for blocker+ because migrations will not work until this is fixed.

Comment 7 Bogdan Dobrelya 2023-04-11 13:40:55 UTC
submitted downstream patch before upstream got merged (low cores attention keep it pending)

Comment 10 melanie witt 2023-04-26 16:30:11 UTC
*** Bug 2186553 has been marked as a duplicate of this bug. ***

Comment 18 Bogdan Dobrelya 2023-05-04 15:47:51 UTC
A follow-up patch linked

Comment 21 Bogdan Dobrelya 2023-05-15 13:19:26 UTC
A one more fix added upstream...

Comment 54 errata-xmlrpc 2023-08-16 01:14:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2023:4577