Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2041360

Summary: Error resizing root disk instance
Product: Red Hat OpenStack Reporter: Parag Ambre <pambre>
Component: openstack-tripleo-heat-templatesAssignee: Bogdan Dobrelya <bdobreli>
Status: CLOSED ERRATA QA Contact: Jason Grosso <jgrosso>
Severity: medium Docs Contact:
Priority: high    
Version: 16.2 (Train)CC: alifshit, astupnik, bdobreli, dasmith, eglynn, igallagh, jgrosso, jhakimra, kchamart, lherbolt, mariel, mburns, mschuppe, pgodwin, pweeks, sbauza, sgordon, smooney, vromanso
Target Milestone: z6Keywords: Reopened, Triaged
Target Release: 16.2 (Train on RHEL 8.4)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-11.6.1-2.20230309004940.f1d609a.el8ost tripleo-ansible-0.8.1-2.20230817005023.123ce73.el8ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-08 19:18:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Parag Ambre 2022-01-17 08:23:16 UTC
Description of problem:
The Cu is trying to extend the root disk of the VM via resizing the instance flavor.

Version-Release number of selected component (if applicable):
Red Hat OpenStack Platform release 16.2.0 GA (Train)

How reproducible:


Steps to Reproduce:
1. Create the instance with the flavor
2. Try extending the flavor using the openstack server resize ...
3. The resize is failing with the below error 
~~~
|                                     | \'\'\nStderr: "qemu-img: Could not open \'/var/lib/nova/instances/60ae1016-9180-4352-aa7f-66b4f3f780f4/disk\': Could not open  |
|                                     | \'/var/lib/nova/instances/60ae1016-9180-4352-aa7f-66b4f3f780f4/disk\': Permission denied\\n"\n'
~~~

Actual results:
I modified the parameter "allow_resize_to_same_host=true" on all controllers and the resize is working fine but it's failing when we set this parameter's value to false[default]

Expected results:


Additional info:

Comment 2 David Vallee Delisle 2022-01-18 17:11:43 UTC
I was able to reproduce this issue on master branch, I'm trying to determine where the issue is coming from.

On one hand, when an instance is shutoff, the disk file is owned by root. When it's started, the disk file is owned by qemu. It's always been like that.

On the other hand, when cold migrating (or resize for that matter), the nova-migration-wrapper sudos as root for some reason, and it's been like that since 2017 [1]. I believe this was to address the ownership of the mentioned previously.

[1] https://review.rdoproject.org/r/c/openstack/nova-distgit/+/6378

Comment 3 David Vallee Delisle 2022-01-18 19:23:58 UTC
The issue I had on master branch is caused by some new change non-related to the RHOSP16.2.

I proposed a new tempest test for this kind of scenario:
https://review.opendev.org/c/openstack/tempest/+/825161

Comment 4 David Vallee Delisle 2022-01-19 16:57:39 UTC
Hello,

We've discussed this bug in our team meeting, and we have some observations:

- While this might not be the root cause, we're wondering why you have this added to your sshd config?
UsePrivilegeSeparation sandbox

Also, can you please provide the output of these 2 commands in each one of this state:
- When instance is healthy and running
- When instance is shutdown cleanly
- When instance is resizing while allow_resize_to_same_host=False (default)

Commands:
ls -ltraR /var/lib/nova/instances
ls -ltraRZ /var/lib/nova/instances

Another thing, is this issue only happening with this specific instance or with other instances? Can you reproduce the issue with another test instance? Is it only happening on this specific compute node?

We think it's strange that root owns the disk file when instance is shutoff. It's expected that file should be owned by the nova uid (42436) when the instance is off and qemu:qemu when it's running. Qemu is apparently managing these permissions, and it reverts them. If you shutdown another instance, is it the same behavior?

Comment 70 errata-xmlrpc 2023-11-08 19:18:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.2.6 (Train) bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6307