Bug 2226366 - [RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)
Summary: [RBD] Retyping of in-use boot volumes renders instances unusable (possible da...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-cinder
Version: 17.1 (Wallaby)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z1
: 17.1
Assignee: Eric Harney
QA Contact: Yosi Ben Shimon
Ian Frangs
URL:
Whiteboard:
: 2229174 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-07-25 19:57 UTC by Eric Harney
Modified: 2023-09-20 00:30 UTC (History)
11 users (show)

Fixed In Version: openstack-cinder-18.2.2-1.20230518161045.el9ost
Doc Type: Bug Fix
Doc Text:
Before this update, when retyping `in-use` Red Hat Ceph Storage (RHCS) volumes to store the volume in a different pool than its current location, data could be corrupted or lost. With this update, the Block Storage RHCS back end resolves this issue.
Clone Of:
: 2229174 (view as bug list)
Environment:
Last Closed: 2023-09-20 00:29:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
reproduction notes (11.49 KB, text/plain)
2023-07-25 19:57 UTC, Eric Harney
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 2019190 0 None None None 2023-07-25 19:57:21 UTC
OpenStack gerrit 893863 0 None NEW Volume retype/migrate: verify instance after an hard reboot 2023-09-06 11:17:31 UTC
Red Hat Issue Tracker OSP-26895 0 None None None 2023-07-25 19:57:49 UTC
Red Hat Product Errata RHBA-2023:5138 0 None None None 2023-09-20 00:30:04 UTC

Description Eric Harney 2023-07-25 19:57:21 UTC
Created attachment 1979481 [details]
reproduction notes

Description of problem:
Volume retype of an in-use RBD volume moves the RBD image but does not update location in the instance using the volume.


Steps to Reproduce:
1.  Have an RBD volume attached to an instance
2.  Retype w/ migrate the volume to a type that moves it to a different RBD pool (different c-vol backend)
3.  Observe the RBD volumes' location (rbd ls volumes) vs. the location in the instance VM (virsh dumpxml).
4.  Reboot the instance w/ openstack server reboot

Actual results:
Instance cannot boot.

Additional info:
The upstream bug contains more detailed notes on reproduction:
https://bugs.launchpad.net/cinder/+bug/2019190
https://paste.openstack.org/raw/bNpzkjbeXrmTCwNHfDGs/

Comment 8 Brian Rosmaita 2023-08-04 15:07:18 UTC
The "known issue" BZ for 17.1 GA is https://bugzilla.redhat.com/show_bug.cgi?id=2229174

Comment 9 Andy Stillman 2023-08-09 13:26:28 UTC
*** Bug 2229174 has been marked as a duplicate of this bug. ***

Comment 25 Luigi Toscano 2023-09-14 13:53:32 UTC
After some manual application of the manual steps which confirmed the verification, the scenario was verified the additional confirmation of running the tests from the WIP tempest and cinder-tempest-plugin patches which can reproduce the problem, namely:
- https://review.opendev.org/c/openstack/tempest/+/890360
- https://review.opendev.org/c/openstack/cinder-tempest-plugin/+/894189

All those tests pass now (failing before). Kudos to Yosi for most of the verification.

openstack-cinder-18.2.2-1.20230518161045.el9ost.noarch
python3-cinder-18.2.2-1.20230518161045.el9ost.noarch
python3-cinder-common-18.2.2-1.20230518161045.el9ost.noarch

Comment 31 errata-xmlrpc 2023-09-20 00:29:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1.1 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:5138


Note You need to log in before you can comment on or make changes to this bug.