Bug 2226366 - [RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)
Summary: [RBD] Retyping of in-use boot volumes renders instances unusable (possible da...
Keywords:
Status: MODIFIED
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-cinder
Version: 17.1 (Wallaby)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z1
: ---
Assignee: Eric Harney
QA Contact: Evelina Shames
Andy Stillman
URL:
Whiteboard:
: 2229174 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-07-25 19:57 UTC by Eric Harney
Modified: 2023-08-16 21:40 UTC (History)
9 users (show)

Fixed In Version: openstack-cinder-18.2.2-17.1.20230816200905.f6b44fc.el9osttrunk
Doc Type: Known Issue
Doc Text:
There is currently a known issue when using a Red Hat Ceph Storage (RHCS) back end for volumes that can prevent instances from being rebooted, and may lead to data corruption. This occurs when all of the following conditions are met: + * RHCS is the back end for instance volumes. * RHCS has multiple storage pools for volumes. * A volume is being retyped where the new type requires the volume to be stored in a different pool than its current location. * The retype call uses the `on-demand` migration_policy. * The volume is attached to an instance. + Workaround: Do not retype `in-use` volumes that meet all of these listed conditions.
Clone Of:
: 2229174 (view as bug list)
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
reproduction notes (11.49 KB, text/plain)
2023-07-25 19:57 UTC, Eric Harney
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 2019190 0 None None None 2023-07-25 19:57:21 UTC
Red Hat Issue Tracker OSP-26895 0 None None None 2023-07-25 19:57:49 UTC

Description Eric Harney 2023-07-25 19:57:21 UTC
Created attachment 1979481 [details]
reproduction notes

Description of problem:
Volume retype of an in-use RBD volume moves the RBD image but does not update location in the instance using the volume.


Steps to Reproduce:
1.  Have an RBD volume attached to an instance
2.  Retype w/ migrate the volume to a type that moves it to a different RBD pool (different c-vol backend)
3.  Observe the RBD volumes' location (rbd ls volumes) vs. the location in the instance VM (virsh dumpxml).
4.  Reboot the instance w/ openstack server reboot

Actual results:
Instance cannot boot.

Additional info:
The upstream bug contains more detailed notes on reproduction:
https://bugs.launchpad.net/cinder/+bug/2019190
https://paste.openstack.org/raw/bNpzkjbeXrmTCwNHfDGs/

Comment 8 Brian Rosmaita 2023-08-04 15:07:18 UTC
The "known issue" BZ for 17.1 GA is https://bugzilla.redhat.com/show_bug.cgi?id=2229174

Comment 9 Andy Stillman 2023-08-09 13:26:28 UTC
*** Bug 2229174 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.