Bug 1421347
Summary: | cinder retype --migration-policy on-demand issues in OSP-5 | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Robin Cernin <rcernin> |
Component: | openstack-cinder | Assignee: | Jon Bernard <jobernar> |
Status: | CLOSED WONTFIX | QA Contact: | Tzach Shefi <tshefi> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 5.0 (RHEL 7) | CC: | acanan, asoni, dhill, eglynn, eharney, geguileo, jobernar, jwaterwo, mflusche, nkinder, pgrist, srevivo, tshefi |
Target Milestone: | async | Keywords: | Reopened, Triaged, ZStream |
Target Release: | 5.0 (RHEL 7) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-04-17 23:12:38 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Robin Cernin
2017-02-11 09:41:53 UTC
Possibly related to this upstream bug, which sounds like the same symptom and has a proposed fix: https://bugs.launchpad.net/cinder/+bug/1657806 Checking the libvirt vs the nova output, we saw a different volumes appear to be attached. nova shows connected: ----------------- 6e8693e5-6979-4a7a-8f65-95b0ff8b8bdb retyping 8a369f26-7fd9-4d63-a863-eb96118909e1 9b02a589-dae8-4780-9a9d-20124cf52adf ----------------- libvirt shows: ----------------- 3a600a0e-ef48-4189-8e05-450d8efe30e5 173e7cf1-f59c-4a43-b6a3-a00fbe4d49f1 17cbe359-9866-415d-97c8-a62866cf0721 attaching ----------------- Looking at the database, it appears that device that is stuck attaching is the migrated volume that is stuck retyping ----------------- MariaDB [cinder]> select * from volumes where id LIKE '%f0721'\G *************************** 1. row *************************** created_at: 2017-03-02 05:47:12 updated_at: 2017-03-02 06:12:14 deleted_at: NULL deleted: 0 id: 17cbe359-9866-415d-97c8-a62866cf0721 ec2_id: NULL user_id: fe4729cf6a7e462fb8925daf36ae3c8e project_id: ac59371df2ba455b939d6ed2f79c6b04 host: ha-controller@backend_netapp10 size: 500 availability_zone: nova instance_uuid: a50d8b51-0361-490e-9281-844b3f3a0e8c mountpoint: /dev/vdc attach_time: 2017-03-02T05:48:07.657648 status: attaching attach_status: detached scheduled_at: 2017-03-02 05:47:12 launched_at: 2017-03-02 06:12:13 terminated_at: NULL display_name: lgposput00329_vdc_restore_2017-03-02_05:42:08_recovery_ display_description: NULL provider_location: gso-e3-affnas01-svm1_lif2.gso.aexp.com:/ge3affnas01_ipc2_cinder9 provider_auth: NULL snapshot_id: d052588d-a0d8-401a-b3fa-de4bab611570 volume_type_id: 41130358-8d77-447f-af5e-6e276193276b source_volid: NULL bootable: 0 attached_host: NULL provider_geometry: NULL _name_id: NULL encryption_key_id: NULL migration_status: target:6e8693e5-6979-4a7a-8f65-95b0ff8b8bdb <---- ----------------- I suspect that the failure here is that nova does not correctly communicate with cinder that the volume has been successful attached, although I didnt see any obvious sign of this in the nova logs. the customer's controller logs are lacking, so i have requested that they upload them again. Red Hat OpenStack Platform version 5 is now End-of-Life, and as such will not have further updates. See https://access.redhat.com/support/policy/updates/openstack/platform/ for full support lifecycle details. After much effort and internal testing, we have determined this operation and live volume migration is not stable in the OSP-5 release, nor did we come up with any viable fixes as there was a great deal of re-work done to improve this in later and more recent releases. It's worth noting at least the suggestions around the keystone timer and other things that could timeout these very long and problematic operations: expiration time to 2 days executing something like this on each keystone node: $ openstack-config -set /etc/keystone/keystone.conf token expiration 172800 $ service openstack-keystone restart That said, the only recommendation here is to do offline migration and look for alternatives to moving data. |