Created attachment 1266509 [details] Cinder logs and config Description of problem: On an RBD mirroring deployment, post cinder failover-host command replicated volumes remain in status error. Replicated volume's snapshot however remains in status available. Version-Release number of selected component (if applicable): rhel7.3 openstack-cinder-10.0.1-0.20170310192919.b05afc3.el7ost.noarch python-cinderclient-1.11.0-1.el7ost.noarch puppet-cinder-10.3.0-1.el7ost.noarch python-cinder-10.0.1-0.20170310192919.b05afc3.el7ost.noarch How reproducible: Steps to Reproduce: 1. Create 3 Cinder volumes two replicated and one none replicated. cinder list +--------------------------------------+-----------+-------------+------+-------------+----------+-------------+ | ID | Status | Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+-----------+-------------+------+-------------+----------+-------------+ | 95bf822c-5390-4618-bd26-494e4767435d | available | NoneReplvol | 1 | - | false | | | a21307f8-5c58-4ad4-8245-92fbe05d3951 | available | RepVolume2 | 1 | REPL | false | | | e7447352-06fa-443b-ab18-9be707b90f8b | available | RepVolume1 | 1 | REPL | false | | +--------------------------------------+-----------+-------------+------+-------------+----------+-------------+ 2. Create snapshot's of all volumes: cinder snapshot-list +--------------------------------------+--------------------------------------+-----------+-----------------+------+ | ID | Volume ID | Status | Name | Size | +--------------------------------------+--------------------------------------+-----------+-----------------+------+ | 0ae49ade-13ca-4e4c-b894-26650a457352 | e7447352-06fa-443b-ab18-9be707b90f8b | available | SnapRepVolume1 | 1 | | 5726b798-8ece-4984-bd36-5728bdd6b177 | a21307f8-5c58-4ad4-8245-92fbe05d3951 | available | SnapRepVolume2 | 1 | | c51fba12-e31e-4b3d-9da1-6f32ffae13bc | 95bf822c-5390-4618-bd26-494e4767435d | available | SnapNoneReplvol | 1 | +--------------------------------------+--------------------------------------+- I've deleted two snapshots from ^ list but it's irrelevant to this bug. 3. #inder failover-host (wait for a few minutes) #cinder service-list --withreplication cinder service-list --withreplication +------------------+------------------------+------+----------+-------+----------------------------+--------------------+-------------------+--------+-----------------+ | Binary | Host | Zone | Status | State | Updated_at | Replication Status | Active Backend ID | Frozen | Disabled Reason | +------------------+------------------------+------+----------+-------+----------------------------+--------------------+-------------------+--------+-----------------+ | cinder-scheduler | hostgroup | nova | enabled | up | 2017-03-26T12:00:34.000000 | | | | - | | cinder-volume | hostgroup@tripleo_ceph | nova | disabled | up | 2017-03-26T12:00:33.000000 | failed-over | cephb | False | failed-over | +------------------+------------------------+------+----------+-------+----------------------------+--------------------+-------------------+--------+-----------------+ As can be seen system has now failed over to cephb 4. Notice all three volumes are in error state, expected but only for none replicated volume! [stack@undercloud-0 ~]$ cinder list +--------------------------------------+--------+-------------+------+-------------+----------+-------------+ | ID | Status | Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+--------+-------------+------+-------------+----------+-------------+ | 95bf822c-5390-4618-bd26-494e4767435d | error | NoneReplvol | 1 | - | false | | | a21307f8-5c58-4ad4-8245-92fbe05d3951 | error | RepVolume2 | 1 | REPL | false | | | e7447352-06fa-443b-ab18-9be707b90f8b | error | RepVolume1 | 1 | REPL | false | | +--------------------------------------+--------+-------------+------+-------------+----------+-------------+ 5. Even more odd snapshot of replicated volume, the only one left before I failed over is still available. [stack@undercloud-0 ~]$ cinder snapshot-list +--------------------------------------+--------------------------------------+-----------+----------------+------+ | ID | Volume ID | Status | Name | Size | +--------------------------------------+--------------------------------------+-----------+----------------+------+ | 5726b798-8ece-4984-bd36-5728bdd6b177 | a21307f8-5c58-4ad4-8245-92fbe05d3951 | available | SnapRepVolume2 | 1 | +--------------------------------------+--------------------------------------+-----------+----------------+------+ Actual results: All volumes are in error state, this is expected only for None replicated volume. It's odd that a replicated volume's snapshot is available, yet it's base volume isn't. Expected results: None replicated volumes should indeed be in error state, yet replicated volumes should be available. Additional info:
Targeting the replication testing bugs found to OSP12
Will be resolved when replication is correctly deployed (OSP-17) and tripleO can deploy a replicated volume