Description of problem: The results of an upgrade with MDS services running on the controller nodes is that the Ceph Ansible, due to its limitations, is setting all of the MDS services as active, instead of 1 active and 2 passive. After the upgrade the Ceph cluster status is: $ sudo ceph -s cluster: id: c18db176-6a52-11e8-818d-525400af03d6 health: HEALTH_WARN insufficient standby MDS daemons available clock skew detected on mon.controller-0 services: mon: 3 daemons, quorum controller-1,controller-2,controller-0 mgr: controller-2(active), standbys: controller-0, controller-1 mds: cephfs-3/3/3 up {0=controller-0=up:active,1=controller-2=up:active,2=controller-1=up:active} osd: 5 osds: 5 up, 5 in data: pools: 8 pools, 288 pgs objects: 57 objects, 192 kB usage: 549 MB used, 99235 MB / 99784 MB avail pgs: 288 active+clean Version-Release number of selected component (if applicable): ceph-ansible-3.1.0-0.1.rc3.el7cp.noarch puppet-tripleo-8.3.2-6.el7ost.noarch openstack-tripleo-puppet-elements-8.0.0-2.el7ost.noarch openstack-tripleo-common-8.6.1-19.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy OSP 12 overcloud with MDS services 2. Upgrade to OSP 13 and Ceph 3.x 3. Validate the status of the Ceph cluster Actual results: As showed in the description, the Ceph cluster status is on Warning with 3 active MDS services Expected results: The default overcloud status for MDS services, with 1 active and 2 standby Additional info:
More details on this at: https://bugzilla.redhat.com/show_bug.cgi?id=1415236#c9
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. Regards, Giri
(In reply to Giridhar Ramaraju from comment #10) > Updating the QA Contact to a Hemant. Hemant will be rerouting them to the > appropriate QE Associate. > > Regards, > Giri The validation from TripleO side is blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1757570. The Ceph QE can validate this bug by upgrading Ceph 2.5 -> Ceph 3.1 with MDS running.