Hide Forgot
rhel-osp-director: after scale down the number of ceph nodes (from 3 to 1) osdmap remains 3 osds: 1 up, 1 in. Environment (OSP-D-7.2-GA): ---------------------------- ceph-0.94.1-13.el7cp.x86_64 ceph-mon-0.94.1-13.el7cp.x86_64 ceph-osd-0.94.1-13.el7cp.x86_64 ceph-common-0.94.1-13.el7cp.x86_64 python-rdomanager-oscplugin-0.0.10-22.el7ost.noarch python-heatclient-0.6.0-1.el7ost.noarch openstack-heat-api-2015.1.2-4.el7ost.noarch heat-cfntools-1.2.8-2.el7.noarch openstack-heat-templates-0-0.8.20150605git.el7ost.noarch instack-0.0.7-2.el7ost.noarch instack-undercloud-2.1.2-36.el7ost.noarch Description: ------------ I attempted to Scale down my 3 ceph nodes deployment to 1 ceph node. after the scale down - the ceph status still shows that number of disks are 3 but only 1 is UP ( the real situation is 1 disk available 1 UP ) Deployment command : openstack overcloud deploy --templates --control-scale 3 --compute-scale 1 --ceph-storage-scale 3 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml --ntp-server 10.5.26.10 --timeout 90 Before Scale down: ------------------- [root@overcloud-controller-0 ~]# ceph status cluster 4d13892c-b403-11e5-8522-525400c91767 health HEALTH_OK monmap e1: 3 mons at {overcloud-controller-0=192.168.0.10:6789/0,overcloud-controller-1=192.168.0.11:6789/0,overcloud-controller-2=192.168.0.12:6789/0} election epoch 12, quorum 0,1,2 overcloud-controller-0,overcloud-controller-1,overcloud-controller-2 osdmap e135: 3 osds: 3 up, 3 in pgmap v476: 224 pgs, 4 pools, 45659 kB data, 19 objects 28331 MB used, 953 GB / 981 GB avail 224 active+clean client io 81 B/s wr, 0 op/s ------------------------------------------------------------------------------- Scale-down command : openstack overcloud deploy --templates --control-scale 3 --compute-scale 1 --ceph-storage-scale 3 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml --ntp-server 10.5.26.10 --timeout 90 After Scale down: ------------------ [root@overcloud-controller-0 ~]# ceph status cluster 4d13892c-b403-11e5-8522-525400c91767 health HEALTH_WARN 183 pgs degraded 63 pgs stale 183 pgs stuck degraded 2 pgs stuck inactive 63 pgs stuck stale 185 pgs stuck unclean 183 pgs stuck undersized 183 pgs undersized recovery 15/38 objects degraded (39.474%) too many PGs per OSD (311 > max 300) pool vms pg_num 64 > pgp_num 56 pool images pg_num 64 > pgp_num 56 pool volumes pg_num 64 > pgp_num 56 monmap e1: 3 mons at {overcloud-controller-0=192.168.0.10:6789/0,overcloud-controller-1=192.168.0.11:6789/0,overcloud-controller-2=192.168.0.12:6789/0} election epoch 12, quorum 0,1,2 overcloud-controller-0,overcloud-controller-1,overcloud-controller-2 osdmap e144: 3 osds: 1 up, 1 in pgmap v727: 248 pgs, 4 pools, 45659 kB data, 19 objects 12852 MB used, 38334 MB / 51187 MB avail 15/38 objects degraded (39.474%) 183 active+undersized+degraded 63 stale+active+clean 2 creating client io 61 B/s rd, 0 op/s Ceph health: ------------ [root@overcloud-controller-0 ~]# ceph health HEALTH_WARN 183 pgs degraded; 63 pgs stale; 183 pgs stuck degraded; 2 pgs stuck inactive; 63 pgs stuck stale; 185 pgs stuck unclean; 183 pgs stuck undersized; 183 pgs undersized; recovery 15/38 objects degraded (39.474%); too many PGs per OSD (311 > max 300); pool vms pg_num 64 > pgp_num 56; pool images pg_num 64 > pgp_num 56; pool volumes pg_num 64 > pgp_num 56
*** Bug 1311997 has been marked as a duplicate of this bug. ***
Scaling down of Ceph nodes is not supported so far. This will need to be RFE. We will though need to prevent user from achieving this state.
Patch is merged upstream.
verified ospd9 : [stack@undercloud72 ~]$ rpm -qa | grep heat openstack-heat-api-6.0.0-3.el7ost.noarch openstack-tripleo-heat-templates-liberty-2.0.0-8.el7ost.noarch openstack-heat-common-6.0.0-3.el7ost.noarch openstack-tripleo-heat-templates-2.0.0-8.el7ost.noarch heat-cfntools-1.3.0-2.el7ost.noarch openstack-heat-templates-0-0.8.20150605git.el7ost.noarch openstack-heat-api-cfn-6.0.0-3.el7ost.noarch python-heatclient-1.0.0-1.el7ost.noarch openstack-tripleo-heat-templates-kilo-2.0.0-8.el7ost.noarch openstack-heat-engine-6.0.0-3.el7ost.noarch [root@undercloud72 ~]# nova list +--------------------------------------+-------------------------+--------+------------+-------------+-----------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------------------------+--------+------------+-------------+-----------------------+ | e7882f51-cea8-4fd6-9825-d4ce94209f66 | overcloud-cephstorage-0 | ACTIVE | - | Running | ctlplane=192.168.0.8 | | b12ab653-57d4-44a8-9eac-62114ac0fc36 | overcloud-cephstorage-1 | ACTIVE | - | Running | ctlplane=192.168.0.7 | | ffb95cfc-cb2f-4baa-8f28-6f6accbf3efb | overcloud-controller-0 | ACTIVE | - | Running | ctlplane=192.168.0.11 | | 00b2396a-19da-4cf9-a666-3e0ce0ed659c | overcloud-controller-1 | ACTIVE | - | Running | ctlplane=192.168.0.10 | | 0c271562-1660-467e-97d4-f89a5f454407 | overcloud-controller-2 | ACTIVE | - | Running | ctlplane=192.168.0.12 | | 4139094d-6f7f-48ce-a715-67c603083b1a | overcloud-novacompute-0 | ACTIVE | - | Running | ctlplane=192.168.0.9 | | 09888732-7c25-4d97-9c9d-9a9d9cfa0b7a | overcloud-novacompute-1 | ACTIVE | - | Running | ctlplane=192.168.0.13 | +--------------------------------------+-------------------------+--------+------------+-------------+-----------------------+ openstack overcloud deploy --templates --control-scale 3 --compute-scale 2 --ceph-storage-scale 1 --neutron-network-type vxlan --neutron-tunnel-types vxlan --ntp-server 10.5.26.10 --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml .. .. 2016-05-31 22:28:51 [overcloud-CephStorageNodesPostDeployment-cbztvrgeq2kl-ExtraConfig-jmi4qykg2gpw]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-05-31 22:28:52 [ExtraConfig]: UPDATE_COMPLETE state changed 2016-05-31 22:28:53 [NetworkDeployment]: SIGNAL_COMPLETE Unknown 2016-05-31 22:28:57 [overcloud-CephStorageNodesPostDeployment-cbztvrgeq2kl]: UPDATE_COMPLETE Stack UPDATE completed successfully Stack overcloud UPDATE_COMPLETE Overcloud Endpoint: http://10.19.184.210:5000/v2.0 Overcloud Deployed [stack@undercloud72 ~]$ nova list +--------------------------------------+-------------------------+--------+------------+-------------+-----------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------------------------+--------+------------+-------------+-----------------------+ | e7882f51-cea8-4fd6-9825-d4ce94209f66 | overcloud-cephstorage-0 | ACTIVE | - | Running | ctlplane=192.168.0.8 | | ffb95cfc-cb2f-4baa-8f28-6f6accbf3efb | overcloud-controller-0 | ACTIVE | - | Running | ctlplane=192.168.0.11 | | 00b2396a-19da-4cf9-a666-3e0ce0ed659c | overcloud-controller-1 | ACTIVE | - | Running | ctlplane=192.168.0.10 | | 0c271562-1660-467e-97d4-f89a5f454407 | overcloud-controller-2 | ACTIVE | - | Running | ctlplane=192.168.0.12 | | 4139094d-6f7f-48ce-a715-67c603083b1a | overcloud-novacompute-0 | ACTIVE | - | Running | ctlplane=192.168.0.9 | | 09888732-7c25-4d97-9c9d-9a9d9cfa0b7a | overcloud-novacompute-1 | ACTIVE | - | Running | ctlplane=192.168.0.13 | +--------------------------------------+-------------------------+--------+------------+-------------+-----------------------+
failed_qa for further investigation After scale down to 1 ceph node, the command 'ceph status' shows : [heat-admin@overcloud-controller-1 ~]$ sudo ceph status cluster 1442b054-d029-11e5-bc14-525400c91767 health HEALTH_WARN 192 pgs degraded 192 pgs stuck degraded 192 pgs stuck unclean 192 pgs stuck undersized 192 pgs undersized monmap e1: 3 mons at {overcloud-controller-0=10.19.105.15:6789/0,overcloud-controller-1=10.19.105.14:6789/0,overcloud-controller-2=10.19.105.11:6789/0} election epoch 8, quorum 0,1,2 overcloud-controller-2,overcloud-controller-1,overcloud-controller-0 osdmap e19: 2 osds: 1 up, 1 in pgmap v1605: 192 pgs, 5 pools, 0 bytes data, 0 objects 4102 MB used, 414 GB / 437 GB avail 192 active+undersized+degraded ------------------------------------------------------ result : osdmap e19: 2 osds: 1 up, 1 in expected : osdmap e19: 1 osds: 1 up, 1 in
could you provide system logs? I want to see ceph osd/mon and puppet run logs. Thanks
Scaling down to one node is not supported. Therefore this bug is being closed as not a supported configuration.
closed, no need for needinfo.