Description of problem: A deployment of an overcloud with RGW failed with the error: 2018-03-29 10:08:30,058 p=5245 u=mistral | failed: [192.168.24.8] (item={u'rule_name': u'', u'pg_num': 32, u'name': u'vms'}) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-control ler-2", "ceph", "--cluster", "ceph", "osd", "pool", "create", "vms", "32", "32", "replicated"], "delta": "0:00:01.070260", "end": "2018-03-29 14:08:28.030349", "item": {"name": "vms", "pg_num ": 32, "rule_name": ""}, "msg": "non-zero return code", "rc": 34, "start": "2018-03-29 14:08:26.960089", "stderr": "Error ERANGE: pg_num 32 size 3 would mean 672 total pgs, which exceeds max 600 (mon_max_pg_per_osd 200 * num_in_osds 3)", "stderr_lines": ["Error ERANGE: pg_num 32 size 3 would mean 672 total pgs, which exceeds max 600 (mon_max_pg_per_osd 200 * num_in_osds 3)"], " stdout": "", "stdout_lines": []} 2018-03-29 10:08:31,360 p=5245 u=mistral | failed: [192.168.24.8] (item={u'rule_name': u'', u'pg_num': 32, u'name': u'volumes'}) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-con troller-2", "ceph", "--cluster", "ceph", "osd", "pool", "create", "volumes", "32", "32", "replicated"], "delta": "0:00:01.064665", "end": "2018-03-29 14:08:29.333740", "item": {"name": "volum es", "pg_num": 32, "rule_name": ""}, "msg": "non-zero return code", "rc": 34, "start": "2018-03-29 14:08:28.269075", "stderr": "Error ERANGE: pg_num 32 size 3 would mean 672 total pgs, which exceeds max 600 (mon_max_pg_per_osd 200 * num_in_osds 3)", "stderr_lines": ["Error ERANGE: pg_num 32 size 3 would mean 672 total pgs, which exceeds max 600 (mon_max_pg_per_osd 200 * num_in_ osds 3)"], "stdout": "", "stdout_lines": []} The number of OSDs in the validation is wrong, it assumes that the number of OSDs that will be deployed is 3. Version-Release number of selected component (if applicable): ansible-tripleo-ipsec-8.1.1-0.20180303222819.8f5369a.el7ost.noarch openstack-tripleo-heat-templates-8.0.0-0.20180304031148.el7ost.noarch openstack-tripleo-common-containers-8.5.1-0.20180304032202.e8d9da9.el7ost.noarch openstack-tripleo-puppet-elements-8.0.0-0.20180304005217.dabb361.el7ost.noarch python-tripleoclient-9.1.1-0.20180305094421.90727db.el7ost.noarch openstack-tripleo-image-elements-8.0.0-0.20180304011935.e427c90.el7ost.noarch puppet-tripleo-8.3.1-0.20180304033908.ed3285e.el7ost.noarch openstack-tripleo-common-8.5.1-0.20180304032202.e8d9da9.el7ost.noarch openstack-tripleo-validations-8.3.1-0.20180304031640.d5546cd.el7ost.noarch ceph-ansible-3.1.0-0.1.beta4.el7cp.noarch How reproducible: 100% Steps to Reproduce: 1. deploy an overcloud with RGW and a single node that will run 5 osds Actual results: Deployment fails, the validation shows there are insufficient number of OSDs in the cluster Expected results: Deployment is successful, this validation runs after the deployment has create the OSDs, so it has the right amount of OSDs in the cluster Additional info: deployment command openstack overcloud deploy \ --timeout 100 \ --templates /usr/share/openstack-tripleo-heat-templates \ --stack overcloud \ --libvirt-type kvm \ --ntp-server clock.redhat.com \ --environment-file /usr/share/openstack-tripleo-heat-templates/environments/cinder-backup.yaml \ --environment-file /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-rgw.yaml \ -e /home/stack/virt/internal.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /home/stack/virt/network/network-environment.yaml \ -e /home/stack/virt/hostnames.yml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \ -e /home/stack/virt/debug.yaml \ -e /home/stack/virt/ceph-single-host-mode.yaml \ -e /home/stack/virt/nodes_data.yaml \ -e /home/stack/virt/docker-images.yaml \ --log-file overcloud_deployment_13.log
Hi Yogev, Please see the duplicate bug and try to resolve this as described in the following comment: https://bugzilla.redhat.com/show_bug.cgi?id=1539852#c17 JOhn *** This bug has been marked as a duplicate of bug 1539852 ***