Created attachment 1460856 [details] pgcalc for derived values for OpenStack Ceph Pools Description of problem: Deployed OSP13 GA with Ceph Storage and configured the pg count for each of the 5 OpenStack Ceph pools with the following snippet from our templates: CephPools: - name: images pg_num: 256 rule_name: "" - name: metrics pg_num: 16 rule_name: "" - name: backups pg_num: 16 rule_name: "" - name: vms pg_num: 1024 rule_name: "" - name: volumes pg_num: 256 rule_name: "" That is a total of 1568 pgs. We have a total of 20 configured OSDs in this testbed so we should be able to have (200 * 20) 4000 pgs. The deploy failed on ceph ansible with: https://gist.githubusercontent.com/akrzos/809f744fbc95110b0b89d7fae30082c0/raw/c5a67eb6686ad466f75f2d2045e637a73186c080/gistfile1.txt I am not sure of why it only showed 10 OSDs in the calculation (2000 (mon_max_pg_per_osd 200 * num_in_osds 10) or why the calculated (3936 required pgs was much higher than what we actually configured (1568). Version-Release number of selected component (if applicable): OSP13 with Ceph Storage - GA build Undercloud: (undercloud) [stack@b04-h01-1029p ~]$ rpm -qa | grep ceph puppet-ceph-2.5.0-1.el7ost.noarch ceph-ansible-3.1.0-0.1.rc9.el7cp.noarch Controller: [root@overcloud-controller-0 ~]# rpm -qa | grep ceph collectd-ceph-5.8.0-10.el7ost.x86_64 puppet-ceph-2.5.0-1.el7ost.noarch ceph-mds-12.2.4-10.el7cp.x86_64 libcephfs2-12.2.4-10.el7cp.x86_64 ceph-base-12.2.4-10.el7cp.x86_64 ceph-radosgw-12.2.4-10.el7cp.x86_64 python-cephfs-12.2.4-10.el7cp.x86_64 ceph-selinux-12.2.4-10.el7cp.x86_64 ceph-mon-12.2.4-10.el7cp.x86_64 ceph-common-12.2.4-10.el7cp.x86_64 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Overcloud failed Expected results: Ceph Ansible to not block deployment Additional info: This bug seems similar and is fixed in an earlier version than the version of Ceph ansible used here: https://bugzilla.redhat.com/show_bug.cgi?id=1578086
Alex, would you please say why you see this issue as different than https://bugzilla.redhat.com/show_bug.cgi?id=1578086 ?
(In reply to Ken Dreyer (Red Hat) from comment #3) > Alex, would you please say why you see this issue as different than > https://bugzilla.redhat.com/show_bug.cgi?id=1578086 ? Hi Ken, I am not positive it is a different issue, I was just instructed to open a new bug while reporting this on #ceph-dfg irc channel. Considering the original bug was already marked as fixed and I was using a build > when the fix had posted, I didn't think opening a new bug would be a bad idea. If it is the same issue by all means use the existing bug or maybe it makes sense to track this as new one, which ever works best for you guys, just let me know if you need to check anything on our systems to help fix. Thanks, -Alex
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. Regards, Giri
Level setting the severity of this defect to "High" with a bulk update. Pls refine it to a more closure value, as defined by the severity definition in https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity
Verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:4353