Bug 1510056

Summary: [UPDATES] OC update failed: Error EINVAL: specified pgp_num 128 > pg_num 32
Product: Red Hat OpenStack Reporter: Yurii Prokulevych <yprokule>
Component: puppet-cephAssignee: Sofer Athlan-Guyot <sathlang>
Status: CLOSED ERRATA QA Contact: Yogev Rabl <yrabl>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 10.0 (Newton)CC: augol, gfidente, jjoyce, jschluet, mbracho, mbultel, sathlang, sclewis, scohen, slinaber, tvignaud, yrabl
Target Milestone: z6Keywords: Triaged, ZStream
Target Release: 10.0 (Newton)Flags: scohen: needinfo+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: puppet-ceph-2.4.1-2.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1510975 (view as bug list) Environment:
Last Closed: 2017-11-15 13:47:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1510975    

Description Yurii Prokulevych 2017-11-06 16:06:43 UTC
Description of problem:
-----------------------
Minor update of RHOS-10(rhel_7.4_update) to z6 failed:
openstack stack failures list overcloud

overcloud.AllNodesDeploySteps.ControllerDeployment_Step4.1:
  resource_type: OS::Heat::StructuredDeployment
  physical_resource_id: 61bfa6f1-f646-4614-a371-abb0a73fd222
  status: UPDATE_FAILED
  status_reason: |
    UPDATE aborted
  deploy_stdout: |
    ...
    Debug: Using settings: adding file resource 'rrddir': 'File[/var/lib/puppet/rrd]{:path=>"/var/lib/puppet/rrd", :mode=>"750", :owner=>"puppet", :group=>"puppet", :ensure=>:directory, :loglevel=>:debug, :links
=>:follow, :backup=>false}'
    Debug: /File[/var/lib/puppet/rrd]/seluser: Found seluser default 'system_u' for /var/lib/puppet/rrd
    Debug: /File[/var/lib/puppet/rrd]/selrole: Found selrole default 'object_r' for /var/lib/puppet/rrd
    Debug: /File[/var/lib/puppet/rrd]/seltype: Found seltype default 'puppet_var_lib_t' for /var/lib/puppet/rrd
    Debug: /File[/var/lib/puppet/rrd]/selrange: Found selrange default 's0' for /var/lib/puppet/rrd
    Debug: Finishing transaction 283874100
    Debug: Received report to process from controller-1.localdomain
    Debug: Evicting cache entry for environment 'production'
    Debug: Caching environment 'production' (ttl = 0 sec)
    Debug: Processing report from controller-1.localdomain with processor Puppet::Reports::Store
    (truncated, view all with --long)
  deploy_stderr: |
    thout storeconfigs
    Warning: Not collecting exported resources without storeconfigs
    Warning: Not collecting exported resources without storeconfigs
overcloud.AllNodesDeploySteps.ControllerDeployment_Step4.2:
  resource_type: OS::Heat::StructuredDeployment
  physical_resource_id: e92bcc00-f9ae-4fc2-a9be-20d1e90d2655
  status: UPDATE_FAILED
  status_reason: |
    Error: resources[2]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 6
  deploy_stdout: |
    ...
    Debug: Using settings: adding file resource 'rrddir': 'File[/var/lib/puppet/rrd]{:path=>"/var/lib/puppet/rrd", :mode=>"750", :owner=>"puppet", :group=>"puppet", :ensure=>:directory, :loglevel=>:debug, :links
=>:follow, :backup=>false}'
    Debug: /File[/var/lib/puppet/rrd]/seluser: Found seluser default 'system_u' for /var/lib/puppet/rrd
    Debug: /File[/var/lib/puppet/rrd]/selrole: Found selrole default 'object_r' for /var/lib/puppet/rrd
    Debug: /File[/var/lib/puppet/rrd]/seltype: Found seltype default 'puppet_var_lib_t' for /var/lib/puppet/rrd
    Debug: /File[/var/lib/puppet/rrd]/selrange: Found selrange default 's0' for /var/lib/puppet/rrd
    Debug: Finishing transaction 273410780
    Debug: Received report to process from controller-2.localdomain
    Debug: Evicting cache entry for environment 'production'
    Debug: Caching environment 'production' (ttl = 0 sec)
    Debug: Processing report from controller-2.localdomain with processor Puppet::Reports::Store
    (truncated, view all with --long)
  deploy_stderr: |
    p_num 128 returned 22 instead of one of [0]
    Warning: /Firewall[998 log all]: Skipping because of failed dependencies
    Warning: /Firewall[999 drop all]: Skipping because of failed dependencies



Version-Release number of selected component (if applicable):
-------------------------------------------------------------
libcephfs1-10.2.7-48.el7cp.x86_64
ceph-osd-10.2.7-48.el7cp.x86_64
ceph-radosgw-10.2.7-48.el7cp.x86_64
puppet-ceph-2.3.0-7.el7ost.noarch
ceph-base-10.2.7-48.el7cp.x86_64
ceph-selinux-10.2.7-48.el7cp.x86_64
python-cephfs-10.2.7-48.el7cp.x86_64
ceph-common-10.2.7-48.el7cp.x86_64
ceph-mon-10.2.7-48.el7cp.x86_64

puppet-ceph-2.3.0-7.el7ost.noarch (undercloud)

How reproducible:
-----------------


Steps to Reproduce:
1. Install RHOS-10 rhel_7.4_update with RHEL-7.4
2. Install z6 repos on uc oc
3. Apply patch on uc - https://review.openstack.org/#/c/517356/
4. Update UC
5. Follow procedure to update oc (update deployment plan/run update)


Actual results:
---------------
Update failed

Expected results:
-----------------
OC is successfully updated

Additional info:
----------------
Virtual setup: 3controllers + 2computes + 3ceph

Comment 2 Giulio Fidente 2017-11-06 16:48:14 UTC
Until z6 pg_num and pgp_num defaulted to 32, instead of 128.

The problem is that puppet-ceph tries to set new pgp_num values after pg_num is set, we need a require rule on pg_num instead.

As a workaround one might use the pre-existing values:

parameter_defaults:
  ExtraConfig:
    ceph::profile::params::osd_pool_default_pg_num: 32
    ceph::profile::params::osd_pool_default_pgp_num: 32

Comment 3 Giulio Fidente 2017-11-06 16:50:06 UTC
> The problem is that puppet-ceph tries to set new pgp_num values after pg_num
> is set, we need a require rule on pg_num instead.

this was puppet-ceph tries to set the new pgp_num values *before* pg_num is set

Comment 9 Yogev Rabl 2017-11-10 13:02:08 UTC
Verified

Comment 11 errata-xmlrpc 2017-11-15 13:47:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3231