Bug 1433942 - The validator "Check the number of processes" fails for a default undercloud installation
Summary: The validator "Check the number of processes" fails for a default undercloud ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-validations
Version: 11.0 (Ocata)
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: rc
: 11.0 (Ocata)
Assignee: Florian Fuchs
QA Contact: Ola Pavlenko
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-20 12:21 UTC by Udi
Modified: 2017-05-17 20:09 UTC (History)
7 users (show)

Fixed In Version: openstack-tripleo-validations-5.5.0-1.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-05-17 20:09:43 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1245 normal SHIPPED_LIVE Red Hat OpenStack Platform 11.0 Bug Fix and Enhancement Advisory 2017-05-17 23:01:50 UTC
OpenStack gerrit 452746 None None None 2017-04-05 13:36:14 UTC

Description Udi 2017-03-20 12:21:34 UTC
Description of problem:
I installed OSP11 on bare metals and ran the validator "Check the number of OpenStack processes on undercloud". It fails with:

Task 'Verify the number of running processes per OpenStack service' failed:
Host: localhost
Message: There are 9 heat-engine processes running. Having more than 8 risks running out of memory.

Task 'Verify the number of running processes per OpenStack service' failed:
Host: localhost
Message: There are 9 nova-api processes running. Having more than 8 risks running out of memory.


Version-Release number of selected component (if applicable):
openstack-tripleo-validations-5.4.0-4.el7ost.noarch
openstack-tripleo-common-6.0.1-0.20170307123121.2c9fa69.el7ost.noarch
puppet-tripleo-6.3.0-1.el7ost.noarch


How reproducible:
100%


Steps to Reproduce:
1. Install a default undercloud and run the validator


Actual results:
Validator fails.


Expected results:
We should comply with our own guidelines and recommendations. The validator should pass if the user didn't make a configuration change.

Comment 2 Florian Fuchs 2017-04-03 13:17:37 UTC
The maximum number of processes is currently hard-coded to 4. But some processes (like nova or heat) set the number of workers to higher values, especially on hosts with a larger number of CPUs.

I suggest to set the maximum number in this validation to the number of CPUs on the undercloud.

Upstream patch: https://review.openstack.org/#/c/452746/

Comment 4 Florian Fuchs 2017-04-10 20:00:34 UTC
Updated the patch with a different fix, based on review feedback.

Comment 6 Florian Fuchs 2017-04-11 11:25:59 UTC
(In reply to Florian Fuchs from comment #2)
> The maximum number of processes is currently hard-coded to 4. But some
> processes (like nova or heat) set the number of workers to higher values,
> especially on hosts with a larger number of CPUs.
> 
> I suggest to set the maximum number in this validation to the number of CPUs
> on the undercloud.
> 
> Upstream patch: https://review.openstack.org/#/c/452746/

Correction: The max number is 8, not 4. Also, considering the original intention of the validation, my suggestion to set the max number equal to the number of CPUs doesn't make much sense, because that's the condition the validation is supposed to prevent in the first place (services eating up memory in setups with a large number of CPUs). So, until a better way is found to realistically assess a meaningful maximum number of processes per service, the proposed upstream fix makes the validation succeed with warnings instead of letting it fail.

Comment 9 Jason E. Rist 2017-04-11 15:41:25 UTC
Removing Regression based on logic in #5, removing blocker since that was the intended action before #7

Comment 11 Udi 2017-04-27 13:16:18 UTC
Verified in openstack-tripleo-validations-5.5.0-1.el7ost.noarch. The validator passes with a warning.

Comment 12 errata-xmlrpc 2017-05-17 20:09:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245


Note You need to log in before you can comment on or make changes to this bug.