Bug 1361285
Summary: | Glance deployed with single worker | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Sai Sindhur Malleni <smalleni> |
Component: | openstack-tripleo-heat-templates | Assignee: | Jiri Stransky <jstransk> |
Status: | CLOSED ERRATA | QA Contact: | Avi Avraham <aavraham> |
Severity: | unspecified | Docs Contact: | |
Priority: | medium | ||
Version: | 10.0 (Newton) | CC: | cyril, dbecker, egafford, eglynn, emacchi, fpercoco, jason.dobies, jschluet, jstransk, jtaleric, mburns, mcornea, morazi, pgrist, rhel-osp-director-maint, scohen, smalleni, srevivo |
Target Milestone: | ga | Keywords: | Triaged |
Target Release: | 10.0 (Newton) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openstack-tripleo-heat-templates-5.0.0-1.1.el7ost | Doc Type: | Enhancement |
Doc Text: |
Feature: Glance configured with more workers by default.
Reason: Improved performance.
Result: Glance API and Registry gets deployed with more workers by default. The count is automatically scaled depending on the number of processors.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2016-12-14 15:47:33 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Sai Sindhur Malleni
2016-07-28 16:55:36 UTC
Changed component from openstack-tripleo to director as it was the downstream product I was testing. The previous Rally results used the glance image create with the image url being external to our network. So I believe the bad times we see in some cases are due to the image download itself, and not more of image create. I retested by hosting image locally and the timings improved(no timeouts seen even with single worker). But the timings are much better when # of workers= cores as seen below. The scenrio is glance image create and delete at concurrency=64. 1 worker: http://10.12.23.106:9000/20160728-183814/all-rally-run-0.html#/GlanceImages.create_and_delete_image-2 32 workers(no. of cores): http://10.12.23.106:9000/20160728-185531/all-rally-run-0.html#/GlanceImages.create_and_delete_image-2 I can confirm that setting `workers` to 0 does not mean it'll use the N# of cores. In Glance 0 is translated to a single pool. In order to make it use the number of cores, the workers option should be set to None. While this behavior is not what one would expect, it's been like that since Glance was created and our config files default to None. The problem seems to be in tripleo as it's setting the value for this option to 0[0] and it should probably be changed there. [0] https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/glance-api.yaml#L38-L41 So, we do want to move to deploying with workers=cores? I'd recommand not setting `workers` at all in the configuration file. Just let Glance find the number of cores itself. Moving this to puppet as this might need to be fixed there. (In reply to Flavio Percoco from comment #5) > I can confirm that setting `workers` to 0 does not mean it'll use the N# of > cores. In Glance 0 is translated to a single pool. In order to make it use > the number of cores, the workers option should be set to None. While this > behavior is not what one would expect, it's been like that since Glance was > created and our config files default to None. > > The problem seems to be in tripleo as it's setting the value for this option > to 0[0] and it should probably be changed there. > > [0] > https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/ > services/glance-api.yaml#L38-L41 I can't see any way to fix it in the template because None/null is not a valid default value. So options are either 1. make glance treat 0 the same as null, which I think is consistent with how most other projects now do things? 2. special-case this in puppet-glance to do the same (e.g translate the 0 in to None) We can't really remove this interface completely beccause it will break backwards compatibility for users of the templates. I would: 1) submit the bug upstream in launchpad/puppet-glance 2) change default values in glance::api::workers to be $os_service_default, so the value will be unset by default and we'll rely on what Glance uses by default. Please move the bug upstream and close it. Just a note, I am not working on the bug presently, I just gave some direction at how to do here. So it's clearly a bug in TripleO Heat templates or in Glance, like Steven mentioned well.* It is not a bug in puppet-glance, because puppet-glance provides the interface to configure the number of workers with a default value to the number of processors, which is done by many other modules. So we need to either patch TripleO to change the default GlanceWorkers parameter from 0 to something (None?) or Glance to accept '0' string value. I'm moving the bug out from puppet-glance, to openstack-glance, but feel free to move it where you think it needs to be fixed. @Emilien et all - it looks like 0 is reserved for testing/profiling/etc [1] This is a bug with our puppet modules, we should not even set workers. [1] https://github.com/openstack/glance/blob/474d8d05c438d7f6934019489195416993c2e013/glance/common/wsgi.py#L317 > This is a bug with our puppet modules, we should not even set workers. That is wrong, the option is available in Glance: https://github.com/openstack/glance/blob/474d8d05c438d7f6934019489195416993c2e013/glance/common/wsgi.py#L83-L86 puppet-glance just provides the interface to configure it or not. If you think we should not configure it, just set it to $os_service_default in puppet-glance or set it to "undef" in tripleo. (In reply to Emilien Macchi from comment #14) > > This is a bug with our puppet modules, we should not even set workers. > > That is wrong, the option is available in Glance: > https://github.com/openstack/glance/blob/ > 474d8d05c438d7f6934019489195416993c2e013/glance/common/wsgi.py#L83-L86 Yeah but the default (None) is the right one to keep as it sets the value to the number of CPUs. We don't set that in our RPMs either. > > puppet-glance just provides the interface to configure it or not. If you > think we should not configure it, just set it to $os_service_default in > puppet-glance or set it to "undef" in tripleo. ++ This sounds like the right solution. Hi Jiri, What's the status on this one? Is there a patch related to this? Is it still possible to fix in TripleO for 10? Trying to determine whether we need to push this to 11 and triage there. Thanks! - Elise Hi Elise, there's no patch i know of. I see the BZ has been triaged into Storage DFG so this probably hasn't been assigned to a correct dev yet. (I'm just the default person in the assignee field for all t-h-t BZs.) Regarding a possible fix for 10 -- i think it would either have to land before Thursday's OpenStack feature freeze, or if this arguably causes performance issues, then it might be considered valid for backporting to stable/newton even after feature freeze. I can see submitted upstream https://review.openstack.org/#/c/350219/ I have to agree with Flavio here: if you really want to use #cores, you should set the option to 'None' (see glance.common.swgi.get_num_workers). Fixing this in TripleO seems to be the right thing to do. Hi Jiri, Moving to ON_DEV to reflect your progress. verified [root@undercloud-0 ~]# rpm -q openstack-tripleo-heat-templates openstack-tripleo-heat-templates-5.1.0-7.el7ost.noarch # The number of cores [root@undercloud-0 ~]# nproc 4 [root@controller-0 ~]# grep ^workers /etc/glance/* /etc/glance/glance-api.conf:workers = 4 /etc/glance/glance-registry.conf:workers = 4 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-2948.html |