Bug 1310178

Summary: undercloud's nova.conf scheduler_max_attempts should be dynamic depending on the number of nodes
Product: Red Hat OpenStack Reporter: Sylvain Bauza <sbauza>
Component: instack-undercloudAssignee: Dmitry Tantsur <dtantsur>
Status: CLOSED CURRENTRELEASE QA Contact: Omri Hochman <ohochman>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 8.0 (Liberty)CC: cylopez, dbecker, dtantsur, mburns, morazi, rhel-osp-director-maint, scorcora, sputhenp, vcojot
Target Milestone: gaKeywords: TestOnly
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: instack-undercloud-2.2.7-3.el7ost.noarch Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-23 18:21:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Sylvain Bauza 2016-02-19 16:35:43 UTC
Description of problem:

At the moment, we only set scheduler_max_attempts=3 in nova.conf (default).
Given that we have a design problem between Ironic and Nova, we most of time end up with some undercloud provisioning failing because the placement is suboptimal and we hit a upper bound of retries.

Increasing that retry number doesn't really fix the problem but it's a nice workaround for helping customers to deploy Director.

Since we can't really assume which max number we should cap the number of retries, we should ideally try to figure out the number of hosts to provision in the undercloud and set that number as a value for scheduler_max_attempts.

Comment 2 Dmitry Tantsur 2016-02-25 15:17:47 UTC
Hi! I don't think we can realistically reconfigure Nova just before deployment (it will even require root access). So I think increasing the retries number e.g. to 30 would do the trick. Anyway I know that people prefer to deploy in bulks of 20-30.

Comment 3 Dmitry Tantsur 2016-02-25 15:24:18 UTC
Upstream patch posted.

Comment 5 Dmitry Tantsur 2016-03-02 14:07:25 UTC
Patches merged upstream, waiting for the next rebase now.

Comment 7 Mike Burns 2016-04-07 21:11:06 UTC
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.

Comment 8 Dmitry Tantsur 2016-04-08 07:58:23 UTC
Sigh, I'm so bad at updating my bugs.... Mike, I'm sorry again, this bug did make it in OSPd8...

Comment 14 Omri Hochman 2016-05-29 22:38:39 UTC
(In reply to Dmitry Tantsur from comment #2)
> Hi! I don't think we can realistically reconfigure Nova just before
> deployment (it will even require root access). So I think increasing the
> retries number e.g. to 30 would do the trick. Anyway I know that people
> prefer to deploy in bulks of 20-30.

Verified with ospd-9 

[stack@undercloud72 ~]$ rpm -qa | grep undercloud
instack-undercloud-4.0.0-2.el7ost.noarch

vi /etc/nova/nova.conf

# * Services that use this:
#
#     ``nova-scheduler``
#
# * Related options:
#
#     None
#  (integer value)
#scheduler_max_attempts=3
scheduler_max_attempts=30