Running "openstack overcloud deploy" within 24 hours of last run fails at Controller Step 3. This is because Swift needs to wait 24 hours before it can rebalance. The Puppet error log from Step 3 on a controller node: [root@overcloud-controller-0 deployed]# cat 2016-10-17-07-21-39Z-cf8c55e0-90ba-46ad-b601-4641bf17d04f-stderr.log Warning: Scope(Class[Cinder::Api]): keystone_enabled is deprecated, use auth_strategy instead. Warning: Scope(Class[Keystone]): Fernet token is recommended in Mitaka release. The default for token_provider will be changed to 'fernet' in O release. Warning: Scope(Class[Keystone]): admin_password is required, please set admin_password to a value != admin_token. admin_token will be removed in a later release Warning: Scope(Class[Keystone::Roles::Admin]): the main class is setting the admin password differently from this\ class when calling bootstrap. This will lead to the password\ flip-flopping and cause authentication issues for the admin user.\ Please ensure that keystone::roles::admin::password and\ keystone::admin_password are set the same. Warning: Scope(Class[Heat]): keystone_user_domain_id is deprecated, use the name option instead. Warning: Scope(Class[Heat]): keystone_project_domain_id is deprecated, use the name option instead. Warning: Scope(Class[Nova]): Could not look up qualified variable '::nova::scheduler::filter::cpu_allocation_ratio'; class ::nova::scheduler::filter has not been evaluated Warning: Scope(Class[Nova]): Could not look up qualified variable '::nova::scheduler::filter::ram_allocation_ratio'; class ::nova::scheduler::filter has not been evaluated Warning: Scope(Class[Nova]): Could not look up qualified variable '::nova::scheduler::filter::disk_allocation_ratio'; class ::nova::scheduler::filter has not been evaluated Warning: Scope(Class[Mongodb::Server]): Replset specified, but no replset_members or replset_config provided. Warning: Scope(Class[Ceilometer]): Both $metering_secret and $telemetry_secret defined, using $telemetry_secret Warning: Scope(Haproxy::Config[haproxy]): haproxy: The $merge_options parameter will default to true in the next major release. Please review the documentation regarding the implications. Error: /Stage[main]/Tripleo::Profile::Base::Swift::Ringbuilder/Swift::Ringbuilder::Rebalance[account]/Exec[rebalance_account]: Failed to call refresh: swift-ring-builder /etc/swift/account.builder rebalance 999 returned 1 instead of one of [0] Error: /Stage[main]/Tripleo::Profile::Base::Swift::Ringbuilder/Swift::Ringbuilder::Rebalance[account]/Exec[rebalance_account]: swift-ring-builder /etc/swift/account.builder rebalance 999 returned 1 instead of one of [0] Error: /Stage[main]/Tripleo::Profile::Base::Swift::Ringbuilder/Swift::Ringbuilder::Rebalance[container]/Exec[rebalance_container]: Failed to call refresh: swift-ring-builder /etc/swift/container.builder rebalance 999 returned 1 instead of one of [0] Error: /Stage[main]/Tripleo::Profile::Base::Swift::Ringbuilder/Swift::Ringbuilder::Rebalance[container]/Exec[rebalance_container]: swift-ring-builder /etc/swift/container.builder rebalance 999 returned 1 instead of one of [0] Error: /Stage[main]/Tripleo::Profile::Base::Swift::Ringbuilder/Swift::Ringbuilder::Rebalance[object]/Exec[rebalance_object]: Failed to call refresh: swift-ring-builder /etc/swift/object.builder rebalance 999 returned 1 instead of one of [0] Error: /Stage[main]/Tripleo::Profile::Base::Swift::Ringbuilder/Swift::Ringbuilder::Rebalance[object]/Exec[rebalance_object]: swift-ring-builder /etc/swift/object.builder rebalance 999 returned 1 instead of one of [0] Running the rebalance command manually causes the following: [root@overcloud-controller-0 deployed]# swift-ring-builder /etc/swift/account.builder rebalance 999 No partitions could be reassigned. The time between rebalances must be at least min_part_hours: 24 hours (20:59:53 remaining) However, you can use the -f option to force a rebalance: [root@overcloud-controller-0 deployed]# swift-ring-builder /etc/swift/account.builder rebalance 999 -f Reassigned 0 (0.00%) partitions. Balance is now 100.00. Dispersion is now 0.00 ------------------------------------------------------------------------------- NOTE: Balance of 100.00 indicates you should push this ring, wait at least 24 hours, and rebalance/repush. ------------------------------------------------------------------------------- Otherwise, it might be an idea to add a check to see if a rebalance is possible. This applies to OSP10 using puppet-tripleo 5.2.0-1.el7ost
The min_part_hours is set to 1 hour in Newton by default. It can be set to a different value using the SwiftMinPartHours. There is one caveat with this approach: if the input ring is slightly different (because each node built it's own ring and rebalances at a slightly different time) some nodes might rebalance while others are not. To fully fix this, we need to fix bz#1310865 first, and then do some kind of a preliminary check (because there might be still nodes that rebalance at a different time). Failed rebalances should only happen on real errors; otherwise it should be simply skipped.
This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.
Closing this bug; there is another one which is ON_QA and is basically the same. Turns out that warnings (but not failures) during rebalance aborted the deployment. *** This bug has been marked as a duplicate of bug 1437499 ***