Bug 1228373 - Gears from a scaled application are not evenly distributed across nodes in the district or zone
Summary: Gears from a scaled application are not evenly distributed across nodes in th...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 2.2.0
Hardware: Unspecified
OS: Linux
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Timothy Williams
QA Contact: libra bugs
URL:
Whiteboard:
Depends On: 1234603
Blocks: 1267728
TreeView+ depends on / blocked
 
Reported: 2015-06-04 19:13 UTC by Timothy Williams
Modified: 2019-09-12 08:31 UTC (History)
12 users (show)

Fixed In Version: rubygem-openshift-origin-msg-broker-mcollective-1.35.3.1-1.el6op rubygem-openshift-origin-controller-1.37.3.1-1.el6op
Doc Type: Bug Fix
Doc Text:
When determining which servers are available for gear placement, the least_preferred_servers variable could include all available servers. Additionally, the nodes only update their facts at a one minute interval. If all available nodes for a gear were passed in least_preferred_servers, the last server in the list would be chosen every time. Additionally, gears created for the same application within the same minute (such as through scaling a cartridge up numerous gears at a time) did not consider the placement of gears created within the same minute. These issues combined resulted in very uneven gear spreading for scaled applications. This bug fix updates OpenShift Enterprise so that if all available gears are passed into least_preferred_servers, least_preferred_servers is essentially ignored because all servers are least preferred. Gears now also will take into consideration the placement of the other gears in the application. As a result, spreading across nodes in districts and zones for scaled applications is now even. Scaling an application up multiple gears will result in the gears being spread evenly.
Clone Of:
: 1267728 (view as bug list)
Environment:
Last Closed: 2015-10-19 08:45:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1844 0 normal SHIPPED_LIVE Important: Red Hat OpenShift Enterprise 2.2.7 security, bug fix and enhancement update 2015-09-30 20:35:28 UTC

Description Timothy Williams 2015-06-04 19:13:09 UTC
Description of problem:
When an application is scaled up, the gears are not evenly distributed across nodes in the district or zone. 

Version-Release number of selected component (if applicable):
2.2.5

How reproducible:
Always

Steps to Reproduce:
1. With 4 nodes, create a single district under one region with two zones (two nodes per zone)
2. Create a scaled application in this district.
3. Make the application HA
4. Scale the application to a high number of gears
5. Use `oo-admin-ctl-gears list | wc -l` on each node

Actual results:
(After scaling to 80 gears)
node1 (zone1): 11
node2 (zone1): 17
node3 (zone2): 14
node4 (zone2): 38
zone1: 28
zone2: 52

Expected results:
node1 (zone1): 20
node2 (zone1): 20
node3 (zone2): 20
node4 (zone2): 20
zone1: 40
zone2: 40

Additional info:
If zones are not present, the gears are distributed similarly across the district.

Comment 2 Timothy Williams 2015-06-09 19:57:50 UTC
https://github.com/openshift/origin-server/pull/6160

Comment 4 openshift-github-bot 2015-07-01 15:13:29 UTC
Commit pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/99d3dcc1dba1c5a1b7f28ecffff7fe65923f6a5b
Ignore least preferred servers if all servers are least preferred

Bug 1228373
Bug link https://bugzilla.redhat.com/show_bug.cgi?id=1228373

If an application is already scaled out so that it has a gear on all available nodes, all nodes will be considered 'least_preferred_servers'. This will cause only one server to be returned from rpc_find_all_available, giving select_best_fit_node only one server to choose. This caused sparatic and uneven gear placement from highly scaled applications.

This fix will ignore the least_preferred_servers list if all available servers are included in the list.

Comment 5 Timothy Williams 2015-07-13 18:49:13 UTC
The commit pushed only partially resolves the issue. Distribution is better, but not even as expected.

A complete fix is being investigated in online bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1234603

Comment 15 Ma xiaoqiang 2015-09-18 06:02:13 UTC
There is no fixed package in the puddle

After installation, QE check the issue
[root@broker ~]# oo-admin-ctl-gears list|wc -l
15
[root@node1 ~]# oo-admin-ctl-gears list|wc -l
16
[root@node2 ~]# oo-admin-ctl-gears list|wc -l
8
[root@node3 ~]# oo-admin-ctl-gears list |wc -l
21

QE check the installed package.
#rpm -qa|grep  rubygem-openshift-origin-controller
rubygem-openshift-origin-controller-1.36.2.3-1.el6op.noarch
# rpm -qa|grep  rubygem-openshift-origin-msg-broker-mcollective
rubygem-openshift-origin-msg-broker-mcollective-1.34.1.1-1.el6op.noarch


There is no fixed package in the puddle.

Comment 17 Ma xiaoqiang 2015-09-22 01:52:22 UTC
Check on puddle [2.2.7/2015-09.21.1]

After installation, QE check the issue
[root@broker ~]# oo-admin-ctl-gears list | wc -l
20
[root@node1 ~]# oo-admin-ctl-gears list | wc -l
20
[root@node2 ~]# oo-admin-ctl-gears list | wc -l
20
[root@node3 ~]# oo-admin-ctl-gears list | wc -l
20

Move this issue to VERIFIED.

Comment 20 errata-xmlrpc 2015-09-30 16:37:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1844.html


Note You need to log in before you can comment on or make changes to this bug.