Bug 1362666 - oo-admin-move should move gears to nodes with enough free space + buffer space
Summary: oo-admin-move should move gears to nodes with enough free space + buffer space
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Unknown
Version: 2.2.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Sally
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On: 1122084
Blocks: 1277547
TreeView+ depends on / blocked
 
Reported: 2016-08-02 19:39 UTC by Rory Thrasher
Modified: 2016-08-24 19:47 UTC (History)
8 users (show)

Fixed In Version: rubygem-openshift-origin-msg-broker-mcollective-1.36.2.2-1.el6op, rubygem-openshift-origin-node-1.38.6.3-1.el6op, openshift-origin-msg-node-mcollective-1.30.2.2-1.el6op
Doc Type: Bug Fix
Doc Text:
Cause: A gear move does not take into consideration the amount of free space available on the node a gear is moved to. Consequence: Gears could be moved to a node whose free space was less than what the gear required, resulting in gears on that node failing. Fix: The gear move process now considers the amount of free space on each node when determining which node it should move the gear to. Result: Gears are no longer moved to a node whose storage speace is not adequate for the gear.
Clone Of: 1122084
Environment:
Last Closed: 2016-08-24 19:47:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2016:1773 0 normal SHIPPED_LIVE Important: Red Hat OpenShift Enterprise 2.2.10 security, bug fix, and enhancement update 2016-08-24 23:41:18 UTC

Comment 3 Johnny Liu 2016-08-08 10:59:26 UTC
Re-test this bug with openshift-origin-msg-node-mcollective-1.30.2.1-1.el6op.noarch using 2.2/2016-08-05.1 puddle, failed.

Even if node's disk space for /var/lib/openshift is enough for moving gear, still failed.

# oo-admin-move --gear_uuid jialiu-ruby18app-1 -i node1.ose22-auto.com.cn
URL: http://ruby18app-jialiu.ose22-auto.com.cn
Login: jialiu
App UUID: 57a864f182611da20d0002b1
Gear UUID: 57a864f182611da20d0002b1
DEBUG: Source district uuid: 57a8243482611d52e5000001
DEBUG: Destination district uuid: 57a8243482611d52e5000001
DEBUG: Getting existing app 'ruby18app' status before moving
DEBUG: Gear component 'ruby-1.8' was running
DEBUG: Unpublishing routing information for gear 'jialiu-ruby18app-1'
DEBUG: Stopping existing app cartridge 'ruby-1.8' before moving
DEBUG: Force stopping existing app before moving
DEBUG: Gear platform is 'linux'
DEBUG: Moving failed.  Rolling back gear 'jialiu-ruby18app-1' in 'ruby18app' with delete on 'node1.ose22-auto.com.cn'
Gear 'jialiu-ruby18app-1' cannot be moved to 'node1.ose22-auto.com.cn'.  Not enough disk space, node would be > 95% full after move.

Seem like the following two lines of code is not added merged into rpm package.
+Facter.add(:node_disk_free) { setcode { results['node_disk_free'] } }
+Facter.add(:node_total_size) { setcode { results['node_total_size'] } }

Comment 6 Johnny Liu 2016-08-09 01:48:16 UTC
Verified this bug with 2.2/2016-08-08.1, and PASS.

node1:
# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_dhcp128178-lv_root
                       18G  6.1G   11G  38% /
tmpfs                 1.9G     0  1.9G   0% /dev/shm
/dev/vda1             477M   99M  353M  22% /boot
/dev/loop0            7.8G   36M  7.4G   1% /var/lib/openshift

node2:
# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_dhcp128178-lv_root
                       18G   13G  3.4G  80% /
tmpfs                 1.9G     0  1.9G   0% /dev/shm
/dev/vda1             477M   99M  353M  22% /boot
/dev/loop0            7.8G  7.1G  320M  96% /var/lib/openshift

move one gear from node1 to node2, that is not allowed.
# oo-admin-move --gear_uuid jialiu-php54app-1 -i node2.ose22-auto.com.cn
URL: http://php54app-jialiu.ose22-auto.com.cn
Login: jialiu
App UUID: 57a8643882611da20d00029b
Gear UUID: 57a8643882611da20d00029b
DEBUG: Source district uuid: 57a8243482611d52e5000001
DEBUG: Destination district uuid: 57a8243482611d52e5000001
DEBUG: Getting existing app 'php54app' status before moving
DEBUG: Gear component 'php-5.4' was running
DEBUG: Unpublishing routing information for gear 'jialiu-php54app-1'
DEBUG: Stopping existing app cartridge 'php-5.4' before moving
DEBUG: Force stopping existing app before moving
DEBUG: Gear platform is 'linux'
DEBUG: Moving failed.  Rolling back gear 'jialiu-php54app-1' in 'php54app' with delete on 'node2.ose22-auto.com.cn'
Gear 'jialiu-php54app-1' cannot be moved to 'node2.ose22-auto.com.cn'.  Not enough disk space, node would be > 95% full after move.


move one gear from node2 to node1, it succeeded.
# oo-admin-move --gear_uuid jialiu-php53app-1 -i node1.ose22-auto.com.cn
URL: http://php53app-jialiu.ose22-auto.com.cn
Login: jialiu
App UUID: 57a85a0482611da20d000134
Gear UUID: 57a85a0482611da20d000134
DEBUG: Source district uuid: 57a8243482611d52e5000001
DEBUG: Destination district uuid: 57a8243482611d52e5000001
DEBUG: Getting existing app 'php53app' status before moving
DEBUG: Gear component 'php-5.3' was running
DEBUG: Unpublishing routing information for gear 'jialiu-php53app-1'
DEBUG: Stopping existing app cartridge 'php-5.3' before moving
DEBUG: Force stopping existing app before moving
DEBUG: Gear platform is 'linux'
DEBUG: Creating new account for gear 'jialiu-php53app-1' on node1.ose22-auto.com.cn
DEBUG: Moving content for app 'php53app', gear 'jialiu-php53app-1' to node1.ose22-auto.com.cn
Agent pid 17734
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 17734 killed;
DEBUG: Moving system components for app 'php53app', gear 'jialiu-php53app-1' to node1.ose22-auto.com.cn
Agent pid 17742
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 17742 killed;
DEBUG: Starting cartridge 'php-5.3' in 'php53app' after move on node1.ose22-auto.com.cn
DEBUG: Fixing DNS and mongo for gear 'jialiu-php53app-1' after move
DEBUG: Changing server identity of 'jialiu-php53app-1' from 'node2.ose22-auto.com.cn' to 'node1.ose22-auto.com.cn'
DEBUG: Updating routing information for gear 'jialiu-php53app-1' after move
DEBUG: Deconfiguring old app 'php53app' on node2.ose22-auto.com.cn after move
Successfully moved gear with uuid 'jialiu-php53app-1' of app 'php53app' from 'node2.ose22-auto.com.cn' to 'node1.ose22-auto.com.cn'

Comment 8 errata-xmlrpc 2016-08-24 19:47:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-1773.html


Note You need to log in before you can comment on or make changes to this bug.