Description of problem: Last night, oo-admin-move moved a gear that was 21GB to a node that had 21.1GB free. This set off our paging alerts as the node was essentially full (only 100MB were free). While there was technically enough space on the node, this makes the node unusable after the move. When a gear gets moved, it needs to move to a host that has adequate free space + a buffer of space. This could be a number of GB or a % of free space. Version-Release number of selected component (if applicable): openshift-origin-broker-util-1.26.3-1.el6oso.noarch How reproducible: I would assume very reproducible Actual results: Gear was moved to new node, but there was not enough disk space after the move. Expected results: Gear to be moved and have a buffer as well.
https://github.com/openshift/origin-server/pull/6382 merged
Verified this bug on devenv_5830 with steps: 1) Launch ec2 instance and setup multi-node env 2) create distinct and add nodes to it 3) Create app 4) Create big data file in gear on node1 and make the available disk to 1G using dd if=/dev/zero of=swapfile4 bs=1024 count=7262144 5) Create another app on other node2 6) Create big data file in this app about 1G 7) move this app to node1 # oo-admin-move --gear_uuid 58219aecc69622f2ad00024b -i ip-172-18-2-156 URL: http://app1-zzhao.dev.rhcloud.com Login: zzhao App UUID: 58219aecc69622f2ad00024b Gear UUID: 58219aecc69622f2ad00024b DEBUG: Source district uuid: 533183076375773558341632 DEBUG: Destination district uuid: 533183076375773558341632 DEBUG: Getting existing app 'app1' status before moving DEBUG: Gear component 'php-5.4' was running DEBUG: Stopping existing app cartridge 'php-5.4' before moving DEBUG: Force stopping existing app before moving DEBUG: Gear platform is 'linux' DEBUG: Moving failed. Rolling back gear '58219aecc69622f2ad00024b' in 'app1' with delete on 'ip-172-18-2-156' Gear '58219aecc69622f2ad00024b' cannot be moved to 'ip-172-18-2-156'. Not enough disk space, node would be > 95% full after move