Description of problem: When an administrator uses "oo-admin-move" to move a gear with many server aliases, the target node reloads apache and openshift-sni-proxy once per alias, which unnecessarily taxes the node and can cause the gear move to timeout and fail. Version-Release number of selected component (if applicable): rubygem-openshift-origin-node-1.34.2-1.el6oso.noarch How reproducible: Often Steps to Reproduce: 1. Create an app 2. Create many aliases (in our case, there were 135) 3. Attempt to move the app using oo-admin-move Actual results: Timeout Expected results: This operation should only require a single reload each of apache and openshift-sni-proxy, after all aliases have been populated.
There's at least one non-trivial change to make here, I think. In the v2 model code: https://github.com/openshift/origin-server/blob/master/node/lib/openshift-origin-node/model/v2_cart_model.rb#L1079 There needs to be some aggregation of alias elements to feed to the frontend plugin. However, the plugin interface is to work against a single entry, which typically acquires a lock and reloads after processing the single entry: https://github.com/openshift/origin-server/blob/master/plugins/frontend/apache-vhost/lib/openshift/runtime/frontend/http/plugins/apache-vhost.rb#L145 So, at a minimum, the model code should aggregate, and the plugin interface should be updated to support an array of entries to process within the single lock/reload context. There may also be some other aggregation necessary in the proxy code: https://github.com/openshift/origin-server/blob/master/plugins/msg-broker/mcollective/lib/openshift/mcollective_application_container_proxy.rb#L2038 I'm not totally familiar with the proxy logic for the move/frontend stuff, so I'm not sure whether there's more to be done. I'm hoping Rajat will have some insight there. My initial impression is that while this is certainly fixable, it's probably a bit much to throw in as a bugfix and should be marked UpcomingRelease. I'll mark it now, and let Jhon (or somebody) put it back into the release if we can get a more optimistic estimate.
WIP PR: https://github.com/openshift/origin-server/pull/6425
Commit pushed to master at https://github.com/openshift/origin-server https://github.com/openshift/origin-server/commit/5749235413c6e42cc744cfa25f6b5adead68f2e7 Bug 1194029 - add multiple app aliases in a single httpd reload
verify this bug on devenv_5829 with steps: 1.Setup multi-node env 2. Create the distinct and add node 3. Create one non-scale app and scale app 4. Add 135 alias for app 5. Move those 2 apps rhc app show app1s app1s @ http://app1s-zzhao.dev.rhcloud.com/ (uuid: 58219d00c69622f2ad00047e) ---------------------------------------------------------------------------- Domain: zzhao Created: 4:38 AM Gears: 1 (defaults to small) Git URL: ssh://58219d00c69622f2ad00047e.rhcloud.com/~/git/app1s.git/ SSH: 58219d00c69622f2ad00047e.rhcloud.com Deployment: auto (on git push) Aliases: doo1.com, doo2.com, doo3.com, doo4.com, doo5.com, doo6.com, doo7.com, doo8.com, doo9.com, doo10.com, doo11.com, doo12.com, doo13.com, doo14.com, doo15.com, doo16.com, doo17.com, doo18.com, doo19.com, doo20.com, doo21.com, doo22.com, doo23.com, doo24.com, doo25.com, doo26.com, doo27.com, doo28.com, doo29.com, doo30.com, doo31.com, doo32.com, doo33.com, doo34.com, doo35.com, doo36.com, doo37.com, doo38.com, doo39.com, doo40.com, doo41.com, doo42.com, doo43.com, doo44.com, doo45.com, doo46.com, doo47.com, doo48.com, doo49.com, doo50.com, doo51.com, doo52.com, doo53.com, doo54.com, doo55.com, doo56.com, doo57.com, doo58.com, doo59.com, doo60.com, doo61.com, doo62.com, doo63.com, doo64.com, doo65.com, doo66.com, doo67.com, doo68.com, doo69.com, doo70.com, doo71.com, doo72.com, doo73.com, doo74.com, doo75.com, doo76.com, doo77.com, doo78.com, doo79.com, doo80.com, doo81.com, doo82.com, doo83.com, doo84.com, doo85.com, doo86.com, doo87.com, doo88.com, doo89.com, doo90.com, doo91.com, doo92.com, doo93.com, doo94.com, doo95.com, doo96.com, doo97.com, doo98.com, doo99.com, doo100.com, doo101.com, doo102.com, doo103.com, doo104.com, doo105.com, doo106.com, doo107.com, doo108.com, doo109.com, doo110.com, doo111.com, doo112.com, doo113.com, doo114.com, doo115.com, doo116.com, doo117.com, doo118.com, doo119.com, doo120.com, doo121.com, doo122.com, doo123.com, doo124.com, doo125.com, doo126.com, doo127.com, doo128.com, doo129.com, doo130.com, doo131.com, doo132.com, doo133.com, doo134.com, doo135.com haproxy-1.4 (Web Load Balancer) ------------------------------- Gears: Located with php-5.4 php-5.4 (PHP 5.4) ----------------- Scaling: x1 (minimum: 1, maximum: available) on small gears [root@ip-172-18-12-116 ~]# oo-admin-move --gear_uuid 58219d00c69622f2ad00047e -i ip-172-18-2-156 URL: http://app1s-zzhao.dev.rhcloud.com Login: zzhao App UUID: 58219d00c69622f2ad00047e Gear UUID: 58219d00c69622f2ad00047e DEBUG: Source district uuid: 533183076375773558341632 DEBUG: Destination district uuid: 533183076375773558341632 DEBUG: Getting existing app 'app1s' status before moving DEBUG: Gear component 'php-5.4' was running DEBUG: Stopping existing app cartridge 'php-5.4' before moving DEBUG: Stopping existing app cartridge 'haproxy-1.4' before moving DEBUG: Force stopping existing app before moving DEBUG: Gear platform is 'linux' DEBUG: Creating new account for gear '58219d00c69622f2ad00047e' on ip-172-18-2-156 DEBUG: Moving content for app 'app1s', gear '58219d00c69622f2ad00047e' to ip-172-18-2-156 Agent pid 17924 unset SSH_AUTH_SOCK; unset SSH_AGENT_PID; echo Agent pid 17924 killed; DEBUG: Moving system components for app 'app1s', gear '58219d00c69622f2ad00047e' to ip-172-18-2-156 Agent pid 17948 unset SSH_AUTH_SOCK; unset SSH_AGENT_PID; echo Agent pid 17948 killed; DEBUG: Starting cartridge 'haproxy-1.4' in 'app1s' after move on ip-172-18-2-156 DEBUG: Starting cartridge 'php-5.4' in 'app1s' after move on ip-172-18-2-156 DEBUG: Fixing DNS and mongo for gear '58219d00c69622f2ad00047e' after move DEBUG: Changing server identity of '58219d00c69622f2ad00047e' from 'ip-172-18-12-116' to 'ip-172-18-2-156' DEBUG: Deconfiguring old app 'app1s' on ip-172-18-12-116 after move Successfully moved gear with uuid '58219d00c69622f2ad00047e' of app 'app1s' from 'ip-172-18-12-116' to 'ip-172-18-2-156'
Move to ON_QA as https://github.com/openshift/origin-server/pull/6427 is merged.
Verified this bug according to comment 5
We apologize, however, we do not plan to address this report at this time. The majority of our active development is for the v3 version of OpenShift. If you would like for Red Hat to reconsider this decision, please reach out to your support representative. We are very sorry for any inconvenience this may cause.