Description of problem: Each gear now produces its own ssh key pair to be shared with other gears of the app. With a bulk scale-up call (e.g. set min gears to 'n+current_count'), since all n gears are created sequentially, n op-groups are created to add the ssh keys. This makes the scale-up call deteriorate badly in time. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Create a scalable app 2. Issue a scale-up of several gears e.g. rhc cartridge scale --min 16 3. Look at broker logs and see the time break up of actual scale-up opgroup vs add_ssh_key op_groups that are executed subsequently. Actual results: For n scale-up, n op-groups are created for add-ssh-keys Expected results: Possibly optimize the op-group creation. The end goal should be to improve the time taken to scale-up by 16 gears. Additional info: More than 50% of time is spent in ssh key installation.
The optimization will be implemented with this Trello card - https://trello.com/c/SDNpDl4G/190-merge-add-ssh-key-op-groups-during-scale-up
We will be fixing this as a bug rather than dealing with it as a user story.
Fixed with --> https://github.com/openshift/origin-server/pull/4565
Actually fixed with: https://github.com/openshift/origin-server/pull/4568
Commits pushed to master at https://github.com/openshift/origin-server https://github.com/openshift/origin-server/commit/d18f26afe056fa353eb4317dd64093ae63567cc9 Bug 1049044: Creating a single sshkey for each scalable application https://github.com/openshift/origin-server/commit/de2a8afe7d50cb7e08b5810c6df4abbc7262cc0e Merge pull request #4568 from danmcp/bug1049044 Merged by openshift-bot
Tested on devenv_4264 and found following problems: 1. Caught exceptions in development.log of broker when scaling up applications. This is reproducible. - Create one scalable ruby-1.9 app and set its min scale to 16. The following exception were caught 16 times in the log --------<log>-------------- 2014-01-23 06:14:32.634 [FATAL] ActionController::RoutingError (No route matches [GET] "/"): actionpack (3.2.8) lib/action_dispatch/middleware/debug_exceptions.rb:21:in `call' actionpack (3.2.8) lib/action_dispatch/middleware/show_exceptions.rb:56:in `call' railties (3.2.8) lib/rails/rack/logger.rb:26:in `call_app' railties (3.2.8) lib/rails/rack/logger.rb:16:in `call' actionpack (3.2.8) lib/action_dispatch/middleware/request_id.rb:22:in `call' rack (1.4.1) lib/rack/methodoverride.rb:21:in `call' rack (1.4.1) lib/rack/runtime.rb:17:in `call' activesupport (3.2.8) lib/active_support/cache/strategy/local_cache.rb:72:in `call' rack (1.4.1) lib/rack/lock.rb:15:in `call' actionpack (3.2.8) lib/action_dispatch/middleware/static.rb:62:in `call' rack-cache (1.2) lib/rack/cache/context.rb:136:in `forward' rack-cache (1.2) lib/rack/cache/context.rb:245:in `fetch' rack-cache (1.2) lib/rack/cache/context.rb:185:in `lookup' rack-cache (1.2) lib/rack/cache/context.rb:66:in `call!' rack-cache (1.2) lib/rack/cache/context.rb:51:in `call' railties (3.2.8) lib/rails/engine.rb:479:in `call' railties (3.2.8) lib/rails/application.rb:223:in `call' railties (3.2.8) lib/rails/railtie/configurable.rb:30:in `method_missing' passenger (3.0.21) lib/phusion_passenger/rack/request_handler.rb:97:in `process_request' passenger (3.0.21) lib/phusion_passenger/abstract_request_handler.rb:521:in `accept_and_process_next_request' passenger (3.0.21) lib/phusion_passenger/abstract_request_handler.rb:274:in `main_loop' passenger (3.0.21) lib/phusion_passenger/rack/application_spawner.rb:206:in `start_request_handler' passenger (3.0.21) lib/phusion_passenger/rack/application_spawner.rb:79:in `block in spawn_application' passenger (3.0.21) lib/phusion_passenger/utils.rb:470:in `safe_fork' passenger (3.0.21) lib/phusion_passenger/rack/application_spawner.rb:64:in `spawn_application' passenger (3.0.21) lib/phusion_passenger/spawn_manager.rb:264:in `spawn_rack_application' passenger (3.0.21) lib/phusion_passenger/spawn_manager.rb:137:in `spawn_application' passenger (3.0.21) lib/phusion_passenger/spawn_manager.rb:275:in `handle_spawn_application' passenger (3.0.21) lib/phusion_passenger/abstract_server.rb:357:in `server_main_loop' passenger (3.0.21) lib/phusion_passenger/abstract_server.rb:206:in `start_synchronously' passenger (3.0.21) helper-scripts/passenger-spawn-server:102:in `<main>' (pid:2701) --------<end of log>-------------- 2. Unable to ssh from haproxy gear to new gears(non-haproxy gear), the .openshift_ssh/config and known_hosts are root owned [r19s-jhou.dev.rhcloud.com 52e0f8463c4fcb5ad5000006]\> ssh -v 52e0f8c33c4fcb15ef000002.rhcloud.com OpenSSH_5.3p1, OpenSSL 1.0.0-fips 29 Mar 2010 Can't open user config file /var/lib/openshift/52e0f8463c4fcb5ad5000006//.openshift_ssh/config: Permission denied [root@ip-10-239-57-70 52e0f8463c4fcb5ad5000006]# ls -lZ .openshift_ssh -rw-rw----. root root system_u:object_r:openshift_var_lib_t:s0 config -rw-------. 52e0f8463c4fcb5ad5000006 52e0f8463c4fcb5ad5000006 unconfined_u:object_r:openshift_var_lib_t:s0:c0,c1000 id_rsa -rw-------. 52e0f8463c4fcb5ad5000006 52e0f8463c4fcb5ad5000006 unconfined_u:object_r:openshift_var_lib_t:s0:c0,c1000 id_rsa.pub -rw-rw----. root root system_u:object_r:openshift_var_lib_t:s0 known_hosts 3. Make the app HA, RESTAPI returns success, but there was no new gears added to the application, the application still consumes 16 gears with only 1 haproxy instance [root@ip-10-239-57-70 openshift]# pwd /var/lib/openshift [root@ip-10-239-57-70 openshift]# ls |grep jhou |wc -l 16 [root@ip-10-239-57-70 openshift]# tree -L 2|grep haproxy | |-- haproxy irb(main):001:0> require '/var/www/openshift/broker/config/environment' => true irb(main):012:0> app=Application.find_by(name: "r19s") => #<Application _id: 52e0f8463c4fcb5ad5000006, _type: nil, created_at: 2014-01-23 11:08:54 UTC, updated_at: 2014-01-23 11:34:24 UTC, name: "r19s", canonical_name: "r19s", domain_requires: [], group_overrides: [{"components"=>[{"cart"=>"ruby-1.9", "comp"=>"ruby-1.9"}, {"cart"=>"haproxy-1.4", "comp"=>"web_proxy", "min_gears"=>2, "max_gears"=>-1}], "min_gears"=>16, "max_gears"=>-1}], domain_id: "52e0f7863c4fcb5ad5000003", domain_namespace: "jhou", owner_id: "52e0f7863c4fcb5ad5000001", builder_id: nil, downloaded_cart_map: {}, default_gear_size: "small", scalable: true, ha: true, init_git_url: nil, analytics: {"user_agent"=>"rhc/1.19.4 (ruby 1.9.3; x86_64-linux) (API [1.1, 1.2, 1.3, 1.4, 1.5, 1.6]) (2.3.4.1, ruby 1.9.3 (2013-06-27))"}, secret_token: "xJ7XEPn7laBGh3r3_bIOOv0fUbpAuazJLhZIeO3elq_PiNppsLWBp6HuvYVO_7j0FleOu0WcJET0fRZf9Hd1aIQibvK5Xed7tfk9qWihr0UOURz3OjICkVstWpgQ_IVx", config: {"auto_deploy"=>true, "deployment_branch"=>"master", "keep_deployments"=>1, "deployment_type"=>"git"}, meta: nil, uuid: "52e0f8463c4fcb074b000001"> irb(main):015:0> app.ha => true irb(main):013:0> app.gears.length => 16
The first error mentioned -- [FATAL] ActionController::RoutingError (No route matches [GET] "/") -- is one we've been seeing for a long time. It doesn't impact the functioning of OpenShift. Please feel free to file a separate bug for this (or we may already have one, I'm not sure). For the 3rd error mentioned (making app HA doesn't add 2nd HAProxy gear), please file a separate bug, as this bug is just about SSH keys.
Jhon, It looks like you changed these perms. Was it related to the bug or did it just seem like the right thing to do? -Dan
Third issue is expected. Not a bug. Make-HA does not scale up automatically if there are more than 2 gears present. The user is expected to do that. Will put appropriate message to the user.
Commits pushed to master at https://github.com/openshift/origin-server https://github.com/openshift/origin-server/commit/83cd9f259a650b70a0990b0ba27112c63b2fd749 Bug 1049044 - Restore setting ssh config settings for gear * Setting the file permissions was incorrectly removed * https://bugzilla.redhat.com/show_bug.cgi?id=1049044#c8 https://github.com/openshift/origin-server/commit/a9b3154ba6d1d71d7dd9e881fb40c8dbc7e547ff Bug 1049044 - Create more of .openshift_ssh environment
Verified on devenv_4269 ssh from haproxy gear to other gears is successful. The ssh config file permission issue is fixed. [root@ip-10-236-136-111 .openshift_ssh]# ls -Zl total 8 -rw-rw----. 1 system_u:object_r:openshift_var_lib_t:s0:c0,c1000 52e2013d6e8d6b7ba0000027 52e2013d6e8d6b7ba0000027 0 Jan 24 00:59 config -rw-------. 1 system_u:object_r:openshift_var_lib_t:s0:c0,c1000 52e2013d6e8d6b7ba0000027 52e2013d6e8d6b7ba0000027 1675 Jan 24 00:59 id_rsa -rw-------. 1 system_u:object_r:openshift_var_lib_t:s0:c0,c1000 52e2013d6e8d6b7ba0000027 52e2013d6e8d6b7ba0000027 424 Jan 24 00:59 id_rsa.pub -rw-rw----. 1 system_u:object_r:openshift_var_lib_t:s0:c0,c1000 52e2013d6e8d6b7ba0000027 52e2013d6e8d6b7ba0000027 0 Jan 24 00:59 known_hosts