Bug 1049044 - Scale up by 'n' gears causes inefficient installation of ssh keys
Summary: Scale up by 'n' gears causes inefficient installation of ssh keys
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Containers
Version: 2.x
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Jhon Honce
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-01-06 21:03 UTC by Rajat Chopra
Modified: 2015-05-14 23:33 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-02-26 19:09:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Rajat Chopra 2014-01-06 21:03:01 UTC
Description of problem:
Each gear now produces its own ssh key pair to be shared with other gears of the app. With a bulk scale-up call (e.g. set min gears to 'n+current_count'), since all n gears are created sequentially, n op-groups are created to add the ssh keys.

This makes the scale-up call deteriorate badly in time.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Create a scalable app
2. Issue a scale-up of several gears e.g. rhc cartridge scale --min 16
3. Look at broker logs and see the time break up of actual scale-up opgroup vs add_ssh_key op_groups that are executed subsequently.

Actual results:
For n scale-up, n op-groups are created for add-ssh-keys

Expected results:
Possibly optimize the op-group creation. The end goal should be to improve the time taken to scale-up by 16 gears.

Additional info:
More than 50% of time is spent in ssh key installation.

Comment 1 Rajat Chopra 2014-01-21 19:29:43 UTC
The optimization will be implemented with this Trello card - 
https://trello.com/c/SDNpDl4G/190-merge-add-ssh-key-op-groups-during-scale-up

Comment 2 Abhishek Gupta 2014-01-23 04:10:58 UTC
We will be fixing this as a bug rather than dealing with it as a user story.

Comment 3 Abhishek Gupta 2014-01-23 04:11:53 UTC
Fixed with --> https://github.com/openshift/origin-server/pull/4565

Comment 4 Dan McPherson 2014-01-23 05:38:30 UTC
Actually fixed with:  https://github.com/openshift/origin-server/pull/4568

Comment 5 openshift-github-bot 2014-01-23 06:41:43 UTC
Commits pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/d18f26afe056fa353eb4317dd64093ae63567cc9
Bug 1049044: Creating a single sshkey for each scalable application

https://github.com/openshift/origin-server/commit/de2a8afe7d50cb7e08b5810c6df4abbc7262cc0e
Merge pull request #4568 from danmcp/bug1049044

Merged by openshift-bot

Comment 6 Jianwei Hou 2014-01-23 12:15:57 UTC
Tested on devenv_4264 and found following problems:

1. Caught exceptions in development.log of broker when scaling up applications. This is reproducible.
 - Create one scalable ruby-1.9 app and set its min scale to 16. The following exception were caught 16 times in the log
--------<log>--------------
2014-01-23 06:14:32.634 [FATAL] ActionController::RoutingError (No route matches [GET] "/"):
actionpack (3.2.8) lib/action_dispatch/middleware/debug_exceptions.rb:21:in `call'
actionpack (3.2.8) lib/action_dispatch/middleware/show_exceptions.rb:56:in `call'
railties (3.2.8) lib/rails/rack/logger.rb:26:in `call_app'
railties (3.2.8) lib/rails/rack/logger.rb:16:in `call'
actionpack (3.2.8) lib/action_dispatch/middleware/request_id.rb:22:in `call'
rack (1.4.1) lib/rack/methodoverride.rb:21:in `call'
rack (1.4.1) lib/rack/runtime.rb:17:in `call'
activesupport (3.2.8) lib/active_support/cache/strategy/local_cache.rb:72:in `call'
rack (1.4.1) lib/rack/lock.rb:15:in `call'
actionpack (3.2.8) lib/action_dispatch/middleware/static.rb:62:in `call'
rack-cache (1.2) lib/rack/cache/context.rb:136:in `forward'
rack-cache (1.2) lib/rack/cache/context.rb:245:in `fetch'
rack-cache (1.2) lib/rack/cache/context.rb:185:in `lookup'
rack-cache (1.2) lib/rack/cache/context.rb:66:in `call!'
rack-cache (1.2) lib/rack/cache/context.rb:51:in `call'
railties (3.2.8) lib/rails/engine.rb:479:in `call'
railties (3.2.8) lib/rails/application.rb:223:in `call'
railties (3.2.8) lib/rails/railtie/configurable.rb:30:in `method_missing'
passenger (3.0.21) lib/phusion_passenger/rack/request_handler.rb:97:in `process_request'
passenger (3.0.21) lib/phusion_passenger/abstract_request_handler.rb:521:in `accept_and_process_next_request'
passenger (3.0.21) lib/phusion_passenger/abstract_request_handler.rb:274:in `main_loop'
passenger (3.0.21) lib/phusion_passenger/rack/application_spawner.rb:206:in `start_request_handler'
passenger (3.0.21) lib/phusion_passenger/rack/application_spawner.rb:79:in `block in spawn_application'
passenger (3.0.21) lib/phusion_passenger/utils.rb:470:in `safe_fork'
passenger (3.0.21) lib/phusion_passenger/rack/application_spawner.rb:64:in `spawn_application'
passenger (3.0.21) lib/phusion_passenger/spawn_manager.rb:264:in `spawn_rack_application'
passenger (3.0.21) lib/phusion_passenger/spawn_manager.rb:137:in `spawn_application'
passenger (3.0.21) lib/phusion_passenger/spawn_manager.rb:275:in `handle_spawn_application'
passenger (3.0.21) lib/phusion_passenger/abstract_server.rb:357:in `server_main_loop'
passenger (3.0.21) lib/phusion_passenger/abstract_server.rb:206:in `start_synchronously'
passenger (3.0.21) helper-scripts/passenger-spawn-server:102:in `<main>' (pid:2701)

--------<end of log>--------------

2. Unable to ssh from haproxy gear to new gears(non-haproxy gear), the .openshift_ssh/config and known_hosts are root owned
[r19s-jhou.dev.rhcloud.com 52e0f8463c4fcb5ad5000006]\> ssh -v 52e0f8c33c4fcb15ef000002.rhcloud.com
OpenSSH_5.3p1, OpenSSL 1.0.0-fips 29 Mar 2010
Can't open user config file /var/lib/openshift/52e0f8463c4fcb5ad5000006//.openshift_ssh/config: Permission denied

[root@ip-10-239-57-70 52e0f8463c4fcb5ad5000006]# ls -lZ .openshift_ssh
-rw-rw----. root root system_u:object_r:openshift_var_lib_t:s0 config
-rw-------. 52e0f8463c4fcb5ad5000006 52e0f8463c4fcb5ad5000006 unconfined_u:object_r:openshift_var_lib_t:s0:c0,c1000 id_rsa
-rw-------. 52e0f8463c4fcb5ad5000006 52e0f8463c4fcb5ad5000006 unconfined_u:object_r:openshift_var_lib_t:s0:c0,c1000 id_rsa.pub
-rw-rw----. root root system_u:object_r:openshift_var_lib_t:s0 known_hosts

3. Make the app HA, RESTAPI returns success, but there was no new gears added to the application, the application still consumes 16 gears with only 1 haproxy instance
[root@ip-10-239-57-70 openshift]# pwd
/var/lib/openshift

[root@ip-10-239-57-70 openshift]# ls |grep jhou |wc -l
16

[root@ip-10-239-57-70 openshift]# tree -L 2|grep haproxy

| |-- haproxy



irb(main):001:0> require '/var/www/openshift/broker/config/environment'
=> true

irb(main):012:0> app=Application.find_by(name: "r19s") 
=> #<Application _id: 52e0f8463c4fcb5ad5000006, _type: nil, created_at: 2014-01-23 11:08:54 UTC, updated_at: 2014-01-23 11:34:24 UTC, name: "r19s", canonical_name: "r19s", domain_requires: [], group_overrides: [{"components"=>[{"cart"=>"ruby-1.9", "comp"=>"ruby-1.9"}, {"cart"=>"haproxy-1.4", "comp"=>"web_proxy", "min_gears"=>2, "max_gears"=>-1}], "min_gears"=>16, "max_gears"=>-1}], domain_id: "52e0f7863c4fcb5ad5000003", domain_namespace: "jhou", owner_id: "52e0f7863c4fcb5ad5000001", builder_id: nil, downloaded_cart_map: {}, default_gear_size: "small", scalable: true, ha: true, init_git_url: nil, analytics: {"user_agent"=>"rhc/1.19.4 (ruby 1.9.3; x86_64-linux) (API [1.1, 1.2, 1.3, 1.4, 1.5, 1.6]) (2.3.4.1, ruby 1.9.3 (2013-06-27))"}, secret_token: "xJ7XEPn7laBGh3r3_bIOOv0fUbpAuazJLhZIeO3elq_PiNppsLWBp6HuvYVO_7j0FleOu0WcJET0fRZf9Hd1aIQibvK5Xed7tfk9qWihr0UOURz3OjICkVstWpgQ_IVx", config: {"auto_deploy"=>true, "deployment_branch"=>"master", "keep_deployments"=>1, "deployment_type"=>"git"}, meta: nil, uuid: "52e0f8463c4fcb074b000001">

irb(main):015:0> app.ha
=> true

irb(main):013:0> app.gears.length
=> 16

Comment 7 Andy Goldstein 2014-01-23 13:49:16 UTC
The first error mentioned -- [FATAL] ActionController::RoutingError (No route matches [GET] "/") -- is one we've been seeing for a long time. It doesn't impact the functioning of OpenShift. Please feel free to file a separate bug for this (or we may already have one, I'm not sure).

For the 3rd error mentioned (making app HA doesn't add 2nd HAProxy gear), please file a separate bug, as this bug is just about SSH keys.

Comment 8 Dan McPherson 2014-01-23 17:24:31 UTC
Jhon,

  It looks like you changed these perms.  Was it related to the bug or did it just seem like the right thing to do?

-Dan

Comment 9 Rajat Chopra 2014-01-23 18:10:09 UTC
Third issue is expected. Not a bug.
Make-HA does not scale up automatically if there are more than 2 gears present. The user is expected to do that. Will put appropriate message to the user.

Comment 10 openshift-github-bot 2014-01-23 23:44:44 UTC
Commits pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/83cd9f259a650b70a0990b0ba27112c63b2fd749
Bug 1049044 - Restore setting ssh config settings for gear

* Setting the file permissions was incorrectly removed
* https://bugzilla.redhat.com/show_bug.cgi?id=1049044#c8

https://github.com/openshift/origin-server/commit/a9b3154ba6d1d71d7dd9e881fb40c8dbc7e547ff
Bug 1049044 - Create more of .openshift_ssh environment

Comment 11 Jianwei Hou 2014-01-24 06:18:19 UTC
Verified on devenv_4269

ssh from haproxy gear to other gears is successful. The ssh config file permission issue is fixed.

[root@ip-10-236-136-111 .openshift_ssh]# ls -Zl
total 8
-rw-rw----. 1 system_u:object_r:openshift_var_lib_t:s0:c0,c1000 52e2013d6e8d6b7ba0000027 52e2013d6e8d6b7ba0000027    0 Jan 24 00:59 config
-rw-------. 1 system_u:object_r:openshift_var_lib_t:s0:c0,c1000 52e2013d6e8d6b7ba0000027 52e2013d6e8d6b7ba0000027 1675 Jan 24 00:59 id_rsa
-rw-------. 1 system_u:object_r:openshift_var_lib_t:s0:c0,c1000 52e2013d6e8d6b7ba0000027 52e2013d6e8d6b7ba0000027  424 Jan 24 00:59 id_rsa.pub
-rw-rw----. 1 system_u:object_r:openshift_var_lib_t:s0:c0,c1000 52e2013d6e8d6b7ba0000027 52e2013d6e8d6b7ba0000027    0 Jan 24 00:59 known_hosts


Note You need to log in before you can comment on or make changes to this bug.