Bug 985496

Summary: oo-admin-chk level1 times out
Product: OpenShift Online Reporter: Sten Turpin <sten>
Component: PodAssignee: Abhishek Gupta <abhgupta>
Status: CLOSED CURRENTRELEASE QA Contact: libra bugs <libra-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 2.xCC: dmcphers, jhou, rchopra, twiest, xtian
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-08-07 22:55:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sten Turpin 2013-07-17 15:45:26 UTC
Description of problem: oo-admin-chk level 1 times out


Version-Release number of selected component (if applicable): openshift-origin-broker-util-1.10.6-1.el6oso.noarch


How reproducible: always, in openshift.com production


Steps to Reproduce:
1. execute oo-admin-chk --level 1 on a node
2. wait 40-150 minutes

Actual results:

Stack trace: 

Started at: 2013-07-17 10:37:49 -0400
             /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/networking.rb:306:in `rescue in receive_message_on_socket': Operation failed with the following exception: Connection timed out (Mongo::ConnectionFailure)
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/networking.rb:298:in `receive_message_on_socket'
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/networking.rb:159:in `receive_header'
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/networking.rb:150:in `receive'
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/networking.rb:117:in `receive_message'
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/cursor.rb:529:in `send_get_more'
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/cursor.rb:463:in `refresh'
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/cursor.rb:124:in `next'
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/cursor.rb:285:in `each'
        from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.10.7/lib/openshift/data_store.rb:23:in `block in find'
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/collection.rb:276:in `find'
        from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.10.7/lib/openshift/data_store.rb:22:in `find'
        from /usr/sbin/oo-admin-chk:245:in `<main>'



Expected results:
oo-admin-chk level 1 report, in a more reasonable timeframe

Additional info:

Comment 1 Dan McPherson 2013-07-17 16:06:15 UTC
If you take away all the bloat inside the block you get:

>> require "/var/www/openshift/broker/config/environment"
=> true
>>
?> start_time = Time.now
=> 2013-07-17 09:45:00 -0400
>> apps = []
=> []
>> app_selection = {:fields => ["name", "uuid", "created_at", "domain_id", "group_instances.gears._id","group_instances.gears.uuid", "group_instances.gears.uid", "group_instances.gears.server_identity", "group_instances._id", "component_instances._id", "component_instances.cartridge_name", "component_instances.group_instance_id", "group_overrides", "app_ssh_keys.name", "app_ssh_keys.content"], :timeout => false}
=> {:fields=>["name", "uuid", "created_at", "domain_id", "group_instances.gears._id", "group_instances.gears.uuid", "group_instances.gears.uid", "group_instances.gears.server_identity", "group_instances._id", "component_instances._id", "component_instances.cartridge_name", "component_instances.group_instance_id", "group_overrides", "app_ssh_keys.name", "app_ssh_keys.content"], :timeout=>false}
>> app_query = {"group_instances.gears.0" => {"$exists" => true}}
=> {"group_instances.gears.0"=>{"$exists"=>true}}
>> OpenShift::DataStore.find(:applications, app_query, app_selection) do |app|
?>   apps << app
>> end
=> nil
>>
?> puts apps.size
149766
=> nil
>> puts Time.now - start_time
130.756245042

Comment 2 Rajat Chopra 2013-07-17 17:37:16 UTC
About time we employ multiple threads/processes.

Comment 4 Jianwei Hou 2013-07-31 02:46:00 UTC
Verified on devenv_3588

The time it takes to query apps is much saved. There is no timeout reported when running oo-admin-chk on level 1.

irb(main):001:0> require "/var/www/openshift/broker/config/environment"
=> true
irb(main):002:0> start_time = Time.now
=> 2013-07-30 22:41:11 -0400
irb(main):003:0> apps = []
=> []
irb(main):004:0> app_selection = {:fields => ["name", "uuid", "created_at", "domain_id", "group_instances.gears._id","group_instances.gears.uuid", "group_instances.gears.uid", "group_instances.gears.server_identity", "group_instances._id", "component_instances._id", "component_instances.cartridge_name", "component_instances.group_instance_id", "group_overrides", "app_ssh_keys.name", "app_ssh_keys.content"], :timeout => false}
=> {:fields=>["name", "uuid", "created_at", "domain_id", "group_instances.gears._id", "group_instances.gears.uuid", "group_instances.gears.uid", "group_instances.gears.server_identity", "group_instances._id", "component_instances._id", "component_instances.cartridge_name", "component_instances.group_instance_id", "group_overrides", "app_ssh_keys.name", "app_ssh_keys.content"], :timeout=>false}
irb(main):005:0> app_query = {"group_instances.gears.0" => {"$exists" => true}}
=> {"group_instances.gears.0"=>{"$exists"=>true}}
irb(main):006:0> OpenShift::DataStore.find(:applications, app_query, app_selection) do |app|
irb(main):007:1* apps << app
irb(main):008:1> end
=> nil
irb(main):009:0> puts apps.size
2
=> nil
irb(main):010:0> puts Time.now - start_time
48.302894661

Comment 6 Abhishek Gupta 2013-07-31 18:24:28 UTC
Marking it as verified again. Will reopen if its still a problem in PROD.

Comment 14 Jianwei Hou 2013-08-07 03:21:15 UTC
This bug is verified on devenv-stage_439