Bug 1043414

Summary: Could not scale-up jbosseap application
Product: OpenShift Container Platform Reporter: Gaoyun Pei <gpei>
Component: ImageStreamsAssignee: Brenton Leanhardt <bleanhar>
Status: CLOSED ERRATA QA Contact: libra bugs <libra-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 2.0.0CC: bleanhar, gpei, kcleveng
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-05-15 14:40:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
mcollective.log none

Description Gaoyun Pei 2013-12-16 09:23:24 UTC
Description of problem:
Failed to scale-up a jbosseap-6 app.
This issue only could be reproduced on my testing env, which blocks the ose-2.0 performance testing.

Version-Release number of selected component (if applicable):
2.0/2013-11-26.1
openshift-origin-cartridge-jbosseap-2.11.1-2.el6op.noarch

How reproducible:
Always

Steps to Reproduce:
1.Create a scalable jbosseap-6 application, try to scale it up

[root@broker ~]# rhc cartridge scale -c jbosseap -a app1 --min 2
Using jbosseap-6 (JBoss Enterprise Application Platform 6.1.0) for 'jbosseap'
This operation will run until the application is at the minimum scale and may take several minutes.
Setting scale range for jbosseap-6 ... 
Unable to complete the requested operation due to: The server node.stress.com that your application is running on failed to respond in time.
This may be due to a system restart..
Reference ID: 537b720228b754532aab761a59fca5c7


Errors could be seen in mcollective.log, the whole log of scale-up is in the attachment.
...
E, [2013-12-13T17:35:26.085081 #24344] ERROR -- : openshift.rb:310:in `rescue in with_container_from_args' Activation of new gears failed: 52aad3fe34c48c5f1a000064: Gear activation failed: Shell command '/sbin/runuser -s /bin/sh 52aad3c334c48c5f1a00003a -c "exec /usr/bin/runcon 'unconfined_u:system_r:openshift_t:s0:c0,c1000' /bin/sh -c \"/usr/bin/oo-ssh 52aad3fe34c48c5f1a000064.com gear activate 8a1b0be4 --as-json --post-install --no-rotation\""' returned an error. rc=255
E, [2013-12-13T17:35:26.085284 #24344] ERROR -- : openshift.rb:311:in `rescue in with_container_from_args' /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.17.4/lib/openshift-origin-node/model/application_container_ext/cartridge_actions.rb:1381:in `update_cluster'
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:998:in `block in oo_update_cluster'
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:299:in `with_container_from_args'
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:997:in `oo_update_cluster'
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:139:in `execute_action'
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:104:in `cartridge_do_action'
/opt/rh/ruby193/root/usr/share/ruby/mcollective/rpc/agent.rb:86:in `handlemsg'
/opt/rh/ruby193/root/usr/share/ruby/mcollective/agents.rb:126:in `block (2 levels) in dispatch'
/opt/rh/ruby193/root/usr/share/ruby/timeout.rb:69:in `timeout'
/opt/rh/ruby193/root/usr/share/ruby/mcollective/agents.rb:125:in `block in dispatch'


2. As mcollective timeout error shown, so I changed the timeout parameter to a bigger one, restart the related service and try again.
ProxyTimeout 600 in /etc/httpd/conf.d/000002_openshift_origin_broker_proxy.conf
MCOLLECTIVE_TIMEOUT=600 in /etc/openshift/plugins.d/openshift-origin-msg-broker-mcollective.conf

[root@broker ~]# rhc cartridge scale -c jbosseap -a app1 --min 2
Using jbosseap-6 (JBoss Enterprise Application Platform 6.1.0) for 'jbosseap'
This operation will run until the application is at the minimum scale and may take several minutes.
Setting scale range for jbosseap-6 ... 
Unable to complete the requested operation due to: An invalid exit code (1) was returned from the server node.stress.com.  This indicates an
unexpected problem during the execution of your request..
Reference ID: c38f1ed0f96896e10ec3abdc35f206a7

The mcollective.log said:
E, [2013-12-13T17:49:59.893311 #1946] ERROR -- : openshift.rb:310:in `rescue in with_container_from_args' execution expired
E, [2013-12-13T17:49:59.895048 #1946] ERROR -- : openshift.rb:311:in `rescue in with_container_from_args' /opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:286:in `join'
/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:286:in `block in wait_for_threads'
/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:284:in `each'
/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:284:in `wait_for_threads'
/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:69:in `block in in_threads'
/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:324:in `kill_on_ctrl_c'
/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:69:in `in_threads'
/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:178:in `work_in_threads'
/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:104:in `map'
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.17.4/lib/openshift-origin-node/model/application_container_ext/cartridge_actions.rb:1068:in `with_gear_rotation'
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.17.4/lib/openshift-origin-node/model/application_container_ext/cartridge_actions.rb:795:in `activate'
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.17.4/lib/openshift-origin-node/model/application_container_ext/cartridge_actions.rb:1375:in `update_cluster'
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:998:in `block in oo_update_cluster'
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:299:in `with_container_from_args'
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:997:in `oo_update_cluster'
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:139:in `execute_action'
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:104:in `cartridge_do_action'
/opt/rh/ruby193/root/usr/share/ruby/mcollective/rpc/agent.rb:86:in `handlemsg'
/opt/rh/ruby193/root/usr/share/ruby/mcollective/agents.rb:126:in `block (2 levels) in dispatch'
/opt/rh/ruby193/root/usr/share/ruby/timeout.rb:69:in `timeout'
/opt/rh/ruby193/root/usr/share/ruby/mcollective/agents.rb:125:in `block in dispatch'
I, [2013-12-13T17:49:59.895302 #1946]  INFO -- : openshift.rb:150:in `execute_action' Finished executing action [update-cluster] (1)
I, [2013-12-13T17:49:59.918838 #1946]  INFO -- : openshift.rb:114:in `cartridge_do_action' cartridge_do_action failed (1)
------
execution expired
------)

Actual results:

Expected results:
The scalable jbosseap app could be scale-up.

Additional info:

Comment 1 Gaoyun Pei 2013-12-16 09:23:57 UTC
Created attachment 837160 [details]
mcollective.log

Comment 10 Gaoyun Pei 2014-03-25 06:48:22 UTC
I reinstalled the blade with ose-2.1-2014-03-21.4, still couldn't create a scalable jbosseap app successfully, which is really disappointing.


The same error information reported at the client side.
[root@broker ~]# rhc app create app1 jbosseap -s
Using jbosseap-6 (JBoss Enterprise Application Platform 6) for 'jbosseap'

Application Options
-------------------
Domain:     1234
Cartridges: jbosseap-6
Gear Size:  default
Scaling:    yes

Creating application 'app1' ... 
Unable to complete the requested operation due to: The server node.stress.com that your application is running on failed to respond in time.
This may be due to a system restart.
Reference ID: 8ff93eb60851b8d9a651f1c0fb7ac0e0



Here's the error logs in platform.log when creating this app :
...
March 25 02:33:00 INFO 5331230a34c48c8fb7000064 start against 'jbosseap'
March 25 02:33:34 INFO AdminGearsControl: initialized for gear(s) 5331230a34c48c8fb7000064
  AdminGearsControl: initialized with timeout 360s
  AdminGearsControl: initialized with 1 process per CPU
March 25 02:33:34 ERROR (24510) Stopping gear 5331230a34c48c8fb7000064 ... [ FAILED ]
  undefined method `values' for nil:NilClass
March 25 02:33:34 INFO 5331230a34c48c8fb7000064 start against 'haproxy'
March 25 02:33:34 INFO Shell command '/sbin/runuser -s /bin/sh 5331230a34c48c8fb7000064 -c "exec /usr/bin/runcon 'unconfined_u:system_r:openshift_t:s0:c3,c526' /bin/sh -c \"set -e; /var/lib/openshift/5331230a34c48c8fb7000064/haproxy/bin/control start \""' ran. rc=0 out=HAProxy instance is started

March 25 02:33:34 INFO The file context database is being reloaded.
March 25 02:33:34 INFO 5331230a34c48c8fb7000064 start against 'jbosseap'
March 25 02:33:35 INFO Shell command '/sbin/runuser -s /bin/sh 5331230a34c48c8fb7000064 -c "exec /usr/bin/runcon 'unconfined_u:system_r:openshift_t:s0:c3,c526' /bin/sh -c \"set -e; /var/lib/openshift/5331230a34c48c8fb7000064/jbosseap/bin/control start \""' ran. rc=0 out=Application is already running

March 25 02:33:35 INFO (24510) Starting gear 5331230a34c48c8fb7000064 ... [ OK ]
March 25 02:33:35 ERROR Gear: 5331230a34c48c8fb7000064 failed, Error: undefined method `values' for nil:NilClass
March 25 02:33:35 INFO Gear: 5331230a34c48c8fb7000064 failed, Exception: #<NoMethodError: undefined method `values' for nil:NilClass>
March 25 02:33:35 INFO Gear: 5331230a34c48c8fb7000064 failed, Exception: #<NoMethodError: undefined method `values' for nil:NilClass>
March 25 02:33:35 INFO Gear: 5331230a34c48c8fb7000064 failed, Backtrace: ["/opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/user_interaction.rb:237:in `method_missing'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.22.4/lib/openshift-origin-node/model/application_container_ext/cartridge_actions.rb:1646:in `update_proxy_status'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.22.4/lib/openshift-origin-node/model/application_container.rb:469:in `stop_gear'", "/usr/sbin/oo-admin-ctl-gears:177:in `block (2 levels) in restart'", "/usr/sbin/oo-admin-ctl-gears:115:in `block in gear_action'", "/opt/rh/ruby193/root/usr/share/ruby/timeout.rb:69:in `timeout'", "/usr/sbin/oo-admin-ctl-gears:114:in `gear_action'", "/usr/sbin/oo-admin-ctl-gears:177:in `block in restart'", "/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:345:in `call'", "/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:345:in `call_with_index'", "/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:274:in `process_incoming_jobs'", "/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:257:in `block in worker'", "/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:250:in `fork'", "/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:250:in `worker'", "/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:238:in `block in create_workers'", "/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:237:in `each'", "/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:237:in `create_workers'", "/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:201:in `work_in_processes'", "/opt/rh/ruby193/root/usr/share/gems/gems/parallel-0.8.0/lib/parallel.rb:106:in `map'", "/usr/sbin/oo-admin-ctl-gears:173:in `restart'", "/usr/sbin/oo-admin-ctl-gears:403:in `block (2 levels) in <main>'", "/opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:180:in `call'", "/opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:180:in `call'", "/opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:155:in `run'", "/opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/runner.rb:385:in `run_active_command'", "/opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/runner.rb:62:in `run!'", "/opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/delegates.rb:11:in `run!'", "/opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/import.rb:10:in `block in <top (required)>'"]
March 25 02:33:35 INFO Gear: 5331230a34c48c8fb7000064 output, undefined method `values' for nil:NilClass
March 25 02:33:35 INFO Gear: 5331230a34c48c8fb7000064 output, undefined method `values' for nil:NilClass
...

Comment 13 Gaoyun Pei 2014-03-26 05:39:59 UTC
After increase the configuration setting in /etc/openshift/resource_limits.conf as Comment 12 said, scalable jbosseap app could be created and scale-up smoothly.
So move this bug to VERIFIED.