Bug 1002919

Summary: scalable app fail to be restarted after upgrade
Product: OpenShift Container Platform Reporter: Johnny Liu <jialiu>
Component: NodeAssignee: Brenton Leanhardt <bleanhar>
Status: CLOSED ERRATA QA Contact: libra bugs <libra-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 1.2.1CC: baulakh, bleanhar, libra-onpremise-devel
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rubygem-openshift-origin-controller-1.9.16.2-1 Doc Type: Bug Fix
Doc Text:
Some applications failed to restart after upgrading from OpenShift Enterprise 1.1 to 1.2 because the web_proxy gear was not always the first gear in a gear group. This has been fixed and applications now restart correctly when upgrading OpenShift Enterprise.
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-25 15:30:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Johnny Liu 2013-08-30 08:52:37 UTC
Description of problem:
some scalable applications failed be restarted after upgrade.

Version-Release number of selected component (if applicable):
Upgrade from ose-1.1.3 to ose-1.2.2

How reproducible:
Sometimes

Steps to Reproduce:
1.Setup ose-1.1.3 env
2.Create all supported scalable applications, e.g: php, python, perl, ruby-1.8, ruby-1.9, jbossews-1.0, jbosseap-6.0
3.Upgrade this env from ose-1.1.3 to ose-1.2.2
4.Try to restart these scalable applications.

Actual results:
ruby-1.8, ruby-1.9 and jbosseap-6.0 fail to be restarted.

From mcollective log, get the following error:
I, [2013-08-30T04:37:09.777419 #6431]  INFO -- : openshift.rb:132:in `execute_parallel_action' execute_parallel_action call - [{:tag=>"", :gear=>"c8b10e1078b241bfb895d3e263e7fe26", :job=>{:cartridge=>"openshift-origin-node", :action=>"app-state-show", :args=>{"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db", "--with-app-name"=>"ruby19scal", "--with-container-uuid"=>"c8b10e1078b241bfb895d3e263e7fe26", "--with-container-name"=>"c8b10e1078", "--with-namespace"=>"jialiu", "--with-uid"=>1046, "--with-request-id"=>"68dc5f2e72485531b37e699f808113db"}}, :result_stdout=>"started", :result_stderr=>"", :result_exit_code=>0}, {:tag=>"", :gear=>"40fd34b98a3a4419be8f069ac645c372", :job=>{:cartridge=>"openshift-origin-node", :action=>"app-state-show", :args=>{"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db", "--with-app-name"=>"ruby19scal", "--with-container-uuid"=>"40fd34b98a3a4419be8f069ac645c372", "--with-container-name"=>"40fd34b98a", "--with-namespace"=>"jialiu", "--with-uid"=>1044, "--with-request-id"=>"68dc5f2e72485531b37e699f808113db"}}, :result_stdout=>"started", :result_stderr=>"", :result_exit_code=>0}, {:tag=>"", :gear=>"42f906444c5844b0ae33cec02f12d8db", :job=>{:cartridge=>"openshift-origin-node", :action=>"app-state-show", :args=>{"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db", "--with-app-name"=>"ruby19scal", "--with-container-uuid"=>"42f906444c5844b0ae33cec02f12d8db", "--with-container-name"=>"ruby19scal", "--with-namespace"=>"jialiu", "--with-uid"=>1045, "--with-request-id"=>"68dc5f2e72485531b37e699f808113db"}}, :result_stdout=>"started", :result_stderr=>"", :result_exit_code=>0}]
I, [2013-08-30T04:41:57.960224 #6431]  INFO -- : openshift.rb:51:in `cartridge_do_action' cartridge_do_action call / action: cartridge_do, agent=openshift, data={:cartridge=>"mysql-5.1",
 :action=>"restart",
 :args=>
  {"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db",
   "--with-app-name"=>"ruby19scal",
   "--with-container-uuid"=>"c8b10e1078b241bfb895d3e263e7fe26",
   "--with-container-name"=>"c8b10e1078",
   "--with-namespace"=>"jialiu",
   "--with-uid"=>1046,
   "--with-request-id"=>"b5258d03aad29207aaa8740e853e3c4b",
   "--cart-name"=>"mysql-5.1",
   "--component-name"=>"mysql-5.1",
   "--with-software-version"=>"",
   "--cartridge-vendor"=>""},
 :process_results=>true}

I, [2013-08-30T04:41:57.964089 #6431]  INFO -- : openshift.rb:52:in `cartridge_do_action' cartridge_do_action validation = mysql-5.1 restart {"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db", "--with-app-name"=>"ruby19scal", "--with-container-uuid"=>"c8b10e1078b241bfb895d3e263e7fe26", "--with-container-name"=>"c8b10e1078", "--with-namespace"=>"jialiu", "--with-uid"=>1046, "--with-request-id"=>"b5258d03aad29207aaa8740e853e3c4b", "--cart-name"=>"mysql-5.1", "--component-name"=>"mysql-5.1", "--with-software-version"=>"", "--cartridge-vendor"=>""}
I, [2013-08-30T04:41:57.964671 #6431]  INFO -- : openshift.rb:91:in `execute_action' Executing action [restart] using method oo_restart with args [{"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db", "--with-app-name"=>"ruby19scal", "--with-container-uuid"=>"c8b10e1078b241bfb895d3e263e7fe26", "--with-container-name"=>"c8b10e1078", "--with-namespace"=>"jialiu", "--with-uid"=>1046, "--with-request-id"=>"b5258d03aad29207aaa8740e853e3c4b", "--cart-name"=>"mysql-5.1", "--component-name"=>"mysql-5.1", "--with-software-version"=>"", "--cartridge-vendor"=>""}]
I, [2013-08-30T04:41:57.964089 #6431]  INFO -- : openshift.rb:52:in `cartridge_do_action' cartridge_do_action validation = mysql-5.1 restart {"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db", "--with-app-name"=>"ruby19scal", "--with-container-uuid"=>"c8b10e1078b241bfb895d3e263e7fe26", "--with-container-name"=>"c8b10e1078", "--with-namespace"=>"jialiu", "--with-uid"=>1046, "--with-request-id"=>"b5258d03aad29207aaa8740e853e3c4b", "--cart-name"=>"mysql-5.1", "--component-name"=>"mysql-5.1", "--with-software-version"=>"", "--cartridge-vendor"=>""}
I, [2013-08-30T04:41:57.964671 #6431]  INFO -- : openshift.rb:91:in `execute_action' Executing action [restart] using method oo_restart with args [{"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db", "--with-app-name"=>"ruby19scal", "--with-container-uuid"=>"c8b10e1078b241bfb895d3e263e7fe26", "--with-container-name"=>"c8b10e1078", "--with-namespace"=>"jialiu", "--with-uid"=>1046, "--with-request-id"=>"b5258d03aad29207aaa8740e853e3c4b", "--cart-name"=>"mysql-5.1", "--component-name"=>"mysql-5.1", "--with-software-version"=>"", "--cartridge-vendor"=>""}]
I, [2013-08-30T04:42:01.002235 #6431]  INFO -- : openshift.rb:100:in `execute_action' Finished executing action [restart] (0)
I, [2013-08-30T04:42:01.002559 #6431]  INFO -- : openshift.rb:71:in `cartridge_do_action' cartridge_do_action reply (0):
------

------)
I, [2013-08-30T04:42:01.095801 #6431]  INFO -- : openshift.rb:51:in `cartridge_do_action' cartridge_do_action call / action: cartridge_do, agent=openshift, data={:cartridge=>"ruby-1.9",
 :action=>"restart",
 :args=>
  {"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db",
   "--with-app-name"=>"ruby19scal",
   "--with-container-uuid"=>"40fd34b98a3a4419be8f069ac645c372",
   "--with-container-name"=>"40fd34b98a",
   "--with-namespace"=>"jialiu",
   "--with-uid"=>1044,
   "--with-request-id"=>"b5258d03aad29207aaa8740e853e3c4b",
   "--cart-name"=>"ruby-1.9",
   "--component-name"=>"ruby-1.9",
   "--with-software-version"=>"",
   "--cartridge-vendor"=>""},
 :process_results=>true}

I, [2013-08-30T04:42:01.096157 #6431]  INFO -- : openshift.rb:52:in `cartridge_do_action' cartridge_do_action validation = ruby-1.9 restart {"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db", "--with-app-name"=>"ruby19scal", "--with-container-uuid"=>"40fd34b98a3a4419be8f069ac645c372", "--with-container-name"=>"40fd34b98a", "--with-namespace"=>"jialiu", "--with-uid"=>1044, "--with-request-id"=>"b5258d03aad29207aaa8740e853e3c4b", "--cart-name"=>"ruby-1.9", "--component-name"=>"ruby-1.9", "--with-software-version"=>"", "--cartridge-vendor"=>""}
I, [2013-08-30T04:42:01.097378 #6431]  INFO -- : openshift.rb:91:in `execute_action' Executing action [restart] using method oo_restart with args [{"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db", "--with-app-name"=>"ruby19scal", "--with-container-uuid"=>"40fd34b98a3a4419be8f069ac645c372", "--with-container-name"=>"40fd34b98a", "--with-namespace"=>"jialiu", "--with-uid"=>1044, "--with-request-id"=>"b5258d03aad29207aaa8740e853e3c4b", "--cart-name"=>"ruby-1.9", "--component-name"=>"ruby-1.9", "--with-software-version"=>"", "--cartridge-vendor"=>""}]
I, [2013-08-30T04:42:02.142080 #6431]  INFO -- : openshift.rb:100:in `execute_action' Finished executing action [restart] (0)
I, [2013-08-30T04:42:02.142435 #6431]  INFO -- : openshift.rb:71:in `cartridge_do_action' cartridge_do_action reply (0):
------
restarting Ruby cart

------)
I, [2013-08-30T04:42:02.232419 #6431]  INFO -- : openshift.rb:51:in `cartridge_do_action' cartridge_do_action call / action: cartridge_do, agent=openshift, data={:cartridge=>"ruby-1.9",
 :action=>"restart",
 :args=>
  {"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db",
   "--with-app-name"=>"ruby19scal",
   "--with-container-uuid"=>"42f906444c5844b0ae33cec02f12d8db",
   "--with-container-name"=>"ruby19scal",
   "--with-namespace"=>"jialiu",
   "--with-uid"=>1045,
   "--with-request-id"=>"b5258d03aad29207aaa8740e853e3c4b",
   "--cart-name"=>"ruby-1.9",
   "--component-name"=>"ruby-1.9",
   "--with-software-version"=>"",
   "--cartridge-vendor"=>""},
 :process_results=>true}

I, [2013-08-30T04:42:02.232813 #6431]  INFO -- : openshift.rb:52:in `cartridge_do_action' cartridge_do_action validation = ruby-1.9 restart {"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db", "--with-app-name"=>"ruby19scal", "--with-container-uuid"=>"42f906444c5844b0ae33cec02f12d8db", "--with-container-name"=>"ruby19scal", "--with-namespace"=>"jialiu", "--with-uid"=>1045, "--with-request-id"=>"b5258d03aad29207aaa8740e853e3c4b", "--cart-name"=>"ruby-1.9", "--component-name"=>"ruby-1.9", "--with-software-version"=>"", "--cartridge-vendor"=>""}
I, [2013-08-30T04:42:02.233108 #6431]  INFO -- : openshift.rb:91:in `execute_action' Executing action [restart] using method oo_restart with args [{"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db", "--with-app-name"=>"ruby19scal", "--with-container-uuid"=>"42f906444c5844b0ae33cec02f12d8db", "--with-container-name"=>"ruby19scal", "--with-namespace"=>"jialiu", "--with-uid"=>1045, "--with-request-id"=>"b5258d03aad29207aaa8740e853e3c4b", "--cart-name"=>"ruby-1.9", "--component-name"=>"ruby-1.9", "--with-software-version"=>"", "--cartridge-vendor"=>""}]
I, [2013-08-30T04:42:03.192828 #6431]  INFO -- : openshift.rb:100:in `execute_action' Finished executing action [restart] (0)
I, [2013-08-30T04:42:03.193162 #6431]  INFO -- : openshift.rb:71:in `cartridge_do_action' cartridge_do_action reply (0):
------
restarting Ruby cart

------)
I, [2013-08-30T04:42:03.266899 #6431]  INFO -- : openshift.rb:51:in `cartridge_do_action' cartridge_do_action call / action: cartridge_do, agent=openshift, data={:cartridge=>"haproxy-1.4",
 :action=>"restart",
 :args=>
  {"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db",
   "--with-app-name"=>"ruby19scal",
   "--with-container-uuid"=>"40fd34b98a3a4419be8f069ac645c372",
   "--with-container-name"=>"40fd34b98a",
   "--with-namespace"=>"jialiu",
   "--with-uid"=>1044,
   "--with-request-id"=>"b5258d03aad29207aaa8740e853e3c4b",
   "--cart-name"=>"haproxy-1.4",
   "--component-name"=>"web_proxy",
   "--with-software-version"=>"",
   "--cartridge-vendor"=>""},
 :process_results=>true}

I, [2013-08-30T04:42:03.267047 #6431]  INFO -- : openshift.rb:52:in `cartridge_do_action' cartridge_do_action validation = haproxy-1.4 restart {"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db", "--with-app-name"=>"ruby19scal", "--with-container-uuid"=>"40fd34b98a3a4419be8f069ac645c372", "--with-container-name"=>"40fd34b98a", "--with-namespace"=>"jialiu", "--with-uid"=>1044, "--with-request-id"=>"b5258d03aad29207aaa8740e853e3c4b", "--cart-name"=>"haproxy-1.4", "--component-name"=>"web_proxy", "--with-software-version"=>"", "--cartridge-vendor"=>""}
I, [2013-08-30T04:42:03.267290 #6431]  INFO -- : openshift.rb:91:in `execute_action' Executing action [restart] using method oo_restart with args [{"--with-app-uuid"=>"42f906444c5844b0ae33cec02f12d8db", "--with-app-name"=>"ruby19scal", "--with-container-uuid"=>"40fd34b98a3a4419be8f069ac645c372", "--with-container-name"=>"40fd34b98a", "--with-namespace"=>"jialiu", "--with-uid"=>1044, "--with-request-id"=>"b5258d03aad29207aaa8740e853e3c4b", "--cart-name"=>"haproxy-1.4", "--component-name"=>"web_proxy", "--with-software-version"=>"", "--cartridge-vendor"=>""}]
E, [2013-08-30T04:42:03.271669 #6431] ERROR -- : openshift.rb:170:in `rescue in with_container_from_args' Failed to get cartridge 'haproxy-1.4' from  in gear 40fd34b98a3a4419be8f069ac645c372: Cartridge directory not found for haproxy-1.4
E, [2013-08-30T04:42:03.271770 #6431] ERROR -- : openshift.rb:171:in `rescue in with_container_from_args' /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.9.14.3/lib/openshift-origin-node/model/v2_cart_model.rb:145:in `rescue in get_cartridge'
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.9.14.3/lib/openshift-origin-node/model/v2_cart_model.rb:138:in `get_cartridge'
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.9.14.3/lib/openshift-origin-node/model/v2_cart_model.rb:1197:in `start_cartridge'
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.9.14.3/lib/openshift-origin-node/model/application_container.rb:663:in `restart'
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:675:in `block in oo_restart'
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:166:in `with_container_from_args'
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:674:in `oo_restart'
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:93:in `execute_action'
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:65:in `cartridge_do_action'
/opt/rh/ruby193/root/usr/share/ruby/mcollective/rpc/agent.rb:86:in `handlemsg'
/opt/rh/ruby193/root/usr/share/ruby/mcollective/agents.rb:126:in `block (2 levels) in dispatch'
/opt/rh/ruby193/root/usr/share/ruby/timeout.rb:69:in `timeout'
/opt/rh/ruby193/root/usr/share/ruby/mcollective/agents.rb:125:in `block in dispatch'
I, [2013-08-30T04:42:03.271889 #6431]  INFO -- : openshift.rb:100:in `execute_action' Finished executing action [restart] (-1)
I, [2013-08-30T04:42:03.271991 #6431]  INFO -- : openshift.rb:73:in `cartridge_do_action' cartridge_do_action failed (-1)
------
Failed to get cartridge 'haproxy-1.4' from  in gear 40fd34b98a3a4419be8f069ac645c372: Cartridge directory not found for haproxy-1.4
------)

$ rhc app show -a ruby19scal --gears
Password: ******
ID                               State   Cartridges           Size  SSH URL
-------------------------------- ------- -------------------- ----- ----------------------------------------------------------
c8b10e1078b241bfb895d3e263e7fe26 started mysql-5.1            small c8b10e1078b241bfb895d3e263e7fe26.com
40fd34b98a3a4419be8f069ac645c372 started ruby-1.9 haproxy-1.4 small 40fd34b98a3a4419be8f069ac645c372.com
42f906444c5844b0ae33cec02f12d8db started ruby-1.9 haproxy-1.4 small 42f906444c5844b0ae33cec02f12d8db.com

Seen from the log, openshift is trying to start haproxy gear from a web gear, not a head gear.

Expected results:
scalable app should be restarted successfully.

Additional info:

Comment 5 Brenton Leanhardt 2013-09-02 19:32:42 UTC
For now the workaround for this bug is to scale the cartridge back down to one gear by setting min and max to 1.  Then scale back up.  Restarts will work after that.

I will keep looking into the issue but for now I don't believe it is a regression.  I'm going to move it to the next release since it should block the current fixes.

Comment 14 Johnny Liu 2013-09-13 05:43:24 UTC
Verified this bug using 1.2/2013-09-12.1 puddle, it PASS.

Comment 18 errata-xmlrpc 2013-09-25 15:30:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1275.html