Bug 1035120 - Fail to move scale app for all cartridges
Summary: Fail to move scale app for all cartridges
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Pod
Version: 2.x
Hardware: All
OS: All
high
high
Target Milestone: ---
: ---
Assignee: Ravi Sankar
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-11-27 06:34 UTC by zhaozhanqi
Modified: 2015-05-15 00:23 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-02-26 19:08:25 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description zhaozhanqi 2013-11-27 06:34:53 UTC
Description of problem:
Given a scale app created. Move this app to another node. that will be failed.

Version-Release number of selected component (if applicable):
devenv_4076

How reproducible:
always

Steps to Reproduce:
1. setup multi-node env
2. create one scale app
3. move this gear to another node

Actual results:
oo-admin-move --gear_uuid 5295862324987b5aa00003e0 -i ip-10-100-198-140
URL: http://zqphps-zqd.dev.rhcloud.com
Login: zzhao
App UUID: 5295862324987b5aa00003e0
Gear UUID: 5295862324987b5aa00003e0
DEBUG: Source district uuid: 529563a624987b27b8000001
DEBUG: Destination district uuid: 529563a624987b27b8000001
DEBUG: Getting existing app 'zqphps' status before moving
DEBUG: Gear component 'php-5.3' was running
DEBUG: Stopping existing app cartridge 'php-5.3' before moving
DEBUG: Force stopping existing app cartridge 'php-5.3' before moving
DEBUG: Stopping existing app cartridge 'haproxy-1.4' before moving
DEBUG: Creating new account for gear 'zqphps' on ip-10-100-198-140
DEBUG: Moving content for app 'zqphps', gear 'zqphps' to ip-10-100-198-140
Identity added: /var/www/openshift/broker/config/keys/rsync_id_rsa (/var/www/openshift/broker/config/keys/rsync_id_rsa)
Agent pid 27816
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 27816 killed;
DEBUG: Moving system components for app 'zqphps', gear 'zqphps' to ip-10-100-198-140
Identity added: /var/www/openshift/broker/config/keys/rsync_id_rsa (/var/www/openshift/broker/config/keys/rsync_id_rsa)
Agent pid 27884
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 27884 killed;
DEBUG: Starting cartridge 'php-5.3' in 'zqphps' after move on ip-10-100-198-140
DEBUG: Starting cartridge 'haproxy-1.4' in 'zqphps' after move on ip-10-100-198-140
DEBUG: Fixing DNS and mongo for gear 'zqphps' after move
DEBUG: Changing server identity of 'zqphps' from 'ip-10-80-190-124' to 'ip-10-100-198-140'
DEBUG: Moving failed.  Rolling back gear 'zqphps' in 'zqphps' with delete on 'ip-10-100-198-140'
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-msg-broker-mcollective-1.18.0/lib/openshift/mcollective_application_container_proxy.rb:1975:in `block in move_gear': undefined local variable or method `gi_comps' for #<OpenShift::MCollectiveApplicationContainerProxy:0x00000008012e28> (NameError)
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-msg-broker-mcollective-1.18.0/lib/openshift/mcollective_application_container_proxy.rb:1974:in `each'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-msg-broker-mcollective-1.18.0/lib/openshift/mcollective_application_container_proxy.rb:1974:in `move_gear'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-msg-broker-mcollective-1.18.0/lib/openshift/mcollective_application_container_proxy.rb:1895:in `block in move_gear_secure'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.18.0/app/models/application.rb:1577:in `run_in_application_lock'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-msg-broker-mcollective-1.18.0/lib/openshift/mcollective_application_container_proxy.rb:1894:in `move_gear_secure'
	from /usr/sbin/oo-admin-move:112:in `<main>

Expected results:

should move successfully.

Additional info:

Comment 1 Abhishek Gupta 2013-11-27 17:40:24 UTC
Fixed with --> https://github.com/openshift/origin-server/pull/4267

A variable was renamed but the older variable name was missed in one place.

Comment 4 zhaozhanqi 2013-11-28 04:53:03 UTC
Tested this issue on devenv_4081, it still failed to move a scalble app with DB cartridge, for example:

1) create one scale app
2) add mongo db cartridge to this app
3) move this app

 oo-admin-move --gear_uuid 5296c0cf4e6278e300000304 -i ip-10-242-83-231
URL: http://setherpad-zqd.dev.rhcloud.com
Login: zzhao
App UUID: 5296c0cf4e6278e300000304
Gear UUID: 5296c0cf4e6278e300000304
DEBUG: Source district uuid: 43df785e57dd11e3b05b12313b014c19
DEBUG: Destination district uuid: 43df785e57dd11e3b05b12313b014c19
DEBUG: Getting existing app 'setherpad' status before moving
DEBUG: Gear component 'nodejs-0.6' was running
DEBUG: Stopping existing app cartridge 'nodejs-0.6' before moving
DEBUG: Force stopping existing app cartridge 'nodejs-0.6' before moving
DEBUG: Stopping existing app cartridge 'haproxy-1.4' before moving
DEBUG: Creating new account for gear 'setherpad' on ip-10-242-83-231
DEBUG: Moving content for app 'setherpad', gear 'setherpad' to ip-10-242-83-231
Identity added: /var/www/openshift/broker/config/keys/rsync_id_rsa (/var/www/openshift/broker/config/keys/rsync_id_rsa)
Agent pid 9528
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 9528 killed;
DEBUG: Moving system components for app 'setherpad', gear 'setherpad' to ip-10-242-83-231
Identity added: /var/www/openshift/broker/config/keys/rsync_id_rsa (/var/www/openshift/broker/config/keys/rsync_id_rsa)
Agent pid 9590
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 9590 killed;
DEBUG: Starting cartridge 'nodejs-0.6' in 'setherpad' after move on ip-10-242-83-231
DEBUG: Starting cartridge 'haproxy-1.4' in 'setherpad' after move on ip-10-242-83-231
DEBUG: Starting cartridge 'mongodb-2.2' in 'setherpad' after move on ip-10-242-83-231
DEBUG: Moving failed.  Rolling back gear 'setherpad' in 'setherpad' with delete on 'ip-10-242-83-231'
An invalid exit code (1) was returned from the server ip-10-242-83-231.  This indicates an unexpected problem during the execution of your request.


development.log

^[[0;37m2013-11-27 23:30:55.400^[[0m [^[[0;37mDEBUG^[[0m] DEBUG: rpc_client.custom_request('cartridge_do', {:cartridge=>"haproxy-1.4", :action=>"start", :args=>{"--with-app-uuid"=>"5296c0cf4e6278e300000304", "--with-app-name"=>"setherpad", "--with-container-uuid"=>"5296c0cf4e6278e300000304", "--with-container-name"=>"setherpad", "--with-namespace"=>"zqd", "--with-uid"=>5696, "--with-request-id"=>nil, "--cart-name"=>"haproxy-1.4", "--component-name"=>"web_proxy", "--with-software-version"=>"1.4", "--cartridge-vendor"=>"redhat"}}, ip-10-242-83-231, {'identity' => ip-10-242-83-231}) (Request ID: ) (pid:9083)
^[[0;37m2013-11-27 23:30:57.846^[[0m [^[[0;37mDEBUG^[[0m] DEBUG: [#<MCollective::RPC::Result:0x000000067d15f8 @agent="openshift", @action="cartridge_do", @results={:sender=>"ip-10-242-83-231", :statuscode=>0, :statusmsg=>"OK", :data=>{:time=>nil, :output=>"HAProxy instance is started\n", :exitcode=>0, :addtl_params=>nil}}>] (Request ID: ) (pid:9083)
^[[0;37m2013-11-27 23:30:57.847^[[0m [^[[0;37mDEBUG^[[0m] DEBUG: MCollective Response Time (execute_direct: start): 2.447125069s  (Request ID: ) (pid:9083)
^[[0;37m2013-11-27 23:30:57.847^[[0m [^[[0;37mDEBUG^[[0m] DEBUG: Starting cartridge 'mongodb-2.2' in 'setherpad' after move on ip-10-242-83-231 (pid:9083)
^[[0;37m2013-11-27 23:30:57.849^[[0m [^[[0;37mDEBUG^[[0m] DEBUG: rpc_client.custom_request('cartridge_do', {:cartridge=>"mongodb-2.2", :action=>"start", :args=>{"--with-app-uuid"=>"5296c0cf4e6278e300000304", "--with-app-name"=>"setherpad", "--with-container-uuid"=>"5296c0cf4e6278e300000304", "--with-container-name"=>"setherpad", "--with-namespace"=>"zqd", "--with-uid"=>5696, "--with-request-id"=>nil, "--cart-name"=>"mongodb-2.2", "--component-name"=>"mongodb-2.2", "--with-software-version"=>"2.2", "--cartridge-vendor"=>"redhat"}}, ip-10-242-83-231, {'identity' => ip-10-242-83-231}) (Request ID: ) (pid:9083)
^[[0;37m2013-11-27 23:30:57.995^[[0m [^[[0;37mDEBUG^[[0m] DEBUG: [#<MCollective::RPC::Result:0x00000006fe02a8 @agent="openshift", @action="cartridge_do", @results={:sender=>"ip-10-242-83-231", :statuscode=>1, :statusmsg=>"cartridge_do_action failed 1. Output Failed to get cartridge 'mongodb-2.2' from  in gear 5296c0cf4e6278e300000304: Cartridge directory not found for mongodb-2.2", :data=>{:time=>nil, :output=>"Failed to get cartridge 'mongodb-2.2' from  in gear 5296c0cf4e6278e300000304: Cartridge directory not found for mongodb-2.2", :exitcode=>1, :addtl_params=>nil}}>] (Request ID: ) (pid:9083)
^[[0;37m2013-11-27 23:30:57.995^[[0m [^[[0;37mDEBUG^[[0m] DEBUG: MCollective Response Time (execute_direct: start): 0.146339524s  (Request ID: ) (pid:9083)
^[[0;37m2013-11-27 23:30:57.995^[[0m [^[[0;37mDEBUG^[[0m] DEBUG: server results: Failed to get cartridge 'mongodb-2.2' from  in gear 5296c0cf4e6278e300000304: Cartridge directory not found for mongodb-2.2 (pid:9083)
^[[0;37m2013-11-27 23:30:58.669^[[0m [^[[0;37mDEBUG^[[0m] DEBUG: Moving failed.  Rolling back gear 'setherpad' in 'setherpad' with delete on 'ip-10-242-83-231' (pid:9083)

Comment 5 Meng Bo 2013-12-03 08:32:17 UTC
2013-12-03 03:29:23.850 [DEBUG] DEBUG: [#<MCollective::RPC::Result:0x00000006ba5c60 @agent="openshift", @action="cartridge_do", @results={:sender=>"ip-10-178-24-29", :statuscode=>1, :statusmsg=>"cartridge_do_action failed 1. Output Failed to get cartridge 'haproxy-1.4' from  in gear ba233b0e5bed11e3981022000ab31424: Cartridge directory not found for haproxy-1.4", :data=>{:time=>nil, :output=>"Failed to get cartridge 'haproxy-1.4' from  in gear ba233b0e5bed11e3981022000ab31424: Cartridge directory not found for haproxy-1.4", :exitcode=>1, :addtl_params=>nil}}>] (Request ID: ) (pid:11001)
2013-12-03 03:29:23.851 [DEBUG] DEBUG: MCollective Response Time (execute_direct: start): 0.180535092s  (Request ID: ) (pid:11001)
2013-12-03 03:29:23.852 [DEBUG] DEBUG: server results: Failed to get cartridge 'haproxy-1.4' from  in gear ba233b0e5bed11e3981022000ab31424: Cartridge directory not found for haproxy-1.4 (pid:11001)
2013-12-03 03:29:23.928 [DEBUG] DEBUG: Moving failed.  Rolling back gear 'ba233b0e5bed11e3981022000ab31424' in 'perl1s' with delete on 'ip-10-178-24-29' (pid:11001)


Reproduced when trying to move scaled-up web gear for scalable app.

Comment 6 Meng Bo 2013-12-03 09:14:36 UTC
This will block most of the scaling app move related testing.

Comment 8 zhaozhanqi 2013-12-05 02:52:12 UTC
Tested on devenv_4097, this bug still could reproduce.


MCollective::RPC::Result:0x000000071051b0 @agent="openshift", @action="cartridge_do", @results={:sender=>"ip-10-152-143-223", :statuscode=>1, :statusmsg=>"cartridge_do_action failed 1. Output Failed to get cartridge 'mysql-5.1' from  in gear 529fe354b36e97cf2d000007: Cartridge directory not found for mysql-5.1", :data=>{:time=>nil, :output=>"Failed to get cartridge 'mysql-5.1' from  in gear 529fe354b36e97cf2d000007: Cartridge directory not found for mysql-5.1", :exitcode=>1, :addtl_params=>nil}}>] (Request ID: ) (pid:8770)
2013-12-04 21:25:28.481 [DEBUG] DEBUG: MCollective Response Time (execute_direct: start): 0.108789375s  (Request ID: ) (pid:8770)
2013-12-04 21:25:28.482 [DEBUG] DEBUG: server results: Failed to get cartridge 'mysql-5.1' from  in gear 529fe354b36e97cf2d000007: Cartridge directory not found for mysql-5.1 (pid:8770)
2013-12-04 21:25:28.552 [DEBUG] DEBUG: Moving failed.  Rolling back gear 'zqphps' in 'zqphps' with delete on 'ip-10-152-143-223' (pid:8770)
2013-12-04 21:25:28.554 [DEBUG] DEBUG: rpc_client.custom_request('cartridge_do', {:cartridge=>"openshift-origin-node", :action=>"app-destroy", :args=>{"--with-app-uuid"=>"529fe354b36e97cf2d000007", "--with-app-name"=>"zqphps", "--with-container-uuid"=>"529fe354b36e97cf2d000007", "--with-container-name"=>"zqphps", "--with-namespace"=>"zqd", "--with-uid"=>5082, "--with-request-id"=>nil, "--skip-hooks"=>true, "--cart-name"=>"openshift-origin-node"}}, ip-10-152-143-223, {'identity' => ip-10-152-143-223}) (Request ID: ) (pid:8770)

Comment 9 Ravi Sankar 2013-12-05 19:29:24 UTC
Fixed in https://github.com/openshift/origin-server/pull/4293

Comment 10 openshift-github-bot 2013-12-06 01:13:20 UTC
Commits pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/d93394511657b4b3b117e54c965ad88e32893243
Bug 1035120 - Fix oo-admin-move: Don't issue start on carts that does not belong to the gear

https://github.com/openshift/origin-server/commit/e7be8b0bdca0ac7b4778c93b445a1b8b74b25ff1
Merge pull request #4293 from pravisankar/dev/ravi/bug1035120

Merged by openshift-bot

Comment 11 zhaozhanqi 2013-12-06 04:54:21 UTC
tested this issue within district and across district on devenv_4102, they are work. Mark this bug to 'verified'

oo-admin-move --gear_uuid 52a13d05cc8e79612900006f -i ip-10-138-21-213
URL: http://zqphps-zqd.dev.rhcloud.com
Login: zzhao+1
App UUID: 52a13d05cc8e79612900006f
Gear UUID: 52a13d05cc8e79612900006f
DEBUG: Source district uuid: 52a13766cc8e791c1e000001
DEBUG: Destination district uuid: 52a13766cc8e791c1e000001
DEBUG: Getting existing app 'zqphps' status before moving
DEBUG: Gear component 'php-5.3' was running
DEBUG: Stopping existing app cartridge 'php-5.3' before moving
DEBUG: Force stopping existing app cartridge 'php-5.3' before moving
DEBUG: Stopping existing app cartridge 'haproxy-1.4' before moving
DEBUG: Creating new account for gear 'zqphps' on ip-10-138-21-213
DEBUG: Moving content for app 'zqphps', gear 'zqphps' to ip-10-138-21-213
Identity added: /var/www/openshift/broker/config/keys/rsync_id_rsa (/var/www/openshift/broker/config/keys/rsync_id_rsa)
Agent pid 29612
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 29612 killed;
DEBUG: Moving system components for app 'zqphps', gear 'zqphps' to ip-10-138-21-213
Identity added: /var/www/openshift/broker/config/keys/rsync_id_rsa (/var/www/openshift/broker/config/keys/rsync_id_rsa)
Agent pid 29673
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 29673 killed;
DEBUG: Starting cartridge 'php-5.3' in 'zqphps' after move on ip-10-138-21-213
DEBUG: Starting cartridge 'haproxy-1.4' in 'zqphps' after move on ip-10-138-21-213
DEBUG: Fixing DNS and mongo for gear 'zqphps' after move
DEBUG: Changing server identity of 'zqphps' from 'ip-10-236-185-92' to 'ip-10-138-21-213'
DEBUG: Deconfiguring old app 'zqphps' on ip-10-236-185-92 after move
Successfully moved gear with uuid '52a13d05cc8e79612900006f' of app 'zqphps' from 'ip-10-236-185-92' to 'ip-10-138-21-213'


Note You need to log in before you can comment on or make changes to this bug.