Bug 963156

Summary: met script error when moving a gear to another node
Product: OKD Reporter: zhaozhanqi <zzhao>
Component: ContainersAssignee: Dan McPherson <dmcphers>
Status: CLOSED CURRENTRELEASE QA Contact: libra bugs <libra-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 2.xCC: claudianus, dmcphers, xtian
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-06-11 03:58:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
log none

Description zhaozhanqi 2013-05-15 09:37:16 UTC
Description of problem:
met script error when moving a gear to another node

Version-Release number of selected component (if applicable):
devenv_3227

How reproducible:
always

Steps to Reproduce:
1. setup a multi-node env (a, b)
2. create a district add node 'a' and 'b' to this district
3. create one app 
4. move the app to another node
  
Actual results:

[root@ip-10-62-23-236 ~]# oo-admin-move --gear_uuid 165155074000759201726464 -i ip-10-83-87-100
URL: http://jbosstest-zqd.dev.rhcloud.com
Login: zzhao
App UUID: 5193382b468a8587cb000122
Gear UUID: 5193382b468a8587cb000122
DEBUG: Source district uuid: 51933513468a8511f2000001
DEBUG: Destination district uuid: 51933513468a8511f2000001
DEBUG: Getting existing app 'jbosstest' status before moving
DEBUG: Error performing status on existing app on try 1: undefined method `[]' for nil:NilClass
DEBUG: Error performing status on existing app on try 2: undefined method `[]' for nil:NilClass
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-msg-broker-mcollective-1.9.1/lib/openshift/mcollective_application_container_proxy.rb:2094:in `block in get_cart_status': undefined method `[]' for nil:NilClass (NoMethodError)
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-msg-broker-mcollective-1.9.1/lib/openshift/mcollective_application_container_proxy.rb:2238:in `block in do_with_retry'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-msg-broker-mcollective-1.9.1/lib/openshift/mcollective_application_container_proxy.rb:2236:in `each'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-msg-broker-mcollective-1.9.1/lib/openshift/mcollective_application_container_proxy.rb:2236:in `do_with_retry'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-msg-broker-mcollective-1.9.1/lib/openshift/mcollective_application_container_proxy.rb:2092:in `get_cart_status'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-msg-broker-mcollective-1.9.1/lib/openshift/mcollective_application_container_proxy.rb:2068:in `get_app_status'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-msg-broker-mcollective-1.9.1/lib/openshift/mcollective_application_container_proxy.rb:1794:in `move_gear'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-msg-broker-mcollective-1.9.1/lib/openshift/mcollective_application_container_proxy.rb:1766:in `block in move_gear_secure'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.9.1/app/models/application.rb:1258:in `run_in_application_lock'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-msg-broker-mcollective-1.9.1/lib/openshift/mcollective_application_container_proxy.rb:1765:in `move_gear_secure'
	from /usr/sbin/oo-admin-move:112:in `<main>'


Expected results:
can move successfully

Additional info:

Comment 1 Xiaoli Tian 2013-05-15 10:37:46 UTC
*** Bug 963157 has been marked as a duplicate of this bug. ***

Comment 2 Dan McPherson 2013-05-15 23:59:41 UTC
Looks like the issue is this logic:

function send_stopped_status {
    _state=`get_app_state`

    case "$_state" in
      idle)     send_attr "status=ALREADY_IDLED" ;;
      stopped)  send_attr "status=ALREADY_STOPPED" ;;
      *)
          if [ -f $APP_DIR/run/stop_lock ]
          then
              if oo-frontend-check-idle --with-container-uuid "$uuid"
              then
                  send_attr "status=ALREADY_IDLED"
              else
                  send_attr "status=ALREADY_STOPPED"
              fi
          fi
          ;;
    esac
}



Is missing from v2

Comment 4 Dan McPherson 2013-05-16 03:05:01 UTC
https://github.com/openshift/origin-server/pull/2500

Comment 5 zhaozhanqi 2013-05-16 07:22:41 UTC
Created attachment 748624 [details]
log

Comment 6 zhaozhanqi 2013-05-16 07:24:28 UTC
the issue still can be reproduced on devenv_3231

comment 5 have attached the development.log

Comment 7 Dan McPherson 2013-05-16 14:45:30 UTC
It says from the PR that the change didn't go in until 3232:

Online Merge Results: SUCCESS (https://ci.dev.openshift.redhat.com/jenkins/job/test_pull_requests/2590/) (Image: devenv_3232)

Comment 8 zhaozhanqi 2013-05-17 01:55:33 UTC
verified this bug on devenv_3235

[root@ip-10-152-186-180 ~]# oo-admin-move --gear_uuid 51958c94444fff0a52000001 -i ip-10-165-14-227
URL: http://zqphp-zqd.dev.rhcloud.com
Login: zzhao
App UUID: 51958c94444fffe43c000006
Gear UUID: 51958c94444fffe43c000006
DEBUG: Source district uuid: c81bb5dabe9311e28eed22000a98bab4
DEBUG: Destination district uuid: c81bb5dabe9311e28eed22000a98bab4
DEBUG: Getting existing app 'zqphp' status before moving
DEBUG: Gear component 'php-5.3' was running
DEBUG: Stopping existing app cartridge 'php-5.3' before moving
DEBUG: Force stopping existing app cartridge 'php-5.3' before moving
DEBUG: Creating new account for gear 'zqphp' on ip-10-165-14-227
DEBUG: Moving content for app 'zqphp', gear 'zqphp' to ip-10-165-14-227
Identity added: /var/www/openshift/broker/config/keys/rsync_id_rsa (/var/www/openshift/broker/config/keys/rsync_id_rsa)
Warning: Permanently added '10.152.186.180' (RSA) to the list of known hosts.
Warning: Permanently added '10.165.14.227' (RSA) to the list of known hosts.
Agent pid 9941
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 9941 killed;
DEBUG: Moving system components for app 'zqphp', gear 'zqphp' to ip-10-165-14-227
Identity added: /var/www/openshift/broker/config/keys/rsync_id_rsa (/var/www/openshift/broker/config/keys/rsync_id_rsa)
Agent pid 10002
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 10002 killed;
DEBUG: Starting cartridge 'php-5.3' in 'zqphp' after move on ip-10-165-14-227
DEBUG: Fixing DNS and mongo for gear 'zqphp' after move
DEBUG: Changing server identity of 'zqphp' from 'ip-10-152-186-180' to 'ip-10-165-14-227'
DEBUG: Deconfiguring old app 'zqphp' on ip-10-152-186-180 after move
Successfully moved gear with uuid '51958c94444fff0a52000001' of app 'zqphp' from 'ip-10-152-186-180' to 'ip-10-165-14-227'

Comment 9 claudianus 2013-05-24 17:20:02 UTC
I get this same error (see below) when moving Mysql from one node to another. I can move all the other gears without problem. I am using OSE 1.1. I am also suspecting that this is applicable to any embedded cartrige like all the database cartridges.

App UUID: 65ddaa7339504b01acc7695bbd30cefe
Gear UUID: e7b66d9cf7d4465bac76bee1559c1355
DEBUG: Source district uuid: NONE
DEBUG: Destination district uuid: NONE
DEBUG: Getting existing app 'scaledphp' status before moving
DEBUG: Gear component 'php-5.3' was running
DEBUG: Stopping existing app cartridge 'mysql-5.1' before moving
DEBUG: Performing cartridge level pre-move for embedded mysql-5.1 for 'scaledphp' on node1.oseoloncadjaibr.csb
DEBUG: Performing cartridge level post-move for embedded mysql-5.1 for 'scaledphp' on node1.oseoloncadjaibr.csb
ERROR: Error performing cartridge level post-move for embedded mysql-5.1 for 'scaledphp' on node1.oseoloncadjaibr.csb: Node execution failure (invalid exit code from node).  If the problem persists please contact Red Hat support.
/usr/lib/ruby/gems/1.8/gems/openshift-origin-msg-broker-mcollective-1.0.5/lib/openshift-origin-msg-broker-mcollective/lib/openshift/mcollective_application_container_proxy.rb:1265:in `run_cartridge_command': Node execution failure (invalid exit code from node).  If the problem persists please contact Red Hat support. (OpenShift::NodeException)
        from /usr/lib/ruby/gems/1.8/gems/openshift-origin-msg-broker-mcollective-1.0.5/lib/openshift-origin-msg-broker-mcollective/lib/openshift/mcollective_application_container_proxy.rb:731:in `send'
        from /usr/lib/ruby/gems/1.8/gems/openshift-origin-msg-broker-mcollective-1.0.5/lib/openshift-origin-msg-broker-mcollective/lib/openshift/mcollective_application_container_proxy.rb:731:in `move_gear_pre'
        from /usr/lib/ruby/gems/1.8/gems/openshift-origin-msg-broker-mcollective-1.0.5/lib/openshift-origin-msg-broker-mcollective/lib/openshift/mcollective_application_container_proxy.rb:708:in `each'
        from /usr/lib/ruby/gems/1.8/gems/openshift-origin-msg-broker-mcollective-1.0.5/lib/openshift-origin-msg-broker-mcollective/lib/openshift/mcollective_application_container_proxy.rb:708:in `move_gear_pre'
        from /usr/lib/ruby/gems/1.8/gems/openshift-origin-msg-broker-mcollective-1.0.5/lib/openshift-origin-msg-broker-mcollective/lib/openshift/mcollective_application_container_proxy.rb:769:in `move_gear'
        from /usr/sbin/oo-admin-move:111