Bug 802653 - migration got failure for some apps
Summary: migration got failure for some apps
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OKD
Classification: Red Hat
Component: Pod
Version: 1.x
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: ---
Assignee: Rob Millner
QA Contact: libra bugs
URL:
Whiteboard:
: 803213 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-03-13 08:01 UTC by Meng Bo
Modified: 2013-11-18 00:38 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-03-19 18:22:50 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Meng Bo 2012-03-13 08:01:35 UTC
Description of problem:
Since the app home has been changed from devenv_1658, migration on older instance should move the old apps to the new home. After doing migration, migrating got failed with node exit code 127.

Version-Release number of selected component (if applicable):
devenv_1659

How reproducible:
always

Steps to Reproduce:
1.start an old instance
2.create any type of app
3.do update and migration  
  
Actual results:

Migrating app 'py1' with uuid '71d428e83ed74fc3b1692cfa0fa80351' on node 'ip-10-110-162-138' for user: bmeng+1
Failed to migrate with cmd: './migrate-2.0.7 --rhlogin 'bmeng+1' --migrate-app 'py1'' after 2 tries with exception: Failed migrating app. Rerun with: ./migrate-2.0.7 --rhlogin 'bmeng+1' --migrate-app 'py1'
./migrate-2.0.7:44:in `migrate_app'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:258:in `call'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:258:in `custom_request'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:511:in `call'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:511:in `process_results_with_block'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:450:in `call_agent'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:129:in `req'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:124:in `loop'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:124:in `req'/usr/lib/ruby/1.8/timeout.rb:67:in `timeout'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:123:in `req'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:446:in `call_agent'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:257:in `custom_request'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:193:in `method_missing'./migrate-2.0.7:32:in `migrate_app'/var/www/stickshift/broker/lib/express/broker/application_container_proxy.rb:607:in `rpc_exec'./migrate-2.0.7:31:in `migrate_app'/usr/lib/ruby/1.8/timeout.rb:67:in `timeout'./migrate-2.0.7:30:in `migrate_app'./migrate-2.0.7:449:in `migrate_from_file'./migrate-2.0.7:447:in `each'./migrate-2.0.7:447:in `migrate_from_file'./migrate-2.0.7:559
Output:
Migrating app on node with: ./migrate-2.0.7 --rhlogin 'bmeng+1' --migrate-app 'py1'
Migrate on node output: Application not found to migrate: /var/lib/stickshift/71d428e83ed74fc3b1692cfa0fa80351

Migrate on node exit code: 127


Expected results:
migration should be done without any problems.

Additional info:

Comment 1 Meng Bo 2012-03-13 09:23:41 UTC
after tried to move the app directory to the new app home (/etc/lib/stickshift) manually, migration got successful.
but when trying to access to the migrated app via ssh, got follow error:

[mengbo@localhost ~]$ ssh 71d428e83ed74fc3b1692cfa0fa80351.rhcloud.com
Traceback (most recent call last):
  File "/usr/bin/trap-user", line 110, in <module>
    read_env_vars()
  File "/usr/bin/trap-user", line 52, in read_env_vars
    for env in os.listdir(envdir):
OSError: [Errno 2] No such file or directory: '/var/lib/libra/71d428e83ed74fc3b1692cfa0fa80351/.env/'
Connection to py1-bmeng1dev.dev.rhcloud.com closed.

and check on the instance:
[root@ip-10-110-162-138 bin]# grep 71d428e83ed74fc3b1692cfa0fa80351 /etc/passwd
71d428e83ed74fc3b1692cfa0fa80351:x:503:503:libra guest:/var/lib/libra/71d428e83ed74fc3b1692cfa0fa80351:/usr/bin/trap-user

Comment 2 Krishna Raman 2012-03-13 19:00:07 UTC
Migration script added.

Please follow manual migration steps before running migration script
https://engineering.redhat.com/trac/Libra/ticket/149

Comment 3 Meng Bo 2012-03-14 09:01:38 UTC
checked on devenv_1661,
migration failure still exist on jenkins app.

output is following:

Migrating app 'jenkins' with uuid 'eecb2be89d2a4ceb876a6c64ff991165' on node 'ip-10-114-33-152' for user: bmeng+1
Failed to migrate with cmd: './migrate-2.0.7 --rhlogin 'bmeng+1' --migrate-app 'jenkins'' after 2 tries with exception: Node execution failure (invalid exit code from node).  If the problem persists please contact Red Hat support.
/var/www/stickshift/broker/lib/express/broker/application_container_proxy.rb:894:in `run_cartridge_command'./migrate-2.0.7:161:in `send'./migrate-2.0.7:161:in `redeploy_httpd_proxy'./migrate-2.0.7:47:in `migrate_app'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:258:in `call'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:258:in `custom_request'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:511:in `call'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:511:in `process_results_with_block'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:450:in `call_agent'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:129:in `req'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:124:in `loop'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:124:in `req'/usr/lib/ruby/1.8/timeout.rb:67:in `timeout'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:123:in `req'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:446:in `call_agent'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:257:in `custom_request'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:193:in `method_missing'./migrate-2.0.7:32:in `migrate_app'/var/www/stickshift/broker/lib/express/broker/application_container_proxy.rb:611:in `rpc_exec'./migrate-2.0.7:31:in `migrate_app'/usr/lib/ruby/1.8/timeout.rb:67:in `timeout'./migrate-2.0.7:30:in `migrate_app'./migrate-2.0.7:458:in `migrate_from_file'./migrate-2.0.7:456:in `each'./migrate-2.0.7:456:in `migrate_from_file'./migrate-2.0.7:568
Output:
Migrating app on node with: ./migrate-2.0.7 --rhlogin 'bmeng+1' --migrate-app 'jenkins'

Comment 4 Meng Bo 2012-03-15 10:36:25 UTC
During migration, some of the apps can be migrated without error and other ones
will return 'Node Execution Failure'.
And after migration, the successful migrated apps can be visit via web, but
cannot be controlled(start|stop|restart|). the failed ones cannot be visit via
web, and cannot be controlled neither.

Comment 5 Rob Millner 2012-03-16 02:11:16 UTC
*** Bug 803213 has been marked as a duplicate of this bug. ***

Comment 6 Rob Millner 2012-03-16 02:16:17 UTC
We ended up missing migration steps as well (updating /etc/password) in testing.  The additional node-specific steps should be scripted in the future.  

Commits: f29bd21, f68088 fixed a variable parser issue which was also causing the above issue.

I can now migrate successfully starting with devenv_stage-143, creating a bunch apps, updating it (devenv sync), following the migrate procedure.

Comment 7 Meng Bo 2012-03-16 08:30:15 UTC
checked on devenv_stage_144 with latest migration script and steps.
migration works fine now.
mark bug as verified.

Comment 8 Johnny Liu 2012-03-16 11:33:31 UTC
During migration, found some issues about jenkins and metrics, so close this bug and file another two bugs to track these issues.

Bug 804010 - jenkins push build will hang there after migration
Bug 804009 - got 404 error when visit metrics page after migration


Note You need to log in before you can comment on or make changes to this bug.