Bug 802653 - migration got failure for some apps
migration got failure for some apps
Product: OpenShift Origin
Classification: Red Hat
Component: Pod (Show other bugs)
Unspecified Unspecified
high Severity medium
: ---
: ---
Assigned To: Rob Millner
libra bugs
: Triaged
: 803213 (view as bug list)
Depends On:
  Show dependency treegraph
Reported: 2012-03-13 04:01 EDT by Meng Bo
Modified: 2013-11-17 19:38 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2012-03-19 14:22:50 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Meng Bo 2012-03-13 04:01:35 EDT
Description of problem:
Since the app home has been changed from devenv_1658, migration on older instance should move the old apps to the new home. After doing migration, migrating got failed with node exit code 127.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.start an old instance
2.create any type of app
3.do update and migration  
Actual results:

Migrating app 'py1' with uuid '71d428e83ed74fc3b1692cfa0fa80351' on node 'ip-10-110-162-138' for user: bmeng+1@redhat.com
Failed to migrate with cmd: './migrate-2.0.7 --rhlogin 'bmeng+1@redhat.com' --migrate-app 'py1'' after 2 tries with exception: Failed migrating app. Rerun with: ./migrate-2.0.7 --rhlogin 'bmeng+1@redhat.com' --migrate-app 'py1'
./migrate-2.0.7:44:in `migrate_app'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:258:in `call'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:258:in `custom_request'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:511:in `call'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:511:in `process_results_with_block'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:450:in `call_agent'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:129:in `req'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:124:in `loop'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:124:in `req'/usr/lib/ruby/1.8/timeout.rb:67:in `timeout'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:123:in `req'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:446:in `call_agent'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:257:in `custom_request'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:193:in `method_missing'./migrate-2.0.7:32:in `migrate_app'/var/www/stickshift/broker/lib/express/broker/application_container_proxy.rb:607:in `rpc_exec'./migrate-2.0.7:31:in `migrate_app'/usr/lib/ruby/1.8/timeout.rb:67:in `timeout'./migrate-2.0.7:30:in `migrate_app'./migrate-2.0.7:449:in `migrate_from_file'./migrate-2.0.7:447:in `each'./migrate-2.0.7:447:in `migrate_from_file'./migrate-2.0.7:559
Migrating app on node with: ./migrate-2.0.7 --rhlogin 'bmeng+1@redhat.com' --migrate-app 'py1'
Migrate on node output: Application not found to migrate: /var/lib/stickshift/71d428e83ed74fc3b1692cfa0fa80351

Migrate on node exit code: 127

Expected results:
migration should be done without any problems.

Additional info:
Comment 1 Meng Bo 2012-03-13 05:23:41 EDT
after tried to move the app directory to the new app home (/etc/lib/stickshift) manually, migration got successful.
but when trying to access to the migrated app via ssh, got follow error:

[mengbo@localhost ~]$ ssh 71d428e83ed74fc3b1692cfa0fa80351@py1-bmeng1dev.dev.rhcloud.com
Traceback (most recent call last):
  File "/usr/bin/trap-user", line 110, in <module>
  File "/usr/bin/trap-user", line 52, in read_env_vars
    for env in os.listdir(envdir):
OSError: [Errno 2] No such file or directory: '/var/lib/libra/71d428e83ed74fc3b1692cfa0fa80351/.env/'
Connection to py1-bmeng1dev.dev.rhcloud.com closed.

and check on the instance:
[root@ip-10-110-162-138 bin]# grep 71d428e83ed74fc3b1692cfa0fa80351 /etc/passwd
71d428e83ed74fc3b1692cfa0fa80351:x:503:503:libra guest:/var/lib/libra/71d428e83ed74fc3b1692cfa0fa80351:/usr/bin/trap-user
Comment 2 Krishna Raman 2012-03-13 15:00:07 EDT
Migration script added.

Please follow manual migration steps before running migration script
Comment 3 Meng Bo 2012-03-14 05:01:38 EDT
checked on devenv_1661,
migration failure still exist on jenkins app.

output is following:

Migrating app 'jenkins' with uuid 'eecb2be89d2a4ceb876a6c64ff991165' on node 'ip-10-114-33-152' for user: bmeng+1@redhat.com
Failed to migrate with cmd: './migrate-2.0.7 --rhlogin 'bmeng+1@redhat.com' --migrate-app 'jenkins'' after 2 tries with exception: Node execution failure (invalid exit code from node).  If the problem persists please contact Red Hat support.
/var/www/stickshift/broker/lib/express/broker/application_container_proxy.rb:894:in `run_cartridge_command'./migrate-2.0.7:161:in `send'./migrate-2.0.7:161:in `redeploy_httpd_proxy'./migrate-2.0.7:47:in `migrate_app'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:258:in `call'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:258:in `custom_request'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:511:in `call'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:511:in `process_results_with_block'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:450:in `call_agent'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:129:in `req'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:124:in `loop'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:124:in `req'/usr/lib/ruby/1.8/timeout.rb:67:in `timeout'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:123:in `req'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:446:in `call_agent'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:257:in `custom_request'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:193:in `method_missing'./migrate-2.0.7:32:in `migrate_app'/var/www/stickshift/broker/lib/express/broker/application_container_proxy.rb:611:in `rpc_exec'./migrate-2.0.7:31:in `migrate_app'/usr/lib/ruby/1.8/timeout.rb:67:in `timeout'./migrate-2.0.7:30:in `migrate_app'./migrate-2.0.7:458:in `migrate_from_file'./migrate-2.0.7:456:in `each'./migrate-2.0.7:456:in `migrate_from_file'./migrate-2.0.7:568
Migrating app on node with: ./migrate-2.0.7 --rhlogin 'bmeng+1@redhat.com' --migrate-app 'jenkins'
Comment 4 Meng Bo 2012-03-15 06:36:25 EDT
During migration, some of the apps can be migrated without error and other ones
will return 'Node Execution Failure'.
And after migration, the successful migrated apps can be visit via web, but
cannot be controlled(start|stop|restart|). the failed ones cannot be visit via
web, and cannot be controlled neither.
Comment 5 Rob Millner 2012-03-15 22:11:16 EDT
*** Bug 803213 has been marked as a duplicate of this bug. ***
Comment 6 Rob Millner 2012-03-15 22:16:17 EDT
We ended up missing migration steps as well (updating /etc/password) in testing.  The additional node-specific steps should be scripted in the future.  

Commits: f29bd21, f68088 fixed a variable parser issue which was also causing the above issue.

I can now migrate successfully starting with devenv_stage-143, creating a bunch apps, updating it (devenv sync), following the migrate procedure.
Comment 7 Meng Bo 2012-03-16 04:30:15 EDT
checked on devenv_stage_144 with latest migration script and steps.
migration works fine now.
mark bug as verified.
Comment 8 Johnny Liu 2012-03-16 07:33:31 EDT
During migration, found some issues about jenkins and metrics, so close this bug and file another two bugs to track these issues.

Bug 804010 - jenkins push build will hang there after migration
Bug 804009 - got 404 error when visit metrics page after migration

Note You need to log in before you can comment on or make changes to this bug.