Description of problem: Since the app home has been changed from devenv_1658, migration on older instance should move the old apps to the new home. After doing migration, migrating got failed with node exit code 127. Version-Release number of selected component (if applicable): devenv_1659 How reproducible: always Steps to Reproduce: 1.start an old instance 2.create any type of app 3.do update and migration Actual results: Migrating app 'py1' with uuid '71d428e83ed74fc3b1692cfa0fa80351' on node 'ip-10-110-162-138' for user: bmeng+1 Failed to migrate with cmd: './migrate-2.0.7 --rhlogin 'bmeng+1' --migrate-app 'py1'' after 2 tries with exception: Failed migrating app. Rerun with: ./migrate-2.0.7 --rhlogin 'bmeng+1' --migrate-app 'py1' ./migrate-2.0.7:44:in `migrate_app'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:258:in `call'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:258:in `custom_request'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:511:in `call'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:511:in `process_results_with_block'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:450:in `call_agent'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:129:in `req'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:124:in `loop'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:124:in `req'/usr/lib/ruby/1.8/timeout.rb:67:in `timeout'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:123:in `req'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:446:in `call_agent'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:257:in `custom_request'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:193:in `method_missing'./migrate-2.0.7:32:in `migrate_app'/var/www/stickshift/broker/lib/express/broker/application_container_proxy.rb:607:in `rpc_exec'./migrate-2.0.7:31:in `migrate_app'/usr/lib/ruby/1.8/timeout.rb:67:in `timeout'./migrate-2.0.7:30:in `migrate_app'./migrate-2.0.7:449:in `migrate_from_file'./migrate-2.0.7:447:in `each'./migrate-2.0.7:447:in `migrate_from_file'./migrate-2.0.7:559 Output: Migrating app on node with: ./migrate-2.0.7 --rhlogin 'bmeng+1' --migrate-app 'py1' Migrate on node output: Application not found to migrate: /var/lib/stickshift/71d428e83ed74fc3b1692cfa0fa80351 Migrate on node exit code: 127 Expected results: migration should be done without any problems. Additional info:
after tried to move the app directory to the new app home (/etc/lib/stickshift) manually, migration got successful. but when trying to access to the migrated app via ssh, got follow error: [mengbo@localhost ~]$ ssh 71d428e83ed74fc3b1692cfa0fa80351.rhcloud.com Traceback (most recent call last): File "/usr/bin/trap-user", line 110, in <module> read_env_vars() File "/usr/bin/trap-user", line 52, in read_env_vars for env in os.listdir(envdir): OSError: [Errno 2] No such file or directory: '/var/lib/libra/71d428e83ed74fc3b1692cfa0fa80351/.env/' Connection to py1-bmeng1dev.dev.rhcloud.com closed. and check on the instance: [root@ip-10-110-162-138 bin]# grep 71d428e83ed74fc3b1692cfa0fa80351 /etc/passwd 71d428e83ed74fc3b1692cfa0fa80351:x:503:503:libra guest:/var/lib/libra/71d428e83ed74fc3b1692cfa0fa80351:/usr/bin/trap-user
Migration script added. Please follow manual migration steps before running migration script https://engineering.redhat.com/trac/Libra/ticket/149
checked on devenv_1661, migration failure still exist on jenkins app. output is following: Migrating app 'jenkins' with uuid 'eecb2be89d2a4ceb876a6c64ff991165' on node 'ip-10-114-33-152' for user: bmeng+1 Failed to migrate with cmd: './migrate-2.0.7 --rhlogin 'bmeng+1' --migrate-app 'jenkins'' after 2 tries with exception: Node execution failure (invalid exit code from node). If the problem persists please contact Red Hat support. /var/www/stickshift/broker/lib/express/broker/application_container_proxy.rb:894:in `run_cartridge_command'./migrate-2.0.7:161:in `send'./migrate-2.0.7:161:in `redeploy_httpd_proxy'./migrate-2.0.7:47:in `migrate_app'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:258:in `call'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:258:in `custom_request'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:511:in `call'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:511:in `process_results_with_block'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:450:in `call_agent'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:129:in `req'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:124:in `loop'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:124:in `req'/usr/lib/ruby/1.8/timeout.rb:67:in `timeout'/usr/lib/ruby/site_ruby/1.8/mcollective/client.rb:123:in `req'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:446:in `call_agent'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:257:in `custom_request'/usr/lib/ruby/site_ruby/1.8/mcollective/rpc/client.rb:193:in `method_missing'./migrate-2.0.7:32:in `migrate_app'/var/www/stickshift/broker/lib/express/broker/application_container_proxy.rb:611:in `rpc_exec'./migrate-2.0.7:31:in `migrate_app'/usr/lib/ruby/1.8/timeout.rb:67:in `timeout'./migrate-2.0.7:30:in `migrate_app'./migrate-2.0.7:458:in `migrate_from_file'./migrate-2.0.7:456:in `each'./migrate-2.0.7:456:in `migrate_from_file'./migrate-2.0.7:568 Output: Migrating app on node with: ./migrate-2.0.7 --rhlogin 'bmeng+1' --migrate-app 'jenkins'
During migration, some of the apps can be migrated without error and other ones will return 'Node Execution Failure'. And after migration, the successful migrated apps can be visit via web, but cannot be controlled(start|stop|restart|). the failed ones cannot be visit via web, and cannot be controlled neither.
*** Bug 803213 has been marked as a duplicate of this bug. ***
We ended up missing migration steps as well (updating /etc/password) in testing. The additional node-specific steps should be scripted in the future. Commits: f29bd21, f68088 fixed a variable parser issue which was also causing the above issue. I can now migrate successfully starting with devenv_stage-143, creating a bunch apps, updating it (devenv sync), following the migrate procedure.
checked on devenv_stage_144 with latest migration script and steps. migration works fine now. mark bug as verified.
During migration, found some issues about jenkins and metrics, so close this bug and file another two bugs to track these issues. Bug 804010 - jenkins push build will hang there after migration Bug 804009 - got 404 error when visit metrics page after migration