Description of problem: oo-admin-move stacktraces on a particular gear in prod Version-Release number of selected component (if applicable): openshift-origin-broker-util-1.23.8-1.el6oso.noarch How reproducible: Always Steps to Reproduce: 1. Attempt to move the gear Actual results: $ sudo oo-admin-move --gear_uuid 533835c74382ec81be00024a --change_district -i ex-std-node429.prod.rhcloud.com URL: http://[redacted].rhcloud.com Login: [redacted] App UUID: 532cfc07500446a0a0000119 Gear UUID: 533835c74382ec81be00024a DEBUG: Source district uuid: 517be5425973ca09b4000001 DEBUG: Destination district uuid: 536930ec5973ca5d9a000001 DEBUG: Getting existing app '[redacted]' status before moving DEBUG: Gear component 'php-5.4' was running DEBUG: Stopping existing app cartridge 'mysql-5.5' before moving DEBUG: Reserved uid '3805' on district: '536930ec5973ca5d9a000001' DEBUG: Creating new account for gear '533835c74382ec81be00024a' on ex-std-node429.prod.rhcloud.com DEBUG: Moving content for app '[redacted]', gear '533835c74382ec81be00024a' to ex-std-node429.prod.rhcloud.com Identity added: /var/www/openshift/broker/config/keys/rsync_id_rsa (/var/www/openshift/broker/config/keys/rsync_id_rsa) Agent pid 2550 unset SSH_AUTH_SOCK; unset SSH_AGENT_PID; echo Agent pid 2550 killed; DEBUG: Moving system components for app '[redacted]', gear '533835c74382ec81be00024a' to ex-std-node429.prod.rhcloud.com Identity added: /var/www/openshift/broker/config/keys/rsync_id_rsa (/var/www/openshift/broker/config/keys/rsync_id_rsa) Agent pid 2605 unset SSH_AUTH_SOCK; unset SSH_AGENT_PID; echo Agent pid 2605 killed; DEBUG: Starting cartridge 'mysql-5.5' in '[redacted]' after move on ex-std-node429.prod.rhcloud.com DEBUG: Fixing DNS and mongo for gear '533835c74382ec81be00024a' after move DEBUG: Changing server identity of '533835c74382ec81be00024a' from 'ex-std-node74.prod.rhcloud.com' to 'ex-std-node429.prod.rhcloud.com' DEBUG: Moving failed. Rolling back gear '533835c74382ec81be00024a' in '[redacted]' with delete on 'ex-std-node429.prod.rhcloud.com' Failed to execute: 'control update-cluster' for /var/lib/openshift/532cfc07500446a0a0000119/haproxy ---- Expected results: The gear should be moved successfully. Additional info:
Created attachment 893795 [details] redacted snippets from the broker log
> gems/json-1.8.1/lib/json/common.rb:67: [BUG] unknown type 0x22 (0xc given)\nruby 1.9.3p448 (2013-06-27) [x86_64-linux] > scl enable ruby193 "ruby -v" ruby 1.9.3p448 (2013-06-27) [x86_64-linux] Accoring to https://github.com/tyne/tyne/issues/19#issuecomment-27873129, this is related to combination of json-1.8.1 gem and ruby 1.9.3p448. They advice to downgrade ruby to 1.9.3p125 and claims it fixes the bug. I guess it's not the right direction for us..
It seems like the json gem is built against wrong version of ruby (not against ruby193 scl). We can check with: $ readelf /var/lib/openshift/532cfc07500446a0a0000119/.gem/gems/json-1.8.1/lib/json/ext/parser.so -a | grep libruby
sten: > readelf /var/lib/openshift/532cfc07500446a0a0000119/.gem/gems/json-1.8.1/lib/json/ext/parser.so -a | grep libruby > 0x0000000000000001 (NEEDED) Shared library: [libruby.so.1.8] So it's confirmed. The json rubygem is built against wrong ruby version. It's 1.8 instead of 1.9.3. Closing CANTFIX.