Some users with gears using the ruby 1.9 cartridge have seen a problem where they can no longer ssh into (or push code to) their gear due to an "argument list too long" error. This is caused by a bug in how ruby/env/LD_LIBRARY_PATH and ruby/env/MANPATH are written.
A bug in the Bash SDK path_append function causes a failure to remove duplicate entries during path_remove calls; each time a ruby-1.9 cart is built, update-configuration fires, causing LD_LIBRARY_PATH and MANPATH to append to themselves and grow.
If it can help, here is example of output from my failed "git push origin" session: remote: Running build on Ruby cart remote: Argument list too long - set -e; source /var/lib/openshift/4fcf6f211aeb4279b098ba74635a1e3b/app-root/runtime/repo/.openshift/action_hooks/pre_start_ruby-1.9; /var/lib/openshift/4fcf6f211aeb4279b098ba74635a1e3b/haproxy/bin/control start; source /var/lib/openshift/4fcf6f211aeb4279b098ba74635a1e3b/app-root/runtime/repo/.openshift/action_hooks/post_start_ruby-1.9 And this one is from attempt to establish SSH connection to my application gear: Traceback (most recent call last): File "/usr/bin/oo-trap-user", line 266, in <module> os.execv(cmd, [cmd] + allargs) OSError: [Errno 7] Argument list too long P.S. I have Ruby on Rails application
Fixed on stage https://github.com/openshift/origin-server/pull/2784 Fixed on master https://github.com/openshift/origin-server/pull/2785
Commit pushed to master at https://github.com/openshift/origin-server https://github.com/openshift/origin-server/commit/8eec159dc32d57055d1408a86db545e150650efc Bug 971460 - Refactor path_append/prepend to accept multiple elements * Original code only supported single argument substitutions * Update function declarations to match rest of file * Add functional tests * Fixed issue in Ruby cartridge where scl use was duping path elements * Bumped Ruby cartridge version number * Fixed CartridgeRepository to rebuild index correctly
Reproduce this issue on devenv-stage_367, with a simple rails app. 1. Create ruby1.9 app $rhc app create ruby19 ruby-1.9 -p$passwd 2. Make it a rails app $rails new ruby19 3. Git push the app 4. Check the env for the app 5. Git push several times and check the env for the app The MANPATH and LD_LIBRARY_PATH will keep growing after each push and make app cannot be ssh login due to the argument list too long. Verified on devenv_3336 with same steps. Issue has been fixed. Will move the bug to verified after the PR merged into latest devenv-stage ami.
Hello, I'm a bit confused. Comment #4 says that issue has been pushed to master (commit is visible on Github), but Comment #5 says that issue is still reproducible in development. Here is my current output: [swimming-bsmgroup.rhcloud.com logs]\> echo $MANPATH|wc -c 71645 [swimming-bsmgroup.rhcloud.com logs]\> echo $LD_LIBRARY_PATH|wc -c 63457 My Apache reports in error log: [Sat Jun 08 11:04:34 2013] [notice] Apache/2.2.15 (Unix) Phusion_Passenger/3.0.17 configured -- resuming normal operations *** glibc detected *** PassengerHelperAgent: double free or corruption (!prev): 0x00007f0afc0271e0 *** *** glibc detected *** PassengerHelperAgent: malloc(): memory corruption: 0x00007f0afc034430 *** *** Exception ArgumentError in spawn manager (odd number of arguments for Hash) (process 23730, thread #<Thread:0x00000000de8118>): from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/lib/phusion_passenger/spawn_manager.rb:270:in `[]' from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/lib/phusion_passenger/spawn_manager.rb:270:in `handle_spawn_application' from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/lib/phusion_passenger/abstract_server.rb:357:in `server_main_loop' from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/lib/phusion_passenger/abstract_server.rb:206:in `start_synchronously' from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/passenger-spawn-server:102:in `<main>' /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:105:in `initialize': Connection refused - connect(2) (Errno::ECONNREFUSED) from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:105:in `new' from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:105:in `connect' from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:86:in `socket' from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:90:in `head_request' from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:145:in `<main>' /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:105:in `initialize': Connection refused - connect(2) (Errno::ECONNREFUSED) from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:105:in `new' from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:105:in `connect' from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:86:in `socket' from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:90:in `head_request' from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:145:in `<main>' *** glibc detected *** PassengerHelperAgent: malloc(): memory corruption: 0x00007f0afc034430 *** It is scaled Ruby on Rails application. Apache tries to spawn new processes but each one stays and in few minutes I can observe dozens of Apache workers with no traffic to application (except from HAProxy). Best regards, Boris
@Boris Mironov The Comment#4 means this fix has been pushed to master branch and waiting to be merged into latest build. And you can find, there are two pull request in comment#3, that means the fix need to push to two branches, which master and stage here. In comment#5, the fix was merged in the master build, but the stage branch still not ready. And when the fix merged in to latest stage build and tested by our QE, it will push to the PRODUCTION. Then you can find the fix. Thanks.
This is verified on devenv-stage_369 as well. Steps: 1. Create rails app rhc app create rails ruby-1.9 mysql-5.1 --from-code git://github.com/openshift/rails-examplgit 2. Clone app to local, trigger git push 3. Notice the output during git push, the above errors are gone. 4. push several times and ssh into gear [rails-jhou.dev.rhcloud.com 927153883858076067430400]\> echo $MANPATH /opt/rh/ruby193/root/usr/share/man [rails-jhou.dev.rhcloud.com 927153883858076067430400]\> echo $LD_LIBRARY_PATH /opt/rh/ruby193/root/usr/lib64 When it comes to the case that the existing app's environment variables is already appended too long, I have verified that this can be corrected with a git push after the fix is in. @Boris, we have passed validation on test servers, the bug will be fix after it is pushed to production(not knowing exact time, maybe within a day). And you can have a try then. Thanks for your information.
Also verified on STG
Verified this with v2 -> v2 migration, since devenv-stage_365 ami is wiped out from EC2, have to upgrade from devenv-stage_353 to 369, migrate from v1 to v2. Recreate the bug manually and fix it with migration Steps(after instance is upgraded to devenv-stage_369 and cartridges are migrated to v2): 1. On node, modify the ruby/env/OPENSHIFT_PHP_IDENT , change value: redhat:ruby:1.9:0.0.2 => redhat:ruby:1.9:0.0.1 2. Edit ruby/env/LD_LIBRARY_PATH and ruby/env/MANPATH, append these environment variables to long duplicate value 3. rhc-admin-migrate --version 2.0.28a 4. After migration is complete, ssh into gear, check the value of above enviroment variables again Result: Environemt variables are corrected [rubyargs-jhou.dev.rhcloud.com 3908d182d0bf11e29ebe22000a8dad5a]\> env|grep -e LD -e MAN MANPATH=/opt/rh/ruby193/root/usr/share/man LD_LIBRARY_PATH=/opt/rh/ruby193/root/usr/lib64 Correction to comment 8: this bug will get fixed once it's deployed. No need to do a git push again.
*** Bug 972707 has been marked as a duplicate of this bug. ***
Hello, I'm continue confused. Error to establish SSH connection to my application gear: Traceback (most recent call last): File "/usr/bin/oo-trap-user", line 266, in <module> os.execv(cmd, [cmd] + allargs) OSError: [Errno 7] Argument list too long I Don't know how resolve. I want delete my gear, only that!
Same here. Cannot login to my gear. Created another #972392.
*** Bug 972392 has been marked as a duplicate of this bug. ***