Bug 971460 - duplicate entries in environment variables causing "argument list too long" errors
Summary: duplicate entries in environment variables causing "argument list too long" e...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Containers
Version: 2.x
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Jhon Honce
QA Contact: libra bugs
URL:
Whiteboard:
: 972392 972707 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-06-06 15:10 UTC by Andy Grimm
Modified: 2016-11-08 03:47 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-06-11 04:18:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 972392 0 unspecified CLOSED Gear is dead? 2021-02-22 00:41:40 UTC

Internal Links: 972392

Description Andy Grimm 2013-06-06 15:10:11 UTC
Some users with gears using the ruby 1.9 cartridge have seen a problem where they can no longer ssh into (or push code to) their gear due to an "argument list too long" error.  This is caused by a bug in how ruby/env/LD_LIBRARY_PATH and ruby/env/MANPATH are written.

Comment 1 Dan Mace 2013-06-06 15:12:12 UTC
A bug in the Bash SDK path_append function causes a failure to remove duplicate entries during path_remove calls; each time a ruby-1.9 cart is built, update-configuration fires, causing LD_LIBRARY_PATH and MANPATH to append to themselves and grow.

Comment 2 Boris Mironov 2013-06-07 00:30:09 UTC
If it can help, here is example of output from my failed "git push origin" session:

remote: Running build on Ruby cart
remote: Argument list too long - set -e; source /var/lib/openshift/4fcf6f211aeb4279b098ba74635a1e3b/app-root/runtime/repo/.openshift/action_hooks/pre_start_ruby-1.9; /var/lib/openshift/4fcf6f211aeb4279b098ba74635a1e3b/haproxy/bin/control start; source /var/lib/openshift/4fcf6f211aeb4279b098ba74635a1e3b/app-root/runtime/repo/.openshift/action_hooks/post_start_ruby-1.9


And this one is from attempt to establish SSH connection to my application gear:
Traceback (most recent call last):
  File "/usr/bin/oo-trap-user", line 266, in <module>
    os.execv(cmd, [cmd] + allargs)
OSError: [Errno 7] Argument list too long


P.S. I have Ruby on Rails application

Comment 4 openshift-github-bot 2013-06-07 03:13:14 UTC
Commit pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/8eec159dc32d57055d1408a86db545e150650efc
Bug 971460 - Refactor path_append/prepend to accept multiple elements

* Original code only supported single argument substitutions
* Update function declarations to match rest of file
* Add functional tests
* Fixed issue in Ruby cartridge where scl use was duping path elements
* Bumped Ruby cartridge version number
* Fixed CartridgeRepository to rebuild index correctly

Comment 5 Meng Bo 2013-06-08 09:12:48 UTC
Reproduce this issue on devenv-stage_367, with a simple rails app.

1. Create ruby1.9 app
$rhc app create ruby19 ruby-1.9 -p$passwd
2. Make it a rails app
$rails new ruby19
3. Git push the app
4. Check the env for the app
5. Git push several times and check the env for the app

The MANPATH and LD_LIBRARY_PATH will keep growing after each push and make app cannot be ssh login due to the argument list too long.


Verified on devenv_3336 with same steps.

Issue has been fixed.

Will move the bug to verified after the PR merged into latest devenv-stage ami.

Comment 6 Boris Mironov 2013-06-08 15:31:44 UTC
Hello,

I'm a bit confused. Comment #4 says that issue has been pushed to master (commit is visible on Github), but Comment #5 says that issue is still reproducible in development.

Here is my current output:
[swimming-bsmgroup.rhcloud.com logs]\> echo $MANPATH|wc -c
71645
[swimming-bsmgroup.rhcloud.com logs]\> echo $LD_LIBRARY_PATH|wc -c
63457

My Apache reports in error log:
[Sat Jun 08 11:04:34 2013] [notice] Apache/2.2.15 (Unix) Phusion_Passenger/3.0.17 configured -- resuming normal operations
*** glibc detected *** PassengerHelperAgent: double free or corruption (!prev): 0x00007f0afc0271e0 ***
*** glibc detected *** PassengerHelperAgent: malloc(): memory corruption: 0x00007f0afc034430 ***
*** Exception ArgumentError in spawn manager (odd number of arguments for Hash) (process 23730, thread #<Thread:0x00000000de8118>):
        from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/lib/phusion_passenger/spawn_manager.rb:270:in `[]'
        from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/lib/phusion_passenger/spawn_manager.rb:270:in `handle_spawn_application'
        from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/lib/phusion_passenger/abstract_server.rb:357:in `server_main_loop'
        from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/lib/phusion_passenger/abstract_server.rb:206:in `start_synchronously'
        from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/passenger-spawn-server:102:in `<main>'
/opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:105:in `initialize': Connection refused - connect(2) (Errno::ECONNREFUSED)
        from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:105:in `new'
        from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:105:in `connect'
        from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:86:in `socket'
        from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:90:in `head_request'
        from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:145:in `<main>'
/opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:105:in `initialize': Connection refused - connect(2) (Errno::ECONNREFUSED)
        from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:105:in `new'
        from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:105:in `connect'
        from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:86:in `socket'
        from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:90:in `head_request'
        from /opt/rh/ruby193/root/usr/share/gems/gems/passenger-3.0.17/helper-scripts/prespawn:145:in `<main>'
*** glibc detected *** PassengerHelperAgent: malloc(): memory corruption: 0x00007f0afc034430 ***


It is scaled Ruby on Rails application. Apache tries to spawn new processes but each one stays and in few minutes I can observe dozens of Apache workers with no traffic to application (except from HAProxy).

Best regards,
Boris

Comment 7 Meng Bo 2013-06-09 02:19:32 UTC
@Boris Mironov 
The Comment#4 means this fix has been pushed to master branch and waiting to be merged into latest build.

And you can find, there are two pull request in comment#3, that means the fix need to push to two branches, which master and stage here.

In comment#5, the fix was merged in the master build, but the stage branch still not ready. 
And when the fix merged in to latest stage build and tested by our QE, it will push to the PRODUCTION.

Then you can find the fix.

Thanks.

Comment 8 Jianwei Hou 2013-06-09 03:06:56 UTC
This is verified on devenv-stage_369 as well.

Steps:
1. Create rails app
rhc app create rails ruby-1.9 mysql-5.1 --from-code git://github.com/openshift/rails-examplgit
2. Clone app to local, trigger git push
3. Notice the output during git push, the above errors are gone.
4. push several times and ssh into gear

[rails-jhou.dev.rhcloud.com 927153883858076067430400]\> echo $MANPATH
/opt/rh/ruby193/root/usr/share/man
[rails-jhou.dev.rhcloud.com 927153883858076067430400]\> echo $LD_LIBRARY_PATH
/opt/rh/ruby193/root/usr/lib64

When it comes to the case that the existing app's environment variables is already appended too long, I have verified that this can be corrected with a git push after the fix is in.

@Boris, we have passed validation on test servers, the bug will be fix after it is pushed to production(not knowing exact time, maybe within a day). And you can have a try then. Thanks for your information.

Comment 9 Jianwei Hou 2013-06-09 03:30:17 UTC
Also verified on STG

Comment 10 Jianwei Hou 2013-06-09 08:31:37 UTC
Verified this with v2 -> v2 migration, since devenv-stage_365 ami is wiped out from EC2, have to upgrade from devenv-stage_353 to 369, migrate from v1 to v2. Recreate the bug manually and fix it with migration

Steps(after instance is upgraded to devenv-stage_369 and cartridges are migrated to v2):
1. On node, modify the ruby/env/OPENSHIFT_PHP_IDENT , change value:
redhat:ruby:1.9:0.0.2 => redhat:ruby:1.9:0.0.1
2. Edit ruby/env/LD_LIBRARY_PATH and ruby/env/MANPATH, append these environment variables to long duplicate value

3. rhc-admin-migrate --version 2.0.28a
4. After migration is complete, ssh into gear, check the value of above enviroment variables again

Result:
Environemt variables are corrected
[rubyargs-jhou.dev.rhcloud.com 3908d182d0bf11e29ebe22000a8dad5a]\> env|grep -e LD -e MAN
MANPATH=/opt/rh/ruby193/root/usr/share/man
LD_LIBRARY_PATH=/opt/rh/ruby193/root/usr/lib64

Correction to comment 8: this bug will get fixed once it's deployed. No need to do a git push again.

Comment 11 Jhon Honce 2013-06-10 13:45:50 UTC
*** Bug 972707 has been marked as a duplicate of this bug. ***

Comment 12 Wagner Fonseca 2013-06-10 16:41:38 UTC
Hello,

I'm continue confused.

Error to establish SSH connection to my application gear:

Traceback (most recent call last):
  File "/usr/bin/oo-trap-user", line 266, in <module>
    os.execv(cmd, [cmd] + allargs)
OSError: [Errno 7] Argument list too long

I Don't know how resolve.

I want delete my gear, only that!

Comment 13 inst 2013-06-12 06:27:02 UTC
Same here. Cannot login to my gear.
Created another #972392.

Comment 14 Andy Grimm 2013-06-12 18:25:55 UTC
*** Bug 972392 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.