Bug 1016917 - System Ruby segfaults, git push and ctl_app are both impossible
System Ruby segfaults, git push and ctl_app are both impossible
Status: CLOSED CURRENTRELEASE
Product: OpenShift Online
Classification: Red Hat
Component: Containers (Show other bugs)
1.x
Unspecified Unspecified
unspecified Severity urgent
: ---
: ---
Assigned To: Hiro Asari
libra bugs
: SupportQuestion
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-08 18:45 EDT by steven.merrill
Modified: 2013-11-03 20:51 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1026970 (view as bug list)
Environment:
Last Closed: 2013-10-17 09:34:06 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description steven.merrill 2013-10-08 18:45:15 EDT
Description of problem:

Our client has a Silver subscription to OpenShift Online. On application 5238c2aa5973cabd1d0001d3 / utilities-allyou.rhcloud.com, something is very broker in the system Ruby stack, which is preventing both "ctl_app" from working on the gear, and preventing git pushes from working (since ruby is used for the postreceive hook.)

Version-Release number of selected component (if applicable):

OpenShift Online Silver

How reproducible:

Every invocation of OpenShift's ruby binary.

Steps to Reproduce:
1. rhc ssh utilities
2. ctl_all
3. <get stacktrace>

Actual results:

I can run ctl_app or git push.

Expected results:

I can do neither without a backtrack.

Additional info:

The stacktrace I am getting is available on GitHub: https://gist.github.com/smerrill/a9398db646dffc2996b9. The application is ID 5238c2aa5973cabd1d0001d3, or http://utilities-allyou.rhcloud.com/.
Comment 1 Andy Grimm 2013-10-09 14:39:18 EDT
This problem is definitely related to the gems installed in the gear's .gem directory.  Ideally, commands like ctl_app and the postreceive hook should be unsetting GEM_HOME, so that this interference does not happen.

I am able to take that .gem directory and reproduce the issue outside of the gear, so I should be able to file a bug against the json gem for that after I debug a little.

I'll talk to engineering about what the best workaround is here.  I could change the commit hook to unset GEM_HOME.  for other actions, the user can ssh in, unset GEM_HOME, and then run ctl_app.  (They would then need to reset GEM_HOME elsewhere, such as in a pre_start script, I think.)
Comment 2 Andy Grimm 2013-10-09 15:19:47 EDT
When I first looked at this, I failed to notice that this app was not set up using the ruby cartridge, and thus was defaulting to ruby-1.8 executables.  So what happened here is that the user ran "gem install", and the json 1.8.0 gem was downloaded and compiled against ruby-1.8.  So this is the reason for the segfault, rather than some bug in json or our ruby-1.9 interpreter.

I've talked to Clayton, and he agrees that we should be ignoring the gear's GEM_HOME for gear/ctl_app operations.

For now, I have added "unset GEM_HOME" in both the git pre-receive and post-receive hooks for this gear.  Please try this and see if it allows you to do git pushes.  Then we'll deal with any remaining issues.
Comment 3 Andy Grimm 2013-10-09 15:37:05 EDT
Reproducer for this:

rhc app create bz1016917 --from-code https://github.com/a13m/bz1016917
Comment 4 steven.merrill 2013-10-09 16:06:59 EDT
OpenShift Team,

Thanks for the sleuthing on this.

This is indeed primarily a PHP app, but we're taking advantage of the fact that Ruby is available to do a phing build but also install some gems so that we can compile CSS with Compass. I may see if we can make sure that bundler installs to a local directory so that it shouldn't change the json gem that the gear script needs.
Comment 5 Hiro Asari 2013-10-11 14:34:56 EDT
A temporary fix is to require 'json' at the correct time. This is "fixed" on the master branch, so this issue does not happen on devenv.

The stage branch PR is https://github.com/openshift/origin-server/pull/3863
Comment 6 Meng Bo 2013-10-12 04:49:45 EDT
Checked on STG(devenv-stage_496) with the app in comment#3, the ctl_app/gear command works well in rhcsh.

[bz1016917-bmengstg.stg.rhcloud.com 52590769dbd93ca387000073]\> gear restart
Cart to restart?
1. php-5.3
?  1
Restarting PHP cartridge


The fix will be pushed to PROD on about Oct. 16th.

Note You need to log in before you can comment on or make changes to this bug.