Bug 1288637

Summary: gear start aborts prematurely if a single cartridge fails to start
Product: OpenShift Online Reporter: Andy Grimm <agrimm>
Component: ContainersAssignee: Sally <somalley>
Status: CLOSED DEFERRED QA Contact: Chao Yang <chaoyang>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2.xCC: agrimm, aos-bugs, erich, jgoulding, jokerman, mmccomas, rthrashe, somalley, tiwillia
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-01 21:51:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1277547    

Description Andy Grimm 2015-12-04 20:15:57 UTC
Description of problem:

When a gear contains multiple cartridges, the start process exits completely when a single cartridge's start command returns a non-zero exit code.  This can leads to situations where an app is rendered non-functional due to a bug in a secondary cartridge.

Version-Release number of selected component (if applicable):

rubygem-openshift-origin-node-1.38.4-1.el6oso.noarch

How reproducible:

often 

Steps to Reproduce:
1. create an app with several cartridges.  For example, php, mysql, and https://github.com/wshearn/openshift-cartridge-osbs-client.git
2. edit backup-client/bin/control and remove the "|| true" from the curl command lines
3. stop the gear
4. start the gear

Actual results:

In the application we observed, the mysql cartridge started first, followed by the backup-client cartridge.  The backup-client cartridge failed (because there was no server configured for curl to connect to), and the php cartridge's start script was never called.

Expected results:

The php cartridge should start regardless of whether the backup-client cartridge starts successfully.

Comment 2 Miciah Dashiel Butler Masters 2015-12-11 20:06:02 UTC
Does it make sense in general to start the framework when the database fails to start?

Could you use Configure-Order in your custom cartridge's manifest to start the framework cartridge first? Would that be a satisfactory solution?

Comment 3 Andy Grimm 2016-01-29 16:32:37 UTC
We had a face-to-face discussion about this one, but I don't recall the specifics.  This seems to be a pretty rare case, so I think we can defer it for now.  I'll re-raise it if we see other occurrences.

Comment 4 Timothy Williams 2016-02-01 21:51:40 UTC
Closing this for now as per Andy's comment. Lets re-open this if we see any other interesting occurrences.