Bug 1039787 - downloaded carts get removed on rollback failure
Summary: downloaded carts get removed on rollback failure
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Pod
Version: 2.x
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Rajat Chopra
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-12-10 01:49 UTC by Rajat Chopra
Modified: 2015-05-15 00:23 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-01-30 00:52:58 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Rajat Chopra 2013-12-10 01:49:04 UTC
Description of problem:
Application can get into 'stuck' state if in the middle of installing a 'downloadable' cartridge, the node stalls.
Essentially if the rollback of an operation also fails, then the downloaded cartridge gets removed from the application's db, inspite of component still being there.

Version-Release number of selected component (if applicable):


How reproducible:
Rare. But can be simulated.

Steps to Reproduce:
1. In a dev environment, modify the node mcollective action file (openshift.rb) such that it returns a non-zero exit code on oo_update_cluster and oo_app_destroy
2. With the above changes and mcollective restarted, try to add a downloadable cartridge to an existing app. example url - http://cartreflect-claytondev.rhcloud.com/reflect?github=narmitag/openshift-origin-cartridge-tungsten
3. 

Actual results:
App reports error and stays stuck. No further operations are possible because the last operation cannot be rolled back (the downloaded cart that it was operating upon has been removed from the db).

Expected results:
App should report error but should not be un-operable after.
After the fix, the app should be operable after the mcollective actions report correctly (not non-zero error codes - indicating that the node is back to normal).



Additional info:

Comment 1 Rajat Chopra 2013-12-10 01:59:38 UTC
Fixed with stage pull request - https://github.com/openshift/origin-server/pull/4308

Master pull request - https://github.com/openshift/origin-server/pull/4307

Comment 2 Liang Xia 2013-12-11 08:20:21 UTC
Following steps in comment #0, and got result as below on devenv-stage_609,

# rhc app create app1 php-5.3 --from-code http://cartreflect-claytondev.rhcloud.com/reflect?github=narmitag/openshift-origin-cartridge-tungsten
Application Options
-------------------
Domain:      lxia
Cartridges:  php-5.3
Source Code: http://cartreflect-claytondev.rhcloud.com/reflect?github=narmitag/openshift-origin-cartridge-tungsten
Gear Size:   default
Scaling:     no

Creating application 'app1' ... 
Unable to complete the requested operation due to: An invalid exit code (131) was returned from the server
ip-10-28-67-145.  This indicates an unexpected problem during the execution of your request..
Reference ID: 8df732a71ba49c39b0d8f42589991472

And no record in mongo for this app.


Hi Rajat Chopra, 
Would you please kindly help to check if this is expected ? Thanks in advance.
Liang

Comment 3 Rajat Chopra 2013-12-11 17:31:50 UTC
The given downloadable cartridge is an embedded cartridge and not a framework one, it cannot be used to create an app. 
So, instead of doing 'rhc app create --from-code' just create a regular app, say php-5.3, and then add the tungsten cartridge to it using 'rhc add cartridge'.

Make sure the app created is a scalable one.

Comment 4 Liang Xia 2013-12-12 05:12:37 UTC
Verified on devenv_4125.

# rhc cartridge add http://cartreflect-claytondev.rhcloud.com/reflect?github=narmitag/openshift-origin-cartridge-tungsten -a phps
The cartridge 'http://cartreflect-claytondev.rhcloud.com/reflect?github=narmitag/openshift-origin-cartridge-tungsten' will be downloaded
and installed
Adding http://cartreflect-claytondev.rhcloud.com/reflect?github=narmitag/openshift-origin-cartridge-tungsten to application 'phps' ... 
Unable to complete the requested operation due to: An invalid exit code (100) was returned from the server domU-12-31-39-04-35-BC.  This
indicates an unexpected problem during the execution of your request..
Reference ID: 033af50843c0dbc930ed43523d74bb21

# rhc app show phps
phps @ http://phps-lxia.dev.rhcloud.com/ (uuid: 52a93d608cdc1fe434000029)
-------------------------------------------------------------------------
  Domain:     lxia
  Created:    Dec 11 11:36 PM
  Gears:      1 (defaults to small)
  Git URL:    ssh://52a93d608cdc1fe434000029.rhcloud.com/~/git/phps.git/
  SSH:        52a93d608cdc1fe434000029.rhcloud.com
  Deployment: auto (on git push)

  php-5.3 (PHP 5.3)
  -----------------
    Scaling: x1 (minimum: 1, maximum: available) on small gears

  haproxy-1.4 (Web Load Balancer)
  -------------------------------
    Gears: Located with php-5.3

# rhc app restart phps
RESULT:
phps restarted


Note You need to log in before you can comment on or make changes to this bug.