Bug 1103740 - The secondary gear isn't stopped after restore snapshot for scalable stop app
Summary: The secondary gear isn't stopped after restore snapshot for scalable stop app
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: ImageStreams
Version: 2.1.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: ---
Assignee: Brenton Leanhardt
QA Contact: libra bugs
URL:
Whiteboard:
: 1110077 (view as bug list)
Depends On: 1101499 1110077
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-06-02 12:50 UTC by Brenton Leanhardt
Modified: 2014-08-04 13:27 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
After stopping a scaled application and saving a snapshot, the head gear for the application remained stopped after restoring the snapshot, which was expected. However, secondary gears were started after the restore, which was not the expected behavior. This issue was caused by certain post restore logic not being applied to all gears in the application. This bug fix updates the post-restore logic, and secondary gears are now stopped along with the head gear after restoring a stopped application.
Clone Of: 1101499
Environment:
Last Closed: 2014-08-04 13:27:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:0999 0 normal SHIPPED_LIVE Red Hat OpenShift Enterprise 2.1.4 bug fix and enhancement update 2014-08-04 17:26:43 UTC

Description Brenton Leanhardt 2014-06-02 12:50:18 UTC
+++ This bug was initially created as a clone of Bug #1101499 +++

Description of problem:
Given one scalable app created(e.g., myperl510s),set min scaling value to 2, stop app, save snapshot, and then restore snapshot and check the state of gear, found haproxy gear will start.

[rayzhang@ray Work]$ rhc app stop -a myperl510s 
RESULT:
myperl510s stopped
[rayzhang@ray Work]$ rhc app show --gear -a myperl510s 
ID                       State   Cartridges            Size  SSH URL
------------------------ ------- --------------------- ----- ------------------------------------------------------------------------
5384a2035be46d2b6e000784 stopped haproxy-1.4 perl-5.10 small 5384a2035be46d2b6e000784.rhcloud.com
5384a2765be46d007500005b stopped haproxy-1.4 perl-5.10 small 5384a2765be46d007500005b.rhcloud.com

[rayzhang@ray Work]$ rhc snapshot restore -a myperl510s -f  myperl510s.tar.gz 
Restoring from snapshot myperl510s.tar.gz to application 'myperl510s' ... done
[rayzhang@ray Work]$ rhc app show --gear -a myperl510s 
ID                       State   Cartridges            Size  SSH URL
------------------------ ------- --------------------- ----- ------------------------------------------------------------------------
5384a2035be46d2b6e000784 stopped haproxy-1.4 perl-5.10 small 5384a2035be46d2b6e000784.rhcloud.com
5384a2765be46d007500005b started haproxy-1.4 perl-5.10 small 5384a2765be46d007500005b.rhcloud.com

Version-Release number of selected component (if applicable):
devenv_4815

How reproducible:
always

Steps to Reproduce:
1.Create scalable app and set min scaling value to 2
#rhc app create myperl510s perl-5.10 -s 
#rhc cartridge scale -a myperl510s -c perl-5.10 --min 2
2.stop app and save snapshot
#rhc app stop myperl510s
#rhc snapshot save myperl510s
#rhc app show --gear -a myperl510s 
3.restore snapshot and check status of gear
#rhc snapshot restore myperl510s 
#rhc app show --gear -a myperl510s 

Actual results:
Haproxy gear start after restore snapshot for stop app

Expected results:
haproxy gear should keep stop after restore snapshot for stop app

Additional info:

--- Additional comment from Ben Parees on 2014-05-27 12:28:54 EDT ---

Michal, you worked most recently in the snapshot/restore app state logic, can you take a look?

--- Additional comment from Michal Fojtik on 2014-05-28 08:50:54 EDT ---

Jakub is working on this one atm.

--- Additional comment from Jakub Hadvig on 2014-05-28 10:43:43 EDT ---

Lei Zhang, the secondary gear doesn't contain HAproxy, only the cartridge with the application. The problem is that upon the restore the secondary gear is not stopped.

--- Additional comment from Lei Zhang on 2014-05-28 22:43:53 EDT ---

Jakub Hadvig, yes, correct! I have updated the title. Thank you!

--- Additional comment from Michal Fojtik on 2014-05-29 06:14:09 EDT ---

Lei Zhang: Is this a regression? Did this worked before?

--- Additional comment from Lei Zhang on 2014-05-29 07:25:37 EDT ---

Michal Fojtik, no, STG also has this issue.

[lijun@ray Work]$ rhc snapshot save myperl510s 
Pulling down a snapshot of application 'myperl510s' to myperl510s.tar.gz ... done
[lijun@ray Work]$ rhc app show --gear -a myperl510s
ID                       State   Cartridges            Size  SSH URL
------------------------ ------- --------------------- ----- --------------------------------------------------------------------------
538710dcdbd93cce240019b7 stopped haproxy-1.4 perl-5.10 small 538710dcdbd93cce240019b7.rhcloud.com
5387112f2587c84434000f59 stopped haproxy-1.4 perl-5.10 small 5387112f2587c84434000f59.rhcloud.com
[lijun@ray Work]$ rhc snapshot restore -a myperl510s  -f myperl510s.tar.gz 
Restoring from snapshot myperl510s.tar.gz to application 'myperl510s' ... done
[lijun@ray Work]$ rhc app show --gear -a myperl510s
ID                       State   Cartridges            Size  SSH URL
------------------------ ------- --------------------- ----- --------------------------------------------------------------------------
538710dcdbd93cce240019b7 stopped haproxy-1.4 perl-5.10 small 538710dcdbd93cce240019b7.rhcloud.com
5387112f2587c84434000f59 started haproxy-1.4 perl-5.10 small 5387112f2587c84434000f59.rhcloud.com

--- Additional comment from Michal Fojtik on 2014-05-29 07:40:24 EDT ---

The problem here is that the application is scaled up when you do restore, so there are actually *2* gears that need restore, but we do restore just on the primary gear.

So we have (at least ;-) two options here:

1) We will have to scale down prior to restore (eg. in snapshots.rb#prepare_for_restore method) and then AFTER the restore succeed scale back up. I see this as too much work for a 'bug' fix and if we want to go this way, I would prefer to move this into Trello.

2) We can tell users to scale their apps down before they run restore. IOW. update documentation. In that case, the restore will do the right thing.

Dan, Ben -> Thoughts?

--- Additional comment from Michal Fojtik on 2014-05-29 07:41:19 EDT ---

Marking this as UpcomingRelease as this is not regression.

--- Additional comment from Andy Goldstein on 2014-05-29 10:48:03 EDT ---

We do restore all child gears in addition to the head gear. The issue is that child web gears aren't being stopped like they should. My guess is that https://github.com/openshift/origin-server/blob/master/node/lib/openshift-origin-node/model/application_container_ext/snapshots.rb#L209, which issues a stop if the pre-restore state was stopped, only applies to the current gear, and it's not a full app stop (all gears).

--- Additional comment from Andy Goldstein on 2014-05-29 11:02:01 EDT ---

And just for clarity, we don't need to scale down prior to restore, or tell users to scale down and then scale up. The only bug here is that child web gears that are supposed to be stopped post-restore (because the app was stopped) end up started.

--- Additional comment from openshift-github-bot on 2014-06-01 17:03:18 EDT ---

Commit pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/da7e8c83fa197cf7765a410e3b163d5b7b9ef16d
Bug 1101499: Stopping secondary gear after restore snapshot for scaleable app

Comment 1 Brenton Leanhardt 2014-07-15 12:57:38 UTC
Upstream commits:

commit da7e8c83fa197cf7765a410e3b163d5b7b9ef16d
Author: jhadvig <jhadvig>
Date:   Fri May 30 14:32:16 2014 +0200

    Bug 1101499: Stopping secondary gear after restore snapshot for scaleable app

commit e0daa900c1d55a5bfe74287661adb178ef1680ec
Author: jhadvig <jhadvig>
Date:   Wed Jun 4 12:43:44 2014 +0200

    Bug 1101499: Adjusting logic of gear state restoration

Comment 4 Yanping Zhang 2014-07-16 05:42:43 UTC
Verified on:OpenShiftEnterpriseErrata/2.1.z/2014-07-15.1

Steps to verify:
1.Create scalable app and set min scaling value to 2
#rhc app create perlone perl-5.10 -s 
#rhc cartridge scale -a perlone -c perl-5.10 --min 2
2.stop app and save snapshot
#rhc app stop perlone
#rhc snapshot save perlone
#rhc app show -a perlone --gears 
3.restore snapshot and check status of gear
#rhc snapshot restore perlone 
#rhc app show -a perlone --gears 

Actual results:
2.
# rhc app stop perlone
RESULT:
perlone stopped

# rhc snapshot save perlone
Pulling down a snapshot of application 'perlone' to perlone.tar.gz ... done

# rhc app show -a perlone --gears
ID                       State   Cartridges            Size  SSH URL
------------------------ ------- --------------------- ----- ------------------------------------------------------------------------------
53c60ddcdb26c87359000053 stopped haproxy-1.4 perl-5.10 small 53c60ddcdb26c87359000053.com.cn
53c60e46db26c87359000073 stopped haproxy-1.4 perl-5.10 small 53c60e46db26c87359000073.com.cn
3.
# rhc snapshot restore perlone
Restoring from snapshot perlone.tar.gz to application 'perlone' ... done

# rhc app show -a perlone --gears
ID                       State   Cartridges            Size  SSH URL
------------------------ ------- --------------------- ----- ------------------------------------------------------------------------------
53c60ddcdb26c87359000053 stopped haproxy-1.4 perl-5.10 small 53c60ddcdb26c87359000053.com.cn
53c60e46db26c87359000073 stopped haproxy-1.4 perl-5.10 small 53c60e46db26c87359000073.com.cn

Comment 5 Brenton Leanhardt 2014-07-22 17:24:20 UTC
*** Bug 1110077 has been marked as a duplicate of this bug. ***

Comment 7 errata-xmlrpc 2014-08-04 13:27:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0999.html


Note You need to log in before you can comment on or make changes to this bug.