Bug 1218049 - Failed to repair the HA app with the head gear located on the unresponsive node
Summary: Failed to repair the HA app with the head gear located on the unresponsive node
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 2.2.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Timothy Williams
QA Contact: libra bugs
URL:
Whiteboard:
Depends On: 1102557
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-05-04 06:29 UTC by Evgheni Dereveanchin
Modified: 2019-07-11 09:04 UTC (History)
17 users (show)

Fixed In Version: openshift-origin-broker-util-1.35.2.3-1
Doc Type: Bug Fix
Doc Text:
Previously, when a node containing a head gear for a scaled application was lost and the gear could not be recovered, running the oo-admin-repair tool still attempted to recover the gear. The tool then reported an error and the gear was not recovered, but changes were made to the broker's database. As a result, running the tool again reported all gears had been recovered, even though the head gear still did not exist. This bug fix updates oo-admin-repair to distinguish applications with lost head gears from regular lost gears from scaled applications. The tool now offers to delete any applications with lost head gears, informing the administrator to first re-create such applications from source or recent backups and move any existing alias before proceeding.
Clone Of: 1102557
Environment:
Last Closed: 2015-07-21 19:12:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:1463 0 normal SHIPPED_LIVE Red Hat OpenShift Enterprise 2.2.6 bug fix and enhancement update 2015-07-21 23:11:33 UTC

Comment 1 Evgheni Dereveanchin 2015-05-04 06:30:52 UTC
Cloned this bug from Online as we are hitting this in Enterprise as well where users are able to create HA apps and this is more critical.

Comment 10 openshift-github-bot 2015-05-15 19:53:04 UTC
Commit pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/71f7ade01db3f830f5aae3480095932e62584d83
oo-admin-repair should properly handle HA apps with deleted head gears

Bug 1218049
Bugzilla Link https://bugzilla.redhat.com/show_bug.cgi?id=1218049
oo-admin-repair previously attempted to resolve HA applications with missing head gears as though they were simply missing a web-proxy gear. When a head gear is missing, the application cannot be recovered. oo-admin-repair should offer to cleanly remove the application rather than attempt to fix it (and break it more in the process).

Comment 13 Ma xiaoqiang 2015-05-19 08:52:00 UTC
Check on puddle [2.2.6/2015-05-18.1]

1. create an app
# rhc app create xiaomn1 php-5.4 -s
2. enable ha for the app
# rhc app enable-ha xiaomn1     
3. find the node which the head gear is on, and stop it
# /etc/init.d/ruby193-mcollective stop
4. repair node in broker
# oo-admin-repair --removed-nodes
Started at: 2015-05-19 08:47:36 UTC
Total gears found in mongo: 4
Servers that are unresponsive:
        Server: node1.ose22-auto.com.cn (district: default-small), Confirm [yes/no]: yes

Some servers are unresponsive: node1.ose22-auto.com.cn

Found 1 HA applications that cannot be recovered due to missing head gear.
xiaomn1 (id: 555af7af82611d7870000001)

These apps should be re-created and any existing aliases moved for recovery before deletion.
Do you want to delete all of them now [yes/no]: yes


Do you want to delete unresponsive servers from their respective districts [yes/no]: yes

Finished at: 2015-05-19 08:48:10 UTC
Total time: 33.561s
SUCCESS


The node is removed from env.

Comment 17 openshift-github-bot 2015-06-11 23:32:03 UTC
Commit pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/60abe78eecdcd9cd029864fbc2af04d71eca1888
Test incorrectly expects less usage records

Bug 1218049
A test modified for bz 1218049 incorrectly expects a certain number of usage records for a test. The test creates an additional gear that was not previously accounted for.

Comment 21 errata-xmlrpc 2015-07-21 19:12:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1463.html


Note You need to log in before you can comment on or make changes to this bug.