Bug 1025048 - Tracking: oo-admin-chk detects apps whose hosts have changed names
Summary: Tracking: oo-admin-chk detects apps whose hosts have changed names
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 1.0.0
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: ---
Assignee: Luke Meyer
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-10-30 20:42 UTC by Luke Meyer
Modified: 2017-03-08 17:35 UTC (History)
5 users (show)

Fixed In Version: openshift-origin-broker-util-1.9.15-1
Doc Type: Enhancement
Doc Text:
If the hostname of a node host changes after its initial setup, applications with existing gears on that node host can become inaccessible to an OpenShift Enterprise user. Many other problems can result as well. To help detect this issue, the oo-admin-chk command now checks whether the current hostname of a reporting node host matches what is recorded in the datastore for any gear discovered on that node host. This may briefly report problems if a gear is being moved at the time the command is run, but should otherwise be considered a legitimate issue. Detecting and correcting this issue also helps prevent future upgrade problems.
Clone Of:
Environment:
Last Closed: 2014-01-13 15:06:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:0019 0 normal SHIPPED_LIVE Red Hat OpenShift Enterprise 1.2.7 bug fix and enhancement update 2014-01-13 20:02:48 UTC

Description Luke Meyer 2013-10-30 20:42:45 UTC
Description of problem:
Cherry-picked a few lines of code to oo-admin-chk which gives an error when someone has changed the hostname (server identity) on the node so that it no longer matches what's recorded with the apps.

I would like to push this out in 1.2 so that people have a method of detecting this problem of changing hostnames before it bites them during the upgrade to 2.0 (as we found with one customer when testing 1.1 => 1.2; it was a nasty problem to diagnose).

Origin commit:
commit ce2881c65a9bda2a841cabc12204e4d7cdcde896
Author: Luke Meyer <lmeyer>
Date:   Tue Oct 15 18:03:41 2013 -0400

Comment 3 Ma xiaoqiang 2013-11-04 06:57:29 UTC
check it on puddle [2.0-2013-10-31.2]

# oo-admin-chk -l 1
Started at: 2013-11-03 23:51:45 -0700
Time to fetch mongo data: 0.079s
Total gears found in mongo: 72
Time to get all gears from nodes: 20.446s
Total gears found on the nodes: 72
Total nodes that responded : 2
Time to get all sshkeys for all gears from nodes: 20.143s
Total gears found on the nodes: 73
Total nodes that responded : 2
Check failed.
District 'dist2' has (5989) available UIDs but (5985) available capacity
WARNING: Only checked the first 4000 errors for false positives.
Please refer to the oo-admin-repair tool to resolve some of these inconsistencies.
Total time: 43.766s
Finished at: 2013-11-03 23:52:29 -0700
[root@broker ~]# oo-admin-chk -l 1
Started at: 2013-11-03 23:55:04 -0700
Time to fetch mongo data: 0.081s
Total gears found in mongo: 72
Time to get all gears from nodes: 20.44s
Total gears found on the nodes: 72
Total nodes that responded : 2
Time to get all sshkeys for all gears from nodes: 20.16s
Total gears found on the nodes: 73
Total nodes that responded : 2
Check failed.
District 'dist2' has (5989) available UIDs but (5985) available capacity
Gear 5271f5a17d59db442c00015d exists but node 'node3.ose-1028.com.cn' does not match DB server_identity 'node2.ose-1029.com.cn'
Gear 5271f5a17d59db442c00015e exists but node 'node3.ose-1028.com.cn' does not match DB server_identity 'node2.ose-1029.com.cn'
Gear 5271f6b07d59dbf5900000ff exists but node 'node3.ose-1028.com.cn' does not match DB server_identity 'node2.ose-1029.com.cn'
Gear 5271f6b07d59dbf590000100 exists but node 'node3.ose-1028.com.cn' does not match DB server_identity 'node2.ose-1029.com.cn'
Gear 5272002b7d59db2ee4000143 exists but node 'node3.ose-1028.com.cn' does not match DB server_identity 'node2.ose-1029.com.cn'
Gear 527202867d59dbda310000d8 exists but node 'node3.ose-1028.com.cn' does not match DB server_identity 'node2.ose-1029.com.cn'
Gear 527221a67d59db12b7000266 exists but node 'node3.ose-1028.com.cn' does not match DB server_identity 'node2.ose-1029.com.cn'
Gear 527223587d59db12b70002ee exists but node 'node3.ose-1028.com.cn' does not match DB server_identity 'node2.ose-1029.com.cn'
Gear 527223c57d59dbe3d5000164 exists but node 'node3.ose-1028.com.cn' does not match DB server_identity 'node2.ose-1029.com.cn'
Gear 527202cc7d59dbea19000089 exists but node 'node3.ose-1028.com.cn' does not match DB server_identity 'node2.ose-1029.com.cn'

Comment 5 errata-xmlrpc 2014-01-13 15:06:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0019.html


Note You need to log in before you can comment on or make changes to this bug.