Bug 971355 - gear ssh url is reported incorrectly for secondary gears in migrated scaled apps
Summary: gear ssh url is reported incorrectly for secondary gears in migrated scaled apps
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 1.2.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: John W. Lamb
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-06-06 10:54 UTC by Johnny Liu
Modified: 2017-03-08 17:35 UTC (History)
4 users (show)

Fixed In Version: rubygem-openshift-origin-controller-1.9.13-1.1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-06-28 15:47:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Johnny Liu 2013-06-06 10:54:37 UTC
Description of problem:
old gear dns records are not be updated to new format after upgrade

Version-Release number of selected component (if applicable):
1.2/2013-06-05.9

How reproducible:
Always

Steps to Reproduce:
1. Setup ose-1.1.3 env
2. Create a scalable app, and embed mysql cartridge
3. Following http://etherpad.corp.redhat.com/OSE-1-2-upgrade-notes to do upgrade testing
4. After upgrade, run the following command.
$ rhc app show pythonscal --gears
Password: ******

ID                               State   Cartridges                                Size  SSH URL
-------------------------------- ------- ----------------------------------------- ----- --------------------------------------------------------------------------------------
4941e92aa5714fc1885ed93bd1ca75d4 started python-2.6 jenkins-client-1.4 haproxy-1.4 small 4941e92aa5714fc1885ed93bd1ca75d4.com
b716573b545b43a2858a18ee78ae5dbb started python-2.6 jenkins-client-1.4 haproxy-1.4 small b716573b545b43a2858a18ee78ae5dbb.com
f9f90dd76ec14bf5992fec192fd8aab6 started mysql-5.1                                 small f9f90dd76ec14bf5992fec192fd8aab6.com


Actual results:
After upgrade, gears' dns is pythonscal-jialiu.ose11test.com, b716573b545b43a2858a18ee78ae5dbb-jialiu.ose11test.com, f9f90dd76ec14bf5992fec192fd8aab6-jialiu.ose11test.com, try to ping them.
$ ping f9f90dd76ec14bf5992fec192fd8aab6-jialiu.ose11test.com
ping: unknown host f9f90dd76ec14bf5992fec192fd8aab6-jialiu.ose11test.com

Check dns record, found that:
[root@broker ~]# named-checkzone -Dj ose11test.com /var/named/dynamic/ose11test.com.db|grep f9f90
f9f90dd76e-jialiu.ose11test.com.	      60 IN CNAME	node1.ose11test.com.
[root@broker ~]# named-checkzone -Dj ose11test.com /var/named/dynamic/ose11test.com.db|grep b716
b716573b54-jialiu.ose11test.com.	      60 IN CNAME	node1.ose11test.com.

Old dns records are still there, not new ones.


Expected results:
Old dns records should be update to new ones.

Additional info:

Comment 2 Luke Meyer 2013-06-10 19:04:15 UTC
Need to do some spelunking to figure out where this was supposed to happen... Given it's a broker-side change, may need some special logic.

Comment 3 Luke Meyer 2013-06-13 19:31:26 UTC
The DNS shouldn't have changed; the problem is that the gear name changed during the migration, and it shouldn't have. I think I have a fix.

Comment 4 Gaoyun Pei 2013-06-14 10:26:53 UTC
Test this issue on the latest puddle.

Setup an OSE env using puddle 1.1.z/2013-06-11.2. 
Create a scalable app, and embed postgresql cartridge.
Then update this env to puddle 1.2/2013-06-13.1.

Check the gears' dns
[root@broker ~]# rhc app show jbewsscal --gears
ID                               State   Cartridges               Size  SSH URL
-------------------------------- ------- ------------------------ ----- -----------------------------------------------------------------------------------
7991bdf194f7433790d5fc77fd5c86be started haproxy-1.4 jbossews-1.0 small 7991bdf194f7433790d5fc77fd5c86be.com
c9cea283cbf44340bfdbbad3cc4a01e9 started haproxy-1.4 jbossews-1.0 small c9cea283cbf44340bfdbbad3cc4a01e9.com
47b76b79c6e34e00a607368985274cc0 started postgresql-8.4           small 47b76b79c6e34e00a607368985274cc0.com

This command returns wrong SSH URL which is couldn't be accessed.

[root@broker ~]# ping c9cea283cbf44340bfdbbad3cc4a01e9-jia.ose11test.com
ping: unknown host c9cea283cbf44340bfdbbad3cc4a01e9-jia.ose11test.com

The correct URL should be c9cea283cb-jia.ose11test.com
[root@broker ~]# ping c9cea283cb-jia.ose11test.com
PING node.ose11test.com (10.4.59.182) 56(84) bytes of data.
64 bytes from vm-182-59-4-10.ose.phx2.redhat.com (10.4.59.182): icmp_seq=1 ttl=62 time=0.553 ms
...

Checked the app info in MongoDB, the scaled-up gear "c9cea283cbf44340bfdbbad3cc4a01e9" still has the name "name" : "c9cea283cb", maybe we should use gear "name" for the URL address instead of gear uuid.

Comment 5 Luke Meyer 2013-06-14 20:22:53 UTC
You're absolutely right. The ssh url is being reported incorrectly. I fixed this upstream with https://github.com/openshift/origin-server/pull/2855 and will pull it in as appropriate.

The upgrade is correct now (I believe), but we should leave this bug to track the ssh url still being wrong.

Comment 6 Luke Meyer 2013-06-16 00:27:35 UTC
origin-server:
commit ed1add091c0c35186a8e295263d0bbe409cfeaeb
Author: Luke Meyer <lmeyer>
Date:   Fri Jun 14 16:15:19 2013 -0400

Comment 7 Luke Meyer 2013-06-17 20:42:30 UTC
enterprise-server:
commit 70b7d2e3f7613ec661db6ce583d791c198b38b6d
Author: Luke Meyer <lmeyer>
Date:   Fri Jun 14 16:15:19 2013 -0400

This is in the buildvm puddle today but not RC1.

Comment 8 Luke Meyer 2013-06-18 13:31:38 UTC
This is included in http://download.lab.bos.redhat.com/rel-eng/OpenShiftEnterprise/1.2/2013-06-17.3 and presumably all future RCs/puddles. I'm leaving it as ON_DEV as I'm not sure if QE is testing that as RC1 right now or the earlier puddle they started with Monday. Feel free to verify this once using the more recent puddle.

Comment 9 Johnny Liu 2013-06-19 13:56:57 UTC
Test this bug using http://buildvm-devops.usersys.redhat.com/puddle/build/OpenShiftEnterprise/1.2/2013-06-17.1/, it indeed is fixed.

$ rhc app show jbeapscal --gears
Password: ******

ID                               State     Cartridges               Size  SSH URL
-------------------------------- --------- ------------------------ ----- ----------------------------------------------------------------
50070176f43640e783d9f76e99c90e28 deploying haproxy-1.4 jbosseap-6.0 small 50070176f43640e783d9f76e99c90e28.com
aaf9fa6dff63468281006967b788e715 stopped haproxy-1.4 jbosseap-6.0 small aaf9fa6dff63468281006967b788e715.com
5919a615d68d4a38bce7e15569ded303 started   mysql-5.1                small 5919a615d68d4a38bce7e15569ded303.com
[jialiu@jialiu-pc1 jbeapscal]$ ping aaf9fa6dff-jialiu.ose11test.com
PING node.ose11test.com (10.4.59.143) 56(84) bytes of data.
64 bytes from vm-143-59-4-10.ose.phx2.redhat.com (10.4.59.143): icmp_req=1 ttl=56 time=637 ms
64 bytes from vm-143-59-4-10.ose.phx2.redhat.com (10.4.59.143): icmp_req=2 ttl=56 time=701 ms
^C
--- node.ose11test.com ping statistics ---
3 packets transmitted, 2 received, 33% packet loss, time 2321ms
rtt min/avg/max/mdev = 637.588/669.790/701.993/32.212 ms

[jialiu@jialiu-pc1 jbeapscal]$ ping 5919a615d6-jialiu.ose11test.com
PING node.ose11test.com (10.4.59.143) 56(84) bytes of data.
64 bytes from vm-143-59-4-10.ose.phx2.redhat.com (10.4.59.143): icmp_req=1 ttl=56 time=281 ms
^C
--- node.ose11test.com ping statistics ---
2 packets transmitted, 1 received, 50% packet loss, time 1000ms
rtt min/avg/max/mdev = 281.204/281.204/281.204/0.000 ms


Once this bug is moved to "ON_QA", QE will verify it.

Comment 10 Luke Meyer 2013-06-19 21:03:41 UTC
Sorry, it was only ON_DEV due to my misunderstanding.

Comment 11 Johnny Liu 2013-06-20 09:20:15 UTC
Verified this bug using http://buildvm-devops.usersys.redhat.com/puddle/build/OpenShiftEnterprise/1.2/2013-06-19.2/, and PASS.

Comment 12 Luke Meyer 2013-06-28 15:47:35 UTC
Closing all bugs introduced, fixed, and verified during 1.2 release work (thus never shipped).


Note You need to log in before you can comment on or make changes to this bug.