Bug 892781 - Race condition adding multiple SSH keys to gears
Summary: Race condition adding multiple SSH keys to gears
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Containers
Version: 1.0.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Miciah Dashiel Butler Masters
QA Contact: libra bugs
URL:
Whiteboard:
Depends On: 876942
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-01-07 20:20 UTC by Miciah Dashiel Butler Masters
Modified: 2017-03-08 17:34 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 876942
Environment:
Last Closed: 2013-01-31 20:34:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2013:0220 0 normal SHIPPED_LIVE Important: Red Hat OpenShift Enterprise 1.1 update 2013-02-01 01:23:24 UTC

Description Miciah Dashiel Butler Masters 2013-01-07 20:20:44 UTC
I have been unable to reproduce this bug following the steps in bug 876942 comment 13 or the steps in bug 878171 comment 0.  However, we received a report of a very similar issue from a customer, and so I am cloning the bug report and backporting the workaround: https://github.com/openshift/enterprise-server/pull/17

+++ This bug was initially created as a clone of Bug #876942 +++

Description of problem:
Create scalable app on devenv-stage_249, upgrade instance to devenv_2475.
Embedded db to the existing scalable app, and try to change the repo and git push.
The new embedded db failed to restart during git push.

Version-Release number of selected component (if applicable):
From devenv-stage_249 to devenv_2475

How reproducible:
always

Steps to Reproduce:
1.Create scalable app on devenv-stage_249
2.Upgrade the instance to devenv_2475
3.Add new db to the existing app
4.Do some change and git push
  
Actual results:
Following error appears during git push.

remote: Failed to start postgresql-8.4

Expected results:
Database should be started successfully during git push.

Additional info:
The db can be start and restart from CLI successfully.

[root@F17x64-openshift py1s]# rhc cartridge status postgresql-8.4 -p123

RESULT:
PostgreSQL server instance is running

[root@F17x64-openshift py1s]# rhc cartridge restart postgresql-8.4 -p123

RESULT:
postgresql-8.4 restarted!

--- Additional comment from Hou Jianwei on 2012-11-15 06:21:25 EST ---

This seems fine on INT, didn't reproduce on INT(devenv_2474)

--- Additional comment from Rob Millner on 2012-11-15 19:08:16 EST ---

I can't reproduce this with the devenv_2480 package set.

Launched devenv-stage_249, created a scalable php app, upgraded to devenv_2480 and rebooted.

Added postgresl, and it went into a running state.

$ rhc cartridge add -p blahblahblah -a rmtest -c postgresql-8.4
Adding 'postgresql-8.4' to application 'rmtest'
Success
postgresql-8.4
==============
  Properties
  ==========
    Username       = admin
    Password       = ymwCgXxgUVVe
    Database Name  = rmtest
    Connection URL = postgresql://c908973716-rmillner207.dev.rhcloud.com:35546/

$ rhc cartridge status postgresql-8.4 -p vostok08

RESULT:
PostgreSQL server instance is running


Updated the app...

$ echo "" >> README ; git commit -a -m 'foo'; git push

The results will be attached as remote.txt.  It appeared to work.

$ grep -i postgres remote.txt 
remote: PostgreSQL server instance already running


$ rhc cartridge status postgresql-8.4 -p vostok08

RESULT:
PostgreSQL server instance is running

--- Additional comment from Rob Millner on 2012-11-15 19:08:49 EST ---

Created attachment 646070 [details]
output of git push

--- Additional comment from Rob Millner on 2012-11-15 19:10:59 EST ---

Passing to Q/E to see if they can get it to reproduce on the later build.

If you happen to see it; the logs from each gear and the mcollective logs for adding the postgresql cartridge would be helpful.  Thanks!

--- Additional comment from Meng Bo on 2012-11-16 02:36:28 EST ---

Check this issue again after upgrade to devenv_2485.

For more dig, I found the real issue is: New added gears for scalable app missing one of the two ssh key (default & haproxy).

Not only for the new added db gears, but also for the scaled-up gears.

For the ones which missing default ssh key, it will not effect the git push, but user cannot ssh connect to the gear from client side.
For the other ones which missing haproxy ssh key, it will fail during git push, and cannot be accessed from haproxy gear, but can be accessed from client side.

For the scalable app ruby18s
There are 3 dbs and 2 web gears

    mysql-5.1      = mysql://b662d2d079-bmeng1dev.dev.rhcloud.com:35631/
    mongodb-2.2    = mongodb://641bae88f5-bmeng1dev.dev.rhcloud.com:35706/
    postgresql-8.4 = postgresql://19d4d70da8-bmeng1dev.dev.rhcloud.com:35711/

a2ff18f412e94c01a232781025352a42.107.14:ruby-1.8;a2ff18f412-bmeng1dev.dev.rhcloud.com
ae5c1ed4a5684cb084b525db862ead2d.107.14:ruby-1.8;ae5c1ed4a5-bmeng1dev.dev.rhcloud.com

The mysql and the first gear was existed before upgrade and the other three were added after upgrade.
Check the .ssh/authorized_keys for these gears 
[root@ip-10-46-107-14 openshift]# cat b662d2d079-bmeng1dev/.ssh/authorized_keys 
command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa xxxx OPENSHIFT-b662d2d0793b46189969841feffc83bbdefault
command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa yyyy OPENSHIFT-b662d2d0793b46189969841feffc83bbhaproxy

[root@ip-10-46-107-14 openshift]# cat 641bae88f5-bmeng1dev/.ssh/authorized_keys 
command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa xxxx OPENSHIFT-641bae88f58144e3ae68ab2c670d3ab6default

[root@ip-10-46-107-14 openshift]# cat 19d4d70da8-bmeng1dev/.ssh/authorized_keys 
command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa yyyy OPENSHIFT-19d4d70da85d47719578a573e8644e9fhaproxy

[root@ip-10-46-107-14 openshift]# cat a2ff18f412e94c01a232781025352a42/.ssh/authorized_keys 
command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa yyyy OPENSHIFT-a2ff18f412e94c01a232781025352a42haproxy
command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa xxxx OPENSHIFT-a2ff18f412e94c01a232781025352a42default

[root@ip-10-46-107-14 openshift]# cat ae5c1ed4a5684cb084b525db862ead2d/.ssh/authorized_keys 
command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa xxxx OPENSHIFT-ae5c1ed4a5684cb084b525db862ead2ddefault


Attached the mcollective log during add db.

--- Additional comment from Meng Bo on 2012-11-16 02:37:56 EST ---

Created attachment 646217 [details]
mcollective_log_during_add_db

--- Additional comment from Hou Jianwei on 2012-11-16 03:00:16 EST ---

Same issue as comment 5 has been reproduced on INT as well.

--- Additional comment from Rob Millner on 2012-11-16 12:16:11 EST ---

*** Bug 877300 has been marked as a duplicate of this bug. ***

--- Additional comment from Rob Millner on 2012-11-16 12:20:24 EST ---

The ssh keys are added in a parallel call and execute locally in separate threads.

Both threads will add_ssh_key which loads the existing key file, adds the new key to it and writes a new key file.  There's a substantial window where each thread can read an empty key file, add their own key and write it back out.

This should reproduce regardless of whether the app was originally created on stg or the latest package set.

--- Additional comment from manoj on 2012-11-16 13:40:36 EST ---

TO be resolved today.

--- Additional comment from Rob Millner on 2012-11-16 14:06:43 EST ---

After discussion; we're going to disable threading for the affected commands until we can come up with a better concurrency management solution.

https://github.com/openshift/origin-server/pull/938

--- Additional comment from Rob Millner on 2012-11-16 14:23:54 EST ---

User story US3121 was added to re-enable threading.

--- Additional comment from Hou Jianwei on 2012-11-19 03:33:23 EST ---

Verified on devenv_2497

Steps:
1. Create domain, add sshkeys

2. Create scalable application
rhc app create php1s php-5.3 -s

3. Scale up this application using rest api

4. Embed db cartridge to this application, eg
rhc cartridge add mysql-5.1 -a php1s

5. SSH into instance, check authorized_keys under /var/lib/openshift/$GEAR_UUID/.ssh
For new added gears(scaled up gear and standalone db gear), the default key and the haproxy key should both be added to authorized_keys file

6. ssh into application's haproxy gear, from which, ssh into standalone db gear and scaled up gear again

7. Connect to mysql from any of these gears

Result:
All operation succeeded.

--- Additional comment from Rob Millner on 2012-11-19 14:52:25 EST ---

*** Bug 878171 has been marked as a duplicate of this bug. ***

Comment 3 Brenton Leanhardt 2013-01-15 07:21:35 UTC
This should ship with the next RC build.

Comment 4 xjia 2013-01-17 10:11:33 UTC
Version:
http://download.lab.bos.redhat.com/rel-eng/OpenShiftEnterprise/1.1/2013-01-16.1

Verify:
Create 10 apps, and add 6 ssh-key. Then check the key's number on broker and node.

[root@broker openshift]# find . -name authorized_keys -print0 | xargs -0 grep jia | wc -l
24
[root@broker openshift]# ssh node1
root@node1's password: 
Last login: Thu Jan 17 04:54:38 2013 from vm-188-59-4-10.ose.phx2.redhat.com
[root@node1 ~]# cd /var/lib/openshift/
[root@node1 openshift]# find . -name authorized_keys -print0 | xargs -0 grep jia | wc -l
36

24+36=60

Six ssh-keys:
jiaclient1	AAAB3Nza..U+h7RHJJ	Delete
jiaclient2	AAAB3Nza..IXQBhESJ	Delete
jiaclient3	AAAB3Nza..pdWRdwg5	Delete
jiaclient4	AAAB3Nza..GLPRkcKF	Delete
jiaclient	AAAB3Nza..xIqWXxW7	Delete
jiaclient5	AAAB3Nza..7wImUBWl	Delete

Ten apps:
App @ http://app-jia.test.com/ 
App1 @ http://app1-jia.test.com/
App2 @ http://app2-jia.test.com/
App3 @ http://app3-jia.test.com/ 
App4 @ http://app4-jia.test.com/ 
App5 @ http://app5-jia.test.com/ 
App6 @ http://app6-jia.test.com/ 
App7 @ http://app7-jia.test.com/ 
App8 @ http://app8-jia.test.com/ 
App9 @ http://app9-jia.test.com/

Comment 6 errata-xmlrpc 2013-01-31 20:34:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0220.html


Note You need to log in before you can comment on or make changes to this bug.