Bug 876942

Summary: Race condition adding multiple SSH keys to gears
Product: OKD Reporter: Meng Bo <bmeng>
Component: ContainersAssignee: Rob Millner <rmillner>
Status: CLOSED CURRENTRELEASE QA Contact: libra bugs <libra-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 2.xCC: jhou, mfisher, ramr
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 892781 (view as bug list) Environment:
Last Closed: 2012-12-19 19:26:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 892781    
Attachments:
Description Flags
output of git push
none
mcollective_log_during_add_db none

Description Meng Bo 2012-11-15 10:47:42 UTC
Description of problem:
Create scalable app on devenv-stage_249, upgrade instance to devenv_2475.
Embedded db to the existing scalable app, and try to change the repo and git push.
The new embedded db failed to restart during git push.

Version-Release number of selected component (if applicable):
From devenv-stage_249 to devenv_2475

How reproducible:
always

Steps to Reproduce:
1.Create scalable app on devenv-stage_249
2.Upgrade the instance to devenv_2475
3.Add new db to the existing app
4.Do some change and git push
  
Actual results:
Following error appears during git push.

remote: Failed to start postgresql-8.4

Expected results:
Database should be started successfully during git push.

Additional info:
The db can be start and restart from CLI successfully.

[root@F17x64-openshift py1s]# rhc cartridge status postgresql-8.4 -p123

RESULT:
PostgreSQL server instance is running

[root@F17x64-openshift py1s]# rhc cartridge restart postgresql-8.4 -p123

RESULT:
postgresql-8.4 restarted!

Comment 1 Jianwei Hou 2012-11-15 11:21:25 UTC
This seems fine on INT, didn't reproduce on INT(devenv_2474)

Comment 2 Rob Millner 2012-11-16 00:08:16 UTC
I can't reproduce this with the devenv_2480 package set.

Launched devenv-stage_249, created a scalable php app, upgraded to devenv_2480 and rebooted.

Added postgresl, and it went into a running state.

$ rhc cartridge add -p blahblahblah -a rmtest -c postgresql-8.4
Adding 'postgresql-8.4' to application 'rmtest'
Success
postgresql-8.4
==============
  Properties
  ==========
    Username       = admin
    Password       = ymwCgXxgUVVe
    Database Name  = rmtest
    Connection URL = postgresql://c908973716-rmillner207.dev.rhcloud.com:35546/

$ rhc cartridge status postgresql-8.4 -p vostok08

RESULT:
PostgreSQL server instance is running


Updated the app...

$ echo "" >> README ; git commit -a -m 'foo'; git push

The results will be attached as remote.txt.  It appeared to work.

$ grep -i postgres remote.txt 
remote: PostgreSQL server instance already running


$ rhc cartridge status postgresql-8.4 -p vostok08

RESULT:
PostgreSQL server instance is running

Comment 3 Rob Millner 2012-11-16 00:08:49 UTC
Created attachment 646070 [details]
output of git push

Comment 4 Rob Millner 2012-11-16 00:10:59 UTC
Passing to Q/E to see if they can get it to reproduce on the later build.

If you happen to see it; the logs from each gear and the mcollective logs for adding the postgresql cartridge would be helpful.  Thanks!

Comment 5 Meng Bo 2012-11-16 07:36:28 UTC
Check this issue again after upgrade to devenv_2485.

For more dig, I found the real issue is: New added gears for scalable app missing one of the two ssh key (default & haproxy).

Not only for the new added db gears, but also for the scaled-up gears.

For the ones which missing default ssh key, it will not effect the git push, but user cannot ssh connect to the gear from client side.
For the other ones which missing haproxy ssh key, it will fail during git push, and cannot be accessed from haproxy gear, but can be accessed from client side.

For the scalable app ruby18s
There are 3 dbs and 2 web gears

    mysql-5.1      = mysql://b662d2d079-bmeng1dev.dev.rhcloud.com:35631/
    mongodb-2.2    = mongodb://641bae88f5-bmeng1dev.dev.rhcloud.com:35706/
    postgresql-8.4 = postgresql://19d4d70da8-bmeng1dev.dev.rhcloud.com:35711/

a2ff18f412e94c01a232781025352a42.107.14:ruby-1.8;a2ff18f412-bmeng1dev.dev.rhcloud.com
ae5c1ed4a5684cb084b525db862ead2d.107.14:ruby-1.8;ae5c1ed4a5-bmeng1dev.dev.rhcloud.com

The mysql and the first gear was existed before upgrade and the other three were added after upgrade.
Check the .ssh/authorized_keys for these gears 
[root@ip-10-46-107-14 openshift]# cat b662d2d079-bmeng1dev/.ssh/authorized_keys 
command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa xxxx OPENSHIFT-b662d2d0793b46189969841feffc83bbdefault
command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa yyyy OPENSHIFT-b662d2d0793b46189969841feffc83bbhaproxy

[root@ip-10-46-107-14 openshift]# cat 641bae88f5-bmeng1dev/.ssh/authorized_keys 
command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa xxxx OPENSHIFT-641bae88f58144e3ae68ab2c670d3ab6default

[root@ip-10-46-107-14 openshift]# cat 19d4d70da8-bmeng1dev/.ssh/authorized_keys 
command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa yyyy OPENSHIFT-19d4d70da85d47719578a573e8644e9fhaproxy

[root@ip-10-46-107-14 openshift]# cat a2ff18f412e94c01a232781025352a42/.ssh/authorized_keys 
command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa yyyy OPENSHIFT-a2ff18f412e94c01a232781025352a42haproxy
command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa xxxx OPENSHIFT-a2ff18f412e94c01a232781025352a42default

[root@ip-10-46-107-14 openshift]# cat ae5c1ed4a5684cb084b525db862ead2d/.ssh/authorized_keys 
command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa xxxx OPENSHIFT-ae5c1ed4a5684cb084b525db862ead2ddefault


Attached the mcollective log during add db.

Comment 6 Meng Bo 2012-11-16 07:37:56 UTC
Created attachment 646217 [details]
mcollective_log_during_add_db

Comment 7 Jianwei Hou 2012-11-16 08:00:16 UTC
Same issue as comment 5 has been reproduced on INT as well.

Comment 8 Rob Millner 2012-11-16 17:16:11 UTC
*** Bug 877300 has been marked as a duplicate of this bug. ***

Comment 9 Rob Millner 2012-11-16 17:20:24 UTC
The ssh keys are added in a parallel call and execute locally in separate threads.

Both threads will add_ssh_key which loads the existing key file, adds the new key to it and writes a new key file.  There's a substantial window where each thread can read an empty key file, add their own key and write it back out.

This should reproduce regardless of whether the app was originally created on stg or the latest package set.

Comment 10 manoj 2012-11-16 18:40:36 UTC
TO be resolved today.

Comment 11 Rob Millner 2012-11-16 19:06:43 UTC
After discussion; we're going to disable threading for the affected commands until we can come up with a better concurrency management solution.

https://github.com/openshift/origin-server/pull/938

Comment 12 Rob Millner 2012-11-16 19:23:54 UTC
User story US3121 was added to re-enable threading.

Comment 13 Jianwei Hou 2012-11-19 08:33:23 UTC
Verified on devenv_2497

Steps:
1. Create domain, add sshkeys

2. Create scalable application
rhc app create php1s php-5.3 -s

3. Scale up this application using rest api

4. Embed db cartridge to this application, eg
rhc cartridge add mysql-5.1 -a php1s

5. SSH into instance, check authorized_keys under /var/lib/openshift/$GEAR_UUID/.ssh
For new added gears(scaled up gear and standalone db gear), the default key and the haproxy key should both be added to authorized_keys file

6. ssh into application's haproxy gear, from which, ssh into standalone db gear and scaled up gear again

7. Connect to mysql from any of these gears

Result:
All operation succeeded.

Comment 14 Rob Millner 2012-11-19 19:52:25 UTC
*** Bug 878171 has been marked as a duplicate of this bug. ***