Description of problem: Create scalable app on devenv-stage_249, upgrade instance to devenv_2475. Embedded db to the existing scalable app, and try to change the repo and git push. The new embedded db failed to restart during git push. Version-Release number of selected component (if applicable): From devenv-stage_249 to devenv_2475 How reproducible: always Steps to Reproduce: 1.Create scalable app on devenv-stage_249 2.Upgrade the instance to devenv_2475 3.Add new db to the existing app 4.Do some change and git push Actual results: Following error appears during git push. remote: Failed to start postgresql-8.4 Expected results: Database should be started successfully during git push. Additional info: The db can be start and restart from CLI successfully. [root@F17x64-openshift py1s]# rhc cartridge status postgresql-8.4 -p123 RESULT: PostgreSQL server instance is running [root@F17x64-openshift py1s]# rhc cartridge restart postgresql-8.4 -p123 RESULT: postgresql-8.4 restarted!
This seems fine on INT, didn't reproduce on INT(devenv_2474)
I can't reproduce this with the devenv_2480 package set. Launched devenv-stage_249, created a scalable php app, upgraded to devenv_2480 and rebooted. Added postgresl, and it went into a running state. $ rhc cartridge add -p blahblahblah -a rmtest -c postgresql-8.4 Adding 'postgresql-8.4' to application 'rmtest' Success postgresql-8.4 ============== Properties ========== Username = admin Password = ymwCgXxgUVVe Database Name = rmtest Connection URL = postgresql://c908973716-rmillner207.dev.rhcloud.com:35546/ $ rhc cartridge status postgresql-8.4 -p vostok08 RESULT: PostgreSQL server instance is running Updated the app... $ echo "" >> README ; git commit -a -m 'foo'; git push The results will be attached as remote.txt. It appeared to work. $ grep -i postgres remote.txt remote: PostgreSQL server instance already running $ rhc cartridge status postgresql-8.4 -p vostok08 RESULT: PostgreSQL server instance is running
Created attachment 646070 [details] output of git push
Passing to Q/E to see if they can get it to reproduce on the later build. If you happen to see it; the logs from each gear and the mcollective logs for adding the postgresql cartridge would be helpful. Thanks!
Check this issue again after upgrade to devenv_2485. For more dig, I found the real issue is: New added gears for scalable app missing one of the two ssh key (default & haproxy). Not only for the new added db gears, but also for the scaled-up gears. For the ones which missing default ssh key, it will not effect the git push, but user cannot ssh connect to the gear from client side. For the other ones which missing haproxy ssh key, it will fail during git push, and cannot be accessed from haproxy gear, but can be accessed from client side. For the scalable app ruby18s There are 3 dbs and 2 web gears mysql-5.1 = mysql://b662d2d079-bmeng1dev.dev.rhcloud.com:35631/ mongodb-2.2 = mongodb://641bae88f5-bmeng1dev.dev.rhcloud.com:35706/ postgresql-8.4 = postgresql://19d4d70da8-bmeng1dev.dev.rhcloud.com:35711/ a2ff18f412e94c01a232781025352a42.107.14:ruby-1.8;a2ff18f412-bmeng1dev.dev.rhcloud.com ae5c1ed4a5684cb084b525db862ead2d.107.14:ruby-1.8;ae5c1ed4a5-bmeng1dev.dev.rhcloud.com The mysql and the first gear was existed before upgrade and the other three were added after upgrade. Check the .ssh/authorized_keys for these gears [root@ip-10-46-107-14 openshift]# cat b662d2d079-bmeng1dev/.ssh/authorized_keys command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa xxxx OPENSHIFT-b662d2d0793b46189969841feffc83bbdefault command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa yyyy OPENSHIFT-b662d2d0793b46189969841feffc83bbhaproxy [root@ip-10-46-107-14 openshift]# cat 641bae88f5-bmeng1dev/.ssh/authorized_keys command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa xxxx OPENSHIFT-641bae88f58144e3ae68ab2c670d3ab6default [root@ip-10-46-107-14 openshift]# cat 19d4d70da8-bmeng1dev/.ssh/authorized_keys command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa yyyy OPENSHIFT-19d4d70da85d47719578a573e8644e9fhaproxy [root@ip-10-46-107-14 openshift]# cat a2ff18f412e94c01a232781025352a42/.ssh/authorized_keys command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa yyyy OPENSHIFT-a2ff18f412e94c01a232781025352a42haproxy command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa xxxx OPENSHIFT-a2ff18f412e94c01a232781025352a42default [root@ip-10-46-107-14 openshift]# cat ae5c1ed4a5684cb084b525db862ead2d/.ssh/authorized_keys command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa xxxx OPENSHIFT-ae5c1ed4a5684cb084b525db862ead2ddefault Attached the mcollective log during add db.
Created attachment 646217 [details] mcollective_log_during_add_db
Same issue as comment 5 has been reproduced on INT as well.
*** Bug 877300 has been marked as a duplicate of this bug. ***
The ssh keys are added in a parallel call and execute locally in separate threads. Both threads will add_ssh_key which loads the existing key file, adds the new key to it and writes a new key file. There's a substantial window where each thread can read an empty key file, add their own key and write it back out. This should reproduce regardless of whether the app was originally created on stg or the latest package set.
TO be resolved today.
After discussion; we're going to disable threading for the affected commands until we can come up with a better concurrency management solution. https://github.com/openshift/origin-server/pull/938
User story US3121 was added to re-enable threading.
Verified on devenv_2497 Steps: 1. Create domain, add sshkeys 2. Create scalable application rhc app create php1s php-5.3 -s 3. Scale up this application using rest api 4. Embed db cartridge to this application, eg rhc cartridge add mysql-5.1 -a php1s 5. SSH into instance, check authorized_keys under /var/lib/openshift/$GEAR_UUID/.ssh For new added gears(scaled up gear and standalone db gear), the default key and the haproxy key should both be added to authorized_keys file 6. ssh into application's haproxy gear, from which, ssh into standalone db gear and scaled up gear again 7. Connect to mysql from any of these gears Result: All operation succeeded.
*** Bug 878171 has been marked as a duplicate of this bug. ***