Hide Forgot
Description of problem: When creating a new gear, an ssh key is added for it. This ssh key is added to the application mongo document and sent over to other gears. However, if the gear creation/configuration fails and is rolled back, the ssh key still persists and is present in both mongo as well as on other gears of the application. Version-Release number of selected component (if applicable): How reproducible: Always, whenever a new gear creation fails and is rolled back Steps to Reproduce: 1. Create a scalable application 2. Add a db cartridge or scale up the web_framework cartridge 3. Insert an error to induce a failure in the configure/post-configure step. Actual results: After step #3, the operation is rolled back and the new gear is removed. However, the ssh key added for this gear is still present in mongo and even added to the other gears. Expected results: The gear's ssh key should be removed from mongo and from all the other gears in case of a rollback in the gear creation/configuration process. Additional info: The operation to add the gear's ssh key to the other gears is managed through a separate op_group. In case of a rollback, this new op_group is not triggered/executed and is executed the next time the user performs an action on the application.
Fixed with --> https://github.com/openshift/origin-server/pull/4294
Commit pushed to master at https://github.com/openshift/origin-server https://github.com/openshift/origin-server/commit/5ffd5388260aa44d6ae79154999de74a6417a886 Fix for bug 1039151
Tested on devenv-stage_604 1. Create a scalable application 2. Add mysql to the application 3. Induce an error to the configure method in /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.18.2/lib/openshift-origin-node/model/v2_cart_model.rb eg: def configure(cartridge_name, template_git_url = nil, manifest = nil) return -1 exit .... 4. Restart broker/mcollective, clear broker cache 5. Query mongo and there are 2 application sshkeys present libra_rs:PRIMARY> db.applications.findOne({name:"php1s"},{app_ssh_keys:1}) { "_id" : ObjectId("52a55699109f4d2029000004"), "app_ssh_keys" : [ { "_id" : ObjectId("52a556bd109f4d2029000024"), "_type" : "ApplicationSshKey", "component_id" : ObjectId("52a55699109f4d2029000004"), "content" : "AAAAB3NzaC1yc2EAAAABIwAAAQEA4dyeqAqSmSY8o6/K36fv36mrsuHD6WPoa22GamFdHINDFyALygZb8m0kO2EUOeIylw0SUtHC/SVR0Ix79yy+MA+wbeIff/ptpVMHj1i6dkxN22cSKjX6VBW/M2K0HR/fNL3QUBHR2ozTlpKzShPnUuFFvUfosqXSO/92W1opYXINEUBBzmaxTB0Iv3KT0pEN+DerG/23nZyj7svMzlYf63sVCcwM7ar5By+3uwYj1qVm2Vz37weHOTPDfFT5IiGfPRkOkxRVZ6LgaedTYH3PFimzRxAfFM3IXREHiBP+8COODn2aVDTwtHrTxBRJpjP4O0RjhNX9C0b7IllEyrWW0Q==", "name" : "application-52a55699109f4d2029000004", "type" : "ssh-rsa" }, { "_id" : ObjectId("52a556ed109f4d2029000040"), "_type" : "ApplicationSshKey", "component_id" : ObjectId("52a556cd109f4d2029000027"), "content" : "AAAAB3NzaC1yc2EAAAABIwAAAQEA/QZtE5/1wuWLk0/+gF2+Wwvpieg/7NG08M89esFjZ/+0BBoiUHJxCjFrDDFfNRpNhWlSgqlq2ALdy84uUKXZZZlLAUnInk5Zw/lMKEhMotpyQGIE/HfivG7wkYOPXsJxnYyrSgTobdhy8HTWyOXnnsc8UNccB7B9PzM9wU2rycih9NstDZoGzNSYwoiCOA/0uZIbae+K0lXNJevNJFkYjfgiENbZZnHMvgfoxEYPdUDBXg8tw4HkX4BVesm4r33eU6+OE8HZUFki9zJpUYtg+e84YtxLCJJMWTaLyP467Mm/GCgWDOzNIaeIVmYUdGeMY11mvibNBYy2hFzrTF225w==", "name" : "application-d0df9a2c609311e3a44022000a9704fb", "type" : "ssh-rsa" } ] } 6. Scale up the application: Failed to create the gear, query mongo again, there are 3 application sshkeys, the pending ops can not be cleared. Not sure how to correctly induce the error, but according the my testing, looks like the error I have induced has a destructive impact on the application, the pending ops remains there and can't be cleared. @abhgupta could you please give me more information on how to induce the error? Thanks!
Subsequently, just remove the error that you induced and try to perform any operation on the application. It should clear the stuck pending_op and test this issue.
You'll have to restart the broker after you change the code to remove the error.
Thanks, tested on devenv-stage_609. The pending_ops was rolled back. However, after rolling back, the application ssh key of the new gear was still in mongo and can be added to new gears of the app. The test steps were same as comment 3. The app now has 3 gears, but the authorized_keys of the gear has 4 application ssh keys and 1 user ssh key: 1 master % rhc app-show php1s --gears ID State Cartridges Size SSH URL ------------------------------- ------- ------------------- ----- ---------------------------------------------------------------------------------- 52a7fc136e8f04b70b000026 started php-5.3 haproxy-1.4 small 52a7fc136e8f04b70b000026.rhcloud.com 52a7fc416e8f042f03000002 started mysql-5.1 small 52a7fc416e8f042f03000002.rhcloud.com 13c0b466622911e3a8ed1231390e8c89 started mongodb-2.2 small 13c0b466622911e3a8ed1231390e8c89.rhcloud.com [root@domU-12-31-39-0E-8C-89 .ssh]# pwd /var/lib/openshift/13c0b466622911e3a8ed1231390e8c89/.ssh [root@domU-12-31-39-0E-8C-89 .ssh]# cat authorized_keys command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAp85i/mFPDGVpKAEDGwAD+qCKqJ8C0UP5tom/TIVlYAWhSyCooxN1mdXXFgzrce7Jw8DdoGB9EsXLIjxDJ7ZHzXHulsdXW9XZTGhMbZN3W3Z+Z1a71qhY9Vd9TimHX2CwI47e/4KLzp/LLL672hchmHRwgQwAIunMy3IKp14/YZptSsPtf5x0XZnN+7PVGB1L6D2amGbOluexaAP7DHxfw0R2AiieoijTvfaLmChN0KckGD4ri3HlmvPhUHYH7qVw7DRSB3ZptL1omWwgGCp0mLU7SC/xXlJK6RW4hhyl3Q6+QHbhGLjGNDTCVaGQ0Cwgz3/HCGya3tEQwUUNrfu1iQ== OPENSHIFT-13c0b466622911e3a8ed1231390e8c89-application-52a7fc136e8f04b70b000026 command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA8AYc52f6eDiwcsM/opOpHFjqbltvd8eCzw6alGjZ1TbR0rLW411t5dgx0gXLLrbrratJG4lhlEurKPdGxAOa/SAnyAQ8qGI7aF3zOY5jFhP7Yp+SmWVAR7k6eTWkjhTaFtWX6z8chicDWbnLFOIWucjg6Xst3brzNQrEmuFdp9LEY2+KI2oqGWx2mmiKXBTdDEnpHaHFv/zIGNxkhVzyyIGFBpQBdOeYkDgffcXVpLVfXWZfBhQyI09uvBT/hWseBpcs10S7spA8P5uh0ivGW2PMPDh02wh9VDPlOR5t7zhXb+cxZx9MQN98XIf73jBY0e3RBUYNFD4pbWTtr8ZkAQ== OPENSHIFT-13c0b466622911e3a8ed1231390e8c89-application-52a7fc416e8f042f03000002 command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEArtPPE20apufUH7xbjPDJR+fV5b1F1k2J8AOLpm5O/SGC0TnPc0ZqGFzp5+OQ2HnZmlb4HeJag/BHjjpfHarUSgX16SdAvRJ1Xzux4yY3pLWNM5v1DDMUMFUzmEJRCh89itzhtiZoBK/OiWog6Cr8Z28F3FAEeEX3sc2Nnqjb2bc402ODqB2GdeuWTmUWkzGinn5yDT41Pu/hm6/8W8Arh9+bSeGwtktY3jkkJEv9A/taqRulxhVWOzzaoYR/DF94EyXrIwXWtmsHWwSoDDDw01Q4JMKu5/Flb+ktPZK+VaG54KtZL4infWbE/Qk6DaFd9PD/KdKJhUJz8Wia7DUO5w== OPENSHIFT-13c0b466622911e3a8ed1231390e8c89-application-52a7fd086e8f0442dc000001 command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAzwG1cSCnAQ7E7ufZ1+W6jhrc9yQWjteoub76/LuGH9kIMljqOZxZPzjDL3z/4TLzhxjv8zAbYrh0Rw/wUsURbe1qVzRSyAmD9LQwPkKzg2rsXMsIfjSiHWHzZ573jY6Q04cVTW3x/Wmf5x4OXIjz9CIYdWKsWGBrd690TdjT7tfbLzUfKnjYgTLPZgPYqKnM1w0Z6v/3JX6NpUmfcdHPAX857kZAdVUf7hZPBxMI2l3/rglyQsj0xYDElFqaAi6jbNSv89YIKfHIEEzjMpWIn3SomoBEEUsu96qtnjQ0MuHZdUTfwf/tUpBuJOXiN4M7eYeFL50PxEjyOsaRsWBAxQ== OPENSHIFT-13c0b466622911e3a8ed1231390e8c89-application-13c0b466622911e3a8ed1231390e8c89 command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDh43XWg7LERxupWx/Ym9hswfA6loRkpMi5JOfX5C49RW6M6JnpyF53u8/VFYYADN8YN+9wwUEPrUTi6LsZAheAfgw5nxc3VTqFaiHuwFP9oc8yPCgclm+vC0Fn3S2foAjHqO1+fRDYDztAciD0uBT+pWqMsMrqdTdpHvgc6U6tYdKRqKCwNjo4K2bueq3MUMOPdxJp1SvrVJK0I+BNhG4iSaGFt++2dYr2X389kQ+6MNYpJpOuqXC734wWm27CQLvYtWRr3QXfKSV3HnybdYZPc/ffV7xE6MLMkbznBnCgdM8b8AvB41xMz5OvnyGf8+tGDrtKN+qL2+pLBPgZbxoF OPENSHIFT-13c0b466622911e3a8ed1231390e8c89-52a7d7b96e8f046b24000001-default
Fixed with --> https://github.com/openshift/origin-server/pull/4328
The above PR isn't merged in devenv-stage_614, tested it by deploying the fix to my env. 1. Create a scalable application with a db cart rhc create-app php1s php-5.3 mysql-5.1 -s 2. Induce an error to the configure method in /openshift-origin-node/model/v2_cart_model.rb eg: def configure(cartridge_name, template_git_url = nil, manifest = nil) return -1 exit .... 3. Restart broker/mcollective, clear broker cache 4. Scale up this app: rhc cartridge-scale php-5.3 -a php1s --min 2 This steps fails since the error is induced. 5. Query the application document from mongo shell, found there are 2 app ssh keys, the new gear's ssh key is not added 6. Check .ssh/authorized_keys, there are 1 user ssh key and 2 app ssh keys. The result is as expected. 7. Fix the error, and restart services, clear broker cache 8. Do any operation against this app, eg: restart 9. Query the application document from mongo shell again 10. Check the contents of .ssh/authorized_keys 11. oo-admin-chk -l 1 Result: After step 9,10,11: the new gear's ssh key is removed from mongo and node, oo-admin-chk passed. This bug is fixed by above pull request. Waiting for merge to have it verified.
The proposed fix for this bug is undergoing review and changes.
The fix is now merged into master
Tested on devenv_4147 and this issue is reproduced. I'm afraid this has not been merged into devenv_4147 and devenv-stage_619 yet. I've compared the fixing code from https://github.com/openshift/origin-server/pull/4328 with the actual code on the instance, and they are different. So assigning back
I have just verified that the fix is part of the latest devenv. Please verify.
Verified on devenv_4154 1. Create a scalable application with a db cart rhc create-app php1s php-5.3 mysql-5.1 -s 2. Induce an error to the configure method in /openshift-origin-node/model/v2_cart_model.rb eg: def configure(cartridge_name, template_git_url = nil, manifest = nil) return -1 exit .... 3. Restart broker/mcollective, clear broker cache 4. Scale up this app: rhc cartridge-scale php-5.3 -a php1s --min 2 This steps fails since the error is induced. 5. Query the application document from mongo shell, found there are 2 app ssh keys, the new gear's ssh key is not added domU-12-31-39-07-BA-57(mongod-2.4.6)[PRIMARY] openshift_broker_dev> db.applications.findOne({},{app_ssh_keys:1}) { "_id": ObjectId("52b264a6e551c7b047000023"), "app_ssh_keys": [ { "_id": ObjectId("52b264aae551c7b04700003d"), "_type": "ApplicationSshKey", "component_id": ObjectId("52b264a6e551c7b047000023"), "content": "AAAAB3NzaC1yc2EAAAABIwAAAQEAwpAtQ5eshd8Ae1zykLtz+RwuI1BiXoWRKTX/qwdpQad3XOACnfWez+F6U3qgIxE5ed5MZPo4X6HzPR20IkwxUvG6Ag8ERV8I/dui3r4XH5Zv9LmrFuv8Q6OroRMQ95pZMMbAk0pV/zhTbRGKAWoCb36ECYxmfTZudgKfmxZAkJKRfU+M6bC/nbMH7v5Ca2OmpRiOFbh4snKP2ZLZB0quGHb+EF+rlWZIcYph96fr2KkR2WHUwbxLCW+UwhjAFlnUhJIUai5gj1/dzrqHOq59n64Sy1xEY6IkNCOD2LYahvZt6Euw9HZAbDSD7mtUQQBw6RlXrahHfBODWkAYv8XURw==", "created_at": ISODate("2013-12-19T03:14:50.708Z"), "name": "application-52b264a6e551c7b047000023", "type": "ssh-rsa" }, { "_id": ObjectId("52b264d4e551c7b047000058"), "_type": "ApplicationSshKey", "component_id": ObjectId("52b264cfe551c7b047000045"), "content": "AAAAB3NzaC1yc2EAAAABIwAAAQEA1qGw2Cf8sisacsV0+TWqMVUlVULrJTEai7/3qZtctReVeR9lJ63Ye7MXnN/AlnNHtrFEflzAgShHcfcECP3CleeX437WLtdt9GK3CM0UaeFT7d3YEJL1h7WGX4ggEQP12BjBTcVwDF3P1EvGE8xSZlSVg+wydPzMRn3XNl4ROeoPhd2/4r41W1cdFsh5erlgDl40V8OU3LFxVkqc+IIByRhere36YxAvGhAITdRdC72P1jX3TxqKJLuRjmZXRQGfy+Z9sx0kw3Hlhswt9tRNyq9iOjYOOSMB69GGRjFVSt+VRqksKu61Ntr6LP8jjy8P+5fICz7ghjLQdig97/XB8w==", "created_at": ISODate("2013-12-19T03:15:32.092Z"), "name": "application-52b264cfe551c75766000004", "type": "ssh-rsa" } ] } 6. Check .ssh/authorized_keys, there are 1 user ssh key and 2 app ssh keys. The result is as expected. [root@domU-12-31-39-07-BA-57 ~]# cat /var/lib/openshift/php1s-jhou/.ssh/authorized_keys command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAwpAtQ5eshd8Ae1zykLtz+RwuI1BiXoWRKTX/qwdpQad3XOACnfWez+F6U3qgIxE5ed5MZPo4X6HzPR20IkwxUvG6Ag8ERV8I/dui3r4XH5Zv9LmrFuv8Q6OroRMQ95pZMMbAk0pV/zhTbRGKAWoCb36ECYxmfTZudgKfmxZAkJKRfU+M6bC/nbMH7v5Ca2OmpRiOFbh4snKP2ZLZB0quGHb+EF+rlWZIcYph96fr2KkR2WHUwbxLCW+UwhjAFlnUhJIUai5gj1/dzrqHOq59n64Sy1xEY6IkNCOD2LYahvZt6Euw9HZAbDSD7mtUQQBw6RlXrahHfBODWkAYv8XURw== OPENSHIFT-52b264a6e551c7b047000023-application-52b264a6e551c7b047000023 command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDh43XWg7LERxupWx/Ym9hswfA6loRkpMi5JOfX5C49RW6M6JnpyF53u8/VFYYADN8YN+9wwUEPrUTi6LsZAheAfgw5nxc3VTqFaiHuwFP9oc8yPCgclm+vC0Fn3S2foAjHqO1+fRDYDztAciD0uBT+pWqMsMrqdTdpHvgc6U6tYdKRqKCwNjo4K2bueq3MUMOPdxJp1SvrVJK0I+BNhG4iSaGFt++2dYr2X389kQ+6MNYpJpOuqXC734wWm27CQLvYtWRr3QXfKSV3HnybdYZPc/ffV7xE6MLMkbznBnCgdM8b8AvB41xMz5OvnyGf8+tGDrtKN+qL2+pLBPgZbxoF OPENSHIFT-52b264a6e551c7b047000023-52b25443e551c776b5000001-default command="/usr/bin/oo-trap-user",no-X11-forwarding ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA1qGw2Cf8sisacsV0+TWqMVUlVULrJTEai7/3qZtctReVeR9lJ63Ye7MXnN/AlnNHtrFEflzAgShHcfcECP3CleeX437WLtdt9GK3CM0UaeFT7d3YEJL1h7WGX4ggEQP12BjBTcVwDF3P1EvGE8xSZlSVg+wydPzMRn3XNl4ROeoPhd2/4r41W1cdFsh5erlgDl40V8OU3LFxVkqc+IIByRhere36YxAvGhAITdRdC72P1jX3TxqKJLuRjmZXRQGfy+Z9sx0kw3Hlhswt9tRNyq9iOjYOOSMB69GGRjFVSt+VRqksKu61Ntr6LP8jjy8P+5fICz7ghjLQdig97/XB8w== OPENSHIFT-52b264a6e551c7b047000023-application-52b264cfe551c75766000004 7. Fix the error, and restart services, clear broker cache 8. Do any operation against this app, eg: restart 9. Query the application document from mongo shell again 10. Check the contents of .ssh/authorized_keys 11. oo-admin-chk -l 1 [root@domU-12-31-39-07-BA-57 ~]# oo-admin-chk -l 1 Started at: 2013-12-19 03:20:20 UTC Total gears found in mongo: 2 Total gears found on the nodes: 2 Total nodes that responded: 1 Finished at: 2013-12-19 03:21:01 UTC Total time: 40.854s SUCCESS This bug is verified.