Hide Forgot
Description of problem: When an ssh key is added to an account, there is a queued pending_ops with type "UserSshKey" added to cloud_uesrs collection in datastore. This pending_ops should have been cleared when the operations have been executed successfully. Version-Release number of selected component (if applicable): On devenv_3758 How reproducible: Always Steps to Reproduce: 1. Add sshkey to an account rhc sshkey-add default ~/.ssh/id_rsa.pub 2. Connect to datastore, and query cloud_users db.cloud_users.findOne({"login":"jhou"}) 3. Remove the ssh key rhc sshkey-remove default 4. Connect to datastore, and query cloud_users db.cloud_users.findOne({"login":"jhou"}) 5. Manually clear the pending_ops and query again oo-admin-clear-pending_ops -t 0 Actual results: After step 2: <snip> "pending_ops" : [ { "_id" : ObjectId("522d2926c7c23db863000004"), "arguments" : { "_id" : ObjectId("522d2926c7c23db863000005"), "type" : "ssh-rsa", "name" : "default", "content" : "AAAAB3NzaC1yc2EAAAADAQABAAABAQDh43XWg7LERxupWx/Ym9hswfA6loRkpMi5JOfX5C49RW6M6JnpyF53u8/VFYYADN8YN+9wwUEPrUTi6LsZAheAfgw5nxc3VTqFaiHuwFP9oc8yPCgclm+vC0Fn3S2foAjHqO1+fRDYDztAciD0uBT+pWqMsMrqdTdpHvgc6U6tYdKRqKCwNjo4K2bueq3MUMOPdxJp1SvrVJK0I+BNhG4iSaGFt++2dYr2X389kQ+6MNYpJpOuqXC734wWm27CQLvYtWRr3QXfKSV3HnybdYZPc/ffV7xE6MLMkbznBnCgdM8b8AvB41xMz5OvnyGf8+tGDrtKN+qL2+pLBPgZbxoF", "_type" : "UserSshKey" }, "completed_domain_ids" : [ ], "created_at" : ISODate("2013-09-09T01:49:26.196Z"), "on_completion_method" : null, "on_domain_ids" : [ ObjectId("522d291fc7c23db863000003") ], "op_type" : "add_ssh_key", "state" : "queued", "updated_at" : ISODate("2013-09-09T01:49:26.197Z") }, </snip> Aftter step 4: <snip> { "_id" : ObjectId("522d33f0c7c23db86300009b"), "arguments" : { "_id" : ObjectId("522d2926c7c23db863000005"), "_type" : "UserSshKey", "content" : "AAAAB3NzaC1yc2EAAAADAQABAAABAQDh43XWg7LERxupWx/Ym9hswfA6loRkpMi5JOfX5C49RW6M6JnpyF53u8/VFYYADN8YN+9wwUEPrUTi6LsZAheAfgw5nxc3VTqFaiHuwFP9oc8yPCgclm+vC0Fn3S2foAjHqO1+fRDYDztAciD0uBT+pWqMsMrqdTdpHvgc6U6tYdKRqKCwNjo4K2bueq3MUMOPdxJp1SvrVJK0I+BNhG4iSaGFt++2dYr2X389kQ+6MNYpJpOuqXC734wWm27CQLvYtWRr3QXfKSV3HnybdYZPc/ffV7xE6MLMkbznBnCgdM8b8AvB41xMz5OvnyGf8+tGDrtKN+qL2+pLBPgZbxoF", "name" : "default", "type" : "ssh-rsa" }, "completed_domain_ids" : [ ], "created_at" : ISODate("2013-09-09T02:35:28.448Z"), "on_completion_method" : null, "on_domain_ids" : [ ObjectId("522d291fc7c23db863000003") ], "op_type" : "delete_ssh_key", "state" : "queued", "updated_at" : ISODate("2013-09-09T02:35:28.450Z") </snip> After step 5: [root@ip-10-147-219-9 ~]# oo-admin-clear-pending-ops -t 0 Failed to clear op for user (chunchen) - #<PendingUserOps _id: 522d3064c7c23db863000026, _type: nil, created_at: 2013-09-09 02:20:20 UTC, updated_at: 2013-09-09 02:20:20 UTC, op_type: :add_ssh_key, state: "queued", arguments: {"_id"=>"522d3064c7c23db863000027", "type"=>"ssh-rsa", "name"=>"chunchenWin7x64P", "content"=>"AAAAB3NzaC1yc2EAAAABIwAAAQEA1D+z8HNYsqJmp1zvU7jBsAifkN19vC30p3VbNW+sLlck7YzM9C7057Dby0vN0nczfX6CMw6oXQSTL+qz3uInoMKtuuB61qL91cUGAvGirKqr2F36i/Ksg0gqgwpumQdlCdWq7PpI3zENNtnzisqFWBIyX7z1IgJNAYOXKa20qHy2BWKHPzM27Ffi3Hfa7z75rQtO3A4yGLjn2nCKfC9G1eiVu4a4cEmzTRYyNqHd28lZ17H09IAMW49fAR9mn/WQVh3RTINkPjVsul4y6PH3X0GMnsJjZkdyWVa1aPPXDV02EN8jonQyF1sjst7TlRV6n9fmGBuEs9wCYaRElicgrQ==", "_type"=>"UserSshKey"}, on_domain_ids: ["522d2a16c7c23db863000024"], completed_domain_ids: [], on_completion_method: nil> Failed to clear op for user (jhou) - #<PendingUserOps _id: 522d2926c7c23db863000004, _type: nil, created_at: 2013-09-09 01:49:26 UTC, updated_at: 2013-09-09 01:49:26 UTC, op_type: :add_ssh_key, state: "queued", arguments: {"_id"=>"522d2926c7c23db863000005", "type"=>"ssh-rsa", "name"=>"default", "content"=>"AAAAB3NzaC1yc2EAAAADAQABAAABAQDh43XWg7LERxupWx/Ym9hswfA6loRkpMi5JOfX5C49RW6M6JnpyF53u8/VFYYADN8YN+9wwUEPrUTi6LsZAheAfgw5nxc3VTqFaiHuwFP9oc8yPCgclm+vC0Fn3S2foAjHqO1+fRDYDztAciD0uBT+pWqMsMrqdTdpHvgc6U6tYdKRqKCwNjo4K2bueq3MUMOPdxJp1SvrVJK0I+BNhG4iSaGFt++2dYr2X389kQ+6MNYpJpOuqXC734wWm27CQLvYtWRr3QXfKSV3HnybdYZPc/ffV7xE6MLMkbznBnCgdM8b8AvB41xMz5OvnyGf8+tGDrtKN+qL2+pLBPgZbxoF", "_type"=>"UserSshKey"}, on_domain_ids: ["522d291fc7c23db863000003"], completed_domain_ids: [], on_completion_method: nil> Failed to clear op for user (jhou) - #<PendingUserOps _id: 522d33f0c7c23db86300009b, _type: nil, created_at: 2013-09-09 02:35:28 UTC, updated_at: 2013-09-09 02:35:28 UTC, op_type: :delete_ssh_key, state: "queued", arguments: {"_id"=>"522d2926c7c23db863000005", "_type"=>"UserSshKey", "content"=>"AAAAB3NzaC1yc2EAAAADAQABAAABAQDh43XWg7LERxupWx/Ym9hswfA6loRkpMi5JOfX5C49RW6M6JnpyF53u8/VFYYADN8YN+9wwUEPrUTi6LsZAheAfgw5nxc3VTqFaiHuwFP9oc8yPCgclm+vC0Fn3S2foAjHqO1+fRDYDztAciD0uBT+pWqMsMrqdTdpHvgc6U6tYdKRqKCwNjo4K2bueq3MUMOPdxJp1SvrVJK0I+BNhG4iSaGFt++2dYr2X389kQ+6MNYpJpOuqXC734wWm27CQLvYtWRr3QXfKSV3HnybdYZPc/ffV7xE6MLMkbznBnCgdM8b8AvB41xMz5OvnyGf8+tGDrtKN+qL2+pLBPgZbxoF", "name"=>"default", "type"=>"ssh-rsa"}, on_domain_ids: ["522d291fc7c23db863000003"], completed_domain_ids: [], on_completion_method: nil> 0 applications were cleaned up. 2 users were cleaned up. 0 domains were cleaned up. After executing, query mongo and found the pending_ops are still there. Expected results: The pending_ops should be cleared when the operations are executed successfully. Additional info:
This is a bug in the PendingUserOps logic - ssh key propagation is now in the application only. There is no domain propagation at this point. The minimal fix is to change CloudUser#run_jobs from: op.delete if op.completed? to op.delete The larger change is to remove on_domain and completed_domains and replace it with just completed_applications. When dan and I spoke about not tracking individual completion of apps we weren't terribly concerned because a failed operation will get rerun and any existing job is a no-op. So it causes some extra processing but is still safe. This also means that users have queues that are filling up - when this fix is applied those queues will get cleared.
Fixed with --> https://github.com/openshift/origin-server/pull/3592
Commit pushed to master at https://github.com/openshift/origin-server https://github.com/openshift/origin-server/commit/998e2517c935ace0fb9fa058d0c5c553d782d18f Fix for bug 1005631
Verified on devenv_3770, the pending_ops with UserSshKey type is cleared after ssh key has been added.