Description of problem: Our monitoring in INT created an application, but is now unable to delete it. When I try to manually delete the app, I get this: $ rhc app delete -k -p $pass --confirm -a chkexsrv2 Deleting application 'chkexsrv2' ... The server did not respond correctly. This may be an issue with the server configuration or with your connection to the server (such as a Web proxy or firewall). Please verify that you can access the OpenShift server https://localhost/broker/rest/domains/nagiosmonitor/applications/chkexsrv2 The mcollective log from this host looks like this: I, [2013-03-06T14:55:51.016675 #13831] INFO -- : openshift.rb:34:in `cartridge_do_action' cartridge_do_action call / action: cartridge_do, agent=openshift, data={:cartridge=>"openshift-origin-node", :action=>"app-destroy", :args=> {"--with-app-uuid"=>"5136b0c76cec0e399f0000df", "--with-app-name"=>"chkexsrv2", "--with-container-uuid"=>"5136b0c76cec0e399f0000df", "--with-container-name"=>"chkexsrv2", "--with-namespace"=>"nagiosmonitor", "--with-uid"=>5388, "--with-request-id"=>"03e4855dda03cbb6373817bf1052aa96", "--cart-name"=>"openshift-origin-node"}, :process_results=>true} I, [2013-03-06T14:55:51.017030 #13831] INFO -- : openshift.rb:35:in `cartridge_do_action' cartridge_do_action validation = openshift-origin-node app-destroy {"--with-app-uuid"=>"5136b0c76cec0e399f0000df", "--with-app-name"=>"chkexsrv2", "--with-container-uuid"=>"5136b0c76cec0e399f0000df", "--with-container-name"=>"chkexsrv2", "--with-namespace"=>"nagiosmonitor", "--with-uid"=>5388, "--with-request-id"=>"03e4855dda03cbb6373817bf1052aa96", "--cart-name"=>"openshift-origin-node"} I, [2013-03-06T14:55:51.017362 #13831] INFO -- : openshift.rb:74:in `execute_action' Executing action [app-destroy] using method oo_app_destroy with args [{"--with-app-uuid"=>"5136b0c76cec0e399f0000df", "--with-app-name"=>"chkexsrv2", "--with-container-uuid"=>"5136b0c76cec0e399f0000df", "--with-container-name"=>"chkexsrv2", "--with-namespace"=>"nagiosmonitor", "--with-uid"=>5388, "--with-request-id"=>"03e4855dda03cbb6373817bf1052aa96", "--cart-name"=>"openshift-origin-node"}] I, [2013-03-06T14:55:51.027281 #13831] INFO -- : openshift.rb:168:in `rescue in oo_app_destroy' ERROR: unable to destroy user account 5136b0c76cec0e399f0000df I, [2013-03-06T14:55:51.028915 #13831] INFO -- : openshift.rb:169:in `rescue in oo_app_destroy' ["/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.5.13/lib/openshift-origin-node/model/unix_user.rb:171:in `destroy'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.5.13/lib/openshift-origin-node/model/v1_cart_model.rb:61:in `destroy'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.5.13/lib/openshift-origin-node/model/application_container.rb:127:in `destroy'", "/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:166:in `oo_app_destroy'", "/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:76:in `execute_action'", "/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:48:in `cartridge_do_action'", "/opt/rh/ruby193/root/usr/share/ruby/mcollective/rpc/agent.rb:86:in `handlemsg'", "/opt/rh/ruby193/root/usr/share/ruby/mcollective/agents.rb:126:in `block (2 levels) in dispatch'", "/opt/rh/ruby193/root/usr/share/ruby/timeout.rb:68:in `timeout'", "/opt/rh/ruby193/root/usr/share/ruby/mcollective/agents.rb:125:in `block in dispatch'"] I, [2013-03-06T14:55:51.029095 #13831] INFO -- : openshift.rb:83:in `execute_action' Finished executing action [app-destroy] (-1) I, [2013-03-06T14:55:51.029263 #13831] INFO -- : openshift.rb:56:in `cartridge_do_action' cartridge_do_action failed (-1) ------ ERROR: unable to destroy user account 5136b0c76cec0e399f0000df ------) I, [2013-03-06T14:55:54.171602 #13831] INFO -- : openshift.rb:34:in `cartridge_do_action' cartridge_do_action call / action: cartridge_do, agent=openshift, data={:cartridge=>"openshift-origin-node", :action=>"app-destroy", :args=> {"--with-app-uuid"=>"5136b0c76cec0e399f0000df", "--with-app-name"=>"chkexsrv2", "--with-container-uuid"=>"5136b0c76cec0e399f0000df", "--with-container-name"=>"chkexsrv2", "--with-namespace"=>"nagiosmonitor", "--with-uid"=>5388, "--with-request-id"=>"387867483a843088764957a26403f486", "--cart-name"=>"openshift-origin-node"}, :process_results=>true} I, [2013-03-06T14:55:54.172302 #13831] INFO -- : openshift.rb:35:in `cartridge_do_action' cartridge_do_action validation = openshift-origin-node app-destroy {"--with-app-uuid"=>"5136b0c76cec0e399f0000df", "--with-app-name"=>"chkexsrv2", "--with-container-uuid"=>"5136b0c76cec0e399f0000df", "--with-container-name"=>"chkexsrv2", "--with-namespace"=>"nagiosmonitor", "--with-uid"=>5388, "--with-request-id"=>"387867483a843088764957a26403f486", "--cart-name"=>"openshift-origin-node"} I, [2013-03-06T14:55:54.172674 #13831] INFO -- : openshift.rb:74:in `execute_action' Executing action [app-destroy] using method oo_app_destroy with args [{"--with-app-uuid"=>"5136b0c76cec0e399f0000df", "--with-app-name"=>"chkexsrv2", "--with-container-uuid"=>"5136b0c76cec0e399f0000df", "--with-container-name"=>"chkexsrv2", "--with-namespace"=>"nagiosmonitor", "--with-uid"=>5388, "--with-request-id"=>"387867483a843088764957a26403f486", "--cart-name"=>"openshift-origin-node"}] I, [2013-03-06T14:55:54.178087 #13831] INFO -- : openshift.rb:168:in `rescue in oo_app_destroy' ERROR: unable to destroy user account 5136b0c76cec0e399f0000df I, [2013-03-06T14:55:54.178311 #13831] INFO -- : openshift.rb:169:in `rescue in oo_app_destroy' ["/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.5.13/lib/openshift-origin-node/model/unix_user.rb:171:in `destroy'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.5.13/lib/openshift-origin-node/model/v1_cart_model.rb:61:in `destroy'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.5.13/lib/openshift-origin-node/model/application_container.rb:127:in `destroy'", "/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:166:in `oo_app_destroy'", "/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:76:in `execute_action'", "/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/agent/openshift.rb:48:in `cartridge_do_action'", "/opt/rh/ruby193/root/usr/share/ruby/mcollective/rpc/agent.rb:86:in `handlemsg'", "/opt/rh/ruby193/root/usr/share/ruby/mcollective/agents.rb:126:in `block (2 levels) in dispatch'", "/opt/rh/ruby193/root/usr/share/ruby/timeout.rb:68:in `timeout'", "/opt/rh/ruby193/root/usr/share/ruby/mcollective/agents.rb:125:in `block in dispatch'"] I, [2013-03-06T14:55:54.178579 #13831] INFO -- : openshift.rb:83:in `execute_action' Finished executing action [app-destroy] (-1) I, [2013-03-06T14:55:54.178746 #13831] INFO -- : openshift.rb:56:in `cartridge_do_action' cartridge_do_action failed (-1) ------ ERROR: unable to destroy user account 5136b0c76cec0e399f0000df ------) Version-Release number of selected component (if applicable): rhc-node-1.5.8-1.el6oso.x86_64 How reproducible: unknown. Once the app is stuck like this, all attempts to destroy it fail. Steps to Reproduce: 1. unknown Actual results: Can't delete the app. Expected results: Should always be able to delete an app.
The root cause for the error message being observed is that the system is in a state where the user is missing from /etc/passwd but the broker still considers the application to exist. The UnixUser class reads /etc/passwd to get the location of the user's home directory. Because the user uid is passed in from mcollection, the logic which is intended to check for the case of an already existing user passes: https://github.com/openshift/origin-server/blob/master/node/lib/openshift-origin-node/model/unix_user.rb#L164 The exception on the following line is then raised because @homedir is nil.
Addendum to above comment: I trying to determine how the system got into this state, as this could be due to another bug.
https://github.com/openshift/origin-server/pull/1575
Pull request has been merged.
Checked on devenv-stage_313, issue has been fixed. 1.Create app 2.Remove the user of the gear from /etc/passwd 3.Try to delete the app App is deleted successfully. Mark bug as verified.