Bug 1013529

Summary: [origin_broker_98]Met error when deleting jenkins app by oo-admin-repair --removed-nodes
Product: OpenShift Online Reporter: zhaozhanqi <zzhao>
Component: PodAssignee: Ravi Sankar <rpenta>
Status: CLOSED CURRENTRELEASE QA Contact: libra bugs <libra-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.xCC: pruan, rpenta, xtian
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-10-17 13:33:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description zhaozhanqi 2013-09-30 09:51:56 UTC
Description of problem:
Given a jenkins app,make this gear's node is down. Run 'oo-admin-repair --removed-nodes', will show error 'Unable to delete application with id: 5248e067f474ca643d000027, error: Failed to correctly execute all parallel operations'

Version-Release number of selected component (if applicable):
devenv_3844

How reproducible:
Random

Steps to Reproduce:
1. create one jenkins server app
2. stop the jenkins gear's node
3. run oo-admin-repair --removed-nodes
4. rhc app show jenkins

Actual results:
oo-admin-repair --removed-nodes
Started at: 2013-09-30 05:24:06 -0400
Time to fetch mongo data: 0.196s
Total gears found in mongo: 36
Servers that are unresponsive:
	Server: ip-10-179-17-116 (district: dist1), Confirm [yes/no]: 
yes
	Server: ip-10-185-17-117 (district: dist2), Confirm [yes/no]: 
yes
Check failed.
Some servers are unresponsive: ip-10-179-17-116, ip-10-185-17-117


Do you want to delete unresponsive servers from their respective districts [yes/no]: no 
Found 13 unresponsive unscalable apps:
jenkins (id: 5248e067f474ca643d000027)
ews2 (id: 5248e0c8f474ca643d000054)
zqjbossas2 (id: 5248e142f474ca7f5f000068)
zqpy33 (id: 5248e7e2f474ca36aa0001bf)
zqeap2 (id: 5248e3f4f474ca36aa0000e3)
zqphp (id: 5248e839f474ca97f9000139)
zqzend (id: 5248e8cef474ca97f9000161)
zqperl (id: 5248e92bf474ca97f9000182)
zqdiy (id: 5248ee4ff474ca97f90001e2)
jenkins (id: 52492537f474ca97f900027a)
zqjbossas1 (id: 5248e004f474ca36aa00000d)
zqnodejs (id: 5248ee77f474ca97f90001fc)
jbosstest1 (id: 5249115ff474ca97f900022b)
These apps can not be recovered. Do you want to delete all of them [yes/no]: yes
Unable to delete application with id: 5248e067f474ca643d000027, error: Failed to correctly execute all parallel operations

step 4: will show 0 gear

rhc domain show
Domain zqd
----------
  Created:            Sep 29 10:18 PM
  Allowed Gear Sizes: small

  jenkins @ http://jenkins-zqd.dev.rhcloud.com/ (uuid: 5248e067f474ca643d000027)
  ------------------------------------------------------------------------------
    Domain:  zqd
    Created: Sep 29 10:22 PM
    Gears:   0 (defaults to small)  >>>>>>>>>>>>>>>>>> 0
    Git URL: ssh:///~/git/jenkins.git/
    SSH:     ssh://


Expected results:

should delete this gear

Additional info:

Comment 1 zhaozhanqi 2013-09-30 10:14:56 UTC
Development.log


2013-09-30 06:13:51.715 [DEBUG] Execute NotifyAppDeleteOp (pid:23306)
2013-09-30 06:13:51.820 [DEBUG] Execute DeleteCompOp (pid:23306)
2013-09-30 06:13:51.859 [DEBUG] Execute UnsubscribeConnectionsOp (pid:23306)
2013-09-30 06:13:51.870 [DEBUG] Execute DeleteGroupInstanceOp (pid:23306)
2013-09-30 06:13:51.899 [DEBUG] Execute ExecuteConnectionsOp (pid:23306)
2013-09-30 06:13:51.952 [DEBUG] Execute UpdateAppConfigOp (pid:23306)
2013-09-30 06:13:52.332 [DEBUG] DEBUG: Output of parallel execute: [{:tag=>{"op_id"=>"52494edff474ca5389000011"}, :gear=>"5248e26af474ca36aa000072", :job=>{:cartridge=>"openshift-origin-node", :action=>"authorized-ssh-key-remove", :args=>{"--with-app-uuid"=>"5248e26af474ca36aa000072", "--with-app-name"=>"stestmongo", "--with-container-uuid"=>"5248e26af474ca36aa000072", "--with-container-name"=>"stestmongo", "--with-namespace"=>"zqd", "--with-uid"=>1024, "--with-request-id"=>nil, "--with-ssh-key"=>"AAAAB3NzaC1yc2EAAAABIwAAAQEArOXv7Up0So/xp3p5WxPe1gDYAy2K5kRSXkDnf6CbFGI3NxC7qO91VNop9oRN6EeY4lwU3f5r0lAz084Uxdk2TPUpnOLFS99p2Xu+P1UKzZ3+BA5met1tbNrU+LuzbT6KZsxPXrXDHWwDM85/3p+yEQP6clZy7Vd9Q/pL3BRPDYFlQPmV7IkiW9SGwkDE6DUPrzm/leoP0v5/qmb2FheJJS5jhNO+BXJ7jkyTlnr2frNpEWj90xOgt9+FMoFLoBbx/AYn2yqWEdGu4cQX9VIWrQiUtEOtZAD4LtzU7Zj1XF/8IrEXZf3egH/Ttd8DFN9UDqaiWVPtTeB0X7J+WTDmZw==", "--with-ssh-comment"=>"domain-jenkins2"}}, :result_stdout=>"No such file or directory - /var/lib/openshift/5248e26af474ca36aa000072/.ssh/authorized_keys", :result_stderr=>"", :result_exit_code=>-1}], exitcode: 0, from: ip-10-184-22-132  (Request ID: ) (pid:23306)
2013-09-30 06:13:52.355 [DEBUG] DEBUG: MCollective Response Time (execute_parallel): 0.076117891s  (Request ID: ) (pid:23306)
2013-09-30 06:13:52.360 [DEBUG] DEBUG: server results: No such file or directory - /var/lib/openshift/5248e26af474ca36aa000072/.ssh/authorized_keys (pid:23306)
2013-09-30 06:13:52.552 [ERROR] Failed to correctly execute all parallel operations (pid:23306)
2013-09-30 06:13:52.557 [ERROR] ["/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/pending_app_op_group.rb:129:in `execute'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:1378:in `run_jobs'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:357:in `block in remove_ssh_keys'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:1422:in `run_in_application_lock'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:351:in `remove_ssh_keys'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/pending_ops/remove_system_ssh_keys_domain_op.rb:7:in `block in execute'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/pending_ops/remove_system_ssh_keys_domain_op.rb:7:in `each'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/pending_ops/remove_system_ssh_keys_domain_op.rb:7:in `execute'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/domain.rb:233:in `run_jobs'", "/opt/rh/ruby193/root/usr/share/gems/gems/mongoid-3.0.21/lib/mongoid/relations/proxy.rb:143:in `method_missing'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:671:in `remove_features'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:693:in `destroy_app'", "/usr/sbin/oo-admin-repair:812:in `block in <main>'", "/usr/sbin/oo-admin-repair:810:in `each'", "/usr/sbin/oo-admin-repair:810:in `<main>'"] (pid:23306)
2013-09-30 06:13:52.559 [ERROR] Failed to correctly execute all parallel operations (pid:23306)
2013-09-30 06:13:52.564 [DEBUG] ["/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/pending_app_op_group.rb:129:in `execute'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:1378:in `run_jobs'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:357:in `block in remove_ssh_keys'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:1422:in `run_in_application_lock'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:351:in `remove_ssh_keys'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/pending_ops/remove_system_ssh_keys_domain_op.rb:7:in `block in execute'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/pending_ops/remove_system_ssh_keys_domain_op.rb:7:in `each'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/pending_ops/remove_system_ssh_keys_domain_op.rb:7:in `execute'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/domain.rb:233:in `run_jobs'", "/opt/rh/ruby193/root/usr/share/gems/gems/mongoid-3.0.21/lib/mongoid/relations/proxy.rb:143:in `method_missing'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:671:in `remove_features'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:693:in `destroy_app'", "/usr/sbin/oo-admin-repair:812:in `block in <main>'", "/usr/sbin/oo-admin-repair:810:in `each'", "/usr/sbin/oo-admin-repair:810:in `<main>'"] (pid:23306)
2013-09-30 06:13:52.604 [ERROR] Failed to correctly execute all parallel operations (pid:23306)
2013-09-30 06:13:52.607 [ERROR] ["/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/pending_app_op_group.rb:129:in `execute'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:1378:in `run_jobs'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:357:in `block in remove_ssh_keys'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:1422:in `run_in_application_lock'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:351:in `remove_ssh_keys'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/pending_ops/remove_system_ssh_keys_domain_op.rb:7:in `block in execute'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/pending_ops/remove_system_ssh_keys_domain_op.rb:7:in `each'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/pending_ops/remove_system_ssh_keys_domain_op.rb:7:in `execute'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/domain.rb:233:in `run_jobs'", "/opt/rh/ruby193/root/usr/share/gems/gems/mongoid-3.0.21/lib/mongoid/relations/proxy.rb:143:in `method_missing'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:671:in `remove_features'", "/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.15.5/app/models/application.rb:693:in `destroy_app'", "/usr/sbin/oo-admin-repair:812:in `block in <main>'", "/usr/sbin/oo-admin-repair:810:in `each'", "/usr/sbin/oo-admin-repair:810:in `<main>'"] (pid:23306)

Comment 2 Ravi Sankar 2013-09-30 18:47:45 UTC
Unable to reproduce the issue on latest devenv.

Comments from agupta: the node seems to be up and running but the gear seems to be missing, this might be the result of manual deletion of the gear home directory from the node.

If the issue is still reproducible, please provide the reproduction steps and attach development.log.

Comment 3 Peter Ruan 2013-10-03 07:52:19 UTC
unable to reproduce bug on devenv_3854, please reopen if encounter the problem again.