Bug 1322976

Summary: race condition in project deletion / recreation
Product: OKD Reporter: Andy Grimm <agrimm>
Component: PodAssignee: Andy Goldstein <agoldste>
Status: CLOSED NOTABUG QA Contact: DeShuai Ma <dma>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.xCC: aos-bugs, decarr, jgoulding, mmccomas
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-03-31 20:36:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1303130    

Description Andy Grimm 2016-03-31 19:21:07 UTC
We have a monitoring script that re-uses a project name on each run to test app creates.  Before it tries to create anything, it removes and recreates the project if it exists.  This often fails, though, because the delete isn't actually finished when the create is called:

[root@ded-stage-aws-master-57a6a ~]# oc new-project agrimm
Now using project "agrimm" on server "https://internal.api.ded-stage-aws.openshift.com".

You can add applications to this project with the 'new-app' command. For example, try:

    $ oc new-app centos/ruby-22-centos7~https://github.com/openshift/ruby-hello-world.git

to build a new hello-world application in Ruby.
[root@ded-stage-aws-master-57a6a ~]# oc delete project agrimm && oc new-project agrimm
project "agrimm" deleted
Error from server: project "agrimm" already exists
[root@ded-stage-aws-master-57a6a ~]# oc get project | grep agrimm
[root@ded-stage-aws-master-57a6a ~]# 


We can insert a "sleep" in between the calls which will work around it, but how long is deletion expected to take? Do we need to just use unique project names each time?

Comment 1 Derek Carr 2016-03-31 20:36:08 UTC
the call to oc delete project will initiate project deletion.

project deletion works as follows:

1. mark the phase as terminating
2. a controller(s) sees the project is terminating and deletes all existing content
3. after each controller(s) finishes, the project is removed.

the right thing to do in your script is to have logic like the following:

1. oc delete project foo
2. wait until oc get project foo returns not found (poll every 2s)
3. oc new-project

Comment 2 Derek Carr 2016-03-31 20:38:42 UTC
oc get project foo should be called with cluster-admin credentials, or some credential whose access to the project is granted outside of the the project scope itself.