Bug 1322976 - race condition in project deletion / recreation
Summary: race condition in project deletion / recreation
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OKD
Classification: Red Hat
Component: Pod
Version: 3.x
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Andy Goldstein
QA Contact: DeShuai Ma
URL:
Whiteboard:
Depends On:
Blocks: OSOPS_V3
TreeView+ depends on / blocked
 
Reported: 2016-03-31 19:21 UTC by Andy Grimm
Modified: 2016-11-08 03:48 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-03-31 20:36:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Andy Grimm 2016-03-31 19:21:07 UTC
We have a monitoring script that re-uses a project name on each run to test app creates.  Before it tries to create anything, it removes and recreates the project if it exists.  This often fails, though, because the delete isn't actually finished when the create is called:

[root@ded-stage-aws-master-57a6a ~]# oc new-project agrimm
Now using project "agrimm" on server "https://internal.api.ded-stage-aws.openshift.com".

You can add applications to this project with the 'new-app' command. For example, try:

    $ oc new-app centos/ruby-22-centos7~https://github.com/openshift/ruby-hello-world.git

to build a new hello-world application in Ruby.
[root@ded-stage-aws-master-57a6a ~]# oc delete project agrimm && oc new-project agrimm
project "agrimm" deleted
Error from server: project "agrimm" already exists
[root@ded-stage-aws-master-57a6a ~]# oc get project | grep agrimm
[root@ded-stage-aws-master-57a6a ~]# 


We can insert a "sleep" in between the calls which will work around it, but how long is deletion expected to take? Do we need to just use unique project names each time?

Comment 1 Derek Carr 2016-03-31 20:36:08 UTC
the call to oc delete project will initiate project deletion.

project deletion works as follows:

1. mark the phase as terminating
2. a controller(s) sees the project is terminating and deletes all existing content
3. after each controller(s) finishes, the project is removed.

the right thing to do in your script is to have logic like the following:

1. oc delete project foo
2. wait until oc get project foo returns not found (poll every 2s)
3. oc new-project

Comment 2 Derek Carr 2016-03-31 20:38:42 UTC
oc get project foo should be called with cluster-admin credentials, or some credential whose access to the project is granted outside of the the project scope itself.


Note You need to log in before you can comment on or make changes to this bug.