Description of problem: If the netnamespace is deleted before the project, it does not re-create the netnamespace when the project is re-created. Version-Release number of selected component (if applicable): Reproduced in OpenShift v4.5.14 How reproducible: 100% Steps to Reproduce: 1. Create a new project: [test45cluster@upi-0 ~]$ oc new-project testproject 2. Check the created project and corresponding netnamespace entry: [test45cluster@upi-0 ~]$ oc get projects | grep testproject testproject Active [test45cluster@upi-0 ~]$ oc get netnamespace | grep testproject testproject 16699832 3. Now delete the netnamespace first and then the project: [test45cluster@upi-0 ~]$ oc delete netnamespace testproject netnamespace.network.openshift.io "testproject" deleted [quicklab@upi-0 ~]$ oc delete project testproject project.project.openshift.io "testproject" deleted [test45cluster@upi-0 ~]$ oc get projects | grep testproject [test45cluster@upi-0 ~]$ oc get netnamespace | grep testproject 4. Now re-create the project with the same name: [test45cluster@upi-0 ~]$ oc new-project testproject [test45cluster@upi-0 ~]$ oc get netnamespace | grep testproject [test45cluster@upi-0 ~]$ oc get projects | grep testproject testproject Active Actual results: While re-creating the project with the same name, the corresponding netnamespace is not created. Expected results: Re-creating a project after deleting should also re-create the corresponding netnamespace. Additional info: If we delete the project first, then it is working as expected. ---------- [test45cluster@upi-0 ~]$ oc new-project testproject2 [test45cluster@upi-0 ~]$ oc get projects | grep testproject2 testproject2 Active [test45cluster@upi-0 ~]$ oc get netnamespace | grep testproject2 testproject2 16474270 [test45cluster@upi-0 ~]$ oc delete project testproject2 project.project.openshift.io "testproject2" deleted [test45cluster@upi-0 ~]$ oc get projects | grep testproject2 [test45cluster@upi-0 ~]$ oc get netnamespace | grep testproject2 [test45cluster@upi-0 ~]$ oc new-project testproject2 [test45cluster@upi-0 ~]$ oc get projects | grep testproject2 testproject2 Active [test45cluster@upi-0 ~]$ oc get netnamespace | grep testproject2 testproject2 6028713 ----------
Reproduced on 4.5.14 gcp cluster: Deleting netns before project: $ oc new-project test-amma $ oc get netnamespace | grep test-amma test-amma 2927755 $ oc delete netnamespace test-amma $ oc delete project test-amma $ oc get project | grep test-amma $ oc get netnamespace | grep test-amma $ oc new-project test-amma $ oc get project | grep test-amma test-amma Active $ oc get netnamespace | grep test-amma $ Deleting project before netns: $ oc new-project test-maam $ oc get netnamespace | grep test-maam test-maam 13424157 $ oc delete project test-maam project.project.openshift.io "test-maam" deleted $ oc get netnamespace | grep test-maam $ oc get project | grep test-maam $ oc new-project test-maam $ oc get project | grep test-maam test-maam Active $ oc get netnamespace | grep test-maam test-maam 6935484 My suspicion is that the expected order of deletion would be to delete the project which would wipe out the netns as well. Somehow when the netns is deleted separately this is probably not tracked/removed properly from some cache due to which when the project is re-created with the same name, it is not creating the netns probably cause there is a stale entry in some watch cache. Looking into the code to confirm what is happening.
ok so looked into the code. It is as I said in the previous comment. When the namespace/project deletion is triggered on the watcher, it immediately goes to delete the netnamespace and corresponding revokeVNID. Plus it also calls the "releaseNetID" to remove this netns from the vmap *masterVNIDMap cache. The logic doesn't expect the netns to be already deleted and hence it errors and falls-back without removing the netid from the vmap. That is why when the project is recreated with the same name, it doesn't recreate the netid/netns since according to vmap cache, that netns is already supposed to exist. I'll put up a PR that can fix this.
Verified this bug on 4.7.0-0.nightly-2020-11-04-224753 1. oc new-project z1 2. oc delete netnamespace z1 3. oc delete project z1 4. oc new-project z1 5. Check the netnamespace is created oc get netnamespace | grep z1 z1 13130068
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633