Bug 1379316

Summary: [public_networking_291] Panic error in master log when adding/deleting project repeatedly
Product: OKD Reporter: Meng Bo <bmeng>
Component: PodAssignee: Derek Carr <decarr>
Status: CLOSED CURRENTRELEASE QA Contact: zhaozhanqi <zzhao>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.xCC: aos-bugs, bbennett, bmeng, dma, jliggitt, mmccomas, wmeng, xtian
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-09 21:52:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
master_log_for_panic none

Description Meng Bo 2016-09-26 11:07:47 UTC
Created attachment 1204782 [details]
master_log_for_panic

Description of problem:
There is panic error appears in the master log during I was doing the adding/deleting operations on the project.

Version-Release number of selected component (if applicable):
openshift v1.4.0-alpha.0+b3ec794
kubernetes v1.4.0-beta.3+d19513f
etcd 3.0.9
git commit: b3ec794bf74052d928b2927e93a32dba1738fd54

How reproducible:
unknown

Steps to Reproduce:
1. Setup multi-node env
2. Create multiple projects via user
$ for i in {1..100}; do oc new-project userp$i ; done
3. Delete all the projects
$ oc delete project -all
4. Run another adding/deleting project operation 
$ while true; do oc new-project userxx ; oc delete project userxx ; done
5. Watching the master log.

Actual results:
There is panic error appears in the master log.

Expected results:
Should not panic.

Additional info:
Master log and core dump file attached.

Comment 2 Dan Williams 2016-09-28 18:21:13 UTC
This looks more like a bug in upstream Kubernetes than OpenShift.  My best guess is *namespace.DeletionTimestamp is causing the panic becuase the returned namespace object from retryOnConflictError() has been modified...

Filed upstream: https://github.com/kubernetes/kubernetes/issues/33676

Should we close this as UPSTREAM resolution?

Comment 3 Meng Bo 2016-09-29 02:12:29 UTC
Thanks, that is ok for me.

Comment 5 Derek Carr 2016-10-07 03:07:22 UTC
I am curious if this was an HA setup?

Comment 6 Derek Carr 2016-10-07 03:45:36 UTC
Upstream PR:
https://github.com/kubernetes/kubernetes/pull/34298

Comment 7 Meng Bo 2016-10-08 02:34:23 UTC
(In reply to Derek Carr from comment #5)
> I am curious if this was an HA setup?

For my testing, it is not HA env.

Comment 8 Derek Carr 2016-10-27 21:11:20 UTC
Origin PR:
https://github.com/openshift/origin/pull/11632

Comment 9 Derek Carr 2016-11-01 14:22:13 UTC
origin pr merged.

Comment 10 zhaozhanqi 2016-11-03 09:20:54 UTC
Tested this issue on 

# openshift version
openshift v3.4.0.19+346a31d
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0

using always add/delete namespance step. no panic logs found on master logs. 

verified this bug.