Description of problem: master-controllers panic and crash repeatedly with "fatal error: concurrent map writes" during statefulsets processing: atomic-openshift-master-controllers[122375]: I1214 02:36:04.316381 122375 stateful_set.go:420] Syncing StatefulSet myproject/mypet with 5 pods atomic-openshift-master-controllers[122375]: I1214 02:36:04.316442 122375 stateful_set_control.go:147] StatefulSet mypet is waiting for Pod mypet-1 to be Running and Ready atomic-openshift-master-controllers[122375]: I1214 02:36:04.316448 122375 stateful_set.go:425] Succesfully synced StatefulSet myproject/mypet successful atomic-openshift-master-controllers[122375]: fatal error: concurrent map writes atomic-openshift-master-controllers[122375]: goroutine 2536 [running]: atomic-openshift-master-controllers[122375]: runtime.throw(0x511810e, 0x15) atomic-openshift-master-controllers[122375]: /usr/lib/golang/src/runtime/panic.go:566 +0x95 fp=0xc433a3f100 sp=0xc433a3f0e0 atomic-openshift-master-controllers[122375]: runtime.mapassign1(0x4850280, 0xc4311f2720, 0xc433a3f2e0, 0xc433a3f2d0) atomic-openshift-master-controllers[122375]: /usr/lib/golang/src/runtime/hashmap.go:458 +0x8ef fp=0xc433a3f1e8 sp=0xc433a3f100 atomic-openshift-master-controllers[122375]: github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/api/v1.Convert_v1_Pod_To_api_Pod(0xc42d6fa900, 0xc43828db00, 0x0, 0x0, 0x7000100, 0x0) Version-Release number of selected component (if applicable): 3.6 upgraded from 3.5 How reproducible: Always in customer env, crash every 2 min Steps to Reproduce: 1. 2. 3. Actual results: crash and generating core dumps, causes disk full Expected results: no crash Additional info:
Dan, Tomas found: https://github.com/kubernetes/kubernetes/pull/52092 Any chance we can backport that to 3.6? (if we not already did it)
(In reply to Michal Fojtik from comment #14) > Dan, Tomas found: https://github.com/kubernetes/kubernetes/pull/52092 > > Any chance we can backport that to 3.6? (if we not already did it) Good news: the 3.6 backport is already under way. See https://bugzilla.redhat.com/show_bug.cgi?id=1519277.
Verified according to this comments:https://bugzilla.redhat.com/show_bug.cgi?id=1519277#c17
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0113