Bug 1525799 - master-controllers panic and crash repeatedly with "fatal error: concurrent map writes"
Summary: master-controllers panic and crash repeatedly with "fatal error: concurrent m...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Master
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.6.z
Assignee: Dan Mace
QA Contact: Wang Haoran
URL:
Whiteboard:
Depends On: 1519277
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-12-14 05:03 UTC by Takayoshi Kimura
Modified: 2018-01-25 22:38 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2018-01-23 17:59:00 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3318671 None None None 2018-01-25 22:38:50 UTC
Red Hat Product Errata RHBA-2018:0113 normal SHIPPED_LIVE OpenShift Container Platform 3.7 and 3.6 bug fix and enhancement update 2018-01-23 22:55:59 UTC

Description Takayoshi Kimura 2017-12-14 05:03:33 UTC
Description of problem:

master-controllers panic and crash repeatedly with "fatal error: concurrent map writes" during statefulsets processing:

atomic-openshift-master-controllers[122375]: I1214 02:36:04.316381  122375 stateful_set.go:420] Syncing StatefulSet myproject/mypet with 5 pods
atomic-openshift-master-controllers[122375]: I1214 02:36:04.316442  122375 stateful_set_control.go:147] StatefulSet mypet is waiting for Pod mypet-1 to be Running and Ready
atomic-openshift-master-controllers[122375]: I1214 02:36:04.316448  122375 stateful_set.go:425] Succesfully synced StatefulSet myproject/mypet successful
atomic-openshift-master-controllers[122375]: fatal error: concurrent map writes
atomic-openshift-master-controllers[122375]: goroutine 2536 [running]:
atomic-openshift-master-controllers[122375]: runtime.throw(0x511810e, 0x15)
atomic-openshift-master-controllers[122375]: /usr/lib/golang/src/runtime/panic.go:566 +0x95 fp=0xc433a3f100 sp=0xc433a3f0e0
atomic-openshift-master-controllers[122375]: runtime.mapassign1(0x4850280, 0xc4311f2720, 0xc433a3f2e0, 0xc433a3f2d0)
atomic-openshift-master-controllers[122375]: /usr/lib/golang/src/runtime/hashmap.go:458 +0x8ef fp=0xc433a3f1e8 sp=0xc433a3f100
atomic-openshift-master-controllers[122375]: github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/api/v1.Convert_v1_Pod_To_api_Pod(0xc42d6fa900, 0xc43828db00, 0x0, 0x0, 0x7000100, 0x0)

Version-Release number of selected component (if applicable):

3.6 upgraded from 3.5

How reproducible:

Always in customer env, crash every 2 min

Steps to Reproduce:
1.
2.
3.

Actual results:

crash and generating core dumps, causes disk full

Expected results:

no crash

Additional info:

Comment 14 Michal Fojtik 2017-12-14 10:25:24 UTC
Dan, Tomas found: https://github.com/kubernetes/kubernetes/pull/52092

Any chance we can backport that to 3.6? (if we not already did it)

Comment 15 Dan Mace 2017-12-14 14:02:49 UTC
(In reply to Michal Fojtik from comment #14)
> Dan, Tomas found: https://github.com/kubernetes/kubernetes/pull/52092
> 
> Any chance we can backport that to 3.6? (if we not already did it)

Good news: the 3.6 backport is already under way. See https://bugzilla.redhat.com/show_bug.cgi?id=1519277.

Comment 17 Wang Haoran 2018-01-05 03:33:35 UTC
Verified according to this comments:https://bugzilla.redhat.com/show_bug.cgi?id=1519277#c17

Comment 20 errata-xmlrpc 2018-01-23 17:59:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0113


Note You need to log in before you can comment on or make changes to this bug.