Bug 1473031 - fatal error: concurrent map read and map write
fatal error: concurrent map read and map write
Status: MODIFIED
Product: OpenShift Container Platform
Classification: Red Hat
Component: Routing (Show other bugs)
3.6.0
Unspecified Unspecified
unspecified Severity unspecified
: ---
: 3.7.0
Assigned To: Ben Bennett
zhaozhanqi
: NeedsTestCase
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-19 17:50 EDT by Eric Paris
Modified: 2017-09-18 23:10 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Missing locking around a router data structure Consequence: The router pod would (very occasionally) crash and restart Fix: Add the appropriate locking Result: The invalid data access does not crash the router
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Logs with backtrace (38.57 KB, text/plain)
2017-07-19 17:51 EDT, Eric Paris
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Origin (Github) 15385 None None None 2017-07-21 10:42 EDT

  None (edit)
Description Eric Paris 2017-07-19 17:50:12 EDT
I found a router that had a 'restart'.

ose-haproxy-router:v3.6.126.1

 Looked at the logs for the last pod and found:

  - spec.tls.key: Invalid value: "redacted key data": unrecognized PEM block DSA PRIVATE KEY
E0718 14:14:42.201169       1 router_controller.go:311] invalid route configuration
fatal error: concurrent map read and map write
Comment 1 Eric Paris 2017-07-19 17:51 EDT
Created attachment 1301446 [details]
Logs with backtrace
Comment 3 Jordan Liggitt 2017-07-20 12:02:23 EDT
state map is read from outside of a lock on line 770:

	if existingConfig, exists := r.state[backendKey]; exists {
Comment 4 openshift-github-bot 2017-07-22 07:58:47 EDT
Commit pushed to master at https://github.com/openshift/origin

https://github.com/openshift/origin/commit/0b305fba3645f1313b54c30e7890a7a6cf4290f1
Moved locking to protect a read of a map in the router

The locking was not protecting a read, so a simultaneous write would
crash the router.  I made a bunch of new functions that implemented
the functional part of the function without the locking, then made the
locking functions acquire the lock and then call the internal part.
Then in the rename, I moved the lock acquisition earlier and called
the internal functions.

In brief: re-jiggered the code so we could lock properly.

Fixes bug 1473031 (https://bugzilla.redhat.com/show_bug.cgi?id=1473031)

Note You need to log in before you can comment on or make changes to this bug.