Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1552742

Summary: Hitting the health check endpoint of a pod causes the pod to go into CrashLoopBackoff
Product: OpenShift Container Platform Reporter: Steven Walter <stwalter>
Component: NetworkingAssignee: Ben Bennett <bbennett>
Networking sub component: router QA Contact: zhaozhanqi <zzhao>
Status: CLOSED DUPLICATE Docs Contact:
Severity: high    
Priority: unspecified CC: aos-bugs
Version: 3.7.1   
Target Milestone: ---   
Target Release: 3.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-07 20:07:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Steven Walter 2018-03-07 16:23:00 UTC
Description of problem:
When customer runs a loop against /healthz endpoint of 3.7 router, the pod panics.

Version-Release number of selected component (if applicable):
3.7.14

How reproducible:
Unconfirmed

Steps to Reproduce:
1. In one window, "oc get pods -w -n default -o wide"
2. In another:  ab  -n 20000 -c 500 http://10.10.95.236:1936/healthz
3. Or even  ab  -n 1000 -c 15 http://10.10.95.236:1936/healthz

Actual results:
W0305 17:03:45.427711       1 reflector.go:343] github.com/openshift/origin/pkg/router/controller/factory/factory.go:108: watch of *route.Route ended with: etcdserver: mvcc: required revision has been compacted
panic: runtime error: index out of range
goroutine 4391 [running]:
github.com/openshift/origin/vendor/github.com/cockroachdb/cmux.(*ptNode).match(0xc420dffa40, 0xc42013c1d8, 0x0, 0x8, 0x1, 0x0)
        /builddir/build/BUILD/atomic-openshift-git-0.593a50e/_output/local/go/src/github.com/openshift/origin/vendor/github.com/cockroachdb/cmux/patricia.go:148 +0x197
github.com/openshift/origin/vendor/github.com/cockroachdb/cmux.(*patriciaTree).matchPrefix(0xc420043ea0, 0xf287600, 0xc421122c58, 0xf2a0700)
        /builddir/build/BUILD/atomic-openshift-git-0.593a50e/_output/local/go/src/github.com/openshift/origin/vendor/github.com/cockroachdb/cmux/patricia.go:38 +0x90
github.com/openshift/origin/vendor/github.com/cockroachdb/cmux.(*patriciaTree).(github.com/openshift/origin/vendor/github.com/cockroachdb/cmux.matchPrefix)-fm(0xf287600, 0xc421122c58, 0xc42000e7f8)
        /builddir/build/BUILD/atomic-openshift-git-0.593a50e/_output/local/go/src/github.com/openshift/origin/vendor/github.com/cockroachdb/cmux/matchers.go:23 +0x3e
github.com/openshift/origin/vendor/github.com/cockroachdb/cmux.(*cMux).serve(0xc42055db00, 0xf2fa960, 0xc42000e7f8, 0xc42043d020, 0xc4204501e0)
        /builddir/build/BUILD/atomic-openshift-git-0.593a50e/_output/local/go/src/github.com/openshift/origin/vendor/github.com/cockroachdb/cmux/cmux.go:129 +0x265
created by github.com/openshift/origin/vendor/github.com/cockroachdb/cmux.(*cMux).Serve
        /builddir/build/BUILD/atomic-openshift-git-0.593a50e/_output/local/go/src/github.com/openshift/origin/vendor/github.com/cockroachdb/cmux/cmux.go:119 +0x16c

Expected results:
No crash

Additional info:
We tried removing readinessProbe/livenessProbe just in case it was due to those, but issue persisted. I was unable to reproduce in my quicklab environment, even with 1000 connections per second (v3.6)

Comment 2 Ben Bennett 2018-03-07 20:07:48 UTC

*** This bug has been marked as a duplicate of bug 1532060 ***