Description of problem: # `oc get rolebindings --all-namespaces` fails reproducible with: F0201 14:15:42.906900 209818 helpers.go:119] Error from server (InternalError): an error on the server ("This request caused apiserver to panic. Look in the logs for details.") has prevented the request from succeeding (get rolebindings.authorization.openshift.io) Version-Release number of selected component (if applicable): 3.11.380 How reproducible: Always when using `oc get rolebindings --all` Steps to Reproduce: 1. `$ oc get rolebindings --all-namespaces --loglevel=10` Actual results: I0201 14:15:42.906879 209818 helpers.go:201] server response object: [{ "metadata": {}, "status": "Failure", "message": "an error on the server (\"This request caused apiserver to panic. Look in the logs for details.\") has prevented the request from succeeding (get rolebindings.authorization.openshift.io)", "reason": "InternalError", "details": { "group": "authorization.openshift.io", "kind": "rolebindings", "causes": [ { "reason": "UnexpectedServerResponse", "message": "This request caused apiserver to panic. Look in the logs for details." } ] }, "code": 500 }] F0201 14:15:42.906900 209818 helpers.go:119] Error from server (InternalError): an error on the server ("This request caused apiserver to panic. Look in the logs for details.") has prevented the request from succeeding (get rolebindings.authorization.openshift.io) ###### master-logs_api_api ###### -> full message in attached file 1 runtime.go:67] Observed a panic: runtime error: index out of range goroutine 598166367 [running]: github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP.func1.1(0xc4d75496e0) /builddir/build/BUILD/atomic-openshift-git-0.f875174/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/server/filters/timeout.go:104 +0xe1 panic(0x47e4d20, 0xa35caa0) /opt/rh/go-toolset-1.10/root/usr/lib/go-toolset-1.10-golang/src/runtime/panic.go:502 +0x229 github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithAudit.func1.1(0xc44b413400, 0x7fa7cafe7cb8, 0xc421e2a080, 0xa6c23b0, 0x0, 0x0, 0x0, 0x0) /builddir/build/BUILD/atomic-openshift-git-0.f875174/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/endpoints/filters/audit.go:84 +0x1f8 panic(0x47e4d20, 0xa35caa0) /opt/rh/go-toolset-1.10/root/usr/lib/go-toolset-1.10-golang/src/runtime/panic.go:502 +0x229 github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/printers.(*HumanReadablePrinter).legacyPrinterToTable(0xc421a36300, 0x7814f20, 0xc45edb3420, 0xc4216bc320, 0x1, 0x0, 0xc4d8b6f7e0) /builddir/build/BUILD/atomic-openshift-git-0.f875174/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/printers/humanreadable.go:671 +0xc96 github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/printers.(*HumanReadablePrinter).PrintTable(0xc421a36300, 0x7814f20, 0xc45edb3420, 0x0, 0x0, 0x0, 0x0, 0x1000000, 0x0, 0x0, ...) /builddir/build/BUILD/atomic-openshift-git-0.f875174/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/printers/humanreadable.go:502 +0x97f github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/printers/storage.TableConvertor.ConvertToTable(0x780ba20, 0xc421a36300, 0x78621a0, 0xc48e58cc90, 0x7814f20, 0xc45edb3420, 0x7821fa0, 0xc469196f30, 0x1, 0x7, ...) /builddir/build/BUILD/atomic-openshift-git-0.f875174/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/printers/storage/storage.go:32 +0x8e github.com/openshift/origin/pkg/util/registry.(*noWatchStorageErrWrapper).ConvertToTable(0xc4203e7430, 0x78621a0, 0xc48e58cc90, 0x7814f20, 0xc45edb3420, 0x7821fa0, 0xc469196f30, 0xc469196f30, 0x0, 0x0) /builddir/build/BUILD/atomic-openshift-git-0.f875174/_output/local/go/src/github.com/openshift/origin/pkg/util/registry/wrapper.go:51 +0x7c github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/endpoints/handlers.transformResponseObject(0x78621a0, 0xc48e58cc90, 0x787aea0, 0xc4234d5400, 0x785cea0, 0xc421445080, 0x7822120, 0xc42010c940, 0x7809dc0, 0xc42015bf80, ...) /builddir/build/BUILD/atomic-openshift-git-0.f875174/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/endpoints/handlers/response.go:117 +0x49f github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/endpoints/handlers.ListResource.func1(0x7855820, 0xc478624d58, 0xc452015000) /builddir/build/BUILD/atomic-openshift-git-0.f875174/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/endpoints/handlers/get.go:282 +0xa02 github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/endpoints.restfulListResource.func1(0xc48e58cb70, 0xc4e5d03b00) /builddir/build/BUILD/atomic-openshift-git-0.f875174/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/endpoints/installer.go:1010 +0xd0 Expected results: get a list of all rolebindings Additional info: 1. command is running for 5-6 seconds before failing, always reproducible, always failing somewhere around the same role in the same project "XYZ" 2. using oc get rolebindings -n XYZ is running fine 3. looping over all projects (for i in $(oc get projects -o json|jq -r .items[].metadata.name); do echo "Working on Project $i:" && oc get rolebindings.rbac -n $i ; done) is runnning fine 4. etcd seems to be okay regarding performance and any suspicious error messages 5. testing was done with different oc-client releases, up to 3.11.570, doesn't change anything 6. removed .kube directory for oc-client to exclude any caching issues 7. issue only to be observed on one out of >25 Clusters running the same RHOCP 3.11.380 release
So the command does not immediately fail, and scoping it to specific namespace works or iterating over namespaces works. The command only fails with -all-namespaces, and the error "index out of range" likely implies that there are more rolebindings than we can handle. Can we get a count of the role bindings from Cu's iteration loop, so we can internally repro to see at what number the command fails ? I guess code review could help us point that out too.
@agogala I am from OpenShift support team, trying to see if the issue is due to # of role-bindings. Customer had 181 namespaces with 3585 role-bindings. I have tested 231 namespaces with 4269 role bindings with one name space having 200 role bindings. So it does not look like it's either the overall role bindings number or per namespace threshold issue. Could be a column /field content issue in the table and that would be hard to reproduce. [anandpaladugu@localhost github]$ oc-311 get rolebindings.rbac --all-namespaces | wc -l 4269 OCP versions in my setup are as below: [anandpaladugu@localhost github]$ oc-311 version oc v3.11.117 kubernetes v1.11.0+d4cacc0 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://openshift.sharedocp311cns.lab.upshift.rdu2.redhat.com:443 openshift v3.11.570 kubernetes v1.11.0+d4cacc0 What versions is customer using ? @maszulik seems to have tagged this with target release 3.11.Z, and I am wondering if he has more insights.
> @maszulik seems to have tagged this with target release 3.11.Z, and I am wondering if he has more insights. Based on the place in code it looks like it's specifically with this particular dataset, which triggers the index out of bounds error.
can't reproduce the issue now: oc get rolebindings --all-namespaces NAMESPACE NAME ROLE AGE default machine-config-controller-events ClusterRole/machine-config-controller-events 122m default machine-config-daemon-events ClusterRole/machine-config-daemon-events 123m default prometheus-k8s Role/prometheus-k8s 116m default system:deployers ClusterRole/system:deployer 118m default system:image-builders ClusterRole/system:image-builder 118m default system:image-pullers ClusterRole/system:image-puller 118m ...... [root@localhost roottest]# echo $? 0 [root@localhost roottest]# oc version oc v3.11.664 kubernetes v1.11.0+d4cacc0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 3.11.664 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:1033