Bug 1716550 - oc client tool randomly gives "panic: runtime error: slice bounds out of range"
Summary: oc client tool randomly gives "panic: runtime error: slice bounds out of range"
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.1.z
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.3.0
Assignee: Stephen Benjamin
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-03 15:03 UTC by Tim Bielawa
Modified: 2020-01-23 11:04 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-01-23 11:04:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
command that exploded this time (18.64 KB, text/plain)
2019-06-03 15:03 UTC, Tim Bielawa
no flags Details
new explosion (19.51 KB, text/plain)
2019-08-27 14:37 UTC, Tim Bielawa
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift oc issues 58 0 'None' closed "panic: runtime error: slice bounds out of range" while "oc adm release extract" 2021-01-11 13:21:44 UTC
Github openshift oc pull 104 0 'None' closed pkg/cli/image/extract: disable pigz to prevent race condition 2021-01-11 13:21:44 UTC
Github openshift oc pull 130 0 'None' closed cmd/oc: disable docker's use of pigz earlier 2021-01-11 13:21:44 UTC
Red Hat Product Errata RHBA-2020:0062 0 None None None 2020-01-23 11:04:41 UTC

Description Tim Bielawa 2019-06-03 15:03:29 UTC
Created attachment 1576698 [details]
command that exploded this time

Description of problem:
Sometimes running the oc client tool yields "panic: runtime error: slice bounds out of range". This appears to be random and often times a follow-up run of the same command will succeed.

Version-Release number of selected component (if applicable):
[tbielawa@buildvm ~]$ oc version
Client Version: version.Info{Major:"4", Minor:"1+", GitVersion:"v4.1.0", GitCommit:"cb455d664", GitTreeState:"clean", BuildDate:"2019-05-19T21:13:58Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}


How reproducible:
Randomly.

Steps to Reproduce:
1. GOTRACEBACK=all oc --config=/home/jenkins/kubeconfigs/art-publish.kubeconfig adm release new --from-release=registry.svc.ci.openshift.org/ocp/release:4.1.0-0.nightly-2019-05-31-174150 --name 4.1.0 --metadata '{"description": "", "url": "https://access.redhat.com/errata/RHBA-2019:0758"}' --to-image=quay.io/openshift-release-dev/ocp-release:4.1.0 


Actual results:
Stack trace

Expected results:
New release posted

Additional info:
I recognize that this oc client tool was built on May 19. This has happened several times using older releases as well. We'll update the oc version on the build vm after this GA release job completes. If this continues happening we'll update the bug.

Comment 1 Maciej Szulik 2019-06-24 14:27:59 UTC
This should be solved in https://github.com/openshift/origin/pull/23255

Comment 3 Maciej Szulik 2019-08-26 09:08:25 UTC
This should be fixed at this point in time, since the PR from comment #1 should be included in oc by now.
Moving to qa.

Comment 4 Tim Bielawa 2019-08-27 14:36:32 UTC
Using OC as new as this

> [tbielawa@buildvm ~]$ rpm -q openshift-clients
> openshift-clients-4.2.0-201908150219.git.0.f6120d9.el7.x86_64

We still find it to be throwing random panics

New attachment is full trace from today's explosion

Comment 5 Tim Bielawa 2019-08-27 14:37:38 UTC
Created attachment 1608651 [details]
new explosion

boom!

Comment 6 Tim Bielawa 2019-08-27 14:39:55 UTC
I have updated our installed version to openshift-clients-4.2.0-201908261819.git.0.b985ea3.el7.x86_64

Will report back if this happens again

Comment 7 Xingxing Xia 2019-08-28 06:00:55 UTC
Not sure how Tim can reproduce it. In my try with commands in comment 0 and comment 5, cannot reproduce it with oc of either above openshift-clients-4.2.0-201908261819.git.0.b985ea3.el7.x86_64 or latest 4.2.0-201908272219.git.0.1904cc5.el7:
for i in {1..100} 
do 
  echo "trying order: $i =========" 
  rm -rf ./mnt 
  GOTRACEBACK=all oc adm release extract --tools '--command-os=*' quay.io/openshift-release-dev/ocp-release-nightly:4.2.0-0.nightly-2019-08-27-072819 --to=./mnt/workpace/jenkins/working/aos-cd-builds/build%2Foc_sync/tools/4.2.0-0.nightly-2019-08-27-072819 
  sleep 1 
done

Now that Tim still reproduces it and comment 3 (comment 1) PR was closed instead of merged, assigning back

Comment 8 Xingxing Xia 2019-08-28 07:39:13 UTC
Hmm, tried comment 7 command and wait, hits the panic several times: http://file.rdu.redhat.com/~xxia/bug-1716550-recreation.txt

Comment 9 Maciej Szulik 2019-08-29 10:54:47 UTC
This area belongs to the installer, I'll let them deal with it. It looks to me like a timeout, maybe due to very frequent pulls.

Comment 10 Brenton Leanhardt 2019-08-29 17:17:12 UTC
We feel the severity isn't high enough to fix this in 4.2.  We rely heavily on this command in CI and don't see this problem happening often.

Comment 11 Tim Bielawa 2019-08-29 17:22:44 UTC
We rely heavily on this command in OCP releases and it happens at least once a week. The result is that we have to re-run release jobs for advisories. Not ideal.

Comment 12 Tim Bielawa 2019-09-20 15:28:08 UTC
Still seeing this. New one today while trying to extract the clients for the pre-release content.

Adding new attachment log.

Comment 17 Clayton Coleman 2019-10-17 19:47:52 UTC
[tbielawa@buildvm ~]$ uname -r
3.10.0-1062.1.1.el7.x86_64

Comment 19 Stephen Benjamin 2019-10-17 19:55:59 UTC
There is a race condition in docker, see https://github.com/moby/moby/issues/39859

There's a workaround in http://github.com/openshift/oc/pull/104, as the version of docker kubectl vendors is ancient (even in k8s 1.16).

If you can't get a newer oc, run `export MOBY_DISABLE_PIGZ=true` before running oc.

Comment 23 Johnny Liu 2019-10-18 02:35:08 UTC
Verified this bug with openshift-clinets-4.3.0-201910141917.git.1.7327846.el7 and PASS.


Thanks for xxia's reproduce steps.

I can reproduce it with openshift-clients-4.2.0-201909221318.git.1.bc66c02.el7.x86_64, after upgrade oc client to openshift-clinets-4.3.0-201910141917.git.1.7327846.el7, the issue is fixed.

Comment 26 errata-xmlrpc 2020-01-23 11:04:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062


Note You need to log in before you can comment on or make changes to this bug.