Bug 1716550

Summary: oc client tool randomly gives "panic: runtime error: slice bounds out of range"
Product: OpenShift Container Platform Reporter: Tim Bielawa <tbielawa>
Component: InstallerAssignee: Stephen Benjamin <stbenjam>
Installer sub component: openshift-installer QA Contact: Johnny Liu <jialiu>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: aos-bugs, bleanhar, ccoleman, jokerman, jupierce, mmccomas, sponnaga, stbenjam, wking
Version: 4.1.z   
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-23 11:04:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
command that exploded this time
none
new explosion none

Description Tim Bielawa 2019-06-03 15:03:29 UTC
Created attachment 1576698 [details]
command that exploded this time

Description of problem:
Sometimes running the oc client tool yields "panic: runtime error: slice bounds out of range". This appears to be random and often times a follow-up run of the same command will succeed.

Version-Release number of selected component (if applicable):
[tbielawa@buildvm ~]$ oc version
Client Version: version.Info{Major:"4", Minor:"1+", GitVersion:"v4.1.0", GitCommit:"cb455d664", GitTreeState:"clean", BuildDate:"2019-05-19T21:13:58Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}


How reproducible:
Randomly.

Steps to Reproduce:
1. GOTRACEBACK=all oc --config=/home/jenkins/kubeconfigs/art-publish.kubeconfig adm release new --from-release=registry.svc.ci.openshift.org/ocp/release:4.1.0-0.nightly-2019-05-31-174150 --name 4.1.0 --metadata '{"description": "", "url": "https://access.redhat.com/errata/RHBA-2019:0758"}' --to-image=quay.io/openshift-release-dev/ocp-release:4.1.0 


Actual results:
Stack trace

Expected results:
New release posted

Additional info:
I recognize that this oc client tool was built on May 19. This has happened several times using older releases as well. We'll update the oc version on the build vm after this GA release job completes. If this continues happening we'll update the bug.

Comment 1 Maciej Szulik 2019-06-24 14:27:59 UTC
This should be solved in https://github.com/openshift/origin/pull/23255

Comment 3 Maciej Szulik 2019-08-26 09:08:25 UTC
This should be fixed at this point in time, since the PR from comment #1 should be included in oc by now.
Moving to qa.

Comment 4 Tim Bielawa 2019-08-27 14:36:32 UTC
Using OC as new as this

> [tbielawa@buildvm ~]$ rpm -q openshift-clients
> openshift-clients-4.2.0-201908150219.git.0.f6120d9.el7.x86_64

We still find it to be throwing random panics

New attachment is full trace from today's explosion

Comment 5 Tim Bielawa 2019-08-27 14:37:38 UTC
Created attachment 1608651 [details]
new explosion

boom!

Comment 6 Tim Bielawa 2019-08-27 14:39:55 UTC
I have updated our installed version to openshift-clients-4.2.0-201908261819.git.0.b985ea3.el7.x86_64

Will report back if this happens again

Comment 7 Xingxing Xia 2019-08-28 06:00:55 UTC
Not sure how Tim can reproduce it. In my try with commands in comment 0 and comment 5, cannot reproduce it with oc of either above openshift-clients-4.2.0-201908261819.git.0.b985ea3.el7.x86_64 or latest 4.2.0-201908272219.git.0.1904cc5.el7:
for i in {1..100} 
do 
  echo "trying order: $i =========" 
  rm -rf ./mnt 
  GOTRACEBACK=all oc adm release extract --tools '--command-os=*' quay.io/openshift-release-dev/ocp-release-nightly:4.2.0-0.nightly-2019-08-27-072819 --to=./mnt/workpace/jenkins/working/aos-cd-builds/build%2Foc_sync/tools/4.2.0-0.nightly-2019-08-27-072819 
  sleep 1 
done

Now that Tim still reproduces it and comment 3 (comment 1) PR was closed instead of merged, assigning back

Comment 8 Xingxing Xia 2019-08-28 07:39:13 UTC
Hmm, tried comment 7 command and wait, hits the panic several times: http://file.rdu.redhat.com/~xxia/bug-1716550-recreation.txt

Comment 9 Maciej Szulik 2019-08-29 10:54:47 UTC
This area belongs to the installer, I'll let them deal with it. It looks to me like a timeout, maybe due to very frequent pulls.

Comment 10 Brenton Leanhardt 2019-08-29 17:17:12 UTC
We feel the severity isn't high enough to fix this in 4.2.  We rely heavily on this command in CI and don't see this problem happening often.

Comment 11 Tim Bielawa 2019-08-29 17:22:44 UTC
We rely heavily on this command in OCP releases and it happens at least once a week. The result is that we have to re-run release jobs for advisories. Not ideal.

Comment 12 Tim Bielawa 2019-09-20 15:28:08 UTC
Still seeing this. New one today while trying to extract the clients for the pre-release content.

Adding new attachment log.

Comment 17 Clayton Coleman 2019-10-17 19:47:52 UTC
[tbielawa@buildvm ~]$ uname -r
3.10.0-1062.1.1.el7.x86_64

Comment 19 Stephen Benjamin 2019-10-17 19:55:59 UTC
There is a race condition in docker, see https://github.com/moby/moby/issues/39859

There's a workaround in http://github.com/openshift/oc/pull/104, as the version of docker kubectl vendors is ancient (even in k8s 1.16).

If you can't get a newer oc, run `export MOBY_DISABLE_PIGZ=true` before running oc.

Comment 23 Johnny Liu 2019-10-18 02:35:08 UTC
Verified this bug with openshift-clinets-4.3.0-201910141917.git.1.7327846.el7 and PASS.


Thanks for xxia's reproduce steps.

I can reproduce it with openshift-clients-4.2.0-201909221318.git.1.bc66c02.el7.x86_64, after upgrade oc client to openshift-clinets-4.3.0-201910141917.git.1.7327846.el7, the issue is fixed.

Comment 26 errata-xmlrpc 2020-01-23 11:04:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062