1744029 – e2e flake: Failed to execute container related command like: logs, exec with error "tls: internal error"

Bug 1744029 - e2e flake: Failed to execute container related command like: logs, exec with error "tls: internal error"

Summary: e2e flake: Failed to execute container related command like: logs, exec with...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Cloud Compute
Sub Component:
Version:	4.2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	4.2.0
Assignee:	Michael Gugino
QA Contact:	Jianwei Hou
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1743741 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-08-21 07:27 UTC by zhou ying
Modified:	2020-01-31 21:34 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-10-16 06:36:58 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:2922	0	None	None	None	2019-10-16 06:37:07 UTC

Description zhou ying 2019-08-21 07:27:30 UTC

Description of problem:
Test failed in job: 
https://prow.k8s.io/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-gcp-4.2/56

Failed cases: 
failed: (52.3s) 2019-08-20T11:05:09 "[sig-storage] In-tree Volumes [Driver: nfs] [Testpattern: Inline-volume (default fs)] subPath should support existing single file [Suite:openshift/conformance/parallel] [Suite:k8s]"
failed: (28s) 2019-08-20T11:04:25 "[sig-storage] PersistentVolumes-local  [Volume type: blockfswithformat] One pod requesting one prebound PVC should be able to mount volume and write from pod1 [Suite:openshift/conformance/parallel] [Suite:k8s]"


Failed errors:
fail [k8s.io/kubernetes/test/e2e/storage/utils/local.go:134]: Unexpected error:
    <exec.CodeExitError>: {
        Err: {
            s: "error running &{/usr/bin/kubectl [kubectl --server=https://api.ci-op-0zphwvsz-711dc.origin-ci-int-gce.dev.openshift.com:6443 --kubeconfig=/tmp/admin.kubeconfig exec --namespace=e2e-persistent-local-volumes-test-4467 hostexec-ci-op--v5m9b-w-b-7szpf.c.openshift-gce-devel-ci.internal -- nsenter --mount=/rootfs/proc/1/ns/mnt -- sh -c mkdir -p /tmp/local-volume-test-3ba1bee8-c33a-11e9-8d5d-0a58ac10dbe9 && dd if=/dev/zero of=/tmp/local-volume-test-3ba1bee8-c33a-11e9-8d5d-0a58ac10dbe9/file bs=4096 count=5120 && sudo losetup -f /tmp/local-volume-test-3ba1bee8-c33a-11e9-8d5d-0a58ac10dbe9/file] []  <nil>  Error from server: error dialing backend: remote error: tls: internal error\n [] <nil> 0xc00434ec30 exit status 1 <nil> <nil> true [0xc0037bafb0 0xc0037bafc8 0xc0037bafe0] [0xc0037bafb0 0xc0037bafc8 0xc0037bafe0] [0xc0037bafc0 0xc0037bafd8] [0x95d7a0 0x95d7a0] 0xc001b80ea0 <nil>}:\nCommand stdout:\n\nstderr:\nError from server: error dialing backend: remote error: tls: internal error\n\nerror:\nexit status 1\n",
        },
        Code: 1,
    }
    error running &{/usr/bin/kubectl [kubectl --server=https://api.ci-op-0zphwvsz-711dc.origin-ci-int-gce.dev.openshift.com:6443 --kubeconfig=/tmp/admin.kubeconfig exec --namespace=e2e-persistent-local-volumes-test-4467 hostexec-ci-op--v5m9b-w-b-7szpf.c.openshift-gce-devel-ci.internal -- nsenter --mount=/rootfs/proc/1/ns/mnt -- sh -c mkdir -p /tmp/local-volume-test-3ba1bee8-c33a-11e9-8d5d-0a58ac10dbe9 && dd if=/dev/zero of=/tmp/local-volume-test-3ba1bee8-c33a-11e9-8d5d-0a58ac10dbe9/file bs=4096 count=5120 && sudo losetup -f /tmp/local-volume-test-3ba1bee8-c33a-11e9-8d5d-0a58ac10dbe9/file] []  <nil>  Error from server: error dialing backend: remote error: tls: internal error
     [] <nil> 0xc00434ec30 exit status 1 <nil> <nil> true [0xc0037bafb0 0xc0037bafc8 0xc0037bafe0] [0xc0037bafb0 0xc0037bafc8 0xc0037bafe0] [0xc0037bafc0 0xc0037bafd8] [0x95d7a0 0x95d7a0] 0xc001b80ea0 <nil>}:
    Command stdout:
    
    stderr:
    Error from server: error dialing backend: remote error: tls: internal error
    
    error:
    exit status 1
    
occurred

fail [k8s.io/kubernetes/test/e2e/framework/util.go:2323]: Unexpected error:
    <exec.CodeExitError>: {
        Err: {
            s: "error running &{/usr/bin/kubectl [kubectl --server=https://api.ci-op-0zphwvsz-711dc.origin-ci-int-gce.dev.openshift.com:6443 --kubeconfig=/tmp/admin.kubeconfig logs nfs-server nfs-server --namespace=e2e-volume-1778] []  <nil>  Error from server: Get https://ci-op--v5m9b-w-b-7szpf.c.openshift-gce-devel-ci.internal:10250/containerLogs/e2e-volume-1778/nfs-server/nfs-server: remote error: tls: internal error\n [] <nil> 0xc002c43d40 exit status 1 <nil> <nil> true [0xc002be4020 0xc002be4038 0xc002be4050] [0xc002be4020 0xc002be4038 0xc002be4050] [0xc002be4030 0xc002be4048] [0x95d7a0 0x95d7a0] 0xc002b0cba0 <nil>}:\nCommand stdout:\n\nstderr:\nError from server: Get https://ci-op--v5m9b-w-b-7szpf.c.openshift-gce-devel-ci.internal:10250/containerLogs/e2e-volume-1778/nfs-server/nfs-server: remote error: tls: internal error\n\nerror:\nexit status 1\n",
        },
        Code: 1,
    }
    error running &{/usr/bin/kubectl [kubectl --server=https://api.ci-op-0zphwvsz-711dc.origin-ci-int-gce.dev.openshift.com:6443 --kubeconfig=/tmp/admin.kubeconfig logs nfs-server nfs-server --namespace=e2e-volume-1778] []  <nil>  Error from server: Get https://ci-op--v5m9b-w-b-7szpf.c.openshift-gce-devel-ci.internal:10250/containerLogs/e2e-volume-1778/nfs-server/nfs-server: remote error: tls: internal error
     [] <nil> 0xc002c43d40 exit status 1 <nil> <nil> true [0xc002be4020 0xc002be4038 0xc002be4050] [0xc002be4020 0xc002be4038 0xc002be4050] [0xc002be4030 0xc002be4048] [0x95d7a0 0x95d7a0] 0xc002b0cba0 <nil>}:
    Command stdout:
    
    stderr:
    Error from server: Get https://ci-op--v5m9b-w-b-7szpf.c.openshift-gce-devel-ci.internal:10250/containerLogs/e2e-volume-1778/nfs-server/nfs-server: remote error: tls: internal error
    
    error:
    exit status 1
    
occurred

Aug 20 11:04:48.000 I persistentvolume/pvc-3e773ceb-c33a-11e9-87ad-42010a000005 googleapi: Error 400: The disk resource 'projects/openshift-gce-devel-ci/zones/us-east1-c/disks/ci-op--v5m9b-dynamic-pvc-3e773ceb-c33a-11e9-87ad-42010a000005' is already being used by 'projects/openshift-gce-devel-ci/zones/us-east1-c/instances/ci-op--v5m9b-w-c-bzr9k', resourceInUseByAnotherResource
Version-Release number of selected component (if applicable):


How reproducible:
occasionally

Comment 2 zhou ying 2019-08-22 02:15:51 UTC

In this test job, about half numbers [sig-storage] test cases  were failed with this error.

Comment 5 Seth Jennings 2019-08-28 19:29:18 UTC

*** Bug 1743741 has been marked as a duplicate of this bug. ***

Comment 6 Alberto 2019-09-04 09:32:25 UTC

The pending CSRs issue should be covered by the PRs in https://bugzilla.redhat.com/show_bug.cgi?id=1717610 and https://bugzilla.redhat.com/show_bug.cgi?id=1746881. Also this adds some additional logs https://github.com/openshift/cluster-machine-approver/pull/44 So I set this to modified as those go in.
I'm still not closing as duplicated though as notice the machine approver is also struggling to reach the api server so this could be symptom of an underlying issue.

Comment 7 Alberto 2019-09-05 08:16:08 UTC

https://github.com/openshift/cluster-machine-approver/pull/43
https://github.com/openshift/cluster-machine-approver/pull/41

Comment 9 zhou ying 2019-09-05 10:03:00 UTC

Checked with the latest job, can't reproduce the issue:
https://prow.k8s.io/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-gcp-4.2/182
https://prow.k8s.io/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-gcp-4.2/179

Comment 10 errata-xmlrpc 2019-10-16 06:36:58 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922

Comment 11 Hongkai Liu 2020-01-31 21:34:49 UTC

Found another resourceInUseByAnotherResource in CI
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-gcp-4.2/357

Note You need to log in before you can comment on or make changes to this bug.