Bug 1761706

Summary: [Feature:DeploymentConfig] deploymentconfigs keep the deployer pod invariant valid [Conformance] should deal with cancellation after deployer pod succeeded [Suite:openshift/conformance/parallel/minimal] expand_more
Product: OpenShift Container Platform Reporter: xiyuan
Component: openshift-apiserverAssignee: Lukasz Szaszkiewicz <lszaszki>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Xingxing Xia <xxia>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.2.0CC: aos-bugs, lszaszki, mfojtik, tnozicka, xtian
Target Milestone: ---   
Target Release: 4.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: workloads
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-02-17 11:13:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description xiyuan 2019-10-15 06:37:32 UTC
Description of problem:
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-azure-4.2/93

fail [github.com/openshift/origin/test/extended/util/client.go:201]: Unexpected error:
    <*errors.StatusError | 0xc001524480>: {
        ErrStatus: {
            TypeMeta: {Kind: "", APIVersion: ""},
            ListMeta: {SelfLink: "", ResourceVersion: "", Continue: ""},
            Status: "Failure",
            Message: "Timeout: request did not complete within requested timeout 30s",
            Reason: "Timeout",
            Details: {Name: "", Group: "", Kind: "", UID: "", Causes: nil, RetryAfterSeconds: 0},
            Code: 504,
        },
    }
    Timeout: request did not complete within requested timeout 30s
occurred

Version-Release number of selected component (if applicable):
4.2 jobs

Comment 1 Tomáš Nožička 2019-10-22 08:52:43 UTC
That call doesn't seem to be related to the deployment test but project setup in e2e tests, particularly connected to openshift-apiserver

ProjectClient().ProjectV1().ProjectRequests().Create()

https://github.com/openshift/origin/blob/e31fbf999be5f62613c410df18d10718d3202db9/test/extended/util/client.go#L198-L201

might be connected to kube-apiserver failing (which is the aggregator) but let's start with openshift-apiserver as the owner of the API and if it seems like aggregator's fault, that can be reassigned.


from the same test run:

1 error level events were detected during this test run:

Oct 15 04:32:47.456 E kube-apiserver Kube API started failing: Get https://api.ci-op-4dsd7xsg-37d82.ci.azure.devcluster.openshift.com:6443/api/v1/namespaces/kube-system?timeout=3s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Comment 2 Tomáš Nožička 2019-10-22 08:54:15 UTC
hm, I can't reassign the component to openshift-apiserver, BZ gives me 500 :(

Comment 3 Lukasz Szaszkiewicz 2020-02-17 11:11:19 UTC
Correct, the test timed out on creating a project, after 30 seconds. The request hit the API server but for some reason didn't complete (no further logs):


1. A log line from the API server.
019-10-15T04:59:28.0975635Z I1015 04:59:28.097470       1 trace.go:81] Trace[60316636]: "Create /apis/project.openshift.io/v1/projectrequests" (started: 2019-10-15 04:58:58.0967879 +0000 UTC m=+3111.181308601) (total time: 30.0006535s):
2019-10-15T04:59:28.0975635Z Trace[60316636]: [30.0006535s] [30.0003774s] END


2. Corresponding event from the audit log
  "objectRef": {
    "resource": "projectrequests",
    "name": "e2e-test-cli-deployment-xfwqn",
    "apiGroup": "project.openshift.io",
    "apiVersion": "v1"
  },
  "responseStatus": {
    "metadata": {},
    "status": "Failure",
    "reason": "Timeout",
    "code": 504
  },
  "requestReceivedTimestamp": "2019-10-15T04:58:58.094051Z",
  "stageTimestamp": "2019-10-15T04:59:28.097589Z",