Bug 1822298

Summary: [sig-builds][Feature:Builds] oc new-app should fail with a --name longer than 58 characters [Suite:openshift/conformance/parallel]
Product: OpenShift Container Platform Reporter: Dan Williams <dcbw>
Component: kube-controller-managerAssignee: Maciej Szulik <maszulik>
Status: CLOSED DUPLICATE QA Contact: zhou ying <yinzhou>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.5CC: aos-bugs, gmontero, mfojtik, wzheng
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-09 08:51:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 1 Gabe Montero 2020-04-08 19:05:14 UTC
*** Bug 1822303 has been marked as a duplicate of this bug. ***

Comment 2 Gabe Montero 2020-04-08 19:05:42 UTC
*** Bug 1822302 has been marked as a duplicate of this bug. ***

Comment 3 Gabe Montero 2020-04-08 19:06:04 UTC
*** Bug 1822301 has been marked as a duplicate of this bug. ***

Comment 4 Gabe Montero 2020-04-08 19:06:23 UTC
*** Bug 1822300 has been marked as a duplicate of this bug. ***

Comment 5 Gabe Montero 2020-04-08 19:06:56 UTC
*** Bug 1822299 has been marked as a duplicate of this bug. ***

Comment 6 Gabe Montero 2020-04-08 19:25:46 UTC
So the kube-apiserver went degraged:

Apr 08 11:35:32.675 E clusteroperator/kube-apiserver changed Degraded to True: NodeInstaller_InstallerPodFailed: NodeInstallerDegraded: 1 nodes are failing on revision 7:\nNodeInstallerDegraded: 

and OCM initally had trouble with leader election at 

fail [github.com/openshift/origin/test/extended/operators/cluster.go:114]: Expected
    <[]string | len:1, cap:1>: [
        "Pod openshift-controller-manager/controller-manager-mwgnw is not healthy: I0408 11:29:14.758226       1 controller_manager.go:39] Starting controllers on 0.0.0.0:8443 (unknown)\nI0408 11:29:14.761078       1 controller_manager.go:50] DeploymentConfig controller using images from \"registry.svc.ci.openshift.org/ci-op-0vjskh7j/stable@sha256:baf34611b723ba5e9b3ead8872fed2c8af700156096054d720d42a057f5f24be\"\nI0408 11:29:14.761264       1 controller_manager.go:56] Build controller using images from \"registry.svc.ci.openshift.org/ci-op-0vjskh7j/stable@sha256:19880395f98981bdfd98ffbfc9e4e878aa085ecf1e91f2073c24679545e41478\"\nI0408 11:29:14.761230       1 standalone_apiserver.go:98] Started health checks at 0.0.0.0:8443\nI0408 11:29:14.766485       1 leaderelection.go:242] attempting to acquire leader lease  openshift-controller-manager/openshift-master-controllers...\n",
    ]
to be empty

though perhaps it recovered, as and 10 to 15 minutes later I see activity in the OCM controller-manager log, but a repeated amount of "...forbidden: unable to create new content in namespace..." across all the test namesapces like 

E0408 11:49:39.086650       1 create_dockercfg_secrets.go:285] error syncing service, it will be tried again on a resync e2e-test-image-layers-x6c66/default: secrets "default-token-w5prw" is forbidden: unable to create new content in namespace e2e-test-image-layers-x6c66 because it is being terminated

where the timestamps correlate to what the sig-builds tests are seeing, which is the OCM is still in progressing==true state over the course of 2 to 3 minutes when they are trying to start.

Apr  8 11:48:29.859: INFO: OCM rollout still progressing or in error: True

there are also no artifacts available for the test run noted in the description (get a 404 on the artifacts link)

Sending to kube-controller-manager to see if they can correlate all the forbidden unable to create stuff I mentioned with cluster-policy-controller.

Feels like a GCP env issue but if folks who own the stack beneath builds can clarify, that would be good.

Comment 7 Maciej Szulik 2020-04-09 08:51:05 UTC
It might be definitely related I see a ton of 

namespace_scc_allocation_controller.go:336] error syncing namespace, it will be retried: Operation cannot be fulfilled on namespaces "e2e-test-operators-qzmwm": the object has been modified; please apply your changes to the latest version and try again

from 11:36:56.425828 until 11:57:38.401832. 

I'm going to close this as duplicate.

*** This bug has been marked as a duplicate of bug 1820687 ***