Bug 1705103

Summary: Indeterminate end state of install 4.1 drop 4 HTB
Product: OpenShift Container Platform Reporter: jolee
Component: BuildAssignee: Adam Kaplan <adam.kaplan>
Status: CLOSED INSUFFICIENT_DATA QA Contact: wewang <wewang>
Severity: low Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: aos-bugs, bparees, wzheng
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-08-22 20:55:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1664187    
Attachments:
Description Flags
oc get clusterversion -oyaml > cv.yaml
none
oc get co -oyaml > co.yaml none

Description jolee 2019-05-01 13:58:45 UTC
Description of problem:

Install appears successful w/some issues in the install log (see below).

However, trying to pull images from the container catalog results in (using kubeadmin and user created via htpasswd):

   Warning   BuildConfigInstantiateFailed   buildconfig/java-deploy-name   error instantiating Build from BuildConfig java-deploy/java-deploy-name (0): Error resolving ImageStreamTag java:8 in namespace openshift:    unable to find latest tagged image

Fresh install and environment is still available, so please let me know what else is needed.



Actual results:

time="2019-04-30T08:02:32-05:00" level=debug msg=
"Still waiting for the cluster to initialize: Multiple errors are preventing progress:
* Cluster operator authentication has not yet reported success
* Cluster operator image-registry has not yet reported success
* Cluster operator ingress has not yet reported success
* Cluster operator kube-apiserver is reporting a failure: StaticPodsDegraded: nodes/ip-10-0-134-101.us-east-2.compute.internal pods/kube-apiserver-ip-10-0-134-101.us-east-2.compute.internal container=\"kube-apiserver-1\" is not ready
StaticPodsDegraded: pods \"kube-apiserver-ip-10-0-149-0.us-east-2.compute.internal\" not found
StaticPodsDegraded: pods \"kube-apiserver-ip-10-0-160-44.us-east-2.compute.internal\" not found
* Cluster operator kube-scheduler is reporting a failure: StaticPodsDegraded: nodes/ip-10-0-149-0.us-east-2.compute.internal pods/openshift-kube-scheduler-ip-10-0-149-0.us-east-2.compute.internal container=\"scheduler\" is not ready
StaticPodsDegraded: pods \"openshift-kube-scheduler-ip-10-0-160-44.us-east-2.compute.internal\" not found
StaticPodsDegraded: pods \"openshift-kube-scheduler-ip-10-0-134-101.us-east-2.compute.internal\" not found
* Cluster operator marketplace has not yet reported success
* Cluster operator monitoring has not yet reported success
* Cluster operator node-tuning has not yet reported success
* Cluster operator openshift-apiserver is reporting a failure: ResourceSyncControllerDegraded: namespaces \"openshift-apiserver\" not found
* Cluster operator service-catalog-apiserver has not yet reported success
* Cluster operator service-catalog-controller-manager has not yet reported success
* Cluster operator storage has not yet reported success
* Could not update oauthclient \"console\" (269 of 333): the server does not recognize this resource, check extension API servers
* Could not update rolebinding \"openshift/cluster-samples-operator-openshift-edit\" (232 of 333): resource may have been deleted
* Could not update servicemonitor \"openshift-apiserver-operator/openshift-apiserver-operator\" (329 of 333): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor \"openshift-authentication-operator/openshift-authentication-operator\" (305 of 333): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor \"openshift-controller-manager-operator/openshift-controller-manager-operator\" (332 of 333): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor \"openshift-image-registry/image-registry\" (311 of 333): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor \"openshift-kube-apiserver-operator/kube-apiserver-operator\" (320 of 333): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor \"openshift-kube-controller-manager-operator/kube-controller-manager-operator\" (323 of 333): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor \"openshift-kube-scheduler-operator/kube-scheduler-operator\" (326 of 333): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor \"openshift-operator-lifecycle-manager/olm-operator\" (139 of 333): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor \"openshift-service-catalog-apiserver-operator/openshift-service-catalog-apiserver-operator\" (314 of 333): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor \"openshift-service-catalog-controller-manager-operator/openshift-service-catalog-controller-manager-operator\" (317 of 333): the server does not recognize this resource, check extension API servers"
time="2019-04-30T08:04:07-05:00" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.1.0-rc.0"
time="2019-04-30T08:04:07-05:00" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.1.0-rc.0: downloading update"
time="2019-04-30T08:04:07-05:00" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.1.0-rc.0"
time="2019-04-30T08:04:07-05:00" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.1.0-rc.0: 17% complete"
time="2019-04-30T08:04:07-05:00" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.1.0-rc.0: 23% complete"
time="2019-04-30T08:04:22-05:00" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.1.0-rc.0: 77% complete"
time="2019-04-30T08:04:38-05:00" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.1.0-rc.0: 84% complete"
time="2019-04-30T08:04:53-05:00" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.1.0-rc.0: 87% complete"
time="2019-04-30T08:05:07-05:00" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.1.0-rc.0: 89% complete"
time="2019-04-30T08:05:22-05:00" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.1.0-rc.0: 90% complete"
time="2019-04-30T08:06:23-05:00" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.1.0-rc.0: 96% complete"
time="2019-04-30T08:07:38-05:00" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.1.0-rc.0: 97% complete, waiting on authentication, ingress, monitoring, openshift-samples"
time="2019-04-30T08:09:37-05:00" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.1.0-rc.0: 99% complete"
time="2019-04-30T08:10:37-05:00" level=debug msg="Cluster is initialized"
time="2019-04-30T08:10:37-05:00" level=info msg="Waiting up to 10m0s for the openshift-console route to be created..."
time="2019-04-30T08:10:37-05:00" level=debug msg="Route found in openshift-console namespace: console"
time="2019-04-30T08:10:37-05:00" level=debug msg="Route found in openshift-console namespace: downloads"
time="2019-04-30T08:10:37-05:00" level=debug msg="OpenShift console route is created"
time="2019-04-30T08:10:37-05:00" level=info msg="Install complete!"




Expected results:

Additional info:

Comment 1 Abhinav Dahiya 2019-05-01 14:15:53 UTC
These should provide high level information:

```
oc --insecure-skip-tls-verify adm must-gather --dest-dir /tmp/artifacts/must-gather
oc get clusterversion -oyaml
oc get co -oyaml
```

Comment 2 jolee 2019-05-01 15:29:25 UTC
Created attachment 1560874 [details]
oc get clusterversion -oyaml > cv.yaml

Comment 3 jolee 2019-05-01 15:29:53 UTC
Created attachment 1560875 [details]
oc get co -oyaml > co.yaml

Comment 4 jolee 2019-05-01 15:30:58 UTC
[jolee@leep50 ocp4.1-cluster]$ oc --insecure-skip-tls-verify adm must-gather 
error: unknown command "must-gather"
See 'oc adm -h' for help and examples.


^^^ perhaps I'm missing context here or this just an indication that the cluster is not healthy?

Comment 5 Ben Parees 2019-05-02 21:44:55 UTC
Provide the buildconfig yaml and the imagestream yaml for the imagestreams your build is referencing.  (looks like you are referencing java, so "oc get is/java -n openshift -o yaml"

I am marking this low severity as it seems like a configuration problem, not a release blocker.