Bug 1734786
| Summary: | Installation fails customresourcedefinitions.apiextensions.k8s.io \"servicemonitors.monitoring.coreos.com\" not found | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Itzik Brown <itbrown> |
| Component: | Installer | Assignee: | Luis Tomas Bolivar <ltomasbo> |
| Installer sub component: | openshift-ansible | QA Contact: | Itzik Brown <itbrown> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | unspecified | CC: | alegrand, anpicker, erooth, gpei, ltomasbo, mloibl, pkrupa, surbania |
| Version: | 3.11.0 | ||
| Target Milestone: | --- | ||
| Target Release: | 3.11.z | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-08-13 14:09:21 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
This has nothing to do with monitoring operator, nor with the kuryr error shown on the kuryr-controller. Problem was on the kuryr-cni due to listening on a different port than the configured/expected one (due to using a newer kuryr version). This leads containers to not get proper networking, for instance: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 1m default-scheduler Successfully assigned default/router-3-deploy to infra-node-0.openshift.example.com Warning FailedCreatePodSandBox 1m kubelet, infra-node-0.openshift.example.com Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "ad145a5d727c0807e1730de0a67c956f9cdd3ac173c0004c59ba087266eb6226" network for pod "router-3-deploy": NetworkPlugin cni failed to set up pod "router-3-deploy_default" network: Looks like http://localhost:5036/addNetwork cannot be reached. Is kuryr-daemon running?: Post http://localhost:5036/addNetwork: dial tcp [::1]:5036: connect: connection refused, failed to clean up sandbox container "ad145a5d727c0807e1730de0a67c956f9cdd3ac173c0004c59ba087266eb6226" network for pod "router-3-deploy": NetworkPlugin cni failed to teardown pod "router-3-deploy_default" network: Looks like http://localhost:5036/delNetwork cannot be reached. Is kuryr-daemon running?: Post http://localhost:5036/delNetwork: dial tcp [::1]:5036: connect: connection refused] Normal SandboxChanged 2s (x10 over 1m) kubelet, infra-node-0.openshift.example.com Pod sandbox changed, it will be killed and re-created. Checked with 3.11.136 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2352 |
Description of problem: Installation of Openshift with RHOS14 with the latest kuryr images fails. ... FAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (1 retries left). fatal: [master-0.openshift.example.com]: FAILED! => {"attempts": 30, "changed": true, "cmd": ["oc", "get", "crd", "servicemonitors.monitoring.coreos.com", "-n", "openshift-monitoring", "--config=/tmp/openshift-cluster-monitoring-ansible-t52VgQ/admin.kubeconfig"], "delta": "0:00:00.190286", "end": "2019-07-31 04:31:26.602360", "msg": "non-zero return code", "rc": 1, "start": "2019-07-31 04:31:26.412074", "stderr": "No resources found.\nError from server (NotFound): customresourcedefinitions.apiextensions.k8s.io \"servicemonitors.monitoring.coreos.com\" not found", "stderr_lines": ["No resources found.", "Error from server (NotFound): customresourcedefinitions.apiextensions.k8s.io \"servicemonitors.monitoring.coreos.com\" not found"], "stdout": "", "stdout_lines": []} The Kuryr controller logs show: 2019-07-31 08:16:32.717 1 ERROR kuryr_kubernetes.controller.drivers.namespace_subnet [-] Namespace missing CRD annotations for selecting the corresponding subnet.: KeyError: 'openstack.org/kuryr-net-crd' 2019-07-31 08:16:32.717 1 ERROR kuryr_kubernetes.controller.drivers.namespace_subnet Traceback (most recent call last): 2019-07-31 08:16:32.717 1 ERROR kuryr_kubernetes.controller.drivers.namespace_subnet File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/namespace_subnet.py", line 65, in _get_namespace_subnet_id 2019-07-31 08:16:32.717 1 ERROR kuryr_kubernetes.controller.drivers.namespace_subnet net_crd_name = annotations[constants.K8S_ANNOTATION_NET_CRD] 2019-07-31 08:16:32.717 1 ERROR kuryr_kubernetes.controller.drivers.namespace_subnet KeyError: 'openstack.org/kuryr-net-crd' 2019-07-31 08:16:32.717 1 ERROR kuryr_kubernetes.controller.drivers.namespace_subnet ESC[00m 2019-07-31 08:16:32.730 1 ERROR kuryr_kubernetes.controller.drivers.namespace_subnet [-] Namespace missing CRD annotations for selecting the corresponding subnet.: KeyError: 'openstack.org/kuryr-net-crd' Version-Release number of the following components: v3.11.134 openshift-ansible-3.11.134-1.git.0.18e5870.el7.noarch How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Please include the entire output from the last TASK line through the end of output if an error is generated Expected results: Additional info: Please attach logs from ansible-playbook with the -vvv flag