Bug 1661872 - Ingress operator ran crash
Summary: Ingress operator ran crash
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.1.0
Assignee: Dan Mace
QA Contact: Jian Zhang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-12-24 06:48 UTC by Jian Zhang
Modified: 2022-08-04 22:20 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-04 10:41:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 0 None None None 2019-06-04 10:41:33 UTC

Description Jian Zhang 2018-12-24 06:48:56 UTC
Description of problem:
Got below errors when creating the OCP 4.0.
mac:hongli-payload jianzhang$ oc get pods
NAME                                READY     STATUS             RESTARTS   AGE
ingress-operator-6856699cbd-6b2s2   0/1       CrashLoopBackOff   21         1h
mac:hongli-payload jianzhang$ oc logs -f ingress-operator-6856699cbd-6b2s2
time="2018-12-24T06:18:44Z" level=fatal msg="failed to get dns 'cluster': no matches for kind \"DNS\" in version \"config.openshift.io/v1\""


Version-Release number of selected component (if applicable):
mac:hongli-payload jianzhang$ oc get clusterversion
NAME      VERSION                           AVAILABLE   PROGRESSING   SINCE     STATUS
version   4.0.0-0.alpha-2018-12-23-225529   True        False         1h        Cluster version is 4.0.0-0.alpha-2018-12-23-225529

How reproducible:
always

Steps to Reproduce:
1. Create the OCP 4.0 via the installer by using the "registry.svc.ci.openshift.org/openshift/origin-release@sha256:e4acfa3f979ee2c66860b67ff61f24f65671b18b5a9b9a5040d4d5370be0b32d" image.

Actual results:
Ingress operator ran failed, got below errors:
mac:hongli-payload jianzhang$ oc get pods
NAME                                READY     STATUS             RESTARTS   AGE
ingress-operator-6856699cbd-6b2s2   0/1       CrashLoopBackOff   21         1h
mac:hongli-payload jianzhang$ oc logs -f ingress-operator-6856699cbd-6b2s2
time="2018-12-24T06:18:44Z" level=fatal msg="failed to get dns 'cluster': no matches for kind \"DNS\" in version \"config.openshift.io/v1\""

Expected results:
The ingress operator works well.

Additional info:
The root cause is that the cvo didn't deploy the "dnses.config.openshift.io" object. Like below:
mac:hongli-payload jianzhang$ oc get crd|grep dnses.config.openshift.io
mac:hongli-payload jianzhang$ 
I checked the payload of the cvo and did not find the corresponding deployment file.
mac:hongli-payload jianzhang$ oc adm release extract --from=registry.svc.ci.openshift.org/openshift/origin-release@sha256:e4acfa3f979ee2c66860b67ff61f24f65671b18b5a9b9a5040d4d5370be0b32d --to latest-payload
mac:latest-payload jianzhang$ ls | grep ingress
0000_70_cluster-ingress-operator_00-cluster-role.yaml
0000_70_cluster-ingress-operator_00-custom-resource-definition.yaml
0000_70_cluster-ingress-operator_00-namespace.yaml
0000_70_cluster-ingress-operator_01-cluster-role-binding.yaml
0000_70_cluster-ingress-operator_01-kube-system-aws-creds-role-binding.yaml
0000_70_cluster-ingress-operator_01-role-binding.yaml
0000_70_cluster-ingress-operator_01-role.yaml
0000_70_cluster-ingress-operator_01-service-account.yaml
0000_70_cluster-ingress-operator_02-deployment.yaml
mac:latest-payload jianzhang$ ack "dnses.config.openshift.io" .
mac:latest-payload jianzhang$

Comment 1 Jian Zhang 2018-12-24 06:51:55 UTC
The workaround:
1) create this missed CRD, like below:
mac:hongli-payload jianzhang$ cat dnscrd.yaml 
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  creationTimestamp: 2018-12-24T01:24:26Z
  generation: 1
  name: dnses.config.openshift.io
  resourceVersion: "224"
  selfLink: /apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions/dnses.config.openshift.io
  uid: a732a0da-071a-11e9-89e5-020499d9a340
spec:
  additionalPrinterColumns:
  - JSONPath: .metadata.creationTimestamp
    description: |-
      CreationTimestamp is a timestamp representing the server time when this object was created. It is not guaranteed to be set in happens-before order across separate operations. Clients may not set this value. It is represented in RFC3339 form and is in UTC.
 
      Populated by the system. Read-only. Null for lists. More info: https://git.k8s.io/community/contributors/devel/api-conventions.md#metadata
    name: Age
    type: date
  group: config.openshift.io
  names:
    kind: DNS
    listKind: DNSList
    plural: dnses
    singular: dns
  scope: Cluster
  version: v1
  versions:
  - name: v1
    served: true
    storage: true
status:
  acceptedNames:
    kind: DNS
    listKind: DNSList
    plural: dnses
    singular: dns
  conditions:
  - lastTransitionTime: 2018-12-24T01:24:26Z
    message: no conflicts found
    reason: NoConflicts
    status: "True"
    type: NamesAccepted
  - lastTransitionTime: null
    message: the initial names have been accepted
    reason: InitialNamesAccepted
    status: "True"
    type: Established
  storedVersions:
  - v1

mac:hongli-payload jianzhang$ oc get crd| grep dnses.config.openshift.io
dnses.config.openshift.io

2) Create the "DNS" object.
mac:hongli-payload jianzhang$ cat dns.yaml 
apiVersion: config.openshift.io/v1
kind: DNS
metadata:
  generation: 1
  name: cluster
spec:
  baseDomain: origin-ci-int-aws.dev.rhcloud.com
status: {}

mac:hongli-payload jianzhang$ oc get DNS
NAME      AGE
cluster   8m

Now, the ingress operator works well!
mac:hongli-payload jianzhang$ oc get pods
NAME                                READY     STATUS    RESTARTS   AGE
ingress-operator-6856699cbd-ksv6x   1/1       Running   0          8m

Comment 2 Dan Mace 2019-01-24 18:59:37 UTC
This bug appears to be the result of using an updated origin payload with an older cluster. Please re-test with a newer build and installer.

Comment 3 Jian Zhang 2019-01-25 05:13:20 UTC
LGTM, verify it, details as below:
Cluster version is 4.0.0-0.nightly-2019-01-24-184525

mac:ocp-25 jianzhang$ oc get dns
NAME      AGE
cluster   39m
mac:ocp-25 jianzhang$ oc get pods -n openshift-ingress
NAME                              READY     STATUS    RESTARTS   AGE
router-default-6898f475f8-5cm4p   1/1       Running   0          2h
mac:ocp-25 jianzhang$ oc get crd|grep dns
clusterdnses.dns.openshift.io                                                                2019-01-25T02:53:11Z
dnses.config.openshift.io                                                                    2019-01-25T02:52:23Z

Comment 6 errata-xmlrpc 2019-06-04 10:41:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.