Bug 2086301
| Summary: | kubernetes nmstate pods are not running after creating instance | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Aleksandra Malykhin <amalykhi> | |
| Component: | Networking | Assignee: | Christoph Stäbler <cstabler> | |
| Networking sub component: | kubernetes-nmstate-operator | QA Contact: | Aleksandra Malykhin <amalykhi> | |
| Status: | CLOSED ERRATA | Docs Contact: | ||
| Severity: | urgent | |||
| Priority: | urgent | CC: | cstabler, mocohen | |
| Version: | 4.11 | Keywords: | Regression | |
| Target Milestone: | --- | |||
| Target Release: | 4.11.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2087060 (view as bug list) | Environment: | ||
| Last Closed: | 2022-08-10 11:12:00 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
|
Description
Aleksandra Malykhin
2022-05-15 12:50:38 UTC
Works as expected with previous version of knmstate: [kni@provisionhost-0-0 ~]$ oc get csv NAME DISPLAY VERSION REPLACES PHASE kubernetes-nmstate-operator.4.11.0-202205102228 Kubernetes NMState Operator 4.11.0-202205102228 Succeeded [kni@provisionhost-0-0 ~]$ oc get pods -n openshift-nmstate NAME READY STATUS RESTARTS AGE nmstate-cert-manager-6488cc8c6c-zw2l6 1/1 Running 0 24s nmstate-handler-nzl25 1/1 Running 0 23s nmstate-handler-rwl9l 1/1 Running 0 23s nmstate-handler-sqxhq 1/1 Running 0 24s nmstate-handler-sr825 1/1 Running 0 23s nmstate-handler-zwhn9 1/1 Running 0 23s nmstate-operator-5dcdd7d4d9-sz4lj 1/1 Running 0 45s nmstate-webhook-797c74669c-ts7mq 0/1 Running 0 23s nmstate-webhook-797c74669c-wwwxp 0/1 Running 0 24s I think that the operator itself has some miss-configured manifests. I can see that from the nmstate-operator pod's logs / definition:
The first is from logs (Full error below):
Deployment.apps \"nmstate-webhook\" is invalid: spec.template.spec.containers[0].image: Required value\ncould not create (apps/v1, Kind=Deployment)
The second is from the container's env vars where we have nmstate as a value, instead of openshift-nmstate:
- name: HANDLER_NAMESPACE
value: nmstate
Full error from the nmstate-operator pod's logs:
{
"level": "error",
"ts": "2022-05-15T14:47:25.209Z",
"logger": "controller.nmstate",
"msg": "Reconciler error",
"reconciler group": "nmstate.io",
"reconciler kind": "NMState",
"name": "openshift-nmstate",
"namespace": "",
"error": "failed to apply object &{map[apiVersion:apps/v1 kind:Deployment metadata:map[labels:map[app:kubernetes-nmstate component:kubernetes-nmstate-webhook] name:nmstate-webhook namespace:nmstate ownerReferences:[map[apiVersion:nmstate.io/v1 blockOwnerDeletion:true controller:true kind:NMState name:openshift-nmstate uid:40034d2b-f8de-4c1a-bf68-755f8c6a557e]]] spec:map[replicas:2 selector:map[matchLabels:map[name:nmstate-webhook]] strategy:map[type:Recreate] template:map[metadata:map[annotations:map[description:kubernetes-nmstate-webhook resets NNCP status] labels:map[app:kubernetes-nmstate component:kubernetes-nmstate-webhook name:nmstate-webhook]] spec:map[affinity:map[] containers:[map[args:[--zap-time-encoding=iso8601] command:[manager] env:[map[name:WATCH_NAMESPACE value:] map[name:POD_NAME valueFrom:map[fieldRef:map[fieldPath:metadata.name]]] map[name:POD_NAMESPACE valueFrom:map[fieldRef:map[fieldPath:metadata.namespace]]] map[name:RUN_WEBHOOK_SERVER value:] map[name:OPERATOR_NAME value:nmstate] map[name:ENABLE_PROFILER value:False] map[name:PROFILER_PORT value:6060]] image:<nil> imagePullPolicy:Always name:nmstate-webhook ports:[map[containerPort:9443 name:webhook-server protocol:TCP]] readinessProbe:map[httpGet:map[httpHeaders:[map[name:Content-Type value:application/json]] path:/readyz port:webhook-server scheme:HTTPS] initialDelaySeconds:10 periodSeconds:10] resources:map[requests:map[cpu:30m memory:20Mi]] volumeMounts:[map[mountPath:/tmp/k8s-webhook-server/serving-certs/ name:tls-key-pair readOnly:true]]]] nodeSelector:map[kubernetes.io/arch:amd64 node-role.kubernetes.io/master:] priorityClassName:system-cluster-critical serviceAccountName:nmstate-handler tolerations:[map[effect:NoSchedule key:node-role.kubernetes.io/master operator:Exists]] topologySpreadConstraints:[map[labelSelector:map[matchLabels:map[component:kubernetes-nmstate-webhook]] maxSkew:1 topologyKey:kubernetes.io/hostname whenUnsatisfiable:DoNotSchedule]] volumes:[map[name:tls-key-pair secret:map[secretName:nmstate-webhook]]]]]]]}: could not create (apps/v1, Kind=Deployment) nmstate/nmstate-webhook: Deployment.apps \"nmstate-webhook\" is invalid: spec.template.spec.containers[0].image: Required value",
"errorVerbose": "Deployment.apps \"nmstate-webhook\" is invalid: spec.template.spec.containers[0].image: Required value\ncould not create (apps/v1, Kind=Deployment) nmstate/nmstate-webhook\ngithub.com/openshift/cluster-network-operator/pkg/apply.ApplyObject\n\t/workdir/vendor/github.com/openshift/cluster-network-operator/pkg/apply/apply.go:43\ngithub.com/nmstate/kubernetes-nmstate/controllers/operator.(*NMStateReconciler).renderAndApply\n\t/workdir/controllers/operator/nmstate_controller.go:311\ngithub.com/nmstate/kubernetes-nmstate/controllers/operator.(*NMStateReconciler).applyHandler\n\t/workdir/controllers/operator/nmstate_controller.go:246\ngithub.com/nmstate/kubernetes-nmstate/controllers/operator.(*NMStateReconciler).Reconcile\n\t/workdir/controllers/operator/nmstate_controller.go:125\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/workdir/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/workdir/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/workdir/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/workdir/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/lib/golang/src/runtime/asm_amd64.s:1581\nfailed to apply object &{map[apiVersion:apps/v1 kind:Deployment metadata:map[labels:map[app:kubernetes-nmstate component:kubernetes-nmstate-webhook] name:nmstate-webhook namespace:nmstate ownerReferences:[map[apiVersion:nmstate.io/v1 blockOwnerDeletion:true controller:true kind:NMState name:openshift-nmstate uid:40034d2b-f8de-4c1a-bf68-755f8c6a557e]]] spec:map[replicas:2 selector:map[matchLabels:map[name:nmstate-webhook]] strategy:map[type:Recreate] template:map[metadata:map[annotations:map[description:kubernetes-nmstate-webhook resets NNCP status] labels:map[app:kubernetes-nmstate component:kubernetes-nmstate-webhook name:nmstate-webhook]] spec:map[affinity:map[] containers:[map[args:[--zap-time-encoding=iso8601] command:[manager] env:[map[name:WATCH_NAMESPACE value:] map[name:POD_NAME valueFrom:map[fieldRef:map[fieldPath:metadata.name]]] map[name:POD_NAMESPACE valueFrom:map[fieldRef:map[fieldPath:metadata.namespace]]] map[name:RUN_WEBHOOK_SERVER value:] map[name:OPERATOR_NAME value:nmstate] map[name:ENABLE_PROFILER value:False] map[name:PROFILER_PORT value:6060]] image:<nil> imagePullPolicy:Always name:nmstate-webhook ports:[map[containerPort:9443 name:webhook-server protocol:TCP]] readinessProbe:map[httpGet:map[httpHeaders:[map[name:Content-Type value:application/json]] path:/readyz port:webhook-server scheme:HTTPS] initialDelaySeconds:10 periodSeconds:10] resources:map[requests:map[cpu:30m memory:20Mi]] volumeMounts:[map[mountPath:/tmp/k8s-webhook-server/serving-certs/ name:tls-key-pair readOnly:true]]]] nodeSelector:map[kubernetes.io/arch:amd64 node-role.kubernetes.io/master:] priorityClassName:system-cluster-critical serviceAccountName:nmstate-handler tolerations:[map[effect:NoSchedule key:node-role.kubernetes.io/master operator:Exists]] topologySpreadConstraints:[map[labelSelector:map[matchLabels:map[component:kubernetes-nmstate-webhook]] maxSkew:1 topologyKey:kubernetes.io/hostname whenUnsatisfiable:DoNotSchedule]] volumes:[map[name:tls-key-pair secret:map[secretName:nmstate-webhook]]]]]]]}\ngithub.com/nmstate/kubernetes-nmstate/controllers/operator.(*NMStateReconciler).renderAndApply\n\t/workdir/controllers/operator/nmstate_controller.go:313\ngithub.com/nmstate/kubernetes-nmstate/controllers/operator.(*NMStateReconciler).applyHandler\n\t/workdir/controllers/operator/nmstate_controller.go:246\ngithub.com/nmstate/kubernetes-nmstate/controllers/operator.(*NMStateReconciler).Reconcile\n\t/workdir/controllers/operator/nmstate_controller.go:125\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/workdir/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/workdir/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/workdir/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/workdir/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/lib/golang/src/runtime/asm_amd64.s:1581",
"stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/workdir/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/workdir/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"
}
My comment above refers to the same version Aleksandra has mentioned: kubernetes-nmstate-operator.4.11.0-202205131159 @@cstablerThe behavior changed (the webhooks started) so I assume that the fix is inside of the latest build. The handlers are not started. Please, take a look [kni@provisionhost-0-0 ~]$ oc get csv NAME DISPLAY VERSION REPLACES PHASE kubernetes-nmstate-operator.4.11.0-202205161927 Kubernetes NMState Operator 4.11.0-202205161927 Succeeded [kni@provisionhost-0-0 ~]$ oc get pods NAME READY STATUS RESTARTS AGE nmstate-cert-manager-5cd964f945-dwf87 1/1 Running 0 41s nmstate-operator-8575cdb779-nx5fm 1/1 Running 0 75s nmstate-webhook-df8f44d8d-lhsjg 1/1 Running 0 41s nmstate-webhook-df8f44d8d-tp67q 1/1 Running 0 41s logs
2022/05/17 07:33:28 reconciling (/v1, Kind=Namespace) /openshift-nmstate
2022/05/17 07:33:28 update was successful
2022/05/17 07:33:28 reconciling (rbac.authorization.k8s.io/v1, Kind=ClusterRole) /nmstate-cluster-reader
2022/05/17 07:33:28 does not exist, creating (rbac.authorization.k8s.io/v1, Kind=ClusterRole) /nmstate-cluster-reader
2022/05/17 07:33:28 successfully created (rbac.authorization.k8s.io/v1, Kind=ClusterRole) /nmstate-cluster-reader
2022/05/17 07:33:28 reconciling (rbac.authorization.k8s.io/v1, Kind=Role) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 does not exist, creating (rbac.authorization.k8s.io/v1, Kind=Role) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 successfully created (rbac.authorization.k8s.io/v1, Kind=Role) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 reconciling (rbac.authorization.k8s.io/v1, Kind=ClusterRole) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 does not exist, creating (rbac.authorization.k8s.io/v1, Kind=ClusterRole) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 successfully created (rbac.authorization.k8s.io/v1, Kind=ClusterRole) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 reconciling (rbac.authorization.k8s.io/v1, Kind=RoleBinding) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 does not exist, creating (rbac.authorization.k8s.io/v1, Kind=RoleBinding) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 successfully created (rbac.authorization.k8s.io/v1, Kind=RoleBinding) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 reconciling (rbac.authorization.k8s.io/v1, Kind=ClusterRoleBinding) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 does not exist, creating (rbac.authorization.k8s.io/v1, Kind=ClusterRoleBinding) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 successfully created (rbac.authorization.k8s.io/v1, Kind=ClusterRoleBinding) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 reconciling (/v1, Kind=ServiceAccount) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 does not exist, creating (/v1, Kind=ServiceAccount) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 successfully created (/v1, Kind=ServiceAccount) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 reconciling (apps/v1, Kind=Deployment) openshift-nmstate/nmstate-webhook
2022/05/17 07:33:28 does not exist, creating (apps/v1, Kind=Deployment) openshift-nmstate/nmstate-webhook
2022/05/17 07:33:28 successfully created (apps/v1, Kind=Deployment) openshift-nmstate/nmstate-webhook
2022/05/17 07:33:28 reconciling (apps/v1, Kind=Deployment) openshift-nmstate/nmstate-cert-manager
2022/05/17 07:33:28 does not exist, creating (apps/v1, Kind=Deployment) openshift-nmstate/nmstate-cert-manager
2022/05/17 07:33:28 successfully created (apps/v1, Kind=Deployment) openshift-nmstate/nmstate-cert-manager
2022/05/17 07:33:28 reconciling (apps/v1, Kind=DaemonSet) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 does not exist, creating (apps/v1, Kind=DaemonSet) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 successfully created (apps/v1, Kind=DaemonSet) openshift-nmstate/nmstate-handler
2022/05/17 07:33:28 reconciling (/v1, Kind=Service) openshift-nmstate/nmstate-webhook
2022/05/17 07:33:28 does not exist, creating (/v1, Kind=Service) openshift-nmstate/nmstate-webhook
2022/05/17 07:33:28 successfully created (/v1, Kind=Service) openshift-nmstate/nmstate-webhook
2022/05/17 07:33:28 reconciling (admissionregistration.k8s.io/v1, Kind=MutatingWebhookConfiguration) /nmstate
2022/05/17 07:33:28 does not exist, creating (admissionregistration.k8s.io/v1, Kind=MutatingWebhookConfiguration) /nmstate
2022/05/17 07:33:28 successfully created (admissionregistration.k8s.io/v1, Kind=MutatingWebhookConfiguration) /nmstate
2022/05/17 07:33:28 reconciling (policy/v1, Kind=PodDisruptionBudget) openshift-nmstate/nmstate-webhook
2022/05/17 07:33:28 does not exist, creating (policy/v1, Kind=PodDisruptionBudget) openshift-nmstate/nmstate-webhook
2022/05/17 07:33:28 successfully created (policy/v1, Kind=PodDisruptionBudget) openshift-nmstate/nmstate-webhook
{"level":"info","ts":"2022-05-17T07:33:28.748Z","logger":"controllers.NMState","msg":"Reconcile complete."}
E0517 07:33:43.977413 1 reflector.go:138] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: Failed to watch *v1.Node: unknown (get nodes)
E0517 07:34:33.044727 1 reflector.go:138] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: Failed to watch *v1.Node: unknown (get nodes)
E0517 07:35:05.194615 1 reflector.go:138] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: Failed to watch *v1.Node: unknown (get nodes)
E0517 07:35:36.883669 1 reflector.go:138] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: Failed to watch *v1.Node: unknown (get nodes)
E0517 07:36:17.525820 1 reflector.go:138] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: Failed to watch *v1.Node: unknown (get nodes)
E0517 07:36:47.843718 1 reflector.go:138] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: Failed to watch *v1.Node: unknown (get nodes)
E0517 07:37:30.970436 1 reflector.go:138] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: Failed to watch *v1.Node: unknown (get nodes)
E0517 07:38:14.953079 1 reflector.go:138] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: Failed to watch *v1.Node: unknown (get nodes)
E0517 07:39:01.697898 1 reflector.go:138] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:250: Failed to watch *v1.Node: unknown (get nodes)
another log from nmstate handler daemonset [kni@provisionhost-0-0 ~]$ oc -n openshift-nmstate describe ds nmstate-handler Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedCreate 5m15s (x20 over 21m) daemonset-controller Error creating: pods "nmstate-handler-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider restricted-v2: .spec.securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used, spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed, spec.containers[0].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used, provider "restricted": Forbidden: not usable by user or serviceaccount, provider "nonroot-v2": Forbidden: not usable by user or serviceaccount, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork-v2": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount] Verified on kubernetes-nmstate-operator.4.11.0-202205171127 [kni@provisionhost-0-0 ~]$ oc get pods NAME READY STATUS RESTARTS AGE nmstate-cert-manager-6cbb7df4fd-ghsn4 1/1 Running 0 116s nmstate-handler-jm95p 1/1 Running 0 116s nmstate-handler-n5b7j 1/1 Running 0 116s nmstate-handler-ppzkz 1/1 Running 0 116s nmstate-handler-rf9j7 1/1 Running 0 116s nmstate-handler-zj5c2 1/1 Running 0 116s nmstate-operator-5f8cd6bb86-6dksj 1/1 Running 0 2m23s nmstate-webhook-54f48fd9bf-fxxgx 1/1 Running 0 116s nmstate-webhook-54f48fd9bf-l5xzz 1/1 Running 0 116s Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 |