Bug 2130604
| Summary: | Unable to start/stop VM while rebooting the node where kubemacpool-mac-controller-manager pod is running | ||
|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Adolfo Aguirrezabal <aaguirre> |
| Component: | Networking | Assignee: | Ram Lavi <ralavi> |
| Status: | CLOSED ERRATA | QA Contact: | awax |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.11.0 | CC: | awax, blevin, ellorent, nrozen, phoracek, ycui |
| Target Milestone: | --- | ||
| Target Release: | 4.13.3 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | v4.13.3.rhel9-34 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-08-16 14:09:56 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Adolfo Aguirrezabal
2022-09-28 15:20:31 UTC
We probably don't want to go back to active-backup architecture, but we should investigate whether there is something that could reduce the downtime. Perhaps a health check that would spawn a new KMP instance in case the old one is not responding. What happend today if we restart the node where virt-operator is running ? do we suffer the same issue ? My mistake I mean virt-controller but those are running at workers First we have to check if the env has multiple masters (that's a requirement from openshift to have a proper environment I think) then we can improve the situation probes so the instead of node reboot time is pod start time. note that if you reboot the node gracefully then the kubemacpool pod should not experience the downtime you mention. The scenario is relevant for when you ungracefully reboot the node (for example node crash). For more information on graceful reboot, please see https://docs.openshift.com/container-platform/4.11/nodes/nodes/nodes-nodes-rebooting.html#nodes-nodes-rebooting-gracefully_nodes-nodes-rebooting. In order to fix the issue for cases where the node ungracefully restarts, we need to add a toleration to the KMP-manager deployment pod. Also explored adding liveness probe - but it doesn't work since the kubelet (the object that probes the pod to determine if it's alive) is also dead with the node, thus rendering it useless for this case). I've set it so that if the node is down for more than 1 minute, the KMP pod will be evicted to another node if available. PR will arrive soon The fix was merged on KMP. KMP release and pinning it in CNAO are next. Hey Anat, please provide an explanation as to why this BZ is not fixed. Hi Ram, as you can see in the node, the tolaration is still set to '300':
oc get pod -n openshift-cnv kubemacpool-mac-controller-manager-649cbb596c-24qfg -oyaml
apiVersion: v1
kind: Pod
metadata:
annotations:
description: KubeMacPool manages MAC allocation to Pods and VMs
k8s.ovn.org/pod-networks: '{"default":{"ip_addresses":["10.129.0.131/23"],"mac_address":"0a:58:0a:81:00:83","gateway_ips":["10.129.0.1"],"ip_address":"10.129.0.131/23","gateway_ip":"10.129.0.1"}}'
k8s.v1.cni.cncf.io/network-status: |-
[{
"name": "ovn-kubernetes",
"interface": "eth0",
"ips": [
"10.129.0.131"
],
"mac": "0a:58:0a:81:00:83",
"default": true,
"dns": {}
}]
k8s.v1.cni.cncf.io/networks-status: |-
[{
"name": "ovn-kubernetes",
"interface": "eth0",
"ips": [
"10.129.0.131"
],
"mac": "0a:58:0a:81:00:83",
"default": true,
"dns": {}
}]
openshift.io/scc: restricted-v2
seccomp.security.alpha.kubernetes.io/pod: runtime/default
creationTimestamp: "2023-03-07T08:48:03Z"
generateName: kubemacpool-mac-controller-manager-649cbb596c-
...
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/control-plane
operator: Exists
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
- effect: NoSchedule
key: node.kubernetes.io/memory-pressure
operator: Exists
volumes:
- name: tls-key-pair
secret:
defaultMode: 420
secretName: kubemacpool-service
- name: kube-api-access-2lqx4
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
- configMap:
items:
- key: service-ca.crt
path: service-ca.crt
name: openshift-service-ca.crt
...
The reason the Kubemacpool fix wasn't enough is because the toleration is overwritten by the CNAO operator when deployed by it. Deferring to 4.13.2 to save capacity verifying urgent 4.13.1 bugs. For CNV4.13 stable branch: https://github.com/kubevirt/cluster-network-addons-operator/pull/1581 Verified the bug on PSI cluster net-bl-4133250 (v4.13.3-250) Version info: [blevin@fedora kubeconfigs]$ oc get csv -n openshift-cnv NAME DISPLAY VERSION REPLACES PHASE kubevirt-hyperconverged-operator.v4.13.3 OpenShift Virtualization 4.13.3 kubevirt-hyperconverged-operator.v4.13.2 Succeeded [blevin@fedora kubeconfigs]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.13.9 True False 4h32m Cluster version is 4.13.9 Reproduced by steps in the description. After rebooting the node, takes approx. 2 minutes for the user to be able to use virtctl commands on the VM. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Virtualization 4.13.3 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:4664 |