Bug 1820472
Summary: | [Migration] After migrate from sdn to ovn, namespace "openshift-sdn" got stuck at terminating status | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | huirwang |
Component: | Networking | Assignee: | Peng Liu <pliu> |
Networking sub component: | openshift-sdn | QA Contact: | huirwang |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | unspecified | CC: | bbennett, danw, pliu, ricarril |
Version: | 4.5 | ||
Target Milestone: | --- | ||
Target Release: | 4.5.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-08-04 18:07:38 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
huirwang
2020-04-03 07:30:59 UTC
There is a message of the namespace says 'Discovery failed for some groups'. Those APIs that shall be replied by the openshift-apiserver, were not accessible, as the Openshift-SDN pods have been deleted. It caused the namespace stuck in 'Terminating' state. However, the namespace shall be able to be deleted successfully after the cluster reboot, when the cluster network is back to normal. So normally, it shall not be an issue. But, this issue will cause trouble when the OVN-kubernete network didn't work after the cluster reboot. And users want to rollback to Openshift-SDN. All the openshift-sdn resources cannot be created, because the namespace is still 'Terminating'. So to solve this issue, I think we have 2 options: 1. In CNO, not deleting the namespace when doing the migration, so that we don't need to recreate it during rollback. 2. In the migration procedure document, ask users to check and force delete the namespace if needed before executing the rollback. I'm inclined for option 2. We definitely want to delete the openshift-sdn namespace *eventually*. Maybe it makes sense to figure out how to tweak things so that it doesn't get deleted until after the cluster is back up and running with ovn-kubernetes. If we can't do that then I think "the user has to manually delete the namespace if they have to roll back" is better than "the user has to manually delete the namespace even on success (if they don't want a stray unused namespace lying around forever)". (So, 2.) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.5 image release advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409 |