Bug 1897026 - [Migration] With updating optional network operator configuration, migration stucks on MCO
Summary: [Migration] With updating optional network operator configuration, migration ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.7.0
Assignee: Peng Liu
QA Contact: huirwang
URL:
Whiteboard:
Depends On: 1898159
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-12 06:02 UTC by huirwang
Modified: 2021-02-24 15:33 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-24 15:32:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:33:13 UTC

Description huirwang 2020-11-12 06:02:08 UTC
Description of problem:
After updated optional Cluster Network Operator Configuration, the migration stucks when enable MCO

Version-Release number of selected component (if applicable):
4.7.0-0.nightly-2020-11-11-220947

How reproducible:
Always


Steps to Reproduce:
1. Enable migration:oc annotate Network.operator.openshift.io cluster "networkoperator.openshift.io/network-migration"=""
2. Stop MCO

3. Patch SDN to OVN, with optional parameter.
oc patch Network.config.openshift.io cluster --type='merge' --patch '{"spec":{"networkType":"OVNKubernetes","clusterNetwork":[{"cidr":"10.132.0.0/14","hostPrefix":23}]}}'

oc patch Network.operator.openshift.io cluster --type='merge' --patch '{"spec":{"defaultNetwork":{"ovnKubernetesConfig":{"mtu":1200}}}}'

4. Wait for the multus pods recreated

5. reboot all the nodes

6. Enable MCO.


Actual results:

One node stucks on "Ready,SchedulingDisabled". (from the testing, more than two hours still in this status).

huiran-mac:script hrwang$ oc get nodes
NAME                                                         STATUS                     ROLES    AGE     VERSION
huirwang-gcp1-lphgb-master-0.c.openshift-qe.internal         Ready                      master   4h15m   v1.19.2+9c2f84c
huirwang-gcp1-lphgb-master-1.c.openshift-qe.internal         Ready,SchedulingDisabled   master   4h15m   v1.19.2+9c2f84c
huirwang-gcp1-lphgb-master-2.c.openshift-qe.internal         Ready                      master   4h15m   v1.19.2+9c2f84c
huirwang-gcp1-lphgb-worker-a-gvgq6.c.openshift-qe.internal   Ready                      worker   4h4m    v1.19.2+9c2f84c
huirwang-gcp1-lphgb-worker-b-xkbsk.c.openshift-qe.internal   Ready                      worker   4h4m    v1.19.2+9c2f84c
huirwang-gcp1-lphgb-worker-c-7qqkr.c.openshift-qe.internal   Ready                      worker   4h4m    v1.19.2+9c2f84c

There are some errors in below pods, not sure if that related to the issue.

oc logs machine-config-daemon-4twqz -n openshift-machine-config-operator -c machine-config-daemon
I1111 10:25:21.958563    4440 daemon.go:344] evicting pod openshift-etcd/etcd-quorum-guard-5c45879d65-bqxb8
E1111 10:25:21.973588    4440 daemon.go:344] error when evicting pod "etcd-quorum-guard-5c45879d65-bqxb8" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I1111 10:25:26.973770    4440 daemon.go:344] evicting pod openshift-etcd/etcd-quorum-guard-5c45879d65-bqxb8
E1111 10:25:26.985799    4440 daemon.go:344] error when evicting pod "etcd-quorum-guard-5c45879d65-bqxb8" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I1111 10:25:31.986000    4440 daemon.go:344] evicting pod openshift-etcd/etcd-quorum-guard-5c45879d65-bqxb8
E1111 10:25:31.995365    4440 daemon.go:344] error when evicting pod "etcd-quorum-guard-5c45879d65-bqxb8" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.

Expected results:
Migrated to SDN successfully.


Additional info:

Comment 4 Peng Liu 2020-12-07 13:58:00 UTC
This bug shall be fixed together with BZ1898159.

Comment 9 errata-xmlrpc 2021-02-24 15:32:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.