Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1897026

Summary:	[Migration] With updating optional network operator configuration, migration stucks on MCO
Product:	OpenShift Container Platform	Reporter:	huirwang
Component:	Networking	Assignee:	Peng Liu <pliu>
Networking sub component:	openshift-sdn	QA Contact:	huirwang
Status:	CLOSED ERRATA	Docs Contact:
Severity:	high
Priority:	high	CC:	aconstan, anusaxen, bbennett, dosmith, rbrattai, weliang
Version:	4.7
Target Milestone:	---
Target Release:	4.7.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2021-02-24 15:32:41 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1898159
Bug Blocks:

Description huirwang 2020-11-12 06:02:08 UTC

Description of problem:
After updated optional Cluster Network Operator Configuration, the migration stucks when enable MCO

Version-Release number of selected component (if applicable):
4.7.0-0.nightly-2020-11-11-220947

How reproducible:
Always

Steps to Reproduce:
1. Enable migration:oc annotate Network.operator.openshift.io cluster "networkoperator.openshift.io/network-migration"=""
2. Stop MCO

3. Patch SDN to OVN, with optional parameter.
oc patch Network.config.openshift.io cluster --type='merge' --patch '{"spec":{"networkType":"OVNKubernetes","clusterNetwork":[{"cidr":"10.132.0.0/14","hostPrefix":23}]}}'

oc patch Network.operator.openshift.io cluster --type='merge' --patch '{"spec":{"defaultNetwork":{"ovnKubernetesConfig":{"mtu":1200}}}}'

4. Wait for the multus pods recreated

5. reboot all the nodes

6. Enable MCO.

Actual results:

One node stucks on "Ready,SchedulingDisabled". (from the testing, more than two hours still in this status).

huiran-mac:script hrwang$ oc get nodes
NAME STATUS ROLES AGE VERSION
huirwang-gcp1-lphgb-master-0.c.openshift-qe.internal Ready master 4h15m v1.19.2+9c2f84c
huirwang-gcp1-lphgb-master-1.c.openshift-qe.internal Ready,SchedulingDisabled master 4h15m v1.19.2+9c2f84c
huirwang-gcp1-lphgb-master-2.c.openshift-qe.internal Ready master 4h15m v1.19.2+9c2f84c
huirwang-gcp1-lphgb-worker-a-gvgq6.c.openshift-qe.internal Ready worker 4h4m v1.19.2+9c2f84c
huirwang-gcp1-lphgb-worker-b-xkbsk.c.openshift-qe.internal Ready worker 4h4m v1.19.2+9c2f84c
huirwang-gcp1-lphgb-worker-c-7qqkr.c.openshift-qe.internal Ready worker 4h4m v1.19.2+9c2f84c

There are some errors in below pods, not sure if that related to the issue.

oc logs machine-config-daemon-4twqz -n openshift-machine-config-operator -c machine-config-daemon
I1111 10:25:21.958563 4440 daemon.go:344] evicting pod openshift-etcd/etcd-quorum-guard-5c45879d65-bqxb8
E1111 10:25:21.973588 4440 daemon.go:344] error when evicting pod "etcd-quorum-guard-5c45879d65-bqxb8" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I1111 10:25:26.973770 4440 daemon.go:344] evicting pod openshift-etcd/etcd-quorum-guard-5c45879d65-bqxb8
E1111 10:25:26.985799 4440 daemon.go:344] error when evicting pod "etcd-quorum-guard-5c45879d65-bqxb8" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I1111 10:25:31.986000 4440 daemon.go:344] evicting pod openshift-etcd/etcd-quorum-guard-5c45879d65-bqxb8
E1111 10:25:31.995365 4440 daemon.go:344] error when evicting pod "etcd-quorum-guard-5c45879d65-bqxb8" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.

Expected results:
Migrated to SDN successfully.

Additional info:

Comment 4 Peng Liu 2020-12-07 13:58:00 UTC

This bug shall be fixed together with BZ1898159.

Comment 9 errata-xmlrpc 2021-02-24 15:32:41 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633