Bug 1952358

Summary: Openshift-apiserver CO unavailable in fresh OCP 4.7.5 installations
Product: OpenShift Container Platform Reporter: oarribas <oarribas>
Component: Machine Config OperatorAssignee: Yu Qi Zhang <jerzhang>
Machine Config Operator sub component: platform-vsphere QA Contact: Michael Nguyen <mnguyen>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: aconstan, agogala, alexisph, alkazako, anbhat, aos-bugs, bleanhar, danw, david.karlsen, dkulkarn, huirwang, jani.eerola, jcallen, jerzhang, jima, lmohanty, mbetti, mfojtik, mgugino, mkrejci, nchoudhu, oarribas, openshift-bugs-escalate, openshift-bugzilla-robot, palonsor, palshure, rbobek, rbrattai, rsandu, scuppett, shishika, simore, sreber, srengan, wking, yhe, zzhao
Version: 4.7Keywords: Upgrades
Target Milestone: ---Flags: jerzhang: needinfo-
Target Release: 4.8.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: UpdateRecommendationsBlocked
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: 1941246 Environment:
Last Closed: 2021-07-27 23:02:52 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1956749    

Comment 9 Ross Brattain 2021-05-06 21:48:30 UTC
Verified on 4.8.0-0.nightly-2021-05-06-092434

Hypervisor:	VMware ESXi, 7.0.1, 17460241
Model:	Amazon EC2 i3en.metal-2tb
Processor Type:	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz

Tested UPI install on vHW 14 OVN and OpenShiftSDN, then upgraded individual nodes to vHW 14-17

# for f in $(oc get nodes --no-headers -o custom-columns=N:.metadata.name ) ; do oc debug node/$f -- ethtool -k ens192 | grep udp_tnl | tee udp-$f & done

vHW 15
udp-compute-0:tx-udp_tnl-segmentation: off
udp-compute-0:tx-udp_tnl-csum-segmentation: off
vHW 16
udp-compute-1:tx-udp_tnl-segmentation: off
udp-compute-1:tx-udp_tnl-csum-segmentation: off
vHW 17
udp-control-plane-0:tx-udp_tnl-segmentation: off
udp-control-plane-0:tx-udp_tnl-csum-segmentation: off
vHW 14
udp-control-plane-1:tx-udp_tnl-segmentation: off
udp-control-plane-1:tx-udp_tnl-csum-segmentation: off
vHW 14
udp-control-plane-2:tx-udp_tnl-segmentation: off
udp-control-plane-2:tx-udp_tnl-csum-segmentation: off

Added secondary NIC ens224, verified offloads also disabled
# for f in $(oc get nodes --no-headers -o custom-columns=N:.metadata.name ) ; do oc debug node/$f -- ethtool -k ens224 | grep udp_tnl | tee udp-lower_up-224-$f & done

vHW 15
udp-lower_up-224-compute-0:tx-udp_tnl-segmentation: off
udp-lower_up-224-compute-0:tx-udp_tnl-csum-segmentation: off
vHW 16
udp-lower_up-224-compute-1:tx-udp_tnl-segmentation: off
udp-lower_up-224-compute-1:tx-udp_tnl-csum-segmentation: off
vHW 17
udp-lower_up-224-control-plane-0:tx-udp_tnl-segmentation: off
udp-lower_up-224-control-plane-0:tx-udp_tnl-csum-segmentation: off
vHW 14
udp-lower_up-224-control-plane-1:tx-udp_tnl-segmentation: off
udp-lower_up-224-control-plane-1:tx-udp_tnl-csum-segmentation: off
vHW 14
udp-lower_up-224-control-plane-2:tx-udp_tnl-segmentation: off
udp-lower_up-224-control-plane-2:tx-udp_tnl-csum-segmentation: off

Tested on regular OpenShiftSDN AWS IPI, verified dispatcher.d/99-vsphere-disable-tx-udp-tnl created but not activated on AWS as expected.

May 06 20:23:50.308450 ip-10-0-179-113 ignition[896]: INFO     : files: createFilesystemsFiles: createFiles: op(1a): [started]  writing file "/sysroot/etc/NetworkManager/dispatcher.d/99-vsphere-disable-tx-udp-tnl"

Comment 10 jima 2021-05-07 08:25:19 UTC
Upgrade from 4.7.0-0.nightly-2021-05-05-092347 to 4.8.0-0.nightly-2021-05-06-210840

Install vsphere upi with nightly build 4.7.0-0.nightly-2021-05-05-092347 in VMC, where:
- "platform:none" filed in install-config.yaml
- hardware version is 15

After installation is completed, checked all master/worker nodes:
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on

Then upgrade to 4.8.0-0.nightly-2021-05-06-210840
$ ./oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2021-05-05-092347   True        True          6s      Working towards registry.ci.openshift.org/ocp/release:4.8.0-0.nightly-2021-05-06-210840: downloading update

$ ./oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-05-06-210840   True        False         99m     Cluster version is 4.8.0-0.nightly-2021-05-06-210840

Checked again on nodes after upgrade, all have below options:
tx-udp_tnl-segmentation: off
tx-udp_tnl-csum-segmentation: off

Comment 15 errata-xmlrpc 2021-07-27 23:02:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.