Bug 1987108
| Summary: | Networking issue with vSphere clusters running HW14 and later | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Yash Chouksey <ychoukse> | |
| Component: | Machine Config Operator | Assignee: | MCO Team <team-mco> | |
| Machine Config Operator sub component: | Machine Config Operator | QA Contact: | Rio Liu <rioliu> | |
| Status: | CLOSED ERRATA | Docs Contact: | ||
| Severity: | urgent | |||
| Priority: | urgent | CC: | aconstan, aojeagar, aos-bugs, apjagtap, aygarg, bbennett, boyang, cavery, c.bruyland, ChetRHosey, cstabler, david.karlsen, dhellmann, dmoessne, doshir, erich, fan-wxa, hshukla, jcallen, jerzhang, jmalde, jmaxwell, jnordell, jscalf, kahara, kgordeev, knaeem, laurent.breuskin, ldu, lmohanty, mas-hatada, mfojtik, mharri, miabbott, mkrejci, nkaushik, palonsor, pbertera, Philippe.Kapp, rbobek, rbrattai, rh-container, ribarry, rioliu, rsandu, sburke, sdodson, shujadha, simore, skrenger, skudupud, sreber, sttts, vlaad, william.caban, wking, yacao, yanyang, ykonotopov | |
| Version: | 4.8 | Keywords: | Reopened | |
| Target Milestone: | --- | |||
| Target Release: | 4.9.0 | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | EmergencyRequest UpdateRecommendationsBlocked | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1987166 1998106 (view as bug list) | Environment: | ||
| Last Closed: | 2021-10-29 15:20:17 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1723620, 1987166, 1998106 | |||
|
Comment 1
Michal Fojtik
2021-07-28 22:08:57 UTC
kube-apiserver has not connectivity to the aggregated apiservers, e.g. from master 1: 2021-07-28T13:36:45.938423139Z E0728 13:36:45.938066 20 controller.go:116] loading OpenAPI spec for "v1.route.openshift.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: error trying to reach service: dial tcp 10.130.0.6:8443: connect: no route to host Similar messages appear in the other kube-apiserver instances, and (!) for different aggregated apiservers (metrics, oauth, ...). Looks like pod networking is down. At the same time the openshift-apiserver (the one I checked) I up and happy (it provides route.openshift.io among other APIs). Moving to networking. more ./quay-io-openshift-origin-must-gather-sha256-e5e5166f37d7bd043f25276ad450f7aa57d96604e8c1a6c26ab42a9253689079/cluster-scoped-resources/config.openshift.io/infrastructures.yaml
---
apiVersion: config.openshift.io/v1
items:
- apiVersion: config.openshift.io/v1
kind: Infrastructure
metadata:
creationTimestamp: "2021-07-28T12:52:57Z"
generation: 1
name: cluster
resourceVersion: "659"
uid: e58515fa-5dfc-4399-ae0d-1ac422d8792e
spec:
cloudConfig:
key: config
name: cloud-provider-config
platformSpec:
type: VSphere
status:
apiServerInternalURI: https://api-int.marineprod.scotland.gov.uk:6443
apiServerURL: https://api.marineprod.scotland.gov.uk:6443
It seems that the internal url used to expose the apiserver https://api.marineprod.scotland.gov.uk:6443 is not reachable, causing a cascade of network failures.
This url resolves to 192.168.24.116
> et \"https://[api-int.marineprod.scotland.gov.uk]:6443/api/v1/namespaces/openshift-kube-controller-manager/pods/installer-8-3master.marineprod.scotland.gov.uk?timeout=1m0s\": dial tcp 192.168.24.116:6443
can you verify that url is working correctly?
Hello Team,
One of our customers is having the same issue where after upgrading the cluster from 4.7.21 to 4.8.3 (running over VSphere as disconnected UPI), the openshift-apiserver operator is degraded with following errors.
oc get co openshift-apiserver -o yaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
annotations:
exclude.release.openshift.io/internal-openshift-hosted: "true"
creationTimestamp: "2021-06-24T10:34:19Z"
generation: 1
name: openshift-apiserver
resourceVersion: "20095132"
uid: 8aff4f02-b91e-495a-93bb-2cc3b0f88045
spec: {}
status:
conditions:
- lastTransitionTime: "2021-08-03T19:10:58Z"
message: All is well
reason: AsExpected
status: "False"
type: Degraded
- lastTransitionTime: "2021-08-06T11:35:15Z"
message: 'APIServerDeploymentProgressing: deployment/apiserver.openshift-apiserver:
0/3 pods have been updated to the latest generation'
reason: APIServerDeployment_PodsUpdating
status: "True"
type: Progressing
- lastTransitionTime: "2021-08-05T17:02:44Z"
message: |-
APIServicesAvailable: "apps.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
APIServicesAvailable: "authorization.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
APIServicesAvailable: "build.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
APIServicesAvailable: "image.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
APIServicesAvailable: "project.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
APIServicesAvailable: "quota.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
APIServicesAvailable: "route.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
APIServicesAvailable: "security.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
APIServicesAvailable: "template.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
reason: APIServices_Error
status: "False"
type: Available
- lastTransitionTime: "2021-06-24T10:36:24Z"
message: All is well
reason: AsExpected
status: "True"
type: Upgradeable
extension: null
relatedObjects:
- group: operator.openshift.io
name: cluster
resource: openshiftapiservers
- group: ""
name: openshift-config
resource: namespaces
- group: ""
name: openshift-config-managed
resource: namespaces
- group: ""
name: openshift-apiserver-operator
resource: namespaces
- group: ""
name: openshift-apiserver
resource: namespaces
- group: ""
name: openshift-etcd-operator
resource: namespaces
- group: ""
name: host-etcd-2
namespace: openshift-etcd
resource: endpoints
- group: controlplane.operator.openshift.io
name: ""
namespace: openshift-apiserver
resource: podnetworkconnectivitychecks
- group: apiregistration.k8s.io
name: v1.apps.openshift.io
resource: apiservices
- group: apiregistration.k8s.io
name: v1.authorization.openshift.io
resource: apiservices
- group: apiregistration.k8s.io
name: v1.build.openshift.io
resource: apiservices
- group: apiregistration.k8s.io
name: v1.image.openshift.io
resource: apiservices
- group: apiregistration.k8s.io
name: v1.project.openshift.io
resource: apiservices
- group: apiregistration.k8s.io
name: v1.quota.openshift.io
resource: apiservices
- group: apiregistration.k8s.io
name: v1.route.openshift.io
resource: apiservices
- group: apiregistration.k8s.io
name: v1.security.openshift.io
resource: apiservices
- group: apiregistration.k8s.io
name: v1.template.openshift.io
resource: apiservices
versions:
- name: operator
version: 4.8.3
- name: openshift-apiserver
version: 4.8.3
We tried the following workaround but no luck.
--> https://access.redhat.com/solutions/5896081
The endpoints of openshift-apiserver pods over port 8443 are not accessible across the nodes i.e. on master1 only the endpoint for openshift-apiserver pod which is running on that node was accessible. The cluster was upgraded completely and after that only this issue is coming up. I will be attaching the must-gather.
Hello Antonio, The customer disabled the offloading on the primary NIC for all the nodes but still, the issue persists. Regards, Ayush Garg Hi Ronak, Do you have any idea why we are being required to disable `tx-checksum-ip-generic`. This looks similar to the previous VMXNET3 issue. https://bugzilla.redhat.com/show_bug.cgi?id=1941714 *** *** Every customer attached to this case needs to open an immediate support case with VMware *** *** We _need_ the following: - vSphere version with build numbers - Switch type - Virtual machine hardware version *** Bug 1997292 has been marked as a duplicate of this bug. *** We're asking the following questions to evaluate whether or not this bug warrants blocking an upgrade edge from either the previous X.Y or X.Y.Z. The ultimate goal is to avoid delivering an update which introduces new risk or reduces cluster functionality in any way. Sample answers are provided to give more context and the UpgradeBlocker flag has been added to this bug. It will be removed if the assessment indicates that this should not block upgrade edges. The expectation is that the assignee answers these questions. Who is impacted? If we have to block upgrade edges based on this issue, which edges would need blocking? example: Customers upgrading from 4.y.Z to 4.y+1.z running on GCP with thousands of namespaces, approximately 5% of the subscribed fleet example: All customers upgrading from 4.y.z to 4.y+1.z fail approximately 10% of the time What is the impact? Is it serious enough to warrant blocking edges? example: Up to 2 minute disruption in edge routing example: Up to 90seconds of API downtime example: etcd loses quorum and you have to restore from backup How involved is remediation (even moderately serious impacts might be acceptable if they are easy to mitigate)? example: Issue resolves itself after five minutes example: Admin uses oc to fix things example: Admin must SSH to hosts, restore from backups, or other non standard admin activities Is this a regression (if all previous versions were also vulnerable, updating to the new, vulnerable version does not increase exposure)? example: No, it’s always been like this we just never noticed example: Yes, from 4.y.z to 4.y+1.z Or 4.y.z to 4.y.z+1 Who is impacted? OpenShift 4.7.24+ and 4.8 clusters running atop vSphere HW14, new installs and upgrades to the affected versions What is the impact? Is it serious enough to warrant blocking edges? SDN Packet loss resulting in service unavailability. How involved is remediation (even moderately serious impacts might be acceptable if they are easy to mitigate)? Unknown final remediation but a workaround of disabling tx-checksum-ip-generic has been shown to improve the situation Is this a regression (if all previous versions were also vulnerable, updating to the new, vulnerable version does not increase exposure)? Yes, 4.7.24+ and 4.8.2+ are known to be affected This is an initial assessment which will be updated when we have more information. so just to confirm,
with VM HW version set to 15 see c#40 for details, after upgrading my cluster (actually it did not finish completely) from 4.7.21 to 4.7.24 I got the following issue:
# oc get co |grep -v "True False False"
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE
authentication 4.7.24 False True True 3h23m
console 4.7.24 False False True 3h27m
monitoring 4.7.24 False True True 3h21m
openshift-apiserver 4.7.24 False False False 3h25m
operator-lifecycle-manager-packageserver 4.7.24 False True False 3h22m
[root@bastion mg]#
which did not change even after waiting nearly 4 hours. Commands too ages and creating a new project timed out with
~~~
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get projectrequests.project.openshift.io)
~~~
setting has been:
~~~
[root@bastion mg]# for NODE in master0 master1 master2 worker0 worker1 worker2 worker3 worker4 worker5; do echo ${NODE};ssh -o StrictHostKeyChecking=no core@${NODE} sudo ethtool -k ens192 |egrep 'tx-checksum-ip-generic';done
master0
tx-checksum-ip-generic: on
master1
tx-checksum-ip-generic: on
master2
tx-checksum-ip-generic: on
worker0
tx-checksum-ip-generic: on
worker1
tx-checksum-ip-generic: on
worker2
tx-checksum-ip-generic: on
worker3
tx-checksum-ip-generic: on
worker4
tx-checksum-ip-generic: on
worker5
tx-checksum-ip-generic: on
[root@bastion mg]#
~~~
( I has a must-gather and a sosreport from one master, ping me if needed)
as soon as I change this to off
~~~
[root@bastion mg]# for NODE in master0 master1 master2 worker0 worker1 worker2 worker3 worker4 worker5; do echo ${NODE};ssh -o StrictHostKeyChecking=no core@${NODE} sudo ethtool -K ens192 tx-checksum-ip-generic off;done
master0
Actual changes:
tx-checksum-ip-generic: off
tx-tcp-segmentation: off [not requested]
tx-tcp6-segmentation: off [not requested]
master1
Actual changes:
tx-checksum-ip-generic: off
tx-tcp-segmentation: off [not requested]
tx-tcp6-segmentation: off [not requested]
master2
Actual changes:
tx-checksum-ip-generic: off
tx-tcp-segmentation: off [not requested]
tx-tcp6-segmentation: off [not requested]
worker0
Actual changes:
tx-checksum-ip-generic: off
tx-tcp-segmentation: off [not requested]
tx-tcp6-segmentation: off [not requested]
worker1
Actual changes:
tx-checksum-ip-generic: off
tx-tcp-segmentation: off [not requested]
tx-tcp6-segmentation: off [not requested]
worker2
Actual changes:
tx-checksum-ip-generic: off
tx-tcp-segmentation: off [not requested]
tx-tcp6-segmentation: off [not requested]
worker3
Actual changes:
tx-checksum-ip-generic: off
tx-tcp-segmentation: off [not requested]
tx-tcp6-segmentation: off [not requested]
worker4
Actual changes:
tx-checksum-ip-generic: off
tx-tcp-segmentation: off [not requested]
tx-tcp6-segmentation: off [not requested]
worker5
Actual changes:
tx-checksum-ip-generic: off
tx-tcp-segmentation: off [not requested]
tx-tcp6-segmentation: off [not requested]
[root@bastion mg]#
~~~
everything is nearly instantaneous fine again and I can create new projects
~~~
[root@bastion]# oc get co |grep -v "True False False"
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE
[root@bastion]# oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE
authentication 4.7.24 True False False 3m21s
baremetal 4.7.24 True False False 9h
cloud-credential 4.7.24 True False False 10h
cluster-autoscaler 4.7.24 True False False 9h
config-operator 4.7.24 True False False 9h
console 4.7.24 True False False 3m33s
csi-snapshot-controller 4.7.24 True False False 7h26m
dns 4.7.24 True False False 9h
etcd 4.7.24 True False False 9h
image-registry 4.7.24 True False False 9h
ingress 4.7.24 True False False 9h
insights 4.7.24 True False False 9h
kube-apiserver 4.7.24 True False False 9h
kube-controller-manager 4.7.24 True False False 9h
kube-scheduler 4.7.24 True False False 9h
kube-storage-version-migrator 4.7.24 True False False 7h36m
machine-api 4.7.24 True False False 9h
machine-approver 4.7.24 True False False 9h
machine-config 4.7.24 True False False 6h49m
marketplace 4.7.24 True False False 77s
monitoring 4.7.24 True False False 3m4s
network 4.7.24 True False False 9h
node-tuning 4.7.24 True False False 8h
openshift-apiserver 4.7.24 True False False 3m51s
openshift-controller-manager 4.7.24 True False False 8h
openshift-samples 4.7.24 True False False 8h
operator-lifecycle-manager 4.7.24 True False False 9h
operator-lifecycle-manager-catalog 4.7.24 True False False 9h
operator-lifecycle-manager-packageserver 4.7.24 True False False 3m50s
service-ca 4.7.24 True False False 9h
storage 4.7.24 True False False 7h27m
[root@bastion]#
~~~
(In reply to Joseph Callen from comment #25) > Hi Ronak, > > Do you have any idea why we are being required to disable > `tx-checksum-ip-generic`. This looks similar to the previous VMXNET3 issue. > https://bugzilla.redhat.com/show_bug.cgi?id=1941714 What exactly is the setup and what is the issue? Couple of questions: Are tunnels being used? If yes, what tunneling protocol is being used? What is the destination port being used for the tunnel? Does the vmxnet3 driver have the fix from PR 1941714? Thanks, Ronak Based on the impact statement in comment 43, we have stopped recommending folks update from versions that are not impacted to versions that are impacted [1]. [1]: https://github.com/openshift/cincinnati-graph-data/pull/1008 (In reply to Ronak Doshi from comment #46) > (In reply to Joseph Callen from comment #25) > > Hi Ronak, > > > > Do you have any idea why we are being required to disable > > `tx-checksum-ip-generic`. This looks similar to the previous VMXNET3 issue. > > https://bugzilla.redhat.com/show_bug.cgi?id=1941714 > > What exactly is the setup and what is the issue? Similar to the last udp issue. Standard and Distributed vSwitch(s). NSX-T not effected - still the same CI setup using VMC. > > Couple of questions: > Are tunnels being used? If yes, what tunneling protocol is being used? Yes, either VXLAN or GENEVE > What is the destination port being used for the tunnel? No idea - SDN folks on this BZ, please respond > Does the vmxnet3 driver have the fix from PR 1941714? Yes and even if it didn't we have a workaround in place to disable the previous issues with udp. > > Thanks, > Ronak Based on Cathy's response of changes 8.4 could it be caused by: https://github.com/torvalds/linux/commit/8a7f280f29a80f6e0798f5d6e07c5dd8726620fe#diff-db4c3dfb5fede7bacdecc2e2c486cb29369c21885ffa6ccb6cd4220c37b0fa75 or https://github.com/torvalds/linux/commit/1dac3b1bc66dc68dbb0c9f43adac71a7d0a0331a#diff-db4c3dfb5fede7bacdecc2e2c486cb29369c21885ffa6ccb6cd4220c37b0fa75 Ronak, can you see private comments? If not pasting from previous comment [root@inf14:~] vsish -e cat /net/portsets/$(net-stats -l |grep master |awk '{print $4}')/ports/$(net-stats -l |grep master |awk '{print $1}')/vmxnet3/txSummary stats of a vmxnet3 vNIC tx queue { generation:1424 pkts tx ok:12564827 bytes tx ok:6111084746 TSO pkts tx ok:786793 TSO bytes tx ok:4352028717 unicast pkts tx ok:12564748 unicast bytes tx ok:6111081428 multicast pkts tx ok:0 multicast bytes tx ok:0 broadcast pkts tx ok:79 broadcast bytes tx ok:3318 pkts tx failure:0 pkts discarded:341556 <-------------------- ******* error when copying hdrs:0 tso header errors:0 pkt allocation failures:0 # of times a tx queue is stopped:0 failed to map some guest buffers:0 tx completion failure due to stale enableGen:0 giant tso pkts requiring more than 1 pkt handle:0 failed to split a giant tso pkt:0 giant non-tso pkts requiring more than 1 pkt handle:0 failed to create a pkt from more than 1 pkt handle:0 encap (outer) header errors:341556 <------------------------------****** encap (inner) tso header errors:0 } (In reply to Joseph Callen from comment #49) > (In reply to Ronak Doshi from comment #46) > > (In reply to Joseph Callen from comment #25) > > > Hi Ronak, > > > > > > Do you have any idea why we are being required to disable > > > `tx-checksum-ip-generic`. This looks similar to the previous VMXNET3 issue. > > > https://bugzilla.redhat.com/show_bug.cgi?id=1941714 > > > > What exactly is the setup and what is the issue? > > Similar to the last udp issue. Standard and Distributed vSwitch(s). NSX-T > not effected - still the same CI setup using VMC. > > > > > Couple of questions: > > Are tunnels being used? If yes, what tunneling protocol is being used? > Yes, either VXLAN or GENEVE > > > What is the destination port being used for the tunnel? > No idea - SDN folks on this BZ, please respond > > > Does the vmxnet3 driver have the fix from PR 1941714? > Yes and even if it didn't we have a workaround in place to disable the > previous issues with udp. > > > > > Thanks, > > Ronak > > Based on Cathy's response of changes 8.4 could it be caused by: > https://github.com/torvalds/linux/commit/ > 8a7f280f29a80f6e0798f5d6e07c5dd8726620fe#diff- > db4c3dfb5fede7bacdecc2e2c486cb29369c21885ffa6ccb6cd4220c37b0fa75 > or > https://github.com/torvalds/linux/commit/ > 1dac3b1bc66dc68dbb0c9f43adac71a7d0a0331a#diff- > db4c3dfb5fede7bacdecc2e2c486cb29369c21885ffa6ccb6cd4220c37b0fa75 > > Ronak, can you see private comments? If not pasting from previous comment > > [root@inf14:~] vsish -e cat /net/portsets/$(net-stats -l |grep master |awk > '{print $4}')/ports/$(net-stats -l |grep master |awk '{print > $1}')/vmxnet3/txSummary > stats of a vmxnet3 vNIC tx queue { > generation:1424 > pkts tx ok:12564827 > bytes tx ok:6111084746 > TSO pkts tx ok:786793 > TSO bytes tx ok:4352028717 > unicast pkts tx ok:12564748 > unicast bytes tx ok:6111081428 > multicast pkts tx ok:0 > multicast bytes tx ok:0 > broadcast pkts tx ok:79 > broadcast bytes tx ok:3318 > pkts tx failure:0 > pkts discarded:341556 <-------------------- ******* > error when copying hdrs:0 > tso header errors:0 > pkt allocation failures:0 > # of times a tx queue is stopped:0 > failed to map some guest buffers:0 > tx completion failure due to stale enableGen:0 > giant tso pkts requiring more than 1 pkt handle:0 > failed to split a giant tso pkt:0 > giant non-tso pkts requiring more than 1 pkt handle:0 > failed to create a pkt from more than 1 pkt handle:0 > encap (outer) header errors:341556 <------------------------------****** > encap (inner) tso header errors:0 > } I cannot see private comments. Based on the counters it seems something was not as expected in the encapsulation header. In the previous udp issue, it was the destination port. So, I would link to know what destination port is used here. > > Does the vmxnet3 driver have the fix from PR 1941714? > Yes and even if it didn't we have a workaround in place to disable the > previous issues with udp. Btw, if I remember correctly, the fix was that tunnel offloads were disabled in the previous PR. If so, then how are tunnel offloads enabled here? Shouldn't they be disabled? Also, is NSX-T installed here? Thanks, Ronak Also, packet capture (with --ng option) as done in PR 1941714 would be helpful and appreciated. Hi Daniel, Can you answer Ronak's questions regarding your OCP and vSphere cluster specifics? Can you also provide `ethtool -k ens192` and `uname -a` for that master Thanks! In all cases the kernel is 4.18.0-305.10.2.el8_4 or 4.18.0-305.12.1.el8_4. @sdodson isn't that given the OCP version as MCO handles the masters? In our case we'll have workers on RHEL 7.x 3.10.0-1160.36.2.el7.x86_64 while masters on 4.18.0-305.10.2.el8_4.x86_64 RHCOS. (In reply to David J. M. Karlsen from comment #54) > @sdodson isn't that given the OCP version as MCO handles the > masters? > In our case we'll have workers on RHEL 7.x 3.10.0-1160.36.2.el7.x86_64 > while masters on 4.18.0-305.10.2.el8_4.x86_64 RHCOS. Sure, but no where else in this bug has it been mentioned that RHEL7 workers are involved. I think we'd probably want a unique bug to track that variant as it may require a RHEL7 kernel fix in the end. We'll also want to verify that the problem exists between two RHEL7 workers and not just between RHCOS control plane and RHEL7 workers. Thanks Jatan, just to make my data complete: # cat etc/os-release NAME="Red Hat Enterprise Linux" VERSION="8.4 (Ootpa)" ID="rhel" ID_LIKE="fedora" VERSION_ID="8.4" PLATFORM_ID="platform:el8" PRETTY_NAME="Red Hat Enterprise Linux 8.4 (Ootpa)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:8.4:GA" HOME_URL="https://www.redhat.com/" DOCUMENTATION_URL="https://access.redhat.com/documentation/red_hat_enterprise_linux/8/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8" REDHAT_BUGZILLA_PRODUCT_VERSION=8.4 REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux" REDHAT_SUPPORT_PRODUCT_VERSION="8.4" # # cat uname Linux master1.ocp4-csa.coe.muc.redhat.com 4.18.0-305.10.2.el8_4.x86_64 #1 SMP Mon Jul 12 04:43:18 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux #cat ethtool_-k_ens192 Features for ens192: rx-checksumming: on tx-checksumming: on tx-checksum-ipv4: off [fixed] tx-checksum-ip-generic: on tx-checksum-ipv6: off [fixed] tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: off [fixed] scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: off [fixed] tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: off [fixed] tx-tcp-mangleid-segmentation: off tx-tcp6-segmentation: on generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off rx-vlan-offload: on tx-vlan-offload: on ntuple-filters: off [fixed] receive-hashing: on highdma: on rx-vlan-filter: on [fixed] vlan-challenged: off [fixed] tx-lockless: off [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: off [fixed] tx-gre-segmentation: off [fixed] tx-gre-csum-segmentation: off [fixed] tx-ipxip4-segmentation: off [fixed] tx-ipxip6-segmentation: off [fixed] tx-udp_tnl-segmentation: off tx-udp_tnl-csum-segmentation: off tx-gso-partial: off [fixed] tx-tunnel-remcsum-segmentation: off [fixed] tx-sctp-segmentation: off [fixed] tx-esp-segmentation: off [fixed] tx-udp-segmentation: off [fixed] tx-gso-list: off [fixed] rx-gro-list: off tls-hw-rx-offload: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: off loopback: off [fixed] rx-fcs: off [fixed] rx-all: off [fixed] tx-vlan-stag-hw-insert: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] l2-fwd-offload: off [fixed] hw-tc-offload: off [fixed] esp-hw-offload: off [fixed] esp-tx-csum-hw-offload: off [fixed] rx-udp_tunnel-port-offload: off [fixed] tls-hw-tx-offload: off [fixed] rx-gro-hw: off [fixed] tls-hw-record: off [fixed] I've got a must-gather and a sos-report from this cluster I am going to attach (In reply to Ronak Doshi from comment #51) > Also, packet capture (with --ng option) as done in PR 1941714 would be > helpful and appreciated. I have captured data like so https://bugzilla.redhat.com/show_bug.cgi?id=1941714#c10 after I moved all masters to one esxi and I am happy to provide the data, however it is too big to attach it to this bz If something esle should be captured, pls let me know and be as specific as possible as I am neither a VMWare admin nor a NW guy ;) (In reply to daniel from comment #62) > (In reply to Ronak Doshi from comment #51) > > Also, packet capture (with --ng option) as done in PR 1941714 would be > > helpful and appreciated. > > I have captured data like so > https://bugzilla.redhat.com/show_bug.cgi?id=1941714#c10 > after I moved all masters to one esxi > > and I am happy to provide the data, however it is too big to attach it to > this bz > > If something esle should be captured, pls let me know and be as specific as > possible as I am neither a VMWare admin nor a NW guy ;) Based on comment 58, tx-udp_tnl-segmentation: off tx-udp_tnl-csum-segmentation: off The overlay offloads are disabled. This means stack will calculate inner header checksums. So, I am not able to understand, how the packets are requesting offloads. The stats shared in comment 49 are for ens192 right? [root@inf14:~] vsish -e cat /net/portsets/$(net-stats -l |grep master |awk '{print $4}')/ports/$(net-stats -l |grep master |awk '{print $1}')/vmxnet3/txSummary stats of a vmxnet3 vNIC tx queue { ... pkts discarded:341556 <------------------------------****** encap (outer) header errors:341556 <------------------------------****** ... } If so, could you capture packets on ens192 using tcpdump inside the vm when you see the issue? Thanks, Ronak This bug is progressing toward closure via a workaround deployed in OpenShift, we've opened Bug 1998572 to track kernel fix so that we may remove the workaround in future OpenShift versions enabling OpenShift to make use of default offload feature set provided by vmxnet3 driver. (In reply to Scott Dodson from comment #71) Dear Red Hat, > This bug is progressing toward closure via a workaround deployed in > OpenShift, we've opened Bug 1998572 to track kernel fix so that we may > remove the workaround in future OpenShift versions enabling OpenShift to > make use of default offload feature set provided by vmxnet3 driver. we think it should backport to OCP4.7 and OCP4.8 after bug 1998572 is fixed. Does Red Hat have plan for that? Is there any ticket to track for that? Regards. *** Bug 1993153 has been marked as a duplicate of this bug. *** (In reply to weiguo fan from comment #73) > (In reply to Scott Dodson from comment #71) > Dear Red Hat, > > > This bug is progressing toward closure via a workaround deployed in > > OpenShift, we've opened Bug 1998572 to track kernel fix so that we may > > remove the workaround in future OpenShift versions enabling OpenShift to > > make use of default offload feature set provided by vmxnet3 driver. > > we think it should backport to OCP4.7 and OCP4.8 after bug 1998572 is fixed. > Does Red Hat have plan for that? > Is there any ticket to track for that? > > Regards. The workaround has already been backported to 4.8 and 4.7. When a kernel fix becomes available that removes the need for these workarounds we will confirm that it fixes the problem in all relevant versions of OpenShift and the workaround will be removed after that. > The workaround has already been backported to 4.8 and 4.7. When a kernel fix
> becomes available that removes the need for these workarounds we will
> confirm that it fixes the problem in all relevant versions of OpenShift and
> the workaround will be removed after that.
Thanks for the information, Scott.
Cloud you kindly let us know the 4.8 and 4.7 versions that the workaround is included?
Regards.
The 4.8 workaround is being tracked in bug 1998106. The 4.7 workaround is being tracked in bug 1998112. Both are likely to go out with the next supported release in their respective z streams, but neither has been released yet. *** Bug 1993723 has been marked as a duplicate of this bug. *** Workaround for those who cannot immediately upgrade is to disable tx-checksum-ip-generic on vmxnet3 interfaces, ex: ethtool -K ens192 tx-checksum-ip-generic off *** Bug 1996577 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |