Bug 1926279
| Summary: | Pod ignores mtu setting from sriovNetworkNodePolicies in case of PF partitioning | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Nikita <nkononov> |
| Component: | Networking | Assignee: | Peng Liu <pliu> |
| Networking sub component: | SR-IOV | QA Contact: | zhaozhanqi <zzhao> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | anbhat, dosmith |
| Version: | 4.7 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.8.0 | ||
| Hardware: | Unspecified | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-07-27 22:42:10 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Verified this bug on
cat *
apiVersion: v1
kind: Pod
metadata:
generateName: testpod1
labels:
env: test
annotations:
k8s.v1.cni.cncf.io/networks: intel-netdevice-1400
spec:
containers:
- name: test-pod
image: quay.io/openshifttest/hello-sdn@sha256:d5785550cf77b7932b090fcd1a2625472912fb3189d5973f177a5a2c347a1f95
apiVersion: v1
kind: Pod
metadata:
generateName: testpod1
namespace: z1
labels:
env: test
annotations:
k8s.v1.cni.cncf.io/networks: intel-netdevice-9000
spec:
containers:
- name: test-pod
image: quay.io/openshifttest/hello-sdn@sha256:d5785550cf77b7932b090fcd1a2625472912fb3189d5973f177a5a2c347a1f95
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: intel-netdevice-mtu1400
namespace: openshift-sriov-network-operator
spec:
deviceType: netdevice
nicSelector:
pfNames:
- ens1f0#3-4
rootDevices:
- '0000:3b:00.0'
vendor: '8086'
nodeSelector:
feature.node.kubernetes.io/sriov-capable: 'true'
priority: 99
numVfs: 5
mtu: 1400
resourceName: intelmtu1400
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: intel-netdevice-mtu900
namespace: openshift-sriov-network-operator
spec:
deviceType: netdevice
nicSelector:
pfNames:
- ens1f0#1-2
rootDevices:
- '0000:3b:00.0'
vendor: '8086'
nodeSelector:
feature.node.kubernetes.io/sriov-capable: 'true'
priority: 99
numVfs: 5
mtu: 9000
resourceName: intelmtu9000
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
name: intel-netdevice-1400
namespace: openshift-sriov-network-operator
spec:
ipam: |
{
"type": "host-local",
"subnet": "10.56.215.0/24",
"rangeStart": "10.56.215.171",
"rangeEnd": "10.56.215.181",
"routes": [{
"dst": "0.0.0.0/0"
}],
"gateway": "10.56.215.1"
}
vlan: 0
spoofChk: "on"
trust: "off"
resourceName: intelmtu1400
networkNamespace: z2
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
name: intel-netdevice-9000
namespace: openshift-sriov-network-operator
spec:
ipam: |
{
"type": "host-local",
"subnet": "10.56.217.0/24",
"rangeStart": "10.56.217.171",
"rangeEnd": "10.56.217.181",
"routes": [{
"dst": "0.0.0.0/0"
}],
"gateway": "10.56.217.1"
}
vlan: 0
spoofChk: "on"
trust: "off"
resourceName: intelmtu9000
networkNamespace: z1
# oc exec -n z1 testpod1zzm5r -- ip a show net1
2487: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
link/ether 9a:1c:8c:34:e3:ed brd ff:ff:ff:ff:ff:ff
inet 10.56.217.171/24 brd 10.56.217.255 scope global net1
valid_lft forever preferred_lft forever
inet6 fe80::981c:8cff:fe34:e3ed/64 scope link
valid_lft forever preferred_lft forever
oc exec -n z2 testpod1pjqrr -- ip a show net1
2490: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc mq state UP group default qlen 1000
link/ether ee:cb:17:de:50:3b brd ff:ff:ff:ff:ff:ff
inet 10.56.215.171/24 brd 10.56.215.255 scope global net1
valid_lft forever preferred_lft forever
inet6 fe80::eccb:17ff:fede:503b/64 scope link
valid_lft forever preferred_lft forever
verified this bug on 4.8.0-202103270026.p0 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |
Description of problem: If PF partitioning configured then VF in Pod inheritances MTU from PF instead of using MTU from sriovNetworkNodePolicies. Current behavior: According to current behavior sriov operator configures MTU on PF based on the biggest value across several policy however in this case operator configures biggest MTU on all VFs in pods and ignores corresponding values from relevant sriovNetworkNodePolicies. Issue: sriov operator ignores value configured by admin in SriovNetworkNodePolicy.spec.mtu except biggest one We have configuration conflict between SriovNetworkNodePolicy. In current situation external SriovNetworkNodePolicy A (MTU 9000) configured for user A in namespace A changes MTU in environment of user B in namespace B despite SriovNetworkNodePolicy B (MTU 1500). Possible solution: This is correct to configure PF based on the biggest MTU value from several SriovNetworkNodePolicy however CNI plugin should apply configuration to pod from relevant SriovNetworkNodePolicy instead of using PF MTU. In this case my applications inside a pod will respect configured MTU . Of course it's possible to overwrite this value manually inside a pod (ip link set dev X mtu Y) but only in case if an user have enough permissions. Version-Release number of selected component (if applicable): 4.7.0-fc.4 Sriov Operator: Image: registry.redhat.io/openshift4/ose-sriov-network-operator@sha256:569327e96d23fa53360ade21559797c9ce27cbb61b0228bc306cafbb7897b588 Image ID: registry.redhat.io/openshift4/ose-sriov-network-operator@sha256:383428eaee9e59e138925372a1e2029b2af4e7c1ff1841b983de804c6359e40e How reproducible: 1) Create 2 SriovPolicy one with Jumbo mtu(9000) and one with standard MTU(1500) 2) Create relevant SriovNetworks for each policy 3) Create pod connected to Standard MTU policy 4) run >ip link show< in pod and check MTU. It's going to be 9000 instead of 1500 Steps to Reproduce: 1. 2. 3. Actual results: ip link show: 3585: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 20:04:0f:f1:88:03 brd ff:ff:ff:ff:ff:ff Expected results: ip link show: 3585: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 20:04:0f:f1:88:03 brd ff:ff:ff:ff:ff:ff Additional info: Below you can see attached configuration from my env: The first one supports jumbo frame 9000: apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetworkNodePolicy metadata: creationTimestamp: "2021-02-07T15:04:51Z" generateName: test-policy-jumbo generation: 1 managedFields: apiVersion: sriovnetwork.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:generateName: {} f:spec: .: {} f:deviceType: {} f:mtu: {} f:nicSelector: .: {} f:pfNames: {} f:nodeSelector: .: {} f:node-role.kubernetes.io/worker-cnf: {} f:numVfs: {} f:priority: {} f:resourceName: {} f:status: {} manager: sriov.test operation: Update time: "2021-02-07T15:04:51Z" name: test-policy-jumbohq9jq namespace: openshift-sriov-network-operator resourceVersion: "9425117" selfLink: /apis/sriovnetwork.openshift.io/v1/namespaces/openshift-sriov-network-operator/sriovnetworknodepolicies/test-policy-jumbohq9jq uid: eaa3b055-11fc-4ea5-9f7f-4a43428b820a spec: deviceType: netdevice isRdma: false linkType: eth mtu: 9000 nicSelector: pfNames: ens3f0#4-5 nodeSelector: node-role.kubernetes.io/worker-cnf: "" numVfs: 6 priority: 99 resourceName: testresourcejumbo The Second one supports custom MTU 1450 apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetworkNodePolicy metadata: creationTimestamp: "2021-02-07T15:04:51Z" generateName: test-policy-custom generation: 1 managedFields: apiVersion: sriovnetwork.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:generateName: {} f:spec: .: {} f:deviceType: {} f:mtu: {} f:nicSelector: .: {} f:pfNames: {} f:nodeSelector: .: {} f:node-role.kubernetes.io/worker-cnf: {} f:numVfs: {} f:priority: {} f:resourceName: {} f:status: {} manager: sriov.test operation: Update time: "2021-02-07T15:04:51Z" name: test-policy-customn87pg namespace: openshift-sriov-network-operator resourceVersion: "9425101" selfLink: /apis/sriovnetwork.openshift.io/v1/namespaces/openshift-sriov-network-operator/sriovnetworknodepolicies/test-policy-customn87pg uid: 1d671d04-c454-402c-926c-d48316cdaf7c spec: deviceType: netdevice isRdma: false linkType: eth mtu: 1450 nicSelector: pfNames: ens3f0#2-3 nodeSelector: node-role.kubernetes.io/worker-cnf: "" numVfs: 6 priority: 99 resourceName: testresourcecustom Pod connected to policy with 1450 MTU: oc get pod testpod-vkgff -o yaml -n sriov-operator-tests apiVersion: v1 kind: Pod metadata: annotations: k8s.ovn.org/pod-networks: '{"default":{"ip_addresses":["10.135.1.84/23"],"mac_address":"0a:58:0a:87:01:54","gateway_ips":["10.135.0.1"],"ip_address":"10.135.1.84/23","gateway_ip":"10.135.0.1"}}' k8s.v1.cni.cncf.io/network-status: |- [{ "name": "", "interface": "eth0", "ips": [ "10.135.1.84" ], "mac": "0a:58:0a:87:01:54", "default": true, "dns": {} },{ "name": "sriov-operator-tests/test-sriov-static-custom", "interface": "net1", "ips": [ "192.168.100.2" ], "mac": "4a:c7:eb:89:7c:b6", "dns": {}, "device-info": { "type": "pci", "version": "1.0.0", "pci": { "pci-address": "0000:d8:02.2" } } }] k8s.v1.cni.cncf.io/networks: "[\n\t\t{\n\t\t\t"name": "test-sriov-static-custom", \n\t\t\t"mac": "20:04:0f:f1:88:03",\n\t\t\t"ips": ["192.168.100.2/24"]\n\t\t}\n\t]" k8s.v1.cni.cncf.io/networks-status: |- [{ "name": "", "interface": "eth0", "ips": [ "10.135.1.84" ], "mac": "0a:58:0a:87:01:54", "default": true, "dns": {} },{ "name": "sriov-operator-tests/test-sriov-static-custom", "interface": "net1", "ips": [ "192.168.100.2" ], "mac": "4a:c7:eb:89:7c:b6", "dns": {}, "device-info": { "type": "pci", "version": "1.0.0", "pci": { "pci-address": "0000:d8:02.2" } } }] openshift.io/scc: privileged creationTimestamp: "2021-02-07T15:17:18Z" generateName: testpod- managedFields: apiVersion: v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: f:k8s.ovn.org/pod-networks: {} manager: ovnkube operation: Update time: "2021-02-07T15:17:18Z" apiVersion: v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:k8s.v1.cni.cncf.io/networks: {} f:generateName: {} f:spec: f:containers: k:{"name":"test"}: .: {} f:command: {} f:image: {} f:imagePullPolicy: {} f:name: {} f:resources: .: {} f:limits: f:openshift.io/testresourcecustom: {} f:requests: f:openshift.io/testresourcecustom: {} f:terminationMessagePath: {} f:terminationMessagePolicy: {} f:dnsPolicy: {} f:enableServiceLinks: {} f:nodeSelector: .: {} f:kubernetes.io/hostname: {} f:restartPolicy: {} f:schedulerName: {} f:securityContext: .: {} f:seLinuxOptions: f:level: {} f:terminationGracePeriodSeconds: {} manager: sriov.test operation: Update time: "2021-02-07T15:17:18Z" apiVersion: v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: f:k8s.v1.cni.cncf.io/network-status: {} f:k8s.v1.cni.cncf.io/networks-status: {} manager: multus operation: Update time: "2021-02-07T15:17:20Z" apiVersion: v1 fieldsType: FieldsV1 fieldsV1: f:status: f:conditions: k:{"type":"ContainersReady"}: .: {} f:lastProbeTime: {} f:lastTransitionTime: {} f:status: {} f:type: {} k:{"type":"Initialized"}: .: {} f:lastProbeTime: {} f:lastTransitionTime: {} f:status: {} f:type: {} k:{"type":"Ready"}: .: {} f:lastProbeTime: {} f:lastTransitionTime: {} f:status: {} f:type: {} f:containerStatuses: {} f:hostIP: {} f:phase: {} f:podIP: {} f:podIPs: .: {} k:{"ip":"10.135.1.84"}: .: {} f:ip: {} f:startTime: {} manager: kubelet operation: Update time: "2021-02-07T15:17:23Z" name: testpod-vkgff namespace: sriov-operator-tests resourceVersion: "9431978" selfLink: /api/v1/namespaces/sriov-operator-tests/pods/testpod-vkgff uid: be22980a-768f-4e37-8de7-fe4401d86e6c spec: containers: command: sleep INF image: docker-registry.upshift.redhat.com/cnf-gotests/cnf-gotests-client:v4.7 imagePullPolicy: IfNotPresent name: test resources: limits: openshift.io/testresourcecustom: "1" requests: openshift.io/testresourcecustom: "1" securityContext: capabilities: drop: MKNOD terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: default-token-d2f8r readOnly: true mountPath: /etc/podnetinfo name: podnetinfo dnsPolicy: ClusterFirst enableServiceLinks: true imagePullSecrets: name: default-dockercfg-cxrgv nodeName: cnfdt13.lab.eng.tlv2.redhat.com nodeSelector: kubernetes.io/hostname: cnfdt13.lab.eng.tlv2.redhat.com preemptionPolicy: PreemptLowerPriority priority: 0 restartPolicy: Always schedulerName: default-scheduler securityContext: seLinuxOptions: level: s0:c26,c15 serviceAccount: default serviceAccountName: default terminationGracePeriodSeconds: 0 tolerations: effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 volumes: name: default-token-d2f8r secret: defaultMode: 420 secretName: default-token-d2f8r downwardAPI: defaultMode: 420 items: fieldRef: apiVersion: v1 fieldPath: metadata.labels path: labels fieldRef: apiVersion: v1 fieldPath: metadata.annotations path: annotations name: podnetinfo status: conditions: lastProbeTime: null lastTransitionTime: "2021-02-07T15:17:18Z" status: "True" type: Initialized lastProbeTime: null lastTransitionTime: "2021-02-07T15:17:23Z" status: "True" type: Ready lastProbeTime: null lastTransitionTime: "2021-02-07T15:17:23Z" status: "True" type: ContainersReady lastProbeTime: null lastTransitionTime: "2021-02-07T15:17:18Z" status: "True" type: PodScheduled containerStatuses: containerID: cri-o://5ea5884825a5eb8bce9b2d6568564be6ca93a03ed43098f9a630704bf683b9ef image: docker-registry.upshift.redhat.com/cnf-gotests/cnf-gotests-client:v4.7 imageID: docker-registry.upshift.redhat.com/cnf-gotests/cnf-gotests-client@sha256:ec0d2e9591e0f124be4e60a80b24ce17b552b69a5c43b638ca959c6e89eb0029 lastState: {} name: test ready: true restartCount: 0 started: true state: running: startedAt: "2021-02-07T15:17:22Z" hostIP: 10.46.55.27 phase: Running podIP: 10.135.1.84 podIPs: ip: 10.135.1.84 qosClass: BestEffort startTime: "2021-02-07T15:17:18Z" Output from pod ip link show: 3585: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 20:04:0f:f1:88:03 brd ff:ff:ff:ff:ff:ff 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 3: eth0@if3614: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP mode DEFAULT group default link/ether 0a:58:0a:87:01:54 brd ff:ff:ff:ff:ff:ff link-netnsid 0