Bug 1908170

Summary: sriov network resource injector: Hugepage injection doesn't work with mult container
Product: OpenShift Container Platform Reporter: zenghui.shi <zshi>
Component: NetworkingAssignee: zenghui.shi <zshi>
Networking sub component: SR-IOV QA Contact: zhaozhanqi <zzhao>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: medium    
Version: 4.7   
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 22:35:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description zenghui.shi 2020-12-16 01:38:04 UTC
Description of problem:

hugepage downward API was added in k8s 1.20.
network-resources-injector attaches hugepage downward API volume to sriov pods, but it only injects the downward API volume for the first container.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 zhaozhanqi 2021-01-11 07:29:38 UTC
To verify this bug we need to Enable Feature Gate with already move to 4.8 version, see https://issues.redhat.com/browse/SDN-1375
So this bug need to be verified using 4.8 version.

Comment 5 zhaozhanqi 2021-04-13 09:07:04 UTC
Verified this bug on 4.8.0-202104080531.p0 on 4.8.0-0.nightly-2021-04-09-222447

steps

1. Create sriovnodepolicy to init VF
2. Create sriovnetwork to create net-attach-def
3. Create pod with multi containers with following yaml file: 

# cat multi-pod.yaml 
# Pod Spec to work with SR-IOV Network Resource Injector
#  
# SR-IOV Network Resource Injector is a mutating webhook that
# adds Downward API values to the Pod Spec. So this version
# of Pod Spec has Downward API values commented out. 
apiVersion: v1
kind: Pod
metadata:
  name: test-multi-con
  annotations:
    k8s.v1.cni.cncf.io/networks: mlx277-netdevice
spec:
  containers:
  - name: sriov-example
    image: quay.io/zzhao/app-netutil
    imagePullPolicy: IfNotPresent
    securityContext:
      privileged: true
    volumeMounts:
    - mountPath: /dev/hugepages
      name: hugepage
    resources:
      requests:
        memory: 1Gi
        hugepages-1Gi: 2Gi
        #hugepages-2Mi: 2048Mi
        #cpu: "4"
      limits:
        memory: 1Gi
        hugepages-1Gi: 2Gi
  - name: hello-sdn
    image: quay.io/openshifttest/hello-sdn@sha256:d5785550cf77b7932b090fcd1a2625472912fb3189d5973f177a5a2c347a1f95
    imagePullPolicy: IfNotPresent
    securityContext:
      privileged: true
    volumeMounts:
    - mountPath: /dev/hugepages
      name: hugepage
    resources:
      requests:
        memory: 1Gi
        hugepages-1Gi: 2Gi
        #hugepages-2Mi: 2048Mi
        #cpu: "4"
      limits:
        memory: 1Gi
        hugepages-1Gi: 2Gi
        #hugepages-2Mi: 2048Mi
        #cpu: "4"
  volumes:
  - name: hugepage
    emptyDir:
      medium: HugePages


4. Check the podnetinfo 

 #oc exec -n z1 test-multi-con -- ls /etc/podnetinfo
Defaulted container "sriov-example" out of: sriov-example, hello-sdn
annotations
hugepages_1G_limit_hello-sdn
hugepages_1G_limit_sriov-example
hugepages_1G_request_hello-sdn
hugepages_1G_request_sriov-example
labels


Move this bug to verified.

Comment 8 errata-xmlrpc 2021-07-27 22:35:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438