Bug 1970409 - CNF test container: [SNO] [dpdk] All dpdk tests failed : Couldn`t create NetworkAttachmentDefinition CR
Summary: CNF test container: [SNO] [dpdk] All dpdk tests failed : Couldn`t create Netw...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: CNF Platform Validation
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.9.0
Assignee: Sebastian Scheinkman
QA Contact: Nikita
URL:
Whiteboard:
: 1970410 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-10 12:40 UTC by elevin
Modified: 2022-08-24 12:52 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
Release Note text: When running CNF tests in regular mode on a single node, the logic in place to understand if the cluster is ready is missing details. In particular, creating an SR-IOV network will not create a network attachment definition until at least one minute elapses. All the DPDK tests fail in cascade. Run the CNF tests in regular mode skipping the DPDK feature when running against an installation on a single node, with the `-ginkgo.skip` parameter. Run CNF tests in Discovery mode to execute tests against an installation on a single node. ------- Cause: When running cnf-tests on SNO and regular mode, the logic in place to understand if the cluster is ready is missing some details. In particular, creating a sriov network will not create a network attachment definition if not after a minute or more. Consequence: All the dpdk tests fail in cascade Workaround (if any): Run the cnf tests skipping the dpdk feature when running against SNO in regular mode, with the -ginkgo.skip parameter Running in discovery mode is the recommended way to execute tests against SNO Result:
Clone Of:
Environment:
Last Closed: 2022-08-24 12:52:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift-kni cnf-features-deploy pull 570 0 None closed Bug 1969423: Fix dpdk test on SNO 2021-07-07 12:44:21 UTC

Description elevin 2021-06-10 12:40:21 UTC
Description of problem:
All DPDK tests failed in regular mode on SNO cluster.

2021-06-10T12:25:17.724Z	ERROR	controllers.SriovNetwork	Couldn't create NetworkAttachmentDefinition CR	{"sriovnetwork": "openshift-sriov-network-operator/test-dpdk-network", "Namespace": "dpdk-testing", "Name": "test-dpdk-network", "error": "Internal error occurred: failed calling webhook \"multus-validating-config.k8s.io\": Post \"https://multus-admission-controller.openshift-multus.svc:443/validate?timeout=30s\": x509: certificate signed by unknown authority"}


Version-Release number of selected component (if applicable):
Client Version: 4.8.0-0.nightly-2021-06-08-034312
Server Version: 4.8.0-0.nightly-2021-06-08-034312
Kubernetes Version: v1.21.0-rc.0+fec6fbc
DPDK_TESTS_IMAGE=dpdk-base:v4.8.0-8
CNF_TESTS_IMAGE=openshift4-cnf-tests:v4.8.0-48


How reproducible:


Steps to Reproduce:
1.SNO cluster
2.podman run  --net=host -v /root/ocp/auth:/kubeconfig:Z  -e KUBECONFIG=/kubeconfig/kubeconfig -e IMAGE_REGISTRY=cnfdc8-installer:5000/rh-osbs/ -e CNF_TESTS_IMAGE=openshift4-cnf-tests:v4.8.0-48 -e DPDK_TESTS_IMAGE=dpdk-base:v4.8.0-8 -e ROLE_WORKER_CNF=master -e SCTPTEST_HAS_NON_CNF_WORKERS=false registry-proxy.engineering.redhat.com/rh-osbs/openshift4-cnf-tests:v4.8.0-48 /usr/bin/test-run.sh -ginkgo.focus="dpdk"


Actual results:

• Failure in Spec Setup (BeforeEach) [1988.664 seconds]
dpdk
/remote-source/app/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:92
  VFS allocated for dpdk
  /remote-source/app/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:151
    Validate the build [BeforeEach]
    /remote-source/app/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:173
      Should forward and receive packets from a pod running dpdk base on a image created by building config
      /remote-source/app/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:174
      Timed out after 10.001s.
      Unexpected error:
          <*errors.StatusError | 0xc0004d8960>: {
              ErrStatus: {
                  TypeMeta: {Kind: "", APIVersion: ""},
                  ListMeta: {
                      SelfLink: "",
                      ResourceVersion: "",
                      Continue: "",
                      RemainingItemCount: nil,
                  },
                  Status: "Failure",
                  Message: "network-attachment-definitions.k8s.cni.cncf.io \"test-dpdk-network\" not found",
                  Reason: "NotFound",
                  Details: {
                      Name: "test-dpdk-network",
                      Group: "k8s.cni.cncf.io",
                      Kind: "network-attachment-definitions",
                      UID: "",
                      Causes: nil,
                      RetryAfterSeconds: 0,
                  },
                  Code: 404,
              },
          }
          network-attachment-definitions.k8s.cni.cncf.io "test-dpdk-network" not found
      occurred
      /remote-source/app/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:892
Expected results:


Additional info:

Comment 1 Federico Paolinelli 2021-06-10 12:41:49 UTC
This is a bug in the test themselves and not in the feature.
The network attachement definition gets eventually created, it's an issue due to the timing and the reboots of the single node openshift.

Comment 2 Sebastian Scheinkman 2021-06-14 15:36:56 UTC
*** Bug 1970410 has been marked as a duplicate of this bug. ***

Comment 3 elevin 2021-07-08 14:36:01 UTC
Client Version: 4.8.0-0.nightly-2021-06-19-005119
Server Version: 4.8.0-0.nightly-2021-06-19-005119
Kubernetes Version: v1.21.0-rc.0+120883f
==========================================================

podman run  --net=host -v /root/ocp/auth:/kubeconfig:Z  -v ~/reports/:/report:Z -e KUBECONFIG=/kubeconfig/kubeconfig -e IMAGE_REGISTRY=cnfdd2-installer:5000/openshift-kni -e CNF_TESTS_IMAGE=cnf-tests:4.9 -e DPDK_TESTS_IMAGE=dpdk:4.9 -e ROLE_WORKER_CNF=master -e SCTPTEST_HAS_NON_CNF_WORKERS=false quay.io/openshift-kni/cnf-tests:4.9 /usr/bin/test-run.sh -ginkgo.focus="dpdk" --report=/report --junit=/report

=========================================================


SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
S [SKIPPING] [1973.991 seconds]
dpdk
/go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:92
  VFS allocated for dpdk
  /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:151
    Validate the build
    /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:173
      Should forward and receive packets from a pod running dpdk base on a image created by building config [It]
      /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:174

      skip test as we can't find a dpdk workload running with a s2i build

      /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:180
------------------------------
• [SLOW TEST:56.221 seconds]
dpdk
/go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:92
  VFS allocated for dpdk
  /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:151
    Validate a DPDK workload running inside a pod
    /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:194
      Should forward and receive packets
      /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:195
------------------------------
••
------------------------------
• [SLOW TEST:5.527 seconds]
dpdk
/go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:92
  VFS allocated for dpdk
  /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:151
    Validate HugePages
    /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:249
      should allocate the amount of hugepages requested
      /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:269
------------------------------
•
------------------------------
• [SLOW TEST:135.668 seconds]
dpdk
/go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:92
  VFS split for dpdk and netdevice
  /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:322
    should forward and receive packets from a pod running dpdk base
    /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:347
------------------------------
•
------------------------------
• [SLOW TEST:69.216 seconds]
dpdk
/go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:92
  dpdk application on different vendors
  /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:371
    Test connectivity using the requested nic
    /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:399
      Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28
      /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:430
------------------------------
S [SKIPPING] [19.061 seconds]
dpdk
/go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:92
  dpdk application on different vendors
  /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:371
    Test connectivity using the requested nic
    /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:399
      Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking [It]
      /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:431

      skip nic validate as wasn't able to find a nic with vendorID 8086 and deviceID 0d58

      /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:403
------------------------------
• [SLOW TEST:47.182 seconds]
dpdk
/go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:92
  dpdk application on different vendors
  /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:371
    Test connectivity using the requested nic
    /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:399
      Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
      /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:432
------------------------------
S [SKIPPING] [13.048 seconds]
dpdk
/go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:92
  dpdk application on different vendors
  /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:371
    Test connectivity using the requested nic
    /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:399
      Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5] [It]
      /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:433

      skip nic validate as wasn't able to find a nic with vendorID 15b3 and deviceID 1017

      /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:403
------------------------------

• [SLOW TEST:69.986 seconds]
dpdk
/go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:92
  Downward API
  /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:436
    Volume is readable in container
    /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:460
------------------------------
• [SLOW TEST:14.061 seconds]
dpdk
/go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:92
  restoring configuration
  /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:489
    should restore the cluster to the original status
    /go/src/github.com/openshift-kni/cnf-features-deploy/cnf-tests/testsuites/e2esuite/dpdk/dpdk.go:490
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
JUnit report was created: /report/cnftests-junit.xml

Ran 11 of 134 Specs in 3099.620 seconds
SUCCESS! -- 11 Passed | 0 Failed | 0 Pending | 123 Skipped

Comment 5 Carlos Goncalves 2022-08-24 12:52:49 UTC
Bulk closing of all "CNF Platform Validation" component BZs assigned to CNF Network team members and in VERIFIED status for longer than 1 month.


Note You need to log in before you can comment on or make changes to this bug.