Bug 2034968

Summary: [RHEL8.6] pyverbs-tests fail with 6 errors in "tests.test_mr.MWTest" for MLX4 IB and ROCE devices
Product: Red Hat Enterprise Linux 8 Reporter: Brian Chae <bchae>
Component: rdma-coreAssignee: Kamal Heib <kheib>
Status: CLOSED WONTFIX QA Contact: Infiniband QE <infiniband-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.6CC: hwkernel-mgr, kheib, linville, rdma-dev-team
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1971536 Environment:
Last Closed: 2023-06-22 07:28:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1971536    
Bug Blocks:    

Description Brian Chae 2021-12-22 15:24:07 UTC
+++ This bug was initially created as a clone of Bug #1971536 +++

Description of problem:

pyverbs-tests fail due to 6 errored testcases in "tests.test_mr.MWTest" on all MLX4 IB and MLX4 ROCE devices.

test_invalidate_mw_type1 (tests.test_mr.MWTest) ... ERROR
test_mw_type1 (tests.test_mr.MWTest) ... ERROR
test_mw_type2 (tests.test_mr.MWTest) ... ERROR
test_mw_type2_invalidate_dealloc (tests.test_mr.MWTest) ... ERROR
test_mw_type2_invalidate_local (tests.test_mr.MWTest) ... ERROR
test_mw_type2_invalidate_remote (tests.test_mr.MWTest) ... ERROR


Version-Release number of selected component (if applicable):

DISTRO=RHEL-9.0.0-20210610.2

Red Hat Enterprise Linux release 9.0 Beta (Plow)

Linux rdma-perf-00.lab.bos.redhat.com 5.13.0-0.rc4.33.el9.x86_64 #1 SMP Wed Jun 2 19:15:08 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux

BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.13.0-0.rc4.33.el9.x86_64 root=/dev/mapper/rhel_rdma--perf--00-root ro intel_idle.max_cstate=0 intremap=no_x2apic_optout processor.max_cstate=0 intel_iommu=on iommu=on console=tty0 rd_NO_PLYMOUTH intel_idle.max_cstate=0 intremap=no_x2apic_optout processor.max_cstate=0 resume=/dev/mapper/rhel_rdma--perf--00-swap rd.lvm.lv=rhel_rdma-perf-00/root rd.lvm.lv=rhel_rdma-perf-00/swap console=ttyS1,115200n81

rdma-core-34.0-4.el9.x86_64

linux-firmware-20210315-120.el9.noarch

Installed:
  kernel-kernel-infiniband-pyverbs-tests-1.5-1.noarch                   


How reproducible:

100%

Steps to Reproduce:
1. With the above build and packages on both RDMA server and client hosts
2. clone "rdma-core"
3. build "rdma-core"
4. execute the pyverbs tests

    ./build/bin/run_tests.py -v --dev $HCA_ID

    <HCA_ID:	mlx4_0[IB]/mlx4_1[ROCE]>



Actual results:

test_invalidate_mw_type1 (tests.test_mr.MWTest) ... ERROR
test_mw_type1 (tests.test_mr.MWTest) ... ERROR
test_mw_type2 (tests.test_mr.MWTest) ... ERROR
test_mw_type2_invalidate_dealloc (tests.test_mr.MWTest) ... ERROR
test_mw_type2_invalidate_local (tests.test_mr.MWTest) ... ERROR
test_mw_type2_invalidate_remote (tests.test_mr.MWTest) ... ERROR

======================================================================
ERROR: test_invalidate_mw_type1 (tests.test_mr.MWTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 339, in test_invalidate_mw_type1
    self.test_mw_type1()
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 333, in test_mw_type1
    self.create_players(MWRC, mw_type=e.IBV_MW_TYPE_1)
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 245, in create_players
    self.client = resource(**self.dev_info, **resource_arg)
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 209, in __init__
    raise ex
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 205, in __init__
    self.mw = MW(self.pd, self.mw_type)
  File "mr.pyx", line 305, in pyverbs.mr.MW.__init__
pyverbs.pyverbs_error.PyverbsRDMAError: Failed to allocate MW. Errno: 1, Operation not permitted

======================================================================
ERROR: test_mw_type1 (tests.test_mr.MWTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 333, in test_mw_type1
    self.create_players(MWRC, mw_type=e.IBV_MW_TYPE_1)
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 245, in create_players
    self.client = resource(**self.dev_info, **resource_arg)
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 209, in __init__
    raise ex
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 205, in __init__
    self.mw = MW(self.pd, self.mw_type)
  File "mr.pyx", line 305, in pyverbs.mr.MW.__init__
pyverbs.pyverbs_error.PyverbsRDMAError: Failed to allocate MW. Errno: 1, Operation not permitted

======================================================================
ERROR: test_mw_type2 (tests.test_mr.MWTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 346, in test_mw_type2
    self.create_players(MWRC, mw_type=e.IBV_MW_TYPE_2)
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 245, in create_players
    self.client = resource(**self.dev_info, **resource_arg)
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 209, in __init__
    raise ex
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 205, in __init__
    self.mw = MW(self.pd, self.mw_type)
  File "mr.pyx", line 305, in pyverbs.mr.MW.__init__
pyverbs.pyverbs_error.PyverbsRDMAError: Failed to allocate MW. Errno: 1, Operation not permitted

======================================================================
ERROR: test_mw_type2_invalidate_dealloc (tests.test_mr.MWTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 366, in test_mw_type2_invalidate_dealloc
    self.test_mw_type2()
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 346, in test_mw_type2
    self.create_players(MWRC, mw_type=e.IBV_MW_TYPE_2)
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 245, in create_players
    self.client = resource(**self.dev_info, **resource_arg)
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 209, in __init__
    raise ex
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 205, in __init__
    self.mw = MW(self.pd, self.mw_type)
  File "mr.pyx", line 305, in pyverbs.mr.MW.__init__
pyverbs.pyverbs_error.PyverbsRDMAError: Failed to allocate MW. Errno: 1, Operation not permitted

======================================================================
ERROR: test_mw_type2_invalidate_local (tests.test_mr.MWTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 352, in test_mw_type2_invalidate_local
    self.test_mw_type2()
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 346, in test_mw_type2
    self.create_players(MWRC, mw_type=e.IBV_MW_TYPE_2)
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 245, in create_players
    self.client = resource(**self.dev_info, **resource_arg)
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 209, in __init__
    raise ex
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 205, in __init__
    self.mw = MW(self.pd, self.mw_type)
  File "mr.pyx", line 305, in pyverbs.mr.MW.__init__
pyverbs.pyverbs_error.PyverbsRDMAError: Failed to allocate MW. Errno: 1, Operation not permitted

======================================================================
ERROR: test_mw_type2_invalidate_remote (tests.test_mr.MWTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 359, in test_mw_type2_invalidate_remote
    self.test_mw_type2()
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 346, in test_mw_type2
    self.create_players(MWRC, mw_type=e.IBV_MW_TYPE_2)
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 245, in create_players
    self.client = resource(**self.dev_info, **resource_arg)
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 209, in __init__
    raise ex
  File "/tmp/tmp.gpc3AMoEjr/rdma-core/tests/test_mr.py", line 205, in __init__
    self.mw = MW(self.pd, self.mw_type)
  File "mr.pyx", line 305, in pyverbs.mr.MW.__init__
pyverbs.pyverbs_error.PyverbsRDMAError: Failed to allocate MW. Errno: 1, Operation not permitted

----------------------------------------------------------------------
Ran 193 tests in 3.264s

FAILED (errors=6, skipped=121)
---
- TEST RESULT FOR rdma-core
-   Test:   Run pyverbs tests
-   Result: FAIL
-   Return: 1
---


Expected results:

The testcases should have tested with "ok" results

Additional info:

Only exception was the MLX4 ROCE device on "rdma-virt-01" which tested the same 6 above errorred tests fine when tested as RDMA client:

test_invalidate_mw_type1 (tests.test_mr.MWTest) ... ok
test_mw_type1 (tests.test_mr.MWTest) ... ok
test_mw_type2 (tests.test_mr.MWTest) ... ok
test_mw_type2_invalidate_dealloc (tests.test_mr.MWTest) ... ok
test_mw_type2_invalidate_local (tests.test_mr.MWTest) ... ok
test_mw_type2_invalidate_remote (tests.test_mr.MWTest) ... ok
test_reg_mw_wrong_type (tests.test_mr.MWTest)

https://beaker.engineering.redhat.com/recipes/10132185#task127275367

--- Additional comment from Brian Chae on 2021-06-14 11:24:33 UTC ---

pyversb-tests package info 

Installed:
  python3-pyverbs-34.0-4.el9.x86_64

--- Additional comment from Brian Chae on 2021-12-22 14:14:57 UTC ---

(In reply to Brian Chae from comment #1)
> pyversb-tests package info 
> 
> Installed:
>   python3-pyverbs-34.0-4.el9.x86_64

Per Honggang's request on beaker ID of the test run.

https://beaker.engineering.redhat.com/jobs/6117909

Comment 3 RHEL Program Management 2023-06-22 07:28:20 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.