Bug 1850462
| Summary: | RFE: allow blacklisting of modules in rdma-init-kernel by using modprobe -b | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Stefan Assmann <sassmann> | ||||
| Component: | rdma-core | Assignee: | Honggang LI <honli> | ||||
| Status: | CLOSED ERRATA | QA Contact: | zguo <zguo> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 8.3 | CC: | honli, pkenyon, rdma-dev-team, zguo | ||||
| Target Milestone: | rc | Keywords: | FutureFeature | ||||
| Target Release: | 8.4 | Flags: | pm-rhel:
mirror+
|
||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | rdma-core-32.0-1.el8 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2021-05-18 14:42:46 UTC | Type: | Story | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1892502 | ||||||
| Attachments: |
|
||||||
Hi, Stefan I think that invoking /sbin/modprobe with --use-blacklist is not a robust solution for bz1843840. 1) The modprobe '--use-blacklist' option causes modprobe to apply the blacklist commands in the configuration files (if any) to module names as well. I tested with a mlx4 IB HCA, the module still be loaded while I blacklist it in kernel module option. You will have to add mlx4_ib into a modprobe configuration file. That sounds you will have to create a new initramfs file for PXE installation. [root@rdma-dev-01 ~]$ grub2-editenv - list | grep kernelopts kernelopts=root=/dev/mapper/rhel_rdma--dev--01-root ro console=tty0 rd_NO_PLYMOUTH amd_iommu=on crashkernel=auto resume=/dev/mapper/rhel_rdma--dev--01-swap rd.lvm.lv=rhel_rdma-dev-01/root rd.lvm.lv=rhel_rdma-dev-01/swap console=ttyS1,115200 module_name.blacklist=1 rd.driver.blacklist=mlx4_ib [root@rdma-dev-01 ~]$ cat /proc/cmdline BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-228.el8.x86_64 root=/dev/mapper/rhel_rdma--dev--01-root ro console=tty0 rd_NO_PLYMOUTH amd_iommu=on crashkernel=auto resume=/dev/mapper/rhel_rdma--dev--01-swap rd.lvm.lv=rhel_rdma-dev-01/root rd.lvm.lv=rhel_rdma-dev-01/swap console=ttyS1,115200 module_name.blacklist=1 rd.driver.blacklist=mlx4_ib [root@rdma-dev-01 ~]$ grep modprobe /usr/libexec/rdma-init-kernel /sbin/modprobe -b $module 2) blacklist a module in a configuration will not prevent a module being loaded if it is a required or optional dependency of another module. That means '--use-blacklist' will not work for all non-leaf modules of driver/infiniband/ . It will only works for RDMA hardware drivers, as they are leaf modules. (In reply to Honggang LI from comment #2) Hi Honggang, thanks for looking into this. > 1) > option. You will have to add mlx4_ib into a modprobe configuration file. > That sounds you will have to create a new initramfs file for PXE > installation. Yes indeed. I would love to see a solution that allows blacklisting via the kernel cmd line, but I didn't find a way to do so. modprobe with --use-blacklist is the best thing I could think of. If you have a better idea I'd like to hear it. [root@intel-purley-02 ~]# cat /etc/modprobe.d/blacklist.conf blacklist i40iw [root@intel-purley-02 ~]# modprobe -b i40iw [root@intel-purley-02 ~]# lsmod |grep i40iw [root@intel-purley-02 ~]# modprobe i40iw [root@intel-purley-02 ~]# lsmod |grep i40iw i40iw 221184 0 > 2) > blacklist a module in a configuration will not prevent a module being loaded > if it is a required or optional dependency of another module. That means > '--use-blacklist' will not work for all non-leaf modules of > driver/infiniband/ . > It will only works for RDMA hardware drivers, as they are leaf modules. Correct, the goal of this request is to provide us with an option to prevent specific hardware drivers from being loaded by the rdma service. I see 2 scenarios where this would be helpful. a) For the installer, even though it will require to alter the initramfs and add the modprobe config file. b) In a running system where the customer wants to use let's say mlx4_ib, but prevent i40iw from being loaded (to conserve memory). Again I'm all open for better ideas. :) OK, as use --use-blacklist is harmless. Let's try it in upstream. Thanks https://github.com/linux-rdma/rdma-core/pull/797 Hardware module loading is handled by kernel-boot/rdma-hw-modules.rules
with systemd. That means the load_hardware_modules function is obsoleted.
After remove it, blacklist module via kernel options works for me.
[root@rdma-dev-01 ~]$ cat /proc/cmdline
BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.8.0-0.rc7.1.fc33.x86_64 root=UUID=3875ecd0-528a-4aa4-a0a7-1bacfc5ae908 ro rootflags=subvol=root console=tty0 rd_NO_PLYMOUTH amd_iommu=on console=ttyS1,115200 module_blacklist=mlx4_ib
[root@rdma-dev-01 ~]$ ibstat
[root@rdma-dev-01 ~]$ lsmod | grep mlx
mlx4_en 139264 0
mlx4_core 372736 1 mlx4_en
@Stefan, could you please test this patch for i40e?
[honli@dhcp-128-72 rdma-core (my)]$ cat 0001-redhat-delete-obsoleted-hardware-module-loading-scri.patch
From c428bcbf5e9501636e3b25d119351782aa9d2406 Mon Sep 17 00:00:00 2001
From: Honggang Li <honli>
Date: Wed, 5 Aug 2020 11:29:32 +0800
Subject: [PATCH] redhat: delete obsoleted hardware module loading script
Hardware module loading is handled by kernel-boot/rdma-hw-modules.rules
with systemd.
Signed-off-by: Honggang Li <honli>
---
redhat/rdma.conf | 2 --
redhat/rdma.kernel-init | 56 -----------------------------------------
2 files changed, 58 deletions(-)
diff --git a/redhat/rdma.conf b/redhat/rdma.conf
index f5b74b248030..53ab8fa4b1f9 100644
--- a/redhat/rdma.conf
+++ b/redhat/rdma.conf
@@ -14,5 +14,3 @@ RDS_LOAD=no
XPRTRDMA_LOAD=yes
# Load NFSoRDMA server transport module
SVCRDMA_LOAD=no
-# Load Tech Preview device driver modules
-TECH_PREVIEW_LOAD=no
diff --git a/redhat/rdma.kernel-init b/redhat/rdma.kernel-init
index c7444a1c8d77..ea9894851fba 100644
--- a/redhat/rdma.kernel-init
+++ b/redhat/rdma.kernel-init
@@ -60,9 +60,6 @@ if [ -f $CONFIG ]; then
if [ "${SVCRDMA_LOAD}" == "yes" ]; then
LOAD_ULP_MODULES="$LOAD_ULP_MODULES svcrdma"
fi
- if [ "${TECH_PREVIEW_LOAD}" == "yes" ]; then
- LOAD_TECH_PREVIEW_DRIVERS="$TECH_PREVIEW_LOAD"
- fi
else
LOAD_ULP_MODULES="ib_ipoib"
fi
@@ -96,56 +93,6 @@ load_modules()
return $RC
}
-load_hardware_modules()
-{
- local -i RC=0
-
- # We match both class NETWORK and class INFINIBAND devices since our
- # iWARP hardware is listed under class NETWORK. The side effect of
- # this is that we might cause a non-iWARP network driver to be loaded.
- udevadm trigger --subsystem-match=pci --attr-nomatch=driver --attr-match=class=0x020000 --attr-match=class=0x0c0600
- udevadm settle
- if [ -r /proc/device-tree ]; then
- if [ -n "`ls /proc/device-tree | grep lhca`" ]; then
- if ! is_loaded ib_ehca; then
- load_modules ib_ehca
- RC+=$?
- fi
- fi
- fi
- if is_loaded mlx4_core -a ! is_loaded mlx4_ib; then
- load_modules mlx4_ib
- RC+=$?
- fi
- if is_loaded mlx4_core -a ! is_loaded mlx4_en; then
- load_modules mlx4_en
- RC+=$?
- fi
- if is_loaded mlx5_core -a ! is_loaded mlx5_ib; then
- load_modules mlx5_ib
- RC+=$?
- fi
- if is_loaded cxgb4 -a ! is_loaded iw_cxgb4; then
- load_modules iw_cxgb4
- RC+=$?
- fi
- if is_loaded be2net -a ! is_loaded ocrdma; then
- load_modules ocrdma
- RC+=$?
- fi
- if is_loaded enic -a ! is_loaded usnic_verbs; then
- load_modules usnic_verbs
- RC+=$?
- fi
- if [ "${LOAD_TECH_PREVIEW_DRIVERS}" == "yes" ]; then
- if is_loaded i40e -a ! is_loaded i40iw; then
- load_modules i40iw
- RC+=$?
- fi
- fi
- return $RC
-}
-
errata_58()
{
# Check AMD chipset issue Errata #58
@@ -209,9 +156,6 @@ errata_56()
fi
}
-
-load_hardware_modules
-RC=$[ $RC + $? ]
load_modules $LOAD_CORE_MODULES
RC=$[ $RC + $? ]
load_modules $LOAD_CORE_CM_MODULES
--
2.25.4
(In reply to Honggang LI from comment #5) > @Stefan, could you please test this patch for i40e? Sure, but could you please attach the patch as a file? Copy/paste from bugzilla doesn't work very well. Created attachment 1710500 [details]
patch
please test this patch.
I've applied the patch here are the results.
[root@intel-purley-02 ~]# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-227.el8.x86_64 root=/dev/mapper/rhel_intel--purley--02-root ro crashkernel=auto resume=/dev/mapper/rhel_intel--purley--02-swap rd.lvm.lv=rhel_intel-purley-02/root rd.lvm.lv=rhel_intel-purley-02/swap console=ttyS0,115200n81 module_blacklist=i40iw
[root@intel-purley-02 ~]# systemctl status rdma
● rdma.service - Initialize the iWARP/InfiniBand/RDMA stack in the kernel
Loaded: loaded (/usr/lib/systemd/system/rdma.service; disabled; vendor preset: disabled)
Active: inactive (dead)
Docs: file:/etc/rdma/rdma.conf
[root@intel-purley-02 ~]# systemctl start rdma
[root@intel-purley-02 ~]# lsmod |grep i40iw
[root@intel-purley-02 ~]#
[root@intel-purley-02 ~]# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-227.el8.x86_64 root=/dev/mapper/rhel_intel--purley--02-root ro crashkernel=auto resume=/dev/mapper/rhel_intel--purley--02-swap rd.lvm.lv=rhel_intel-purley-02/root rd.lvm.lv=rhel_intel-purley-02/swap console=ttyS0,115200n81
[root@intel-purley-02 ~]# systemctl start rdma
[root@intel-purley-02 ~]# lsmod |grep i40iw
[root@intel-purley-02 ~]#
As you can see, in both cases the i40iw driver will not get loaded with the patch applied. So I don't think applying the patch is a good idea.
Thanks for testing. I will check in a machine with i40iw to debug this issue. (In reply to Stefan Assmann from comment #8) > As you can see, in both cases the i40iw driver will not get loaded with the > patch applied. So I don't think applying the patch is a good idea. It works for iw_cxgb4. I tried to check in a machine with i40iw hardware. However, all of them can't boot when rdma-core installed. I also tried machine intel-purley-02.klab.eng.bos.redhat.com, which you used to test the patch. https://beaker.engineering.redhat.com/jobs/4480036 https://beaker.engineering.redhat.com/jobs/4479888 https://beaker.engineering.redhat.com/jobs/4479886 How can I reserve the machine with rhel-8.2 or rhel-8.3 nightly distro? Which distro you used to test this patch? Thanks (In reply to Honggang LI from comment #10) > However, all of them can't boot when rdma-core installed. It seems nothing to do with rdma-core package. Reboot hang too without rdma-core package. ==== console output when machine hang ==== Start PXE over IPv4. Station IP address is 10.19.176.125 Server IP address is 10.19.42.13 NBP filename is grub/grub-el6.6.efi NBP filesize is 254279 Bytes Downloading NBP file... NBP file downloaded successfully. Press 'ESC+!' for One-Time Boot Menu We have to manually select "One-Time Boot Menu" in serial console window. (In reply to Stefan Assmann from comment #8) > [root@intel-purley-02 ~]# systemctl status rdma > ● rdma.service - Initialize the iWARP/InfiniBand/RDMA stack in the kernel > Loaded: loaded (/usr/lib/systemd/system/rdma.service; disabled; vendor > preset: disabled) > Active: inactive (dead) ^^^^^^^^^^^^^^^^^^^^^^^ This means "Either the one-time configuration failed to execute or not executed yet." . Do you know why it was dead? It should be "active (exited)". I confirmed the patch works for me. I used hpe-e910-01.ml3.eng.bos.redhat.com to test this patch. I will keep this machine for 48 hours. Please feel free to login hpe-e910-01.ml3.eng.bos.redhat.com and run any test you like. You can modify any file on the machine. Thanks (In reply to Honggang LI from comment #13) > > [root@intel-purley-02 ~]# systemctl status rdma > > ● rdma.service - Initialize the iWARP/InfiniBand/RDMA stack in the kernel > > Loaded: loaded (/usr/lib/systemd/system/rdma.service; disabled; vendor > > preset: disabled) > > Active: inactive (dead) > ^^^^^^^^^^^^^^^^^^^^^^^ > > This means "Either the one-time configuration failed to execute or not > executed yet." . > Do you know why it was dead? The service was not started yet. I have provisioned intel-purley-02 with the latest RHEL8.3 build. The system is a bit finicky to provision so please go ahead and use it for testing. I confirmed the rdma-core package works as expected with machine intel-purley-0. i40iw blacklisted with kernel boot option. Can you please double check it? Honggang, I checked the system and at first glance it seems to work as expected. Can you please explain what triggers the i40iw driver load now? $ grep i40e /usr/lib/udev/rules.d/90-rdma-hw-modules.rules
ENV{ID_NET_DRIVER}=="i40e", RUN{builtin}+="kmod load i40iw"
This udev rule triggers the i40iw driver load.
I'm not a udev expert, but we absolutely don't want load i40iw based on the presence of i40e. At least that's how I understand the udev rule above. i40iw is a huge memory hog and we will get loads of bug reports, also probably 95% of NICs driven by i40e don't support rdma. The only viable solution I see for udev would be PCI ID based loading of i40iw. (In reply to Stefan Assmann from comment #21) > The only viable solution I see for udev would be PCI ID based loading of > i40iw. Can you please provide the list of PCI ID? I'll talk to Intel to make sure we catch all PCI IDs. Is it possible fix the i40iw driver to scale memory usage based on available system memory? Mellanox does this. (In reply to Honggang LI from comment #24) > Is it possible fix the i40iw driver to scale memory usage based on available > system memory? Mellanox does this. No, Intel officially denied that request. i40iw will be phased out at some point in favor of irdma. Still discussing with Intel the list of supported IDs. This is the list with supported devices I got from Intel. * Vendor ID */ #define I40E_INTEL_VENDOR_ID 0x8086 /* Device IDs */ #define I40E_DEV_ID_X722 0x37CC #define I40E_DEV_ID_KX_X722 0x37CE #define I40E_DEV_ID_QSFP_X722 0x37CF #define I40E_DEV_ID_SFP_X722 0x37D0 #define I40E_DEV_ID_1G_BASE_T_X722 0x37D1 #define I40E_DEV_ID_10G_BASE_T_X722 0x37D2 #define I40E_DEV_ID_SFP_I_X722 0x37D3 #define I40E_DEV_ID_X722_VF 0x37CD I installed new version rdma-core on the machine. The new version rdma-core migrate
rdma.service to udev rules provided in rdma-core/kernel-boot/ directory.
I also update the udev rule file "90-rdma-hw-modules.rules". Please check the new
rdma-core built works or not. Thanks
[root@intel-purley-02 rules.d]# diff -Nurp 90-rdma-hw-modules.rules.old 90-rdma-hw-modules.rules
--- 90-rdma-hw-modules.rules.old 2020-08-24 10:28:27.333991251 -0400
+++ 90-rdma-hw-modules.rules 2020-08-24 10:28:48.691809006 -0400
@@ -10,11 +10,29 @@ ENV{ID_NET_DRIVER}=="be2net", RUN{builti
ENV{ID_NET_DRIVER}=="bnxt_en", RUN{builtin}+="kmod load bnxt_re"
ENV{ID_NET_DRIVER}=="cxgb4", RUN{builtin}+="kmod load iw_cxgb4"
ENV{ID_NET_DRIVER}=="hns", RUN{builtin}+="kmod load hns_roce"
-ENV{ID_NET_DRIVER}=="i40e", RUN{builtin}+="kmod load i40iw"
ENV{ID_NET_DRIVER}=="mlx4_en", RUN{builtin}+="kmod load mlx4_ib"
ENV{ID_NET_DRIVER}=="mlx5_core", RUN{builtin}+="kmod load mlx5_ib"
ENV{ID_NET_DRIVER}=="qede", RUN{builtin}+="kmod load qedr"
+# Because most of X722 don't support RDMA, only load i40iw for X722 with specific device IDs.
+#define I40E_INTEL_VENDOR_ID 0x8086
+#define I40E_DEV_ID_X722 0x37CC
+#define I40E_DEV_ID_KX_X722 0x37CE
+#define I40E_DEV_ID_QSFP_X722 0x37CF
+#define I40E_DEV_ID_SFP_X722 0x37D0
+#define I40E_DEV_ID_1G_BASE_T_X722 0x37D1
+#define I40E_DEV_ID_10G_BASE_T_X722 0x37D2
+#define I40E_DEV_ID_SFP_I_X722 0x37D3
+#define I40E_DEV_ID_X722_VF 0x37CD
+ENV{ID_NET_DRIVER}=="i40e", ENV{ID_VENDOR_ID}=="0x8086", ENV{ID_MODEL_ID}=="0x37cc", RUN{builtin}+="kmod load i40iw"
+ENV{ID_NET_DRIVER}=="i40e", ENV{ID_VENDOR_ID}=="0x8086", ENV{ID_MODEL_ID}=="0x37ce", RUN{builtin}+="kmod load i40iw"
+ENV{ID_NET_DRIVER}=="i40e", ENV{ID_VENDOR_ID}=="0x8086", ENV{ID_MODEL_ID}=="0x37cf", RUN{builtin}+="kmod load i40iw"
+ENV{ID_NET_DRIVER}=="i40e", ENV{ID_VENDOR_ID}=="0x8086", ENV{ID_MODEL_ID}=="0x37d0", RUN{builtin}+="kmod load i40iw"
+ENV{ID_NET_DRIVER}=="i40e", ENV{ID_VENDOR_ID}=="0x8086", ENV{ID_MODEL_ID}=="0x37d1", RUN{builtin}+="kmod load i40iw"
+ENV{ID_NET_DRIVER}=="i40e", ENV{ID_VENDOR_ID}=="0x8086", ENV{ID_MODEL_ID}=="0x37d2", RUN{builtin}+="kmod load i40iw"
+ENV{ID_NET_DRIVER}=="i40e", ENV{ID_VENDOR_ID}=="0x8086", ENV{ID_MODEL_ID}=="0x37d3", RUN{builtin}+="kmod load i40iw"
+ENV{ID_NET_DRIVER}=="i40e", ENV{ID_VENDOR_ID}=="0x8086", ENV{ID_MODEL_ID}=="0x37cd", RUN{builtin}+="kmod load i40iw"
+
# The user must explicitly load these modules via /etc/modules-load.d/ or otherwise
# rxe
Works for me on intel-purley-02.klab. Also tested that module_blacklist=i40iw works as expected. Alaa's PR for mlx4 and mlx5 IB only HCA. I will rebase my PR on it. https://github.com/linux-rdma/rdma-core/pull/818 (In reply to Honggang LI from comment #31) > Alaa's PR for mlx4 and mlx5 IB only HCA. I will rebase my PR on it. > > https://github.com/linux-rdma/rdma-core/pull/818 New PR to address this issue: https://github.com/linux-rdma/rdma-core/pull/835 (In reply to Honggang LI from comment #33) > New PR to address this issue: > https://github.com/linux-rdma/rdma-core/pull/835 Merged into upstream repo. Set devel+ flag. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (RDMA stack bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1594 |
Description of problem: The load_module() function in rdma-init-kernel currently looks like this: load_modules() { local RC=0 for module in $*; do if ! /sbin/modinfo $module > /dev/null 2>&1; then # do not attempt to load modules which do not exist continue fi if ! is_loaded $module; then /sbin/modprobe $module res=$? RC=$[ $RC + $res ] if [ $res -ne 0 ]; then echo echo "Failed to load module $module" fi fi done return $RC } By invoking /sbin/modprobe with --use-blacklist modprobe would honor any blacklisted modules. This would improve our leverage when we need to avoid loading a specific module, especially since rdma-core is now part of anaconda. BZ where this would have been helpful. https://bugzilla.redhat.com/show_bug.cgi?id=1843840