Bug 1012993

Summary: mlx4: Don't show RoCE interfaces if the hpn channel is not installed
Product: Red Hat Enterprise MRG Reporter: Clark Williams <williams>
Component: realtime-kernelAssignee: John Kacur <jkacur>
Status: CLOSED ERRATA QA Contact: MRG Quality Engineering <mrgqe-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 2.4CC: bhu, jkastner, lgoncalv
Target Milestone: 2.4.1   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: RoCE appears to be supported in kernel even when user space hpn channel not installed Consequence: RoCE interfaces shown even if hpn channel is not installed Fix: Check for hpn channel before exposing RoCE interfaces Result: RoCE devices show up as plain 10GigE devices if hpn channel is not installed
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-10-31 16:33:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Clark Williams 2013-09-27 14:31:11 UTC
Description of problem:


Version-Release number of selected component (if applicable):

kernel-rt-3.8.13-rt14.20.el6rt

Additional info:

Above kernel is missing commit d08e3e6e0a312410868a38c7a8433b527392c56e which disables display of HPN interfaces if the relevant HPN support libraries are not present.

Comment 4 Jiri Kastner 2013-10-07 11:55:06 UTC
[e-ndy@dhcp-27-86 linux-review]$ check_commit_presence ~/rpmbuild/BUILD/kernel-3.8.13/linux-3.8.13.x86_64/ 949c22effa1c4ed87493b692db76f268fda09ace
Reverting 949c22effa1c4ed87493b692db76f268fda09ace (mrg-rt-3.8.13-rt14.22~1) ... Applied
Restoring . Done
1 patch(es) was found applied.

##################################################################

[e-ndy@dhcp-27-86 linux-review]$ git show 949c22effa1c4ed87493b692db76f268fda09acecommit 949c22effa1c4ed87493b692db76f268fda09ace
Author: Doug Ledford <dledford>
Date:   Wed Aug 22 11:58:13 2012 -0500

    [netdrv] mlx4: Don't show RoCE interfaces if the hpn channel is not installed
    
    commit d08e3e6e0a312410868a38c7a8433b527392c56e upstream
    
    Message-id: <c555413a03da4140c36c833c9f817540dae890de.1331750468.git.dledford>
    Patchwork-id: 45842
    O-Subject: [Patch RHEL6] mlx4: Don't show RoCE interfaces if the hpn channel is
        not installed
    Bugzilla: 753004
    RH-Acked-by: Jay Fenlason <fenlason>
    RH-Acked-by: Ivan Vecera <ivecera>
    
    The HPN channel is where we officially support RoCE/IBoE interfaces.  When
    we first created it, the user space packages in the base channel did not
    include the RoCE/IBoE patches, however the kernel did because we don't
    have separate kernels for the HPN channel and the base channel.  This
    created confusion where the user space didn't know about RoCE/IBoE
    devices and the kernel did.  In addition, people were attempting to use
    the RoCE devices without the HPN add on, not because they were trying
    to get around buying the HPN add on, but because they didn't know it was
    even needed (and that's totally understandable when the devices showed up
    without it).  So, this patch makes the devices disappear unless the HPN
    channel is added on to the base subscription.
    
    Note: this is a "keep people honest" type change, it is not intended to
    be something people can't work around.  When people see that their RoCE
    devices don't show up by default in the base product, then they'll ask
    what they need to do to enable them, the answer will be "buy the HPN
    add on", and there will no longer be any confusion.  However, only a
    modest amount of digging would uncover that they can set the hpn
    variable themselves, so this isn't intended to be an enforcement mechanism,
    just a reduction of confusion mechanism.
    
    Tested on my cluster.  Without the libmlx4-rocee package and with this
    patch, all RoCE devices show up as plain 10GigE devices and no more using
    any of ibv_devices, ibv_devinfo, ibstat, and ibstatus (these check
    different paths to the device info, and they all consistently do not
    show the RoCE device as RDMA capable).  With the libmlx4-rocee package
    in place which includes a modprobe.d/libmlx4.conf file that sets hpn
    to 1, the RoCE devices come right back and are seen via all of the
    above listed mechanisms.
    
    Resolves: bz753004
    
    Signed-off-by: Doug Ledford <dledford>
    Signed-off-by: Aristeu Rozanski <arozansk>
    Cherry-picked-by: Clark Williams <williams>
    Signed-off-by: Clark Williams <williams>
    Cherry-picked-for: v3.8-rt
    Signed-off-by: John Kacur <jkacur>
    
    Conflicts:
        drivers/net/ethernet/mellanox/mlx4/main.c

diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 16abde2..a46b44c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -57,6 +57,10 @@ MODULE_VERSION(DRV_VERSION);
 
 struct workqueue_struct *mlx4_wq;
 
+static int hpn = 0;
+module_param(hpn, int, 0644);
+MODULE_PARM_DESC(hpn, "Enable RoCE/IBoE support (implies that the packages from the HPN channel are installed). 
+
 #ifdef CONFIG_MLX4_DEBUG
 
 int mlx4_debug_level = 0;
@@ -273,7 +277,11 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
 
        dev->caps.max_msg_sz         = dev_cap->max_msg_sz;
        dev->caps.page_size_cap      = ~(u32) (dev_cap->min_page_sz - 1);
-       dev->caps.flags              = dev_cap->flags;
+       if (hpn)
+               dev->caps.flags      = dev_cap->flags;
+       else
+               dev->caps.flags      = (dev_cap->flags &
+                                       ~MLX4_DEV_CAP_FLAG_IBOE);
        dev->caps.flags2             = dev_cap->flags2;
        dev->caps.bmme_flags         = dev_cap->bmme_flags;
        dev->caps.reserved_lkey      = dev_cap->reserved_lkey;
[e-ndy@dhcp-27-86 linux-review]$ tail -n +277 ~/rpmbuild/BUILD/kernel-3.8.13/linux-3.8.13.x86_64/drivers/net/ethernet/mellanox/mlx4/main.c | head -n 11

	dev->caps.max_msg_sz         = dev_cap->max_msg_sz;
	dev->caps.page_size_cap	     = ~(u32) (dev_cap->min_page_sz - 1);
	if (hpn)
		dev->caps.flags	     = dev_cap->flags;
	else
		dev->caps.flags      = (dev_cap->flags &
					~MLX4_DEV_CAP_FLAG_IBOE);
	dev->caps.flags2	     = dev_cap->flags2;
	dev->caps.bmme_flags	     = dev_cap->bmme_flags;
	dev->caps.reserved_lkey	     = dev_cap->reserved_lkey;

###########################################

[e-ndy@dhcp-27-86 linux-review]$ tail -n +57 ~/rpmbuild/BUILD/kernel-3.8.13/linux-3.8.13.x86_64/drivers/net/ethernet/mellanox/mlx4/main.c | head -n 10

struct workqueue_struct *mlx4_wq;

static int hpn = 0;
module_param(hpn, int, 0644);
MODULE_PARM_DESC(hpn, "Enable RoCE/IBoE support (implies that the packages from the HPN channel are installed).  Enabling this option without thos packages installed is specifically not supported.");

#ifdef CONFIG_MLX4_DEBUG

int mlx4_debug_level = 0;

Comment 5 errata-xmlrpc 2013-10-31 16:33:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-1490.html