Bug 1340914

Summary: Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool
Product: Red Hat Enterprise Linux 7 Reporter: Phil Sutter <psutter>
Component: iprouteAssignee: Bill Townsend <wtownsen>
Status: CLOSED ERRATA QA Contact: Jaroslav Aster <jaster>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.3CC: aloughla, atragler, jaster, jsitnick, mleitner, rkhan, wtownsen
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: iproute-3.10.0-73.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-03 23:39:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1253767, 1271580    

Description Phil Sutter 2016-05-30 16:31:49 UTC
I discovered this missing feature in iproute when looking at conflicts during backporting for bug 1302119. With resolution of bug 1200759, kernel started supporting IFLA_VF_RATE netlink message in addition to the older IFLA_VF_TX_RATE. Allowing users to use it via iproute would be nice to have (and simplify backporting for bug 1302119).

Comment 2 Phil Sutter 2016-06-21 21:08:00 UTC
This is the upstream commit to backport:

commit f89a2a05ffa94ac5bec9f50751f761215356092b
Author: Sucheta Chakraborty <sucheta.chakraborty>
Date:   Thu May 22 09:59:37 2014 -0400

    Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool
    
    o "min_tx_rate" option has been added for minimum Tx rate. Hence, for
      consistent naming, "max_tx_rate" option has been introduced for maximum
      Tx rate.
    
    o Change in v2: "rate" can be used along with "max_tx_rate".
      When both are specified, "max_tx_rate" should override.
    
    o Change in v3:
      * IFLA_VF_RATE: When IFLA_VF_RATE is used, and user has given only one of
        min_tx_rate or max_tx_rate, reading of previous rate limits is done in
        userspace instead of in kernel space before ndo_set_vf_rate.
    
      * IFLA_VF_TX_RATE: When IFLA_VF_TX_RATE is used, min_tx_rate is always read
        in kernel space. This takes care of below scenarios:
        (1) when old tool sends "rate" but kernel is new (expects min and max)
        (2) when new tool sends only "rate" but kernel is old (expects only "rate")
    
    o Change in v4 as suggested by Stephen Hemminger:
      * As per iproute policy, input and output formats should match. Changing display
        of max_tx_rate and min_tx_rate options accordingly.
        ./ip/ip link show p3p1
        8: p3p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000
            link/ether 00:0e:1e:16:ce:40 brd ff:ff:ff:ff:ff:ff
            vf 0 MAC 2a:18:8f:4d:3d:d4, tx rate 700 (Mbps), max_tx_rate 700Mbps, min_tx_rate 200Mbps
            vf 1 MAC 72:dc:ba:f9:df:fd
    
    Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty>

Comment 3 Jakub Sitnicki 2016-07-18 15:49:27 UTC
Commit 8fe58d58941f ("iplink: Check address length via netlink") needed by BZ 1271580 and BZ 1253767 needs bits from commit f89a2a05ffa9 ("Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool") which pass dev_index to iplink_parse_vf().

There is also a needed follow up fix - commit 9a02651a87d0 "ip: check for missing dev arg when doing VF rate".

Comment 6 Jaroslav Aster 2016-08-16 13:47:58 UTC
Hi Jakub,

I have tried to test it on igb or ixgbe driver, but it does not work. Could you recommend me on which driver I should use for the test?

# uname -r
3.10.0-489.el7.x86_64

# lsmod|grep igb
igbvf                  42222  0 
igb                   197328  0 
dca                    15130  2 igb,ioatdma
ptp                    19231  2 igb,tg3
i2c_algo_bit           13413  2 igb,mgag200
i2c_core               40756  7 drm,igb,i2c_i801,ipmi_ssif,drm_kms_helper,mgag200,i2c_algo_bit

# ip l show dev eno1
6: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000
    link/ether 58:20:b1:06:74:f4 brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 0e:91:6b:2b:15:cf, spoof checking on, link-state auto
    vf 1 MAC 96:66:6e:6c:5f:8c, spoof checking on, link-state auto
    vf 2 MAC f2:2f:c2:75:21:6d, spoof checking on, link-state auto
    vf 3 MAC ca:14:4f:46:c9:7c, spoof checking on, link-state auto
    vf 4 MAC b2:da:28:51:3a:c3, spoof checking on, link-state auto
    vf 5 MAC fe:1f:ec:4b:1b:17, spoof checking on, link-state auto
    vf 6 MAC ae:0d:80:5d:84:30, spoof checking on, link-state auto

# ip link set dev eno1 vf 0 min_tx_rate 10
RTNETLINK answers: Operation not supported

Comment 7 Jaroslav Aster 2016-08-16 13:53:27 UTC
Unfortunately some doc missing. There are no mention about new options min_tx_rate and max_tx_rate in ip link help. There is only rate option. Documentation should be consistent between help and man page. Switching it to assigned state.

# ip link help
...
                          [ vf NUM [ mac LLADDR ]
                                   [ vlan VLANID [ qos VLAN-QOS ] ]
                                   [ rate TXRATE ]
                                   [ spoofchk { on | off} ]
                                   [ query_rss { on | off} ]
                                   [ state { auto | enable | disable} ] ]
                                   [ trust { on | off} ] ]
...


# man ip-link
...
               [ vf NUM [ mac LLADDR ]
                        [ vlan VLANID [ qos VLAN-QOS ] ]
                        [ rate TXRATE ]
                        [ max_tx_rate TXRATE ]
                        [ min_tx_rate TXRATE ]
                        [ spoofchk { on | off } ]
                        [ query_rss { on | off } ]
                        [ state { auto | enable | disable } ]
                        [ trust { on | off } ] ]
...

Comment 10 Marcelo Ricardo Leitner 2016-08-16 14:58:53 UTC
(In reply to Jaroslav Aster from comment #6)
> Hi Jakub,

Jakub is on PTO so we are covering for him.

> 
> I have tried to test it on igb or ixgbe driver, but it does not work. Could
> you recommend me on which driver I should use for the test?

Those should have worked afaict from the kernel code.
mlx4 and qlcnic should also support it:
$ git grep ixgbe_ndo_set_vf_bw
broadcom/bnxt/bnxt.c:   .ndo_set_vf_rate        = bnxt_set_vf_bw,
emulex/benet/be_main.c: .ndo_set_vf_rate        = be_set_vf_tx_rate,
intel/i40e/i40e_main.c: .ndo_set_vf_rate        = i40e_ndo_set_vf_bw,
intel/igb/igb_main.c:   .ndo_set_vf_rate        = igb_ndo_set_vf_bw,
intel/ixgbe/ixgbe_main.c:       .ndo_set_vf_rate        = ixgbe_ndo_set_vf_bw,
mellanox/mlx4/en_netdev.c:      .ndo_set_vf_rate        = mlx4_en_set_vf_rate,
qlogic/qede/qede_main.c:        .ndo_set_vf_rate = qede_set_vf_rate,
qlogic/qlcnic/qlcnic_main.c:    .ndo_set_vf_rate        = qlcnic_sriov_set_vf_tx_rate,

For ixgbe, it was added via bz#1200759

Bill, can you please check why it didn't work for Jaroslav? Thanks!

Comment 12 Phil Sutter 2016-08-18 19:18:16 UTC
Hi Jaroslav,

(In reply to Jaroslav Aster from comment #6)
> I have tried to test it on igb or ixgbe driver, but it does not work. Could
> you recommend me on which driver I should use for the test?

On hardware driven by igb driver, the functionality is only available if MAC is of type e1000_82576. Furthermore, it seems like only max_tx_rate setting is supported.

On ixgbe hardware, link has to be up and link speed must match 10000 (10Gbps). min_tx_rate is then still not supported, and max_tx_rate has to be larger than 10 (but still smaller than link speed).

mlx4 and i40e also don't support min_tx_rate - or to be more precise, in every case only a min_tx_rate of 0 is supported.

bnxt and qlcnic seem to support both min and max tx_rate settings though.

Maybe we should write a dummy driver at some point which successfully does nothing and accepts all netdev ops there are, so we can test iproute independently of actual hardware. Feel free to open a ticket if you like the idea, I don't think it's unreasonably hard to do.

Thanks, Phil

Comment 16 errata-xmlrpc 2016-11-03 23:39:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2162.html