Bug 1535978
Summary: | [RHEL-7.5/RDMA] opensm only honors the first item of mgroup_flags | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Honggang LI <honli> |
Component: | opensm | Assignee: | Honggang LI <honli> |
Status: | CLOSED ERRATA | QA Contact: | Mike Stowell <mstowell> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.5 | CC: | bhu, ddutile, honli, infiniband-qe, mstowell, rdma-dev-team |
Target Milestone: | rc | ||
Target Release: | 7.7 | ||
Hardware: | Unspecified | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | opensm-3.3.21-1.el7 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-08-06 12:46:08 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1614004 |
Description
Honggang LI
2018-01-18 11:35:53 UTC
(In reply to Honggang LI from comment #0) c#18 is very long, and requires the reader to parse, and all readers to parse to same conclusion. A clean summary/description of the problem should be stated here. also, you mention 7.4 opensm version, then mellanox, then upstream commit-id. What point are you trying to make? What is the reader suppose to conclude from the different versions -- b/c these versions almost always differ at any point in time. (In reply to Don Dutile from comment #2) > (In reply to Honggang LI from comment #0) > c#18 is very long, and requires the reader to parse, and all readers to > parse to same conclusion. > > A clean summary/description of the problem should be stated here. Summary: To setup the MTU for MC group, the 'mtu=xxx' must to be the first field of mgroup_flag. Otherwise, opensm will ignore it and set the MTU to default value 2048. Good configuration. MTU will be set to 4096. 'mut=5' is the first field of mgroup_flag. [root@rdma03 opensm]# cat partitions.conf.default Default=0x7fff,ipoib, mtu=5 rate=12:ALL=full; ^^^^^ Bad configuration. MTU will be set to 2048. 'mut=5' is the second field of mgroup flag. [root@rdma03 opensm]# cat partitions.conf.default Default=0x7fff,ipoib, rate=12 mtu=5:ALL=full; ^^^^^^^ > also, you mention 7.4 opensm version, then mellanox, then upstream commit-id. > What point are you trying to make? Issue can be reproduced with all of those versions of opensm. (In reply to Honggang LI from comment #3) Thanks for all that clarification! Now I understand your remark about the first field... I didn't get that before. So, is upstream fixed yet, or do you have to post a patch for it? https://bugzilla.redhat.com/show_bug.cgi?id=1534869#c21 Copy and paste the explanation in here. > [root@rdma-master ~]$ cat /etc/rdma/partitions-ib0.conf | grep mtu > # mtu = > Default=0x7fff, rate=6 mtu=5 scope=2, defmember=full: > Default=0x7fff, ipoib, rate=6 mtu=5 scope=2: ^^^^^^^^^^^^^^^^^^^^ Because of two issues, we failed to set the MTU to 4K. 1) The configuration file is wrong. There MUST BE a comma (,) between the mgroup_flag flags. Default=0x7fff, ipoib, rate=6 mtu=5 scope=2: should be: Default=0x7fff, ipoib, rate=6, mtu=5, scope=2: ^ ^ I believe we had been mislead by the example configuration file "/etc/rdma/partitions.conf" and upstream doc source file "opensm-top-dir/doc/partition-config.txt". No doc emphasize that the field of mgroup_flag must be spilt with a "comma". We should update these two files. 2) The function "parse_name_token" is error prone. It gives us a wrong 'flval' when wrong configuration passed into it. In fact, it should raise an error. I instrumented upstream opensm source code. Output with wrong configuration file. ---------------------------- osm_prtn_config_parse_file open /etc/opensm/partitions.conf osm_prtn_config_parse_file read line (1) (# Bad configuration, ib0's mtu will be 2044 ) osm_prtn_config_parse_file read line (2) (# Default=0x7fff,ipoib, rate=12 mtu=5:ALL=full; ) osm_prtn_config_parse_file read line (3) ( ) osm_prtn_config_parse_file read line (4) (# Good configuration, ib0's mtu will be 4092 ) osm_prtn_config_parse_file read line (5) (Default=0x7fff,ipoib, mtu=5 rate=12:ALL=full; ) ===> parse_name_token return ret=(15) name=(Default), id=(0x7fff) ===> parse_name_token return ret=(6) flag=(ipoib), flval=((null)) ===> parse_name_token return ret=(15) flag=(mtu), flval=(5 rate=12) <===== ^^^^^^^^^^^^^^^^^ IT SHOULD RAISE AN ERROR IN HERE, AS WRONG 'FLVAL' RETURNED. THAT IS WHY OPENSM ONLY HONOR THE FIRST FIELD OF MG_GROUP_FLAG. ===> parse_name_token return ret=(9) name=(ALL), flag=(full) ---------------------------- > ib0_2=0x0002, rate=7 mtu=5 scope=2, defmember=full: > ib0_2=0x0002, ipoib, rate=7 mtu=5 scope=2: > ib0_4=0x0004, rate=3 mtu=5 scope=2, defmember=full: > ib0_4=0x0004, ipoib, rate=3 mtu=5 scope=2: > ib0_6=0x0006, rate=12 mtu=5 scope=2, defmember=full: > ib0_6=0x0006, ipoib, rate=12 mtu=5 scope=2: > > > mtu=5 means MTU==4K. 024fe73e4481 opensm.8.in: Emphasize that the fields of mgroup_flag must be split with "comma" 1f82c22a1237 partition-config.txt: Emphasize that the fields of mgroup_flag must be split with "comma" 04d2a8be0305 osm_prtn_config.c: parse_group_flag log suspicious group flag value Upstream patches fix this issue. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2100 |