RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1886711 - [RFE] Enhance SOS report with tc and hw-offload information
Summary: [RFE] Enhance SOS report with tc and hw-offload information
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: sos
Version: 8.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.0
Assignee: Pavel Moravec
QA Contact: Miroslav Hradílek
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-09 07:56 UTC by Adrián Moreno
Modified: 2021-11-10 07:39 UTC (History)
10 users (show)

Fixed In Version: sos-4.1-3.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-11-09 19:36:07 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github sosreport sos pull 2051 0 None closed Ovs debug enhancements 2021-01-29 14:44:57 UTC
Github sosreport sos pull 2383 0 None closed [networking] add devlink and tc filter 2021-01-29 14:44:57 UTC
Github sosreport sos pull 2550 0 None open [networking] call tc filter show with mandatory ingress subcommand 2021-05-19 06:38:04 UTC
Red Hat Product Errata RHEA-2021:4388 0 None None None 2021-11-09 19:36:22 UTC

Description Adrián Moreno 2020-10-09 07:56:58 UTC
Currently, the sos report lacks some information that might be useful to troubleshoot hw-offloading problems.

The following logs should be added:
1) ovs hw-offloaded dp flows: BZ 1824854 might have already done that. But consider backporting it to the rhel 8.2 sos packag

2) tc filter (including hw/sw stats) on all interfaces

3) devlink information (at least "param show" and "eswitch show")

Comment 1 Pavel Moravec 2020-10-13 07:31:39 UTC
(In reply to Adrián Moreno from comment #0)
> Currently, the sos report lacks some information that might be useful to
> troubleshoot hw-offloading problems.
> 
> The following logs should be added:
> 1) ovs hw-offloaded dp flows: BZ 1824854 might have already done that. But
> consider backporting it to the rhel 8.2 sos packag

That is https://github.com/sosreport/sos/pull/2051 , right? Currently planned to RHEL8.4 due to rebase. We might add it to 8.3.z depending on severity/priority (as a classical z-stream bug), but why 8.2 and some EUS, please?


> 
> 2) tc filter (including hw/sw stats) on all interfaces

Something like adding line after:

https://github.com/sosreport/sos/blob/master/sos/report/plugins/networking.py#L205 ?

(or in pseudo-code:

for i in $(ls /sys/class/net/); do
    tc filter $i   # collect these outputs
done

) ?


> 
> 3) devlink information (at least "param show" and "eswitch show")

Could you be more specific (i.e. whole commands to be called, e.g. in a pseudocode like above)? When these commands should be called? (or also in networking plugin)?

Comment 2 Adrián Moreno 2020-10-15 15:52:09 UTC
Sorry I've not been very clear, my intention was to work on this myself (when I find some time) but if you are jumping on it, that's great!


(In reply to Pavel Moravec from comment #1)
> (In reply to Adrián Moreno from comment #0)
> > Currently, the sos report lacks some information that might be useful to
> > troubleshoot hw-offloading problems.
> > 
> > The following logs should be added:
> > 1) ovs hw-offloaded dp flows: BZ 1824854 might have already done that. But
> > consider backporting it to the rhel 8.2 sos packag
> 
> That is https://github.com/sosreport/sos/pull/2051 , right? 
Yes, that PR includes the logs I'm referring to.

> Currently
> planned to RHEL8.4 due to rebase. We might add it to 8.3.z depending on
> severity/priority (as a classical z-stream bug), but why 8.2 and some EUS,
> please?
> 
Yes, that PR includes the logs I'm referring to.
The reason is mainly because we have enabled OvS tc hardware offloading on RHEL 8.2 and without the full offloaded datapath rules, it's quite difficult debug any issue related to hw offload.

> 
> > 
> > 2) tc filter (including hw/sw stats) on all interfaces
> 
> Something like adding line after:
> 
> https://github.com/sosreport/sos/blob/master/sos/report/plugins/networking.
> py#L205 ?
> 
> (or in pseudo-code:
> 
> for i in $(ls /sys/class/net/); do
>     tc filter $i   # collect these outputs
> done
> 
> ) ?
> 
Yes. However, I'd add a "-s" flag to the tc command to get the statistics as well

> 
> > 
> > 3) devlink information (at least "param show" and "eswitch show")
> 
> Could you be more specific (i.e. whole commands to be called, e.g. in a
> pseudocode like above)? When these commands should be called? (or also in
> networking plugin)?

Sure, I don't have a system with the right devices handy but it would be something like:

# Commands that show information of all available devices
$ devlink param show
$ devlink dev info

#Per-device commands
$ for dev in $(devlink dev); do \
    devlink dev eswitch show $dev; \
done

These commands show the chip/ASIC specific information for compatible switch devices, so I guess the network plugin sounds like the right place but I don't have enough knowledge of sos to have a strong opinion on this.

Comment 3 Marcelo Ricardo Leitner 2020-10-15 18:51:53 UTC
(In reply to Adrián Moreno from comment #2)
> > > 3) devlink information (at least "param show" and "eswitch show")
> > 
> > Could you be more specific (i.e. whole commands to be called, e.g. in a
> > pseudocode like above)? When these commands should be called? (or also in
> > networking plugin)?
> 
> Sure, I don't have a system with the right devices handy but it would be
> something like:
> 
> # Commands that show information of all available devices
> $ devlink param show
> $ devlink dev info
> 
> #Per-device commands
> $ for dev in $(devlink dev); do \
>     devlink dev eswitch show $dev; \
> done
> 
> These commands show the chip/ASIC specific information for compatible switch
> devices, so I guess the network plugin sounds like the right place but I
> don't have enough knowledge of sos to have a strong opinion on this.

+1 to network plugin. These are a companion to ethtool commands, lets say.
Sample outputs for devlink commands now at http://pastebin.test.redhat.com/910915

Comment 4 Pavel Moravec 2021-01-23 16:13:24 UTC
Preliminary patch:

diff --git a/sos/report/plugins/networking.py.orig b/sos/report/plugins/networking.py
index 5bdb697..81315f4 100644
--- a/sos/report/plugins/networking.py.orig
+++ b/sos/report/plugins/networking.py
@@ -102,8 +102,16 @@ class Networking(Plugin):
             "ip neigh show nud noarp",
             "biosdevname -d",
             "tc -s qdisc show",
+            "devlink dev param show",
+            "devlink dev info",
         ])
 
+        devlinks = self.collect_cmd_output("devlink dev")
+        if devlinks['status'] == 0:
+            devlinks_list = devlinks['output'].splitlines()
+            for devlink in devlinks_list:
+                self.add_cmd_output("devlink dev eswitch show %s" % devlink)
+
         # below commands require some kernel module(s) to be loaded
         # run them only if the modules are loaded, or if explicitly requested
         # via --allow-system-changes option
@@ -139,7 +147,8 @@ class Networking(Plugin):
                 "ethtool -l " + eth,
                 "ethtool --phy-statistics " + eth,
                 "ethtool --show-priv-flags " + eth,
-                "ethtool --show-eee " + eth
+                "ethtool --show-eee " + eth,
+                "tc -s filter show dev " + eth
             ], tags=eth)
 
             # skip EEPROM collection by default, as it might hang or

Comment 5 Pavel Moravec 2021-01-24 11:12:23 UTC
Thanks Marcelo for the examples, that clarified some my questions.

Upstream PR raised: https://github.com/sosreport/sos/pull/2383

As I still feel some uncertainty about particular commands syntax, please review the PR if I did it right.

Preliminary, this will be available in RHEL8.5.

Comment 7 Pavel Moravec 2021-04-30 13:08:02 UTC
Hi Adrián,
we have identified this bugfix as important to verify much prferably on some real (non-mocked) environment. Could you please verify the bug against the sos-4.1-1.el8 package, or ask somebody from the knowledge domain to do so / aka for OtherQE?

Thanks in advance.

Comment 8 Adrián Moreno 2021-05-05 15:01:02 UTC
Hi Pavel,

Sure, I'll see if I can get my hands on the right environment

Comment 10 Adrián Moreno 2021-05-18 16:32:33 UTC
Hi Pavel,

I've tested in a real environment and found that we're missing the qdisc name on the "tc filter show" command

I was surprised to see that without specifying any qdisc the command returns nothing:

[heat-admin@overcloud-computeovshwoffload-0 ~]$ sudo tc filter show dev lxbond
[heat-admin@overcloud-computeovshwoffload-0 ~]$ 

While:

[heat-admin@overcloud-computeovshwoffload-0 ~]$ sudo tc filter show dev lxbond ingress                                                                                                                               
filter block 43 protocol 802.1Q pref 3 flower chain 0                                                                                                                                                                
filter block 43 protocol 802.1Q pref 3 flower chain 0 handle 0x1                                                                                                                                                     
  vlan_id 100                                                                                                                                                                                                        
  vlan_ethtype ip                                                                                                                                                                                                    
  dst_mac 01:00:5e:00:00:12                                                                                                                                                                                          
  src_mac 52:54:00:1d:fe:d2                                                                                                                                                                                          
  eth_type ipv4                                                                                                                                                                                                      
  ip_flags nofrag                                                                                                                                                                                                    
  not_in_hw                                                                                                                                                                                                          
        action order 1: skbedit  ptype host pipe                                                                                                                                                                     
         index 3 ref 1 bind 1                                                                                                                                                                                        
                                                                                                                                                                                                                     
        action order 2: mirred (Ingress Redirect to device br-link2) stolen                                                                                                                                          
        index 7 ref 1 bind 1                                                                                                                                                                                         
        cookie 81ef8cc3de424beeef27009d5f38947e                                                                                                                                                                      
                                                                                                                                                                                                                     
filter block 43 protocol 802.1Q pref 3 flower chain 0 handle 0x2                                                                                                                                                     
[...]

It's confusing because "man tc" does show:

    tc [ OPTIONS ] filter show dev DEV                                                                          

However the "help" command shows:

[heat-admin@overcloud-computeovshwoffload-0 ~]$ sudo tc filter help
Usage: tc filter [ add | del | change | replace | show ] [ dev STRING ]
       tc filter [ add | del | change | replace | show ] [ block BLOCK_INDEX ]
       tc filter get dev STRING parent CLASSID protocol PROTO handle FILTERID pref PRIO FILTER_TYPE
       tc filter get block BLOCK_INDEX protocol PROTO handle FILTERID pref PRIO FILTER_TYPE
       [ pref PRIO ] protocol PROTO [ chain CHAIN_INDEX ]
       [ estimator INTERVAL TIME_CONSTANT ]
       [ root | ingress | egress | parent CLASSID ]
       [ handle FILTERID ] [ [ FILTER_TYPE ] [ help | OPTIONS ] ]

       tc filter show [ dev STRING ] [ root | ingress | egress | parent CLASSID ]
       tc filter show [ block BLOCK_INDEX ]
Where:
FILTER_TYPE := { rsvp | u32 | bpf | fw | route | etc. }
FILTERID := ... format depends on classifier, see there
OPTIONS := ... try tc filter add <desired FILTER_KIND> help
[heat-admin@overcloud-computeovshwoffload-0 ~]$ 

I'll send a patch to fix the man page. In the mean time, I think this is what we need is:

diff --git a/sos/report/plugins/networking.py b/sos/report/plugins/networking.py
index acfa027f..09075363 100644
--- a/sos/report/plugins/networking.py
+++ b/sos/report/plugins/networking.py
@@ -156,7 +156,7 @@ class Networking(Plugin):
                 "ethtool --phy-statistics " + eth,
                 "ethtool --show-priv-flags " + eth,
                 "ethtool --show-eee " + eth,
-                "tc -s filter show dev " + eth
+                "tc -s filter show dev " + eth + " ingress",
             ], tags=eth)
 
             # skip EEPROM collection by default, as it might hang or

Comment 11 Pavel Moravec 2021-05-19 06:38:08 UTC
Thanks for spotting it, raising upstream PR:

https://github.com/sosreport/sos/pull/2550

Comment 12 Marcelo Ricardo Leitner 2021-05-19 12:33:47 UTC
We actually want both, because otherwise it will only show ingress filters.

[root@horizon ~]# ip link add veth1 type veth peer name veth2  
[root@horizon ~]# tc qdisc show dev veth1

[root@horizon ~]# tc qdisc add dev veth1 ingress
[root@horizon ~]# tc filter add dev veth1 ingress matchall action drop

                                              vvvvvvvvv
[root@horizon ~]# tc qdisc add dev veth1 root handle 1: htb
[root@horizon ~]# tc filter add dev veth1 parent 1: handle 42 matchall action drop
                                                           ^^

[root@horizon ~]# tc qdisc show dev veth1
qdisc htb 1: root refcnt 2 r2q 10 default 0 direct_packets_stat 0 direct_qlen 1000
qdisc ingress ffff: parent ffff:fff1 ----------------


[root@horizon ~]# tc filter show dev veth1
filter parent 1: protocol all pref 49152 matchall chain 0
       ^^^^^^^^^^
filter parent 1: protocol all pref 49152 matchall chain 0 handle 0x2a    <--- 0x2a = 42
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 2 ref 1 bind 1

[root@horizon ~]# tc filter show dev veth1 ingress
filter protocol all pref 49152 matchall chain 0
filter protocol all pref 49152 matchall chain 0 handle 0x1
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1

Andrea can explain better the semantics around ingress/egress.

Comment 13 Pavel Moravec 2021-05-19 14:19:39 UTC
Thanks for a prompt feedback. I have updated the PR accordingly, let me know if this versiion is correct :)

https://github.com/sosreport/sos/pull/2550/files

Comment 14 Marcelo Ricardo Leitner 2021-05-20 14:21:04 UTC
LGTM!
Btw, I wanted to add a 'Reviewed-by:' tag but wasn't sure how, so I hit the 'approve' in github too. Please let me know if that wasn't appropriate..
Thanks.

Comment 24 errata-xmlrpc 2021-11-09 19:36:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (sos bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:4388


Note You need to log in before you can comment on or make changes to this bug.