RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1869724 - sosreport running 'ethtool -e' is causing bnx2x NICs to pause
Summary: sosreport running 'ethtool -e' is causing bnx2x NICs to pause
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: sos
Version: 8.3
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: 8.0
Assignee: Pavel Moravec
QA Contact: Miroslav Hradílek
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-18 13:57 UTC by suresh kumar
Modified: 2023-12-15 18:54 UTC (History)
6 users (show)

Fixed In Version: sos-3.9.1-6.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-11-04 01:58:15 UTC
Type: Bug
Target Upstream Version:
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github sosreport sos pull 2200 0 None closed [networking] remove 'ethtool -e' option for bnx2x NICs 2021-02-17 21:35:26 UTC
Red Hat Product Errata RHEA-2020:4534 0 None None None 2020-11-04 01:58:38 UTC

Internal Links: 1918923

Description suresh kumar 2020-08-18 13:57:41 UTC
Description of problem:

[1]
Customer observed their application is stuck for ~4 seconds while executing sosreport.

Issue was further tracked down to 'ethtool -e' command.  Checking the strace, we could see ioctl for reading eeprom is returned after 3.444488 seconds.

+++
26621 1595999211.585459 socket(AF_INET, SOCK_DGRAM, IPPROTO_IP) = 3<UDP:[21070872]> <0.000016>
26621 1595999211.585748 ioctl(3<UDP:[21070872]>, SIOCETHTOOL, 0x7fffffffe530) = 0 <0.000026>
26621 1595999211.586002 mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ffff750a000 <0.000012>
26621 1595999211.587348 ioctl(3<UDP:[21070872]>, SIOCETHTOOL, 0x7fffffffe530 <unfinished ...>
26621 1595999215.032016 <... ioctl resumed> ) = 0 <3.444488>  <<<<<
+++

NIC version:
+++
driver: bnx2x
version: 1.713.36-0 storm 7.13.1.0
firmware-version: mbi 7.15.64 bc 7.14.62
+++



Version-Release number of selected component (if applicable):

sos version >= 3.7
sosreport has added support for 'ethtool -e' from version 3.7 on wards.

+++
$ git show 8b989aeb
commit 8b989aebc9c152430fc57f918a8e90210a792a9f
Author: Patrick Talbert <ptalbert>
Date:   Thu Dec 6 13:14:38 2018 +0100

    [networking] Extend ethtool command set
    
    Update the list of ethtool commands to include:
    
    ethtool -e (EEPROM dump)
    ethtool -P (permanent MAC address)
    ethtool -l (channel/queue settings)
    ethtool --phy-statistics
    ethtool --show-priv-flags
    ethtool --show-eee
    
    All of the above are helpful in understanding the state of modern NICs.
    And -P is nice to have as otherwise there is no reliable way to see the
    permanent MAC of team ports.
...
+++



How reproducible:

Always. Run sosreport on a system with bnx2x NIC.

Below is test result from on system dell-pem630-01

+++
System Information
        Manufacturer: Dell Inc.
        Product Name: PowerEdge M630
        Version: Not Specified
        Serial Number: 1V8QT52
        UUID: 4c4c4544-0056-3810-8051-b1c04f543532


+++
# ethtool -i em1
driver: bnx2x                                      <------------------
version: 1.713.36-0 storm 7.13.1.0
firmware-version: FFV7.12.19 bc 7.12.5
expansion-rom-version: 
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
+++


It takes time in EEPROM dump
+++
# time ethtool -e em1
...
0x1fff80:		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x1fff90:		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x1fffa0:		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x1fffb0:		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x1fffc0:		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x1fffd0:		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x1fffe0:		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x1ffff0:		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 

real	0m12.674s                       <-----------------
user	0m0.090s
sys	0m1.294s
+++



Actual results:
sosreport is taking time to complete and bnx2x NICs are pausing.



Expected results:
sosreport should not affect NIC operation.



Additional info:
[1]
In another instance (https://bugzilla.redhat.com/show_bug.cgi?id=1846708), we observed sosreport breaks iDRAC connectivity. Issue was again tracked to "ethtool -e" command.


[2]
Dell has an advisory regarding this:
+++
https://www.dell.com/support/manuals/in/en/inbsdt1/red-hat-entps-lx-v7.0/rhel_7.7_rn/reading-eeprom-from-a-broadcom-device-via-ethtool-results-in-soft-lockup?guid=guid-986ca2a9-c9f9-4345-8762-02d286cc0d1f&lang=en-us
+++

[3]

We also have case where abrtd triggers sosreport and CPUs get into soft lockup very often.


So, its better to avoid sosreport running 'ethtool -e' on bnx2x NICs as its not production safe.

Comment 1 suresh kumar 2020-08-18 14:07:06 UTC
I have submitted an upstream patch to remove ethtool -e for bnx2x NICs which is accepted.
https://github.com/sosreport/sos/commit/34c77d6902ee1df403dc3836b4092d413fb95350 .

+++
$ git show 34c77d69
commit 34c77d6902ee1df403dc3836b4092d413fb95350
Author: suresh2514 <suresh2514>
Date:   Fri Aug 14 22:59:34 2020 +0530

    [networking] remove 'ethtool -e' option for bnx2x NICs
    
    Running EEPROM dump (ethtool -e) can result in bnx2x driver NICs to
    pause for few seconds and is not recommended in production environment.
    
    Resolves: #2188
    Resolves: #2200
    
    Signed-off-by: suresh2514 <suresh2514>
    Signed-off-by: Jake Hunsaker <jhunsake>

diff --git a/sos/report/plugins/networking.py b/sos/report/plugins/networking.py
index ba9c0fb1..397549a5 100644
--- a/sos/report/plugins/networking.py
+++ b/sos/report/plugins/networking.py
@@ -198,7 +198,6 @@ class Networking(Plugin):
                 "ethtool -a " + eth,
                 "ethtool -c " + eth,
                 "ethtool -g " + eth,
-                "ethtool -e " + eth,
                 "ethtool -P " + eth,
                 "ethtool -l " + eth,
                 "ethtool --phy-statistics " + eth,
@@ -206,6 +205,17 @@ class Networking(Plugin):
                 "ethtool --show-eee " + eth
             ], tags=eth)
 
+            # skip EEPROM collection for 'bnx2x' NICs as this command
+            # can pause the NIC and is not production safe.
+            bnx_output = {
+                "cmd": "ethtool -i %s" % eth,
+                "output": "bnx2x"
+            }
+            bnx_pred = SoSPredicate(self,
+                                    cmd_outputs=bnx_output,
+                                    required={'cmd_outputs': 'none'})
+            self.add_cmd_output("ethtool -e %s" % eth, pred=bnx_pred)
+
         # Collect information about bridges (some data already collected via
         # "ip .." commands)
         self.add_cmd_output([
+++



Test result for above patch:

+++
 Setting up archive ...
 Setting up plugins ...
...
[plugin:networking] skipped command 'ethtool -e em2':    <--------------------- bnx2x NIC
[plugin:networking] skipped command 'ethtool -e em1':    <--------------------- bnx2x NICs
 Running plugins. Please wait ...

  Starting 1/1   networking      [Running: networking]

  Finished running plugins

Creating compressed archive...

Your sosreport has been generated and saved in:
	/var/tmp/sosreport-dell-pem630-01-2020-08-14-ixpdmsw.tar.xz

 Size	1.24MiB
 Owner	root
 md5	a2c236193997733cc383ebdf2bac478f

Please send this file to your support representative.


real	0m3.718s   <-------  Without this patch,  it was taking 12s to complete sosreport.
user	0m2.028s
sys	0m0.896s
+++

Comment 2 Pavel Moravec 2020-08-18 14:30:36 UTC
I can add it to RHEL 8.3.0 but we are limited on QE capacity. If/Once a candidate package is available, could you verify it, please?

Comment 3 suresh kumar 2020-08-18 16:06:09 UTC
(In reply to Pavel Moravec from comment #2)
> I can add it to RHEL 8.3.0 but we are limited on QE capacity. If/Once a
> candidate package is available, could you verify it, please?

sure

regards

Comment 4 Pavel Moravec 2020-08-19 09:10:15 UTC
Hello,
could you please verify the fix against below build? Thanks in advance.


A yum repository for the build of sos-3.9.1-6.el8 (task 30820540) is available at:

http://brew-task-repos.usersys.redhat.com/repos/official/sos/3.9.1/6.el8/

You can install the rpms locally by putting this .repo file in your /etc/yum.repos.d/ directory:

http://brew-task-repos.usersys.redhat.com/repos/official/sos/3.9.1/6.el8/sos-3.9.1-6.el8.repo

RPMs and build logs can be found in the following locations:
http://brew-task-repos.usersys.redhat.com/repos/official/sos/3.9.1/6.el8/noarch/

The full list of available rpms is:
http://brew-task-repos.usersys.redhat.com/repos/official/sos/3.9.1/6.el8/noarch/sos-3.9.1-6.el8.src.rpm
http://brew-task-repos.usersys.redhat.com/repos/official/sos/3.9.1/6.el8/noarch/sos-3.9.1-6.el8.noarch.rpm
http://brew-task-repos.usersys.redhat.com/repos/official/sos/3.9.1/6.el8/noarch/sos-audit-3.9.1-6.el8.noarch.rpm

The repository will be available for the next 60 days. Scratch build output will be deleted
earlier, based on the Brew scratch build retention policy.

Comment 18 errata-xmlrpc 2020-11-04 01:58:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (sos bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:4534


Note You need to log in before you can comment on or make changes to this bug.