Bug 1917196

Summary: network glitch while running ethtool -e command in sosreport
Product: Red Hat Enterprise Linux 8 Reporter: Kyung Huh <khuh>
Component: sosAssignee: Pavel Moravec <pmoravec>
Status: CLOSED ERRATA QA Contact: Maros Kopec <makopec>
Severity: high Docs Contact:
Priority: urgent    
Version: 8.1CC: agk, bmr, jeharris, mhradile, mtesar, ofamera, plambri, pmoravec, sbradley, vcojot
Target Milestone: rcKeywords: ZStream
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: sos-4.0-6.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1928627 1928628 1928687 (view as bug list) Environment:
Last Closed: 2021-05-18 14:49:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1928627, 1928628, 1928687    

Description Kyung Huh 2021-01-18 01:23:24 UTC
Description of problem:
While running ethtool -e command in sosreport network glitch
pacemaker ip resource monitoring timed out and then vip resource failed over to the other node.

Version-Release number of selected component (if applicable):
sos-3.7-8.el8_1.noarch
sos-3.9.1-6.el8.noarch

How reproducible:
Run sosreport and open another terminal run watch ip addr
He noticed that ip addr command was blocked for about 17 seconds.

Steps to Reproduce:
1. Run sosreport
2. Open another termial and run 'watch ip addr'
3. 

Actual results:
ip addr command was blocked for about 17 seconds

Expected results:
It should not block the ip addr command.

Additional info:
NIC driver i40e
The output of ethtool -e command is large

# LANG=C ls -lah sos_commands/networking/ethtool_-e*
-rw-r--r-- 1 root root 22M Dec 14 10:27 ethtool_-e_eno1

This issue is similar to BZ 1869724 except for driver.

Comment 2 Pavel Moravec 2021-01-21 10:39:24 UTC
The underlying fix must go to 8.4, as it is the third similar problem with "ethtool -e %DEV" that breaks stability of the underlying system. Sos report tool cant afford such its own behaviour due to an issue of a tool invoked..

Comment 9 Maros Kopec 2021-01-27 13:05:12 UTC
I tested it manually on RHEL-8.4 x86_64

OLD

# rpm -qa sos
sos-4.0-5.el8.noarch

# sos report --list-plugins | grep 'networking\.'
 networking.traceroute     off             collect a traceroute to www.example.com
 networking.namespace_pattern                 Specific namespaces pattern to be collected, namespaces pattern should be separated by whitespace as for example "eth* ens2"
 networking.namespaces     0               Number of namespaces to collect, 0 for unlimited. Incompatible with the namespace_pattern plugin option
 networking.ethtool_namespaces on              Define if ethtool commands should be collected for namespaces

Default sos report run
# sos report -o networking --batch
...
Your sosreport has been generated and saved in:
	/var/tmp/sosreport-localhost-2021-01-27-kihdqsq.tar.xz

# tar tf /var/tmp/sosreport-localhost-2021-01-27-kihdqsq.tar.xz | grep ethtool_-e
sosreport-localhost-2021-01-27-kihdqsq/sos_commands/networking/ethtool_-e_lo
sosreport-localhost-2021-01-27-kihdqsq/sos_commands/networking/ethtool_-e_eth0

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

NEW

# rpm -qa sos
sos-4.0-6.el8.noarch

We can see that eepromdump is now ignored by default
# sos report --list-plugins | grep 'networking\.'
 networking.traceroute     off             collect a traceroute to www.example.com
 networking.namespace_pattern                 Specific namespaces pattern to be collected, namespaces pattern should be separated by whitespace as for example "eth* ens2"
 networking.namespaces     0               Number of namespaces to collect, 0 for unlimited. Incompatible with the namespace_pattern plugin option
 networking.ethtool_namespaces on              Define if ethtool commands should be collected for namespaces
 networking.eepromdump     off             collect 'ethtool -e' for all devices

Default sos report run
# sos report -o networking --batch
...
Your sosreport has been generated and saved in:
  /var/tmp/sosreport-localhost-2021-01-27-oqqwddm.tar.xz

# tar tf /var/tmp/sosreport-localhost-2021-01-27-oqqwddm.tar.xz | grep ethtool_-e

To collect ethtool -e output the eepromdump must be specified now
# sos report -o networking --plugin-option networking.eepromdump --batch
...
[plugin:networking] WARNING (about to collect 'ethtool -e lo'): collecting an eeprom dump is known to cause certain NIC drivers (e.g. bnx2x/tg3) to interrupt device operation
[plugin:networking] WARNING (about to collect 'ethtool -e eth0'): collecting an eeprom dump is known to cause certain NIC drivers (e.g. bnx2x/tg3) to interrupt device operation
...
Your sosreport has been generated and saved in:
	/var/tmp/sosreport-localhost-2021-01-27-zpybjxm.tar.xz

# tar tf /var/tmp/sosreport-localhost-2021-01-27-zpybjxm.tar.xz | grep ethtool_-e
sosreport-localhost-2021-01-27-zpybjxm/sos_commands/networking/ethtool_-e_lo
sosreport-localhost-2021-01-27-zpybjxm/sos_commands/networking/ethtool_-e_eth0

Comment 16 Pavel Moravec 2021-02-15 10:46:06 UTC
Generic functional reproducer / verification steps:

1)
sosreport -o networking -l

prints:

 networking.eepromdump     off             collect 'ethtool -e' for all devices


2)
sosreport -o networking --batch --build

generates a sos directory:

dir="/var/tmp/$(ls -t /var/tmp/ | grep sosreport -m1)"

that has _no_  ethtool_-e*  files:

(  find $dir | grep ethtool_-e   returns nothing)


3)
sosreport -o networking --batch --build -k networking.eepromdump=on

does call the "ethtool -e DEV" commands:

# dir="/var/tmp/$(ls -t /var/tmp/ | grep sosreport -m1)"
# find $dir | grep ethtool_-e
/var/tmp/sosreport-pmoravec-rhel8-2021-02-15-kfuktnj/sos_commands/networking/ethtool_-e_ens3
/var/tmp/sosreport-pmoravec-rhel8-2021-02-15-kfuktnj/sos_commands/networking/ethtool_-e_cni-podman0
/var/tmp/sosreport-pmoravec-rhel8-2021-02-15-kfuktnj/sos_commands/networking/ethtool_-e_lo
#

Comment 19 Pavel Moravec 2021-02-15 11:56:54 UTC
FYI hotfix for RHEL 8.1 is available at:

https://people.redhat.com/pmoravec/sos-3.7-9.el8_1/

directory (SRPM and RPMs for sos and also sos-audit are there). No official errata planned for 8.1.

Comment 21 errata-xmlrpc 2021-05-18 14:49:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (sos bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:1604