Bug 1640356

Summary: sosreport executes ethtool commands on incorrect vlan device
Product: Red Hat Enterprise Linux 7 Reporter: Jonathan Maxwell <jmaxwell>
Component: sosAssignee: Pavel Moravec <pmoravec>
Status: CLOSED ERRATA QA Contact: Miroslav HradĂ­lek <mhradile>
Severity: high Docs Contact:
Priority: high    
Version: 7.5CC: agk, bmr, cww, gavin, plambri, ptalbert, sbradley
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: sos-3.7-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-08-06 13:15:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1594286, 1648022    
Attachments:
Description Flags
amended patch to exclude bonding_masters none

Description Jonathan Maxwell 2018-10-17 21:15:44 UTC
Description of problem:

Add a vlan device:

# ip link add link br0 name br0.100 type vlan id 100

14: br0.100@br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000

# ethtool -k br0.100
Features for br0.100:
rx-checksumming: off [fixed]
tx-checksumming: on
	tx-checksum-ipv4: off [fixed]
	tx-checksum-ip-generic: on
	tx-checksum-ipv6: off [fixed]
	tx-checksum-fcoe-crc: off
	tx-checksum-sctp: off

Generate a sosreport:

# sosreport

The sosreport will have a file called:

sos_command/networking/ethtool_-k_br0.100_br

It contains:

Cannot get device feature names: No such device

Or

ethtool: bad command line argument(s)
For more information run ethtool -h

It should not be appending _<interface_name>.

Version-Release number of selected component (if applicable):

# rpm -qf `which sosreport`
sos-3.5-9.el7_5.noarch

How reproducible:

Always as above.

Actual results:

Sosreport uses the incorrect vlan device name and subsequently this valuable ethtool information is missing from sosreports.

Expected results:

Use the correct device name with ethtool commands.

Comment 2 Pavel Moravec 2018-10-18 08:26:16 UTC
Nice catch. The problem origins from where we get link names (ip -o link):

# ip -o link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000\    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000\    link/ether 00:1a:4a:22:39:27 brd ff:ff:ff:ff:ff:ff
3: br0.100@eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000\    link/ether 00:1a:4a:22:39:27 brd ff:ff:ff:ff:ff:ff
4: br0.10@test@eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000\    link/ether 00:1a:4a:22:39:27 brd ff:ff:ff:ff:ff:ff
#

Here we take 2nd column without trailing @NONE (no idea now why that).

We must identify the link names by some different means..

Comment 3 Pavel Moravec 2018-10-18 08:33:45 UTC
(see in above example that vlan device can have '@' in its name, as well as (I think) eth name, so we cant say for sure what br0.100@eth0 or br0.10@test@eth0 is - eth name or vlan name with appended eth name?)

Any idea how to list interface names uniquely and precisely is welcomed.

Comment 4 Patrick Talbert 2018-10-18 09:42:44 UTC
Created attachment 1495228 [details]
example patch to use sysfs to collect interface names

Hehe I looked at the networking module and its get_eth_interfaces() function and kinda cried a little bit.

For me, a much cleaner and fool-proof way to get a list of net devs is to just look at the /sys/class/net/ directory contents. You will not have to do any goofy substring matching or splits or anything like that...

$ ls -1 /sys/class/net
br0
enp0s31f6
lo
tun0
virbr0
virbr0-nic
virbr1
virbr1-nic
virbr4
virbr4-nic
virbr5
virbr5-nic
vnet0
vnet1
vnet2
wlp58s0


I am not so good at python but how about this?

Comment 5 Pavel Moravec 2018-10-18 18:37:31 UTC
(In reply to Patrick Talbert from comment #4)
> Created attachment 1495228 [details]
> example patch to use sysfs to collect interface names
> 
> Hehe I looked at the networking module and its get_eth_interfaces() function
> and kinda cried a little bit.
> 
> For me, a much cleaner and fool-proof way to get a list of net devs is to
> just look at the /sys/class/net/ directory contents. You will not have to do
> any goofy substring matching or splits or anything like that...
> 
> $ ls -1 /sys/class/net
> br0
> enp0s31f6
> lo
> tun0
> virbr0
> virbr0-nic
> virbr1
> virbr1-nic
> virbr4
> virbr4-nic
> virbr5
> virbr5-nic
> vnet0
> vnet1
> vnet2
> wlp58s0
> 
> 
> I am not so good at python but how about this?

That is easy to implement, but is there a similar way to get equivalent list of interfaces per given namespace?

Since we use the same get_eth_interfaces method also for generating list of commands like:

ip netns exec $namespace ethtool -k $eth

(for each $eth in $namespace)


(I am networking noob so might ask basic question here)


Thanks in advance for advice.

Comment 6 Patrick Talbert 2018-10-19 06:49:21 UTC
Created attachment 1495545 [details]
amended patch to exclude bonding_masters

Hey Pavel,

Yeah, the patch handles that as well:

224             # Devices that exist in a namespace use less ethtool
225             # parameters. Run this per namespace.
226             for namespace in self.get_ip_netns(ip_netns_file):
227                 ns_cmd_prefix = cmd_prefix + namespace + " "
228                 netns_netdev_list = self.call_ext_prog(ns_cmd_prefix +
229                                                        "ls -1 /sys/class/net/")
230                 for eth in netns_netdev_list['output'].splitlines():
231                     self.add_cmd_output([
232                         ns_cmd_prefix + "ethtool " + eth,
233                         ns_cmd_prefix + "ethtool -i " + eth,
234                         ns_cmd_prefix + "ethtool -k " + eth,
235                         ns_cmd_prefix + "ethtool -S " + eth
236                     ])



One thing I did realize is that if the bonding module is loaded, it creates a *file* called 'bonding_masters' under /sys/class/net/. So we want to exclude that from the list (in both the default and extra namespaces).

I submit here a second patch which skips add_cmd_output if it sees 'bonding_masters' .


What do you think? At least on my system, it creates a valid report with the expected ethtool output for both the default namespace and custom namespaces.

Patrick

Comment 7 Pavel Moravec 2018-10-20 08:45:44 UTC
Indeed the patch works well there also.

Thanks for the prompt feedback and definitely also for patch - after adding some comments, I propose it to upstream:

https://github.com/sosreport/sos/pull/1458


(devel_ack+ for 7.7, now it is tooo late for 7.6 in any case)

Comment 8 Pavel Moravec 2019-03-18 19:17:30 UTC
POSTed to upstream

Comment 9 Pavel Moravec 2019-03-18 19:17:47 UTC
POSTed to upstream

Comment 13 errata-xmlrpc 2019-08-06 13:15:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2295