Bug 2220858

Summary: Setting ethtool pause will cause i40e interface goes offline
Product: Red Hat Enterprise Linux 9 Reporter: Gris Ge <fge>
Component: NetworkManagerAssignee: NetworkManager Development Team <nm-team>
Status: CLOSED MIGRATED QA Contact: Vladimir Benes <vbenes>
Severity: medium Docs Contact:
Priority: medium    
Version: 9.2CC: bgalvani, desktop-qa-list, ksundara, lrintel, rkhan, sfaye, sukulkar, till
Target Milestone: rcKeywords: MigratedToJIRA, Triaged
Target Release: 9.4   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-17 13:17:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Kernel and NetworkManager trace log none

Description Gris Ge 2023-07-06 10:08:28 UTC
Created attachment 1974287 [details]
Kernel and NetworkManager trace log

Description of problem:

When applying this nmstate state, the i40e interface will lose link carrier

interfaces:
- name: eno4
  type: ethernet
  state: up
  ethtool:
    pause:
      rx: false
      tx: false
      autoneg: false

The ethtool pause setting is actually the current and also default setting for this i40e NIC.


Version-Release number of selected component (if applicable):

kernel-5.14.0-283.el9.x86_64
NetworkManager-1.42.2-4.el9_2.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Find a server with i40e NIC
2. Apply above nmstate YAML file(change the interface name to match your env):
   sudo nmstatectl apply ethtool_pause.yml
3. ip link show eno4

Actual results:

The i40e NIC lose link carrier and link down.

Expected results:

The i40e NIC will not lose link carrier.

Additional info:

Tried with `ip link set eno4 up` then `ethtool --pause eno4 autoneg off rx off tx off`, the link carrier will not lose.

[root@computesriovoffload-0 fge]# ethtool -i eno4
driver: i40e
version: 5.14.0-283.el9.x86_64
firmware-version: 8.40 0x8000af86 20.5.16
expansion-rom-version:
bus-info: 0000:19:00.3
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

19:00.3 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)

Comment 1 Gris Ge 2023-07-06 10:23:38 UTC
Another additional info:

The `ethtool` CLI is using netlink to configure the ethtool pause:

sendto(3, [{nlmsg_len=60, nlmsg_type=ethtool, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=2, nlmsg_pid=0}, "\x16\x01\x00\x00\x10\x00\x01\x80\x09\x00\x02\x00\x65\x6e\x6f\x34\x00\x00\x00\x00\x05\x00\x02\x00\x00\x00\x03\x00\x05\x00\x03\x00"...], 60, 0, {sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, 12) = 60


While the NetworkManager is using old ioctl way to configure ethtool pause:

static int
_ethtool_call_handle(SocketHandle *shandle, gpointer edata, gsize edata_size)
{
    char sbuf[50];

    return _ioctl_call("ethtool",
                       _ethtool_edata_to_string(edata, edata_size, sbuf, sizeof(sbuf)),
                       SIOCETHTOOL,
                       shandle->ifindex,
                       &shandle->fd,
                       shandle->ifname,
                       IOCTL_CALL_DATA_TYPE_IFRDATA,
                       edata,
                       edata_size,
                       NULL);
}

Another possible cause of this failure could be the race between link up and changing ethtool pause:


Jul 06 05:47:41 computesriovoffload-0 NetworkManager[29972]: <trace> [1688636861.4458] ethtool[5]: ETHTOOL_GPAUSEPARAM, eno4: success
Jul 06 05:47:41 computesriovoffload-0 NetworkManager[29972]: <trace> [1688636861.4458] ethtool[5]: get-pause: retrieved kernel pause settings
Jul 06 05:47:41 computesriovoffload-0 kernel: i40e 0000:19:00.3 eno4: Autoneg did not complete so changing settings may not result in an actual change.
Jul 06 05:47:41 computesriovoffload-0 kernel: i40e 0000:19:00.3 eno4: NIC Link is Down
Jul 06 05:47:41 computesriovoffload-0 NetworkManager[29972]: <trace> [1688636861.6433] ethtool[5]: ETHTOOL_SPAUSEPARAM, eno4: success
Jul 06 05:47:41 computesriovoffload-0 NetworkManager[29972]: <trace> [1688636861.6433] ethtool[5]: set-pause: set kernel puase settings
Jul 06 05:47:41 computesriovoffload-0 NetworkManager[29972]: <debug> [1688636861.6434] device[a96dcf1e04d86192] (eno4): ethtool: pause settings successfully set

Comment 3 Beniamino Galvani 2023-07-06 12:01:36 UTC
> While the NetworkManager is using old ioctl way to configure ethtool pause

I don't think this should make a difference, in the end both the ioctl and the netlink message call the same driver hook (dev->ethtool_ops->set_pauseparam()) to change the settings.

Can you please also attach the output of `ethtool eno4` before changing the pause configuration?

Comment 4 Gris Ge 2023-07-07 08:03:35 UTC
This is output when no NM connection was created and link is up:

[root@computesriovoffload-0 ~]# ethtool eno4
Settings for eno4:
	Supported ports: [ FIBRE ]
	Supported link modes:   10000baseT/Full
	Supported pause frame use: Symmetric Receive-only
	Supports auto-negotiation: No
	Supported FEC modes: Not reported
	Advertised link modes:  10000baseT/Full
	Advertised pause frame use: No
	Advertised auto-negotiation: No
	Advertised FEC modes: Not reported
	Speed: 10000Mb/s
	Duplex: Full
	Auto-negotiation: off
	Port: Direct Attach Copper
	PHYAD: 0
	Transceiver: internal
	Supports Wake-on: g
	Wake-on: g
        Current message level: 0x00000007 (7)
                               drv probe link
	Link detected: yes



This is the output of `ethtool eno4` when has ethtool pause setting in NM which lead to link down:


Settings for eno4:
	Supported ports: [ FIBRE ]
	Supported link modes:   10000baseT/Full
	Supported pause frame use: Symmetric Receive-only
	Supports auto-negotiation: No
	Supported FEC modes: Not reported
	Advertised link modes:  10000baseT/Full
	Advertised pause frame use: No
	Advertised auto-negotiation: No
	Advertised FEC modes: Not reported
	Speed: 10000Mb/s
	Duplex: Full
	Auto-negotiation: off
	Port: Direct Attach Copper
	PHYAD: 0
	Transceiver: internal
	Supports Wake-on: g
	Wake-on: g
        Current message level: 0x00000007 (7)
                               drv probe link
	Link detected: no

Comment 5 Beniamino Galvani 2023-07-07 12:29:27 UTC
I tried again and now the link goes up but nmstate fails during the verification because the settings are not applied:

# nmstatectl set ethtool_pause.yml
INFO  nmstate::query_apply::net_state] Retrying on: VerificationError: Verification failure: eno4.interface.ethtool.pause.rx desire 'false', current 'true'
INFO  nmstate::query_apply::net_state] Rollbacked to checkpoint /org/freedesktop/NetworkManager/Checkpoint/1075
NmstateError: VerificationError: Verification failure: eno4.interface.ethtool.pause.rx desire 'false', current 'true' 
                                                                        
# ethtool -a eno4                                                                                                                                               
Pause parameters for eno4:        
Autonegotiate:  off                                                                                                                                                                           
RX:             on                                                                                                                                                                            
TX:             on                                                                                                                                                                            
                                                                                                                                                                                              
# ethtool -A eno4 rx off tx off 
                                                                                                                                
# ethtool -a eno4 
Pause parameters for eno4:                                                                                                                                                                    
Autonegotiate:  off                                                                                                                                                                           
RX:             off                                                                                                                                                                           
TX:             off                                                                                                                                                                           
                                               
# dmesg | tail                                                                                                                                                                                                                                        
[243697.885775] i40e 0000:19:00.3 eno4: Autoneg did not complete so changing settings may not result in an actual change.                     
[243698.013983] i40e 0000:19:00.3 eno4: NIC Link is Down                                                                                                                                      
[243698.834701] i40e 0000:19:00.3 eno4: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None   
                                                                               
# ip link show eno4                                                                                                                                             
5: eno4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000                                                                       
    link/ether e4:43:4b:5c:96:83 brd ff:ff:ff:ff:ff:ff                                                                                                                                        
    altname enp25s0f3

The observations are:
- the error applying the pause setting might be related to the "Autoneg did not complete so changing settings may not result in an actual change" message
- changing the pause autonegotiation is not supported and should be done by changing the link autonegotiation:
  # ethtool -A eno4 autoneg on
  netlink error: Operation not supported
  # dmesg
  i40e 0000:19:00.3 eno4: To change autoneg please use: ethtool -s <dev> autoneg <on|off>
- changing link autonegotiation is not supported by the phy:
  i40e 0000:19:00.3 eno4: Autoneg not supported on this phy