Bug 2220858
| Summary: | Setting ethtool pause will cause i40e interface goes offline | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Gris Ge <fge> | ||||
| Component: | NetworkManager | Assignee: | NetworkManager Development Team <nm-team> | ||||
| Status: | CLOSED MIGRATED | QA Contact: | Vladimir Benes <vbenes> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 9.2 | CC: | bgalvani, desktop-qa-list, ksundara, lrintel, rkhan, sfaye, sukulkar, till | ||||
| Target Milestone: | rc | Keywords: | MigratedToJIRA, Triaged | ||||
| Target Release: | 9.4 | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2023-08-17 13:17:36 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Another additional info:
The `ethtool` CLI is using netlink to configure the ethtool pause:
sendto(3, [{nlmsg_len=60, nlmsg_type=ethtool, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=2, nlmsg_pid=0}, "\x16\x01\x00\x00\x10\x00\x01\x80\x09\x00\x02\x00\x65\x6e\x6f\x34\x00\x00\x00\x00\x05\x00\x02\x00\x00\x00\x03\x00\x05\x00\x03\x00"...], 60, 0, {sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, 12) = 60
While the NetworkManager is using old ioctl way to configure ethtool pause:
static int
_ethtool_call_handle(SocketHandle *shandle, gpointer edata, gsize edata_size)
{
char sbuf[50];
return _ioctl_call("ethtool",
_ethtool_edata_to_string(edata, edata_size, sbuf, sizeof(sbuf)),
SIOCETHTOOL,
shandle->ifindex,
&shandle->fd,
shandle->ifname,
IOCTL_CALL_DATA_TYPE_IFRDATA,
edata,
edata_size,
NULL);
}
Another possible cause of this failure could be the race between link up and changing ethtool pause:
Jul 06 05:47:41 computesriovoffload-0 NetworkManager[29972]: <trace> [1688636861.4458] ethtool[5]: ETHTOOL_GPAUSEPARAM, eno4: success
Jul 06 05:47:41 computesriovoffload-0 NetworkManager[29972]: <trace> [1688636861.4458] ethtool[5]: get-pause: retrieved kernel pause settings
Jul 06 05:47:41 computesriovoffload-0 kernel: i40e 0000:19:00.3 eno4: Autoneg did not complete so changing settings may not result in an actual change.
Jul 06 05:47:41 computesriovoffload-0 kernel: i40e 0000:19:00.3 eno4: NIC Link is Down
Jul 06 05:47:41 computesriovoffload-0 NetworkManager[29972]: <trace> [1688636861.6433] ethtool[5]: ETHTOOL_SPAUSEPARAM, eno4: success
Jul 06 05:47:41 computesriovoffload-0 NetworkManager[29972]: <trace> [1688636861.6433] ethtool[5]: set-pause: set kernel puase settings
Jul 06 05:47:41 computesriovoffload-0 NetworkManager[29972]: <debug> [1688636861.6434] device[a96dcf1e04d86192] (eno4): ethtool: pause settings successfully set
> While the NetworkManager is using old ioctl way to configure ethtool pause
I don't think this should make a difference, in the end both the ioctl and the netlink message call the same driver hook (dev->ethtool_ops->set_pauseparam()) to change the settings.
Can you please also attach the output of `ethtool eno4` before changing the pause configuration?
This is output when no NM connection was created and link is up:
[root@computesriovoffload-0 ~]# ethtool eno4
Settings for eno4:
Supported ports: [ FIBRE ]
Supported link modes: 10000baseT/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: 10000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: 10000Mb/s
Duplex: Full
Auto-negotiation: off
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Supports Wake-on: g
Wake-on: g
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes
This is the output of `ethtool eno4` when has ethtool pause setting in NM which lead to link down:
Settings for eno4:
Supported ports: [ FIBRE ]
Supported link modes: 10000baseT/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: 10000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: 10000Mb/s
Duplex: Full
Auto-negotiation: off
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Supports Wake-on: g
Wake-on: g
Current message level: 0x00000007 (7)
drv probe link
Link detected: no
I tried again and now the link goes up but nmstate fails during the verification because the settings are not applied:
# nmstatectl set ethtool_pause.yml
INFO nmstate::query_apply::net_state] Retrying on: VerificationError: Verification failure: eno4.interface.ethtool.pause.rx desire 'false', current 'true'
INFO nmstate::query_apply::net_state] Rollbacked to checkpoint /org/freedesktop/NetworkManager/Checkpoint/1075
NmstateError: VerificationError: Verification failure: eno4.interface.ethtool.pause.rx desire 'false', current 'true'
# ethtool -a eno4
Pause parameters for eno4:
Autonegotiate: off
RX: on
TX: on
# ethtool -A eno4 rx off tx off
# ethtool -a eno4
Pause parameters for eno4:
Autonegotiate: off
RX: off
TX: off
# dmesg | tail
[243697.885775] i40e 0000:19:00.3 eno4: Autoneg did not complete so changing settings may not result in an actual change.
[243698.013983] i40e 0000:19:00.3 eno4: NIC Link is Down
[243698.834701] i40e 0000:19:00.3 eno4: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
# ip link show eno4
5: eno4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether e4:43:4b:5c:96:83 brd ff:ff:ff:ff:ff:ff
altname enp25s0f3
The observations are:
- the error applying the pause setting might be related to the "Autoneg did not complete so changing settings may not result in an actual change" message
- changing the pause autonegotiation is not supported and should be done by changing the link autonegotiation:
# ethtool -A eno4 autoneg on
netlink error: Operation not supported
# dmesg
i40e 0000:19:00.3 eno4: To change autoneg please use: ethtool -s <dev> autoneg <on|off>
- changing link autonegotiation is not supported by the phy:
i40e 0000:19:00.3 eno4: Autoneg not supported on this phy
|
Created attachment 1974287 [details] Kernel and NetworkManager trace log Description of problem: When applying this nmstate state, the i40e interface will lose link carrier interfaces: - name: eno4 type: ethernet state: up ethtool: pause: rx: false tx: false autoneg: false The ethtool pause setting is actually the current and also default setting for this i40e NIC. Version-Release number of selected component (if applicable): kernel-5.14.0-283.el9.x86_64 NetworkManager-1.42.2-4.el9_2.x86_64 How reproducible: 100% Steps to Reproduce: 1. Find a server with i40e NIC 2. Apply above nmstate YAML file(change the interface name to match your env): sudo nmstatectl apply ethtool_pause.yml 3. ip link show eno4 Actual results: The i40e NIC lose link carrier and link down. Expected results: The i40e NIC will not lose link carrier. Additional info: Tried with `ip link set eno4 up` then `ethtool --pause eno4 autoneg off rx off tx off`, the link carrier will not lose. [root@computesriovoffload-0 fge]# ethtool -i eno4 driver: i40e version: 5.14.0-283.el9.x86_64 firmware-version: 8.40 0x8000af86 20.5.16 expansion-rom-version: bus-info: 0000:19:00.3 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes 19:00.3 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)