Bug 2089558

Summary: [E810]Executing ethtool error: netlink error: requested channel count exceeds maximum (offset 36)
Product: Red Hat Enterprise Linux 9 Reporter: mhou <mhou>
Component: tunedAssignee: Jaroslav Škarvada <jskarvad>
Status: CLOSED NOTABUG QA Contact: mhou <mhou>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 9.0CC: jeder, jskarvad, jzerdik, lcapitulino, nilal, ppandit
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-06-10 16:03:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description mhou 2022-05-24 03:09:06 UTC
Description of problem:
When configure realtime-virtual-host profile on tuned, tuned repord an error log as "Executing ethtool error: netlink error: requested channel count exceeds maximum (offset 36)"

Version-Release number of selected component (if applicable):

kernel version:5.14.0-70.13.1.el9_0.x86_64

tuned: tuned-2.18.0-1.el9.noarch.rpm   
tuned-profiles-nfv-2.18.0-1.el9.noarch.rpm

# ethtool -i ens1f0
driver: ice
version: 5.14.0-70.13.1.el9_0.x86_64
firmware-version: 3.00 0x80008944 20.5.13
expansion-rom-version: 
bus-info: 0000:3b:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

How reproducible: 100%


Steps to Reproduce:
1. configure isolated cpu as below:
# lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  48
  On-line CPU(s) list:   0-47
Vendor ID:               GenuineIntel
  BIOS Vendor ID:        Intel
  Model name:            Intel(R) Xeon(R) Silver 4214R CPU @ 2.40GHz
    BIOS Model name:     Intel(R) Xeon(R) Silver 4214R CPU @ 2.40GHz
    CPU family:          6
    Model:               85
    Thread(s) per core:  2
    Core(s) per socket:  12
    Socket(s):           2
    Stepping:            7
    BogoMIPS:            4800.00
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
                          nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq
                          dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_ti
                         mer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin ssbd mba ibrs i
                         bpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm m
                         px rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cq
                         m_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm arat pln pts pku ospke avx512_vnni md_clear flush_l1d arch_capabilities
Virtualization features: 
  Virtualization:        VT-x
Caches (sum of all):     
  L1d:                   768 KiB (24 instances)
  L1i:                   768 KiB (24 instances)
  L2:                    24 MiB (24 instances)
  L3:                    33 MiB (2 instances)
NUMA:                    
  NUMA node(s):          2
  NUMA node0 CPU(s):     0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46
  NUMA node1 CPU(s):     1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47
Vulnerabilities:         
  Itlb multihit:         KVM: Mitigation: VMX disabled
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Enhanced IBRS, IBPB conditional, RSB filling
  Srbds:                 Not affected
  Tsx async abort:       Mitigation; TSX disabled

# cat /etc/tuned/realtime-virtual-host-variables.conf 
isolated_cores=2,4,6,8,10,12,14,16,18,20,22,26,28,30,32,34,36,38,40,42,44,46
isolate_managed_irq=Y

2. enable tuned profile
# tuned-adm profile realtime-virtual-host

3. reboot system and monitor tuned.log
2022-05-18 03:36:55,681 INFO     tuned.daemon.application: TuneD: 2.18.0, kernel: 5.14.0-70.13.1.rt21.83.el9_0.x86_64
2022-05-18 03:36:55,687 INFO     tuned.daemon.application: dynamic tuning is globally disabled
2022-05-18 03:36:55,737 INFO     tuned.daemon.daemon: using sleep interval of 1 second(s)
2022-05-18 03:36:55,760 INFO     tuned.profiles.loader: loading profile: realtime-virtual-host
2022-05-18 03:36:55,806 INFO     tuned.daemon.controller: starting controller
2022-05-18 03:36:55,807 INFO     tuned.daemon.daemon: starting tuning
2022-05-18 03:36:55,876 INFO     tuned.plugins.base: instance cpu: assigning devices cpu5, cpu32, cpu27, cpu2, cpu43, cpu3, cpu7, cpu34, cpu38, cpu9, cpu47, cpu44, cpu18, cpu40, cpu14, cpu13, cpu6, cpu41, cpu23, cpu39, cpu0, cpu30, cpu33, cpu4, cpu45, cpu20, cpu1, cpu16, cpu37, cpu31, cpu29, cpu10, cpu25, cpu17, cpu11, cpu28, cpu35, cpu46, cpu8, cpu19, cpu24, cpu36, cpu12, cpu42, cpu15, cpu21, cpu22, cpu26
2022-05-18 03:36:55,893 INFO     tuned.plugins.plugin_cpu: We are running on an x86 GenuineIntel platform
2022-05-18 03:36:55,915 INFO     tuned.plugins.base: instance net: assigning devices ens1f0, ens2f0, eno4, eno2, eno3, ens2f1, ens1f1, eno1
2022-05-18 03:36:56,115 INFO     tuned.plugins.plugin_rtentsk: opened SOF_TIMESTAMPING_OPT_TX_SWHW socket
2022-05-18 03:36:56,142 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu5'
2022-05-18 03:36:56,152 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu32'
2022-05-18 03:36:56,161 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu27'
2022-05-18 03:36:56,175 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu2'
2022-05-18 03:36:56,186 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu43'
2022-05-18 03:36:56,197 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu3'
2022-05-18 03:36:56,208 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu7'
2022-05-18 03:36:56,218 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu34'
2022-05-18 03:36:56,229 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu38'
2022-05-18 03:36:56,239 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu9'
2022-05-18 03:36:56,250 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu47'
2022-05-18 03:36:56,261 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu44'
2022-05-18 03:36:56,272 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu18'
2022-05-18 03:36:56,287 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu40'
2022-05-18 03:36:56,297 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu14'
2022-05-18 03:36:56,305 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu13'
2022-05-18 03:36:56,312 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu6'
2022-05-18 03:36:56,320 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu41'
2022-05-18 03:36:56,327 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu23'
2022-05-18 03:36:56,334 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu39'
2022-05-18 03:36:56,341 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu0'
2022-05-18 03:36:56,348 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu30'
2022-05-18 03:36:56,355 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu33'
2022-05-18 03:36:56,363 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu4'
2022-05-18 03:36:56,370 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu45'
2022-05-18 03:36:56,377 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu20'
2022-05-18 03:36:56,384 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu1'
2022-05-18 03:36:56,391 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu16'
2022-05-18 03:36:56,398 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu37'
2022-05-18 03:36:56,404 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu31'
2022-05-18 03:36:56,411 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu29'
2022-05-18 03:36:56,418 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu10'
2022-05-18 03:36:56,425 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu25'
2022-05-18 03:36:56,432 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu17'
2022-05-18 03:36:56,439 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu11'
2022-05-18 03:36:56,446 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu28'
2022-05-18 03:36:56,453 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu35'
2022-05-18 03:36:56,460 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu46'
2022-05-18 03:36:56,467 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu8'
2022-05-18 03:36:56,474 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu19'
2022-05-18 03:36:56,480 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu24'
2022-05-18 03:36:56,487 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu36'
2022-05-18 03:36:56,494 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu12'
2022-05-18 03:36:56,501 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu42'
2022-05-18 03:36:56,508 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu15'
2022-05-18 03:36:56,515 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu21'
2022-05-18 03:36:56,522 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu22'
2022-05-18 03:36:56,529 INFO     tuned.plugins.plugin_cpu: energy_perf_bias successfully set to 'performance' on cpu 'cpu26'
2022-05-18 03:36:56,529 INFO     tuned.plugins.plugin_cpu: setting new cpu latency 1
2022-05-18 03:36:56,530 ERROR    tuned.plugins.plugin_sysctl: Failed to read sysctl parameter 'net.core.busy_read', the parameter does not exist
2022-05-18 03:36:56,530 ERROR    tuned.plugins.plugin_sysctl: sysctl option net.core.busy_read will not be set, failed to read the original value.
2022-05-18 03:36:56,530 ERROR    tuned.plugins.plugin_sysctl: Failed to read sysctl parameter 'net.core.busy_poll', the parameter does not exist
2022-05-18 03:36:56,530 ERROR    tuned.plugins.plugin_sysctl: sysctl option net.core.busy_poll will not be set, failed to read the original value.
2022-05-18 03:36:56,530 ERROR    tuned.plugins.plugin_sysctl: Failed to read sysctl parameter 'kernel.numa_balancing', the parameter does not exist
2022-05-18 03:36:56,531 ERROR    tuned.plugins.plugin_sysctl: sysctl option kernel.numa_balancing will not be set, failed to read the original value.
2022-05-18 03:36:56,534 INFO     tuned.plugins.plugin_sysctl: reapplying system sysctl
2022-05-18 03:36:56,534 WARNING  tuned.plugins.plugin_vm: Option 'transparent_hugepages' is not supported on current hardware.
2022-05-18 03:36:56,535 INFO     tuned.plugins.plugin_bootloader: installing additional boot command line parameters to grub2
2022-05-18 03:36:56,627 WARNING  tuned.profiles.functions.function_check_net_queue_count: net-dev queue count is not correctly specified, setting it to HK CPUs 26

2022-05-18 03:36:57,230 ERROR    tuned.utils.commands: Executing ethtool error: netlink error: requested channel count exceeds maximum (offset 36)
netlink error: Invalid argument
2022-05-18 03:36:57,237 ERROR    tuned.utils.commands: Executing ethtool error: netlink error: requested channel count exceeds maximum (offset 36)
netlink error: Invalid argument
2022-05-18 03:36:57,243 ERROR    tuned.utils.commands: Executing ethtool error: netlink error: requested channel count exceeds maximum (offset 36)
netlink error: Invalid argument
2022-05-18 03:36:57,816 ERROR    tuned.utils.commands: Executing ethtool error: netlink error: requested channel count exceeds maximum (offset 36)
netlink error: Invalid argument
2022-05-18 03:36:57,877 INFO     tuned.plugins.plugin_script: calling script '/usr/lib/tuned/realtime/script.sh' with arguments '['start']'
2022-05-18 03:36:57,921 INFO     tuned.plugins.plugin_script: calling script '/usr/lib/tuned/realtime-virtual-host/script.sh' with arguments '['start']'
2022-05-18 03:36:58,403 INFO     tuned.daemon.daemon: static tuning from profile 'realtime-virtual-host' applied


Actual results:
got ERROR message about "Executing ethtool error: netlink error: requested channel count exceeds maximum (offset 36)"

Expected results:
no more ERROR message about ethtool

Additional info:
A discussion of how ethtool sets up channels is in the bug 2086137.

When using ethtool to configure multiple channels, you would need to do the setting for both rx and tx, or you can do it for combined. I wonder if tuned only sets tx or rx when calling ethtool?

Comment 1 mhou 2022-05-24 03:12:54 UTC
+ Luiz for visibility

Do you know how realtime-virtual-host sets up the Ethernet channel?

Comment 2 Luiz Capitulino 2022-05-24 14:28:41 UTC
Prasad,

Would you please check this for us? You might want to check with Nitesh about ethtool usage in our profiles, I don't remember all details.

Comment 3 Nitesh Narayan Lal 2022-05-24 15:02:07 UTC
From what I recall, tuned should already be setting both rx and tx together or Combined count based on the supported mode for the network device.
If that is not happening then we should review tuned/plugins/plugin_net.py - _replace_channels_parameters().

Having said that, from Bug 2086137 please note:

The error "tuned.utils.commands: Executing ethtool error: netlink error: requested channel count exceeds maximum (offset 36)" 
and what Minxi is reporting by manually running the ethtool command:

# ethtool -L ens1f0 rx 48
netlink error: Invalid argument

are two different errors.

A simple test would be to print the command line that is being used by tuned at the time "requested channel count exceeds maximum (offset 36)" is thrown along with the network device name.
Minxi, can you do that?

Thanks

Comment 4 Nitesh Narayan Lal 2022-05-24 15:02:58 UTC
My bad, restoring the needinfo for Prasad.

Comment 5 Nitesh Narayan Lal 2022-05-24 17:31:18 UTC
Minxi, can you share the environment where you are reproducing the issue for us to have a look?
Thanks

Comment 6 mhou 2022-05-25 02:17:31 UTC
Hello Nitesh

I have already sent the root credential of an email to you. Please check it.

Comment 7 Nitesh Narayan Lal 2022-05-25 02:58:07 UTC
As mentioned in the previous comment the error is not tuned-specific.

[root@dell-per740-77 ~]# ethtool -l eno4
Channel parameters for eno4:
Pre-set maximums:
RX:		4
TX:		4
Other:		n/a
Combined:	n/a
Current hardware settings:
RX:		4
TX:		1
Other:		n/a
Combined:	n/a


[root@dell-per740-77 ~]# ethtool -L eno4 rx 26 tx 26
netlink error: requested channel count exceeds maximum (offset 36)
netlink error: Invalid argument


[root@dell-per740-77 ~]# ethtool -L eno4 rx 2 tx 2
[root@dell-per740-77 ~]# ethtool -l eno4
Channel parameters for eno4:
Pre-set maximums:
RX:		4
TX:		4
Other:		n/a
Combined:	n/a
Current hardware settings:
RX:		2
TX:		2
Other:		n/a
Combined:	n/a


The reason why ethtool and hence tuned is throwing an error is simply that with HK CPUs set to 26 we are trying to set the rx and tx count with a value greater than the maximum allowed limit for the network device. As shared above the same error can be reproduced with the ethtool command line as well.

With tuned on configuring the HK CPUs as 2 or anything less than 4 the error will go away in the tuned.log.

Please note that tuned also supports per device queue count configuration if a value other than the HK CPUs needs to be configured on a per device basis. The queue count value should be defined based on the use case that you are trying to test.
I hope that helps.

Comment 8 Nitesh Narayan Lal 2022-05-25 02:58:07 UTC
As mentioned in the previous comment the error is not tuned-specific.

[root@dell-per740-77 ~]# ethtool -l eno4
Channel parameters for eno4:
Pre-set maximums:
RX:		4
TX:		4
Other:		n/a
Combined:	n/a
Current hardware settings:
RX:		4
TX:		1
Other:		n/a
Combined:	n/a


[root@dell-per740-77 ~]# ethtool -L eno4 rx 26 tx 26
netlink error: requested channel count exceeds maximum (offset 36)
netlink error: Invalid argument


[root@dell-per740-77 ~]# ethtool -L eno4 rx 2 tx 2
[root@dell-per740-77 ~]# ethtool -l eno4
Channel parameters for eno4:
Pre-set maximums:
RX:		4
TX:		4
Other:		n/a
Combined:	n/a
Current hardware settings:
RX:		2
TX:		2
Other:		n/a
Combined:	n/a


The reason why ethtool and hence tuned is throwing an error is simply that with HK CPUs set to 26 we are trying to set the rx and tx count with a value greater than the maximum allowed limit for the network device. As shared above the same error can be reproduced with the ethtool command line as well.

With tuned on configuring the HK CPUs as 2 or anything less than 4 the error will go away in the tuned.log.

Please note that tuned also supports per device queue count configuration if a value other than the HK CPUs needs to be configured on a per device basis. The queue count value should be defined based on the use case that you are trying to test.
I hope that helps.

Comment 9 mhou 2022-05-25 03:25:21 UTC
Hello Nitesh

Any test interface shouldn't configure channel number more than Pre-set maximums. I thought tuned should check two things as below:

1. Whether the target number of channels is more than the Pre-set maximums
2. If the NIC shows that combine is n/a, then tuned need to modify the configuration of the tuned channel sent to ethtool. e.g don't configure combined if n/a appear on Combined.

Comment 10 Nitesh Narayan Lal 2022-05-25 13:08:39 UTC
(In reply to mhou from comment #9)
> Hello Nitesh
> 
> Any test interface shouldn't configure channel number more than Pre-set
> maximums. 

Please note, to configure the queue counts through tuned you have two options:
1. Use HK CPUs as the reference for all the network devices (default)
2. Configure each network device individually based on your use-case

The default could be something that might be different but at the moment it is 
set to HK CPUs from a low-latency use-case perspective where we generally 
have very few HK CPUs.
If you have a large number of HK CPUs then instead of relying on the default 
you should configure it to what makes sense for your setup.

> I thought tuned should check two things as below:
> 
> 1. Whether the target number of channels is more than the Pre-set maximums

What is the target number of channels? The one that you are trying to configure?
I think you are referring to that tuned should dynamically adjust the queue count
from x to x-y based on the maximum limit.
If so, then AFAIK for all the network devices the queue count is by default set
to the max and in which case you don't need to configure anything.

Having said that tuned could improve how the errors are captured and
reported. However, that is a general problem with tuned. In the past,
there were some efforts towards that but I am not sure about their current state.

Jaroslav can comment more on that.


> 2. If the NIC shows that combine is n/a, then tuned need to modify the
> configuration of the tuned channel sent to ethtool. e.g don't configure
> combined if n/a appear on Combined.

Well, it won't even if it tries to as the network device won't allow it.
Tuned dynamically replaces the parameter based on the supported mode.
Can you give an example showing that the above is not happening?

Comment 11 mhou 2022-05-25 15:40:57 UTC
Hello Nitesh

Sorry for the inconvenience, I need to re-claim my opines. 
> I think you are referring to that tuned should dynamically adjust the queue count from x to x-y based on the maximum limit.
yes, you are right. For example, I configure cpu isolation and keep cpu which located on core0 as housekeeping. On enable HT scenario, There are 4 CPUs as housekeeping(Suppose there are 2 numa nodes). So in this scenario, tuned should configure 4 channels on each TX/RX. In this time, I thought the configured won't get ERROR message like "requested channel count exceeds maximum". 

But when the housekeeping cpu is larger than the maximum queue the NIC can support, tuned default behavior is to configure the number of channels equal to the number of housekeeping channels. This will of course result in an error from tuned.

# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              56
On-line CPU(s) list: 0-55
Thread(s) per core:  1
Core(s) per socket:  28
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
BIOS Vendor ID:      Intel
CPU family:          6
Model:               106
Model name:          Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz
BIOS Model name:     Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz
Stepping:            6
CPU MHz:             2000.000
BogoMIPS:            4000.00
Virtualization:      VT-x
L1d cache:           48K
L1i cache:           32K
L2 cache:            1280K
L3 cache:            43008K
NUMA node0 CPU(s):   0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54
NUMA node1 CPU(s):   1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local split_lock_detect wbnoinvd dtherm ida arat pln pts avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme avx512_vpopcntdq la57 rdpid fsrm md_clear pconfig flush_l1d arch_capabilities

# cat /etc/tuned/realtime-virtual-host-variables.conf 
isolated_cores=2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54
isolate_managed_irq=Y

1. check this interface support 4 TX and 4 RX.
# ethtool -l eno8403
Channel parameters for eno8403:
Pre-set maximums:
RX:		4    <--------This is the maximum number of RX that the interface can support
TX:		4    <--------This is the maximum number of TX that the interface can support
Other:		n/a
Combined:	n/a    <-------This is the maximum number of combined that the interface can support
Current hardware settings:
RX:		4
TX:		4
Other:		n/a
Combined:	n/a

2. use ethtool to configure TX/RX channel
# ethtool -L eno8403 rx 4 tx 4

3. enable realtime-virtual-host profile
2022-05-25 10:57:26,227 ERROR    tuned.utils.commands: Executing ethtool error: netlink error: requested channel count exceeds maximum (offset 40)
netlink error: Invalid argument
2022-05-25 10:57:26,234 ERROR    tuned.utils.commands: Executing ethtool error: netlink error: requested channel count exceeds maximum (offset 40)
netlink error: Invalid argument
2022-05-25 10:57:26,239 ERROR    tuned.utils.commands: Executing ethtool error: netlink error: requested channel count exceeds maximum (offset 40)
netlink error: Invalid argument
2022-05-25 10:57:26,245 ERROR    tuned.utils.commands: Executing ethtool error: netlink error: requested channel count exceeds maximum (offset 36)
netlink error: Invalid argument
2022-05-25 10:57:26,251 ERROR    tuned.utils.commands: Executing ethtool error: netlink error: requested channel count exceeds maximum (offset 40)
netlink error: Invalid argument
2022-05-25 10:57:26,257 ERROR    

In this time, From NIC side, no more change effect, because the default channel setting is the maximum what NIC can support. But more ERROR message occurred from tuned.log. 

> Well, it won't even if it tries to as the network device won't allow it.
Tuned dynamically replaces the parameter based on the supported mode.
Can you give an example showing that the above is not happening?

I agree with your point of view.

Comment 12 Nitesh Narayan Lal 2022-05-25 16:21:54 UTC
(In reply to mhou from comment #11)
> Hello Nitesh
> 
> Sorry for the inconvenience, I need to re-claim my opines. 

Don't worry about it.
The important thing is that we should be in agreement.

> > I think you are referring to that tuned should dynamically adjust the queue count from x to x-y based on the maximum limit.
> yes, you are right. For example, I configure cpu isolation and keep cpu
> which located on core0 as housekeeping. On enable HT scenario, There are 4
> CPUs as housekeeping(Suppose there are 2 numa nodes). So in this scenario,
> tuned should configure 4 channels on each TX/RX. In this time, I thought the
> configured won't get ERROR message like "requested channel count exceeds
> maximum". 
> 
> But when the housekeeping cpu is larger than the maximum queue the NIC can
> support, tuned default behavior is to configure the number of channels equal
> to the number of housekeeping channels. This will of course result in an
> error from tuned.
> 
> # lscpu
> Architecture:        x86_64
> CPU op-mode(s):      32-bit, 64-bit
> Byte Order:          Little Endian
> CPU(s):              56


[...]

> 
> 3. enable realtime-virtual-host profile
> 2022-05-25 10:57:26,227 ERROR    tuned.utils.commands: Executing ethtool
> error: netlink error: requested channel count exceeds maximum (offset 40)
> netlink error: Invalid argument
> 2022-05-25 10:57:26,234 ERROR    tuned.utils.commands: Executing ethtool
> error: netlink error: requested channel count exceeds maximum (offset 40)
> netlink error: Invalid argument
> 2022-05-25 10:57:26,239 ERROR    tuned.utils.commands: Executing ethtool
> error: netlink error: requested channel count exceeds maximum (offset 40)
> netlink error: Invalid argument
> 2022-05-25 10:57:26,245 ERROR    tuned.utils.commands: Executing ethtool
> error: netlink error: requested channel count exceeds maximum (offset 36)
> netlink error: Invalid argument
> 2022-05-25 10:57:26,251 ERROR    tuned.utils.commands: Executing ethtool
> error: netlink error: requested channel count exceeds maximum (offset 40)
> netlink error: Invalid argument
> 2022-05-25 10:57:26,257 ERROR    
> 
> In this time, From NIC side, no more change effect, because the default
> channel setting is the maximum what NIC can support. But more ERROR message
> occurred from tuned.log. 
> 

Right, and the error handling and reporting is something that can be improved in
tuned.

However, as I mentioned this is a broader problem with plugin_net as I think 
if we try to use any other functionality from plugin_net script in a way that is
not supported by the underneath tool (eg. ethtool) then tuned will still print
out the error messages as is from that tool.

Jaroslav can correct me here.

Also, I don't know if improving error handling & reporting on a per plugin basis is
something that Jaroslav and the tuned team have in their queue.

Thanks

Comment 13 Prasad Pandit 2022-06-01 08:10:20 UTC
Removing needinfo, Nitesh has replied with the analysis above.

Comment 14 mhou 2022-06-10 16:03:06 UTC
close this bug as NOT A BUG

Comment 15 Red Hat Bugzilla 2023-09-15 01:55:06 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days