Bug 1939705

Summary: Tuned fails to move IRQs to the housekeeping CPUs
Product: Red Hat Enterprise Linux 7 Reporter: Nitesh Narayan Lal <nilal>
Component: tunedAssignee: Jaroslav Škarvada <jskarvad>
Status: CLOSED DUPLICATE QA Contact: rhel-cs-infra-services-qe <rhel-cs-infra-services-qe>
Severity: high Docs Contact:
Priority: high    
Version: 7.6CC: fkrska, jeder, jorton, jskarvad, jzerdik, lcapitulino, marjones, mtosatti
Target Milestone: rcKeywords: Triaged
Target Release: ---Flags: pm-rhel: mirror+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-04-26 21:08:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1730479, 1939252    

Description Nitesh Narayan Lal 2021-03-16 20:56:28 UTC
Description of problem:
Tuned fails to move all the device IRQs from the isolated CPUs to the HK
CPUs.

Version-Release number of selected component (if applicable):
tuned-2.10.0-6.el7_6.4.noarch

How reproducible:
Tried it a couple of times and was able to reproduce the issue.

Steps to Reproduce:
1. Bring up an RT host
2. Create VFs on the runtime so that the IRQs corresponding to it are
   pinned to the isolated CPUs
3. List the IRQ affinity and verify that some of them are pinned to the
  isolated CPUs
4. Re-run the tuned-adm profile realtime-virtual-host command
5. List the IRQ affinity again to check if all of the IRQs that are recently
   created have moved to the HK CPUs or not 

Actual results:
Device IRQs on isolated CPUs

Expected results:
All device IRQs moved to the HK CPUs

Additional info:

Comment 20 Nitesh Narayan Lal 2021-03-25 14:27:07 UTC
So I was able to look into the issue that is reported in this BZ.

What happens here is that tuned has this logic where while setting the
SMP affinity mask for an IRQ, tuned finds an intersection of the CPUs where
we are trying to move the IRQs with the previously set SMP affinity mask.
Then tuned tries to move the IRQs only to those CPUs that are part of this
intersection result.
Now, if those CPUs are out of available vector it fails and restores the
original affinity mask.

Here is an example to further clarify the above explanation:

[root@uhn6qtlab1cvcm03 tracing]# cat /proc/irq/729/smp_affinity_list 
0-19,40-59

[root@uhn6qtlab1cvcm03 tracing]# echo 0,40 > /proc/irq/729/smp_affinity_list

[root@uhn6qtlab1cvcm03 tracing]# cat /proc/irq/729/smp_affinity_list 
0-19,40-59

[root@uhn6qtlab1cvcm03 tracing]# echo 20,60 > /proc/irq/729/smp_affinity_list

[root@uhn6qtlab1cvcm03 tracing]# cat /proc/irq/729/smp_affinity_list 
20,60


In the above example, we have 0,40,20,60 as the HK CPUs. However,  Tuned will
only try to set affinity masks corresponding to 0,40 based on the intersection
results. However, 0,40 are apparently out of available vectors and hence tuned
and the manual attempt to write 0,40 fails.

However, if we try to write 20,60 it passes but tuned will not use this as
explained above.

Ideally, this issue should not occur once we fix vector exhaustion by reducing the
Net-dev queue count (Bug 1942508). For now, I am keeping the Bug open as I will
probably run some more tests later on.

Comment 21 Nitesh Narayan Lal 2021-04-26 21:08:54 UTC
Based on the findings in Comment 20, this issue is fixed as we fix
the vector exhaustion issue on the housekeeping CPUs.

Hence, closing this as a DUP of Bug 1942508.

*** This bug has been marked as a duplicate of bug 1942508 ***