Bug 433956

Summary: [MRG] 2.6.24.1-21 kernel uses
Product: Red Hat Enterprise MRG Reporter: IBM Bug Proxy <bugproxy>
Component: realtime-kernelAssignee: Red Hat Real Time Maintenance <rt-maint>
Status: CLOSED NOTABUG QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: beta   
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-03-11 14:29:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description IBM Bug Proxy 2008-02-22 12:08:16 UTC
=Comment: #0=================================================
John G. Stultz <jstultz.com> - 2008-02-20 21:06 EDT
Problem description:

The 2.6.24.1-21 kernel uses IRQ- instead of IRQ_ for the process comm string.
This breaks the set_kthread_prio script, and results in the IRQ threads not
getting rt priority.

I believe the reverse happened in the SR2->SR3 timeframe. We should check w/
Clark/Steven to see what the rational is here.

Comment 1 IBM Bug Proxy 2008-02-26 01:08:25 UTC
------- Comment From dvhltc.com 2008-02-25 20:01 EDT-------
I checked our SR3 kernel, and it has IRQ- naming, not IRQ_.

[root@elm3b207 ~]# ps aux | grep IRQ
root       102  0.0  0.0      0     0 ?        S<   13:05   0:00 [IRQ-11]
root       329  0.0  0.0      0     0 ?        S<   13:05   0:00 [IRQ-8]
...

[root@elm3b207 ~]# uname -r
2.6.21.4-ibmrt1.23

Checking the set_kthread_prio script, I see that the regex's do indeed search
for ^IRQ_.  Checking the IRQ threads revealed that NONE of them were running
with realtime priority!

[root@elm3b207 ~]# ps -eLo rtprio,comm | grep IRQ
- IRQ-11
- IRQ-8
- IRQ-12
- IRQ-1
- IRQ-3
- IRQ-19
- IRQ-26
- IRQ-6
- IRQ-25
- IRQ-4

This is all on a fresh R1-SR3.dat deploy.

Comment 2 IBM Bug Proxy 2008-02-26 01:40:25 UTC
------- Comment From dvhltc.com 2008-02-25 20:38 EDT-------
So from patch-2.6.21.4-rt10 we see:

+static int start_irq_thread(int irq, struct irq_desc *desc)
+{
+       if (desc->thread || !ok_to_create_irq_threads)
+               return 0;
+
+       desc->thread = kthread_create(do_irqd, desc, "IRQ-%d", irq);

and from util/set_kthread_prio we see:

} else if (cmd ~ /^IRQ_/ &&
"IRQ_default" in config) {
prio = config["IRQ_default"];
opts = confopts["IRQ_default"];

Note that the SR3-iFix1 kernel uses IRQ- and the SR3-iFix1 set_kthread_prio
script uses IRQ_.  From this, I can't imagine how SR3 ever had real-time
priority hardware interrupt stubs.  This is truly incredible given the lack of
failures we've seen during testing.

Comment 3 IBM Bug Proxy 2008-02-26 01:56:25 UTC
------- Comment From dvhltc.com 2008-02-25 20:48 EDT-------
To confirm I didn't botch the code review and that the ABAT deploy wasn't
somehow at fault, I did an abat deploy of rhel5.1 on elm3b102 and then installed
SR3-iFix1 manually.

[root@elm3b102 ~]# uname -r
2.6.21.4-ibmrt1.23

[root@elm3b102 ~]# ps aux | grep IRQ
root       166  0.0  0.0      0     0 ?        S<   20:41   0:00 [IRQ-9]
root       507  0.0  0.0      0     0 ?        S<   20:41   0:00 [IRQ-8]
...

[root@elm3b102 ~]# ps -eLo rtprio,comm | grep IRQ
- IRQ-9
- IRQ-8
- IRQ-12
...

Applying the following patch:
39c39
<                       } else if (cmd ~ /^IRQ_/ &&
---
>                       } else if (cmd ~ /^IRQ[_-]/ &&

Rerunning the script we have:
# ps -eLo rtprio,comm | grep IRQ
95 IRQ-9
95 IRQ-8
...

This fix allows for both the old and new versions of the IRQ naming convention
to work.  I would like to understand WHY this doesn't affect our testing more.
But I think we will have to send an update to the customer immediately, perhaps
not as an iFix, perhaps just the 1 line patch above with instructions for how to
run it (maybe even a script that fixes the problem with a sed command).
Thoughts on delivery?

Bumping priority to P1, we need to understand why this isn't showing up as a
bigger problem and how we'll get the fix out to customers.

Comment 4 IBM Bug Proxy 2008-02-26 23:16:33 UTC
------- Comment From jstultz.com 2008-02-26 18:15 EDT-------
This is an IBM issue, and doesn't affect MRG or RH. I'm reject it.

Comment 5 Clark Williams 2008-03-11 14:29:44 UTC
closing on our side.