RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 911649 - HP SmartArrray hpsa module interrupts are imbalanced
Summary: HP SmartArrray hpsa module interrupts are imbalanced
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: irqbalance
Version: 6.3
Hardware: Unspecified
OS: Linux
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Petr Holasek
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-02-15 14:28 UTC by Roland Friedwagner
Modified: 2016-10-04 04:08 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-09-19 09:59:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Add hpsa to storage_modules list in classify.c (240 bytes, patch)
2013-02-15 14:28 UTC, Roland Friedwagner
no flags Details | Diff
spec file changes (1008 bytes, patch)
2013-02-15 14:31 UTC, Roland Friedwagner
no flags Details | Diff
irqbalance --debug log (60.00 KB, text/plain)
2013-09-19 09:45 UTC, Roland Friedwagner
no flags Details
Unbalanced hpsa /proc/interrupts (13.11 KB, text/plain)
2015-01-22 14:28 UTC, Edmund White
no flags Details
irqbalance debug output (32.96 KB, text/plain)
2015-01-22 14:29 UTC, Edmund White
no flags Details
Balanced HPSA /proc/interrupts (12.84 KB, text/plain)
2015-01-25 14:09 UTC, Edmund White
no flags Details

Description Roland Friedwagner 2013-02-15 14:28:34 UTC
Created attachment 697824 [details]
Add hpsa to storage_modules list in classify.c

Description of problem:

Hewlett Packard is moving from cciss to hpsa driver module for their
Smart Array Controller Chip.
The new module name (hpsa) is not in the storage_modules list in
classify.c and all SmartArray Controler Interrupts are handled by CPU0

Version-Release number of selected component (if applicable):

0.55-35

Actual results:

$ grep hpsa /proc/interrupts
  52:     253368          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      hpsa0
  53:      25294          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      hpsa0
  54:      25360          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      hpsa0
  55:      19647          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      hpsa0
  56:       8907          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      hpsa0
  57:       5888          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      hpsa0
  58:     210612          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      hpsa0
  59:      22698          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      hpsa0
  60:   78013564          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      hpsa1
  61:    2020926          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      hpsa1
  62:    1562522          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      hpsa1
  63:     760476          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      hpsa1
  64:     271462          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      hpsa1
  65:     195098          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      hpsa1
  66:    5489932          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      hpsa1
  67:     232921          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      hpsa1



Expected results:

irq counters increase on all CPUs


Additional info:

Patch simply adds "hpsa" to storage_modules list

Comment 1 Roland Friedwagner 2013-02-15 14:31:43 UTC
Created attachment 697825 [details]
spec file changes

Comment 3 RHEL Program Management 2013-02-20 06:48:28 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 4 Petr Holasek 2013-02-21 18:54:49 UTC
Hello,

wrong classifying issue was fixed in new irqbalance in RHEL 6.4. If your issue wasn't fixed by update, feel free to reopen this bug.

thanks,
Petr

Comment 5 Roland Friedwagner 2013-02-25 16:25:55 UTC
reopened because new irqbalance (irqbalance-1.0.4-3.el6.x86_64) 
in RHEL 6.4 (Bug 878708) does not balance any irq.

Comment 6 Petr Holasek 2013-09-18 11:56:15 UTC
I am not sure if your problem can be considered as duplicate of bz878708. If you think so, let me know and I'll close the bz as duplicate, otherwise send me an output after a few minutes running of following commands, please:

# service irqbalance stop
# irqbalance --debug

Thank you,
Petr Holasek

Comment 7 Roland Friedwagner 2013-09-19 09:34:34 UTC
I currently im not authorized to lookup bz878708.

But I've collected the debug output from irqbalance as requested
- uploaded as 2013-09-19_irqbalance-1.0.4-4.el6_4.x86_64_debug.log -
and find that the hpsa interrupts are handled by irqbalande now.

thx&regards
Roland

Comment 8 Roland Friedwagner 2013-09-19 09:45:20 UTC
Created attachment 799808 [details]
irqbalance --debug log

Comment 9 Petr Holasek 2013-09-19 09:59:11 UTC
Roland, thank you for cooperation!

Based on your feedback, I am closing BZ as CURRENTRELEASE.

regards,
Petr

Comment 10 Edmund White 2015-01-22 14:09:35 UTC
I continue to see unbalanced hpsa controller interrupts under irqbalance-1.0.4-10.el6 on EL6.6. Was this patch ever incorporated?

Comment 11 Petr Holasek 2015-01-22 14:15:15 UTC
Edmund,

could you please attach output of /proc/interrupts and one minute of irqbalance debug? (Can be collected by # service irqbalance stop && irqbalance --debug > irqbalance_debug)

Comment 12 Edmund White 2015-01-22 14:28:42 UTC
Created attachment 982840 [details]
Unbalanced hpsa /proc/interrupts

Comment 13 Edmund White 2015-01-22 14:29:31 UTC
Created attachment 982842 [details]
irqbalance debug output

Comment 14 Petr Holasek 2015-01-22 15:05:48 UTC
Thank you for outputs. hpsa interrupts were classified right as the "storage", but balanced on the cache level instead of core level that shouldn't be the problem though.

Could you please add output of "# for i in $(seq 0 100); do grep . /proc/irq/$i/smp_affinity /dev/null 2>/dev/null; done" after irqbalance daemon is running for a while?

Comment 15 Edmund White 2015-01-23 15:23:47 UTC
The smp_affinity output is below:

# for i in $(seq 0 100); do grep . /proc/irq/$i/smp_affinity /dev/null  2>/dev/null; done
/proc/irq/0/smp_affinity:ffffffff,ffffffff
/proc/irq/1/smp_affinity:00000000,0003f03f
/proc/irq/2/smp_affinity:ffffffff,ffffffff
/proc/irq/3/smp_affinity:00000000,0003f03f
/proc/irq/4/smp_affinity:00000000,0003f03f
/proc/irq/5/smp_affinity:00000000,0003f03f
/proc/irq/6/smp_affinity:00000000,00ffffff
/proc/irq/7/smp_affinity:00000000,0003f03f
/proc/irq/8/smp_affinity:00000000,0003f03f
/proc/irq/9/smp_affinity:00000000,0003f03f
/proc/irq/10/smp_affinity:00000000,0003f03f
/proc/irq/11/smp_affinity:00000000,00ffffff
/proc/irq/12/smp_affinity:00000000,0003f03f
/proc/irq/13/smp_affinity:00000000,00ffffff
/proc/irq/14/smp_affinity:00000000,00ffffff
/proc/irq/15/smp_affinity:00000000,00ffffff
/proc/irq/16/smp_affinity:00000000,0003f03f
/proc/irq/17/smp_affinity:00000000,0003f03f
/proc/irq/20/smp_affinity:00000000,0003f03f
/proc/irq/21/smp_affinity:00000000,0003f03f
/proc/irq/72/smp_affinity:00000000,00000001
/proc/irq/73/smp_affinity:00000000,00000001
/proc/irq/74/smp_affinity:00000000,00000001
/proc/irq/75/smp_affinity:00000000,00000002
/proc/irq/76/smp_affinity:00000000,00000004
/proc/irq/77/smp_affinity:00000000,00000008
/proc/irq/78/smp_affinity:00000000,00000010
/proc/irq/79/smp_affinity:00000000,0003f03f
/proc/irq/80/smp_affinity:00000000,0003f03f
/proc/irq/81/smp_affinity:00000000,0003f03f
/proc/irq/82/smp_affinity:00000000,0003f03f
/proc/irq/83/smp_affinity:00000000,0003f03f
/proc/irq/84/smp_affinity:00000000,0003f03f
/proc/irq/85/smp_affinity:00000000,0003f03f
/proc/irq/86/smp_affinity:00000000,0003f03f
/proc/irq/87/smp_affinity:00000000,0003f03f
/proc/irq/88/smp_affinity:00000000,0003f03f
/proc/irq/89/smp_affinity:00000000,00000020
/proc/irq/90/smp_affinity:00000000,00001000
/proc/irq/91/smp_affinity:00000000,00002000
/proc/irq/92/smp_affinity:00000000,00008000
/proc/irq/93/smp_affinity:00000000,00010000

Comment 16 Petr Holasek 2015-01-23 15:53:38 UTC
Thanks for the output.

It seems that irqbalance does a good job there and problem is in hardware - namely APIC. Kernel would also refuse to set incorrect smp_affinity, but anyway you can fill the bug against hpsa driver.

Comment 17 Edmund White 2015-01-25 14:08:20 UTC
I was just upgrading another system, and notice that this was not an issue on 2.6.32-431.17.1.el6, but seems to have started with 2.6.32-504.3.3.el6. I can't tell if this is due to an HPSA driver change or the kernel revision.

Comment 18 Edmund White 2015-01-25 14:09:07 UTC
Created attachment 983964 [details]
Balanced HPSA /proc/interrupts

Comment 19 Petr Holasek 2015-01-26 08:37:35 UTC
Thank you. I'd recommend you to fill bug against kernel/StorageDrivers and refer this bugzilla in the report, e.g. provide information about correctly set smp_affinity.

Comment 20 Viktor Villafuerte 2015-01-29 01:10:00 UTC
Cau Peto :)

just to clarify this...

I've got very similar problem where updating caused the same problems as mentioned above. Also I was forced to downgrade kernel back to 2.6.32-431.17.1.el6.x86_64 which seemed to have helped with running down CPUs.

However, I did affinity checks as suggested (above) and the results still seem bit odd to me. The only difference there is the version of irqbalance package.



1)

irqbalance-1.0.4-10.el6.x86_64
2.6.32-431.17.1.el6.x86_64


for i in $(seq 0 100); do grep . /proc/irq/$i/smp_affinity /dev/null  2>/dev/null; done
/proc/irq/0/smp_affinity:ffffffff
/proc/irq/1/smp_affinity:000000ff
/proc/irq/2/smp_affinity:ffffffff
/proc/irq/3/smp_affinity:000000ff
/proc/irq/4/smp_affinity:000000ff
/proc/irq/5/smp_affinity:000000ff
/proc/irq/6/smp_affinity:000000ff
/proc/irq/7/smp_affinity:000000ff
/proc/irq/8/smp_affinity:000000ff
/proc/irq/9/smp_affinity:000000ff
/proc/irq/10/smp_affinity:000000ff
/proc/irq/11/smp_affinity:000000ff
/proc/irq/12/smp_affinity:000000ff
/proc/irq/13/smp_affinity:000000ff
/proc/irq/14/smp_affinity:000000ff
/proc/irq/15/smp_affinity:000000ff
/proc/irq/17/smp_affinity:000000ff
/proc/irq/20/smp_affinity:000000ff
/proc/irq/22/smp_affinity:000000ff
/proc/irq/23/smp_affinity:000000ff
/proc/irq/50/smp_affinity:000000ff
/proc/irq/51/smp_affinity:000000ff
/proc/irq/52/smp_affinity:000000ff
/proc/irq/53/smp_affinity:000000ff
/proc/irq/54/smp_affinity:000000ff
/proc/irq/55/smp_affinity:000000ff
/proc/irq/56/smp_affinity:000000ff
/proc/irq/57/smp_affinity:000000ff
/proc/irq/58/smp_affinity:00000008
/proc/irq/59/smp_affinity:00000008
/proc/irq/60/smp_affinity:00000008
/proc/irq/61/smp_affinity:00000008
/proc/irq/62/smp_affinity:00000008
/proc/irq/63/smp_affinity:00000008
/proc/irq/64/smp_affinity:00000008
/proc/irq/65/smp_affinity:00000008


2) 

irqbalance-1.0.4-9.el6_5.x86_64
2.6.32-431.17.1.el6.x86_64

for i in $(seq 0 100); do grep . /proc/irq/$i/smp_affinity /dev/null  2>/dev/null; done
/proc/irq/0/smp_affinity:ffffffff
/proc/irq/1/smp_affinity:000000ff
/proc/irq/2/smp_affinity:ffffffff
/proc/irq/3/smp_affinity:000000ff
/proc/irq/4/smp_affinity:000000ff
/proc/irq/5/smp_affinity:000000ff
/proc/irq/6/smp_affinity:000000ff
/proc/irq/7/smp_affinity:000000ff
/proc/irq/8/smp_affinity:000000ff
/proc/irq/9/smp_affinity:000000ff
/proc/irq/10/smp_affinity:000000ff
/proc/irq/11/smp_affinity:00000088
/proc/irq/12/smp_affinity:000000ff
/proc/irq/13/smp_affinity:000000ff
/proc/irq/14/smp_affinity:000000ff
/proc/irq/15/smp_affinity:000000ff
/proc/irq/17/smp_affinity:000000ff
/proc/irq/20/smp_affinity:00000022
/proc/irq/22/smp_affinity:00000088
/proc/irq/23/smp_affinity:00000044
/proc/irq/50/smp_affinity:00000088
/proc/irq/51/smp_affinity:00000088
/proc/irq/52/smp_affinity:00000022
/proc/irq/53/smp_affinity:00000022
/proc/irq/54/smp_affinity:00000044
/proc/irq/55/smp_affinity:00000022
/proc/irq/56/smp_affinity:00000088
/proc/irq/57/smp_affinity:00000022
/proc/irq/58/smp_affinity:00000010
/proc/irq/59/smp_affinity:00000020
/proc/irq/60/smp_affinity:00000040
/proc/irq/61/smp_affinity:00000080
/proc/irq/62/smp_affinity:00000001
/proc/irq/63/smp_affinity:00000002
/proc/irq/64/smp_affinity:00000004
/proc/irq/65/smp_affinity:00000008

Comment 21 Edmund White 2015-01-29 01:20:29 UTC
I've detailled this in Bug #1185890, but it is not public yet.

Comment 22 Viktor Villafuerte 2015-01-29 02:51:20 UTC
No worries, I'll keep an eye on that one.

thanks

v

Comment 23 Petr Holasek 2015-01-29 11:16:58 UTC
Ahoj Viktore :)

Different smp_affinities across versions are probably caused by bug bz1170351 ( Broken irqbalance deepest cache backport) which is going to be fixed in RHEL6.7.

Comment 24 Petr Holasek 2015-02-10 14:05:10 UTC
Edmund,

just another try: Are interrupts distributed among processors when you run irqbalance with --hintpolicy=exact option?

Comment 25 Edmund White 2015-02-25 16:24:37 UTC
Petr,

I'll try that, but the issue was narrowed down to bz1170351 in my other bz1185890. Downgrading irqbalance to irqbalance-1.0.4-9.el6_5.x86_64 is a temporary resolution. The permanent fix is slated for release in 6.7.

Comment 26 Edmund White 2015-02-25 16:27:22 UTC
Running the irqbalance with --hintpolicy=exact did not change the interrupt distribution.


Note You need to log in before you can comment on or make changes to this bug.