Bug 2097871
Summary: | irqbalance crashes with error "double free or corruption (!prev)" | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Andrew Schorr <ajschorr> |
Component: | irqbalance | Assignee: | ltao |
Status: | CLOSED ERRATA | QA Contact: | Jiri Dluhos <jdluhos> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | CentOS Stream | CC: | bstinson, jeder, jshortt, jwboyer, rvr |
Target Milestone: | rc | Keywords: | TestOnly, Triaged |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | irqbalance-1.9.0-1.el9 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-11-15 11:18:28 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 2098635 | ||
Bug Blocks: |
Description
Andrew Schorr
2022-06-16 19:12:22 UTC
From valgrind /usr/sbin/irqbalance --foreground $IRQBALANCE_ARGS: ==142014== Memcheck, a memory error detector ==142014== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==142014== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info ==142014== Command: /usr/sbin/irqbalance --foreground --policyscript=/usr/local/etc/irqbalance_policyscript.sh ==142014== ==142014== Invalid read of size 4 ==142014== at 0x10B3A4: compare_ints (classify.c:256) ==142014== by 0x48C5950: g_list_find_custom (glist.c:927) ==142014== by 0x10B749: get_irq_info (classify.c:812) ==142014== by 0x10FD79: parse_proc_interrupts (procinterrupts.c:302) ==142014== by 0x11133A: scan (irqbalance.c:316) ==142014== by 0x48CB5A0: g_timeout_dispatch (gmain.c:4889) ==142014== by 0x48CAD4E: UnknownInlinedFun (gmain.c:3337) ==142014== by 0x48CAD4E: g_main_context_dispatch (gmain.c:4055) ==142014== by 0x491F607: g_main_context_iterate.constprop.0 (gmain.c:4131) ==142014== by 0x48CA462: g_main_loop_run (gmain.c:4329) ==142014== by 0x10AF0B: main (irqbalance.c:706) ==142014== Address 0x57fa2e0 is 0 bytes inside a block of size 592 free'd ==142014== at 0x48470E4: free (vg_replace_malloc.c:872) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==142014== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==142014== by 0x10AED0: main (irqbalance.c:694) ==142014== Block was alloc'd at ==142014== at 0x4849464: calloc (vg_replace_malloc.c:1328) ==142014== by 0x10B99E: UnknownInlinedFun (classify.c:269) ==142014== by 0x10B99E: __add_banned_irq (classify.c:259) ==142014== by 0x10F6BE: add_new_irq (classify.c:615) ==142014== by 0x10F842: build_one_dev_entry (classify.c:654) ==142014== by 0x10FA75: build_dev_irqs (classify.c:743) ==142014== by 0x10ADD8: UnknownInlinedFun (classify.c:783) ==142014== by 0x10ADD8: UnknownInlinedFun (irqbalance.c:242) ==142014== by 0x10ADD8: main (irqbalance.c:664) ==142014== ==142014== Invalid read of size 4 ==142014== at 0x10CB2E: remove_no_existing_irq (classify.c:865) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==142014== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==142014== by 0x11133A: scan (irqbalance.c:316) ==142014== by 0x48CB5A0: g_timeout_dispatch (gmain.c:4889) ==142014== by 0x48CAD4E: UnknownInlinedFun (gmain.c:3337) ==142014== by 0x48CAD4E: g_main_context_dispatch (gmain.c:4055) ==142014== by 0x491F607: g_main_context_iterate.constprop.0 (gmain.c:4131) ==142014== by 0x48CA462: g_main_loop_run (gmain.c:4329) ==142014== by 0x10AF0B: main (irqbalance.c:706) ==142014== Address 0x57fa51c is 572 bytes inside a block of size 592 free'd ==142014== at 0x48470E4: free (vg_replace_malloc.c:872) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==142014== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==142014== by 0x10AED0: main (irqbalance.c:694) ==142014== Block was alloc'd at ==142014== at 0x4849464: calloc (vg_replace_malloc.c:1328) ==142014== by 0x10B99E: UnknownInlinedFun (classify.c:269) ==142014== by 0x10B99E: __add_banned_irq (classify.c:259) ==142014== by 0x10F6BE: add_new_irq (classify.c:615) ==142014== by 0x10F842: build_one_dev_entry (classify.c:654) ==142014== by 0x10FA75: build_dev_irqs (classify.c:743) ==142014== by 0x10ADD8: UnknownInlinedFun (classify.c:783) ==142014== by 0x10ADD8: UnknownInlinedFun (irqbalance.c:242) ==142014== by 0x10ADD8: main (irqbalance.c:664) ==142014== ==142014== Invalid read of size 4 ==142014== at 0x10B3A6: compare_ints (classify.c:256) ==142014== by 0x48C5950: g_list_find_custom (glist.c:927) ==142014== by 0x10CB68: UnknownInlinedFun (classify.c:871) ==142014== by 0x10CB68: remove_no_existing_irq (classify.c:861) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==142014== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==142014== by 0x11133A: scan (irqbalance.c:316) ==142014== by 0x48CB5A0: g_timeout_dispatch (gmain.c:4889) ==142014== by 0x48CAD4E: UnknownInlinedFun (gmain.c:3337) ==142014== by 0x48CAD4E: g_main_context_dispatch (gmain.c:4055) ==142014== by 0x491F607: g_main_context_iterate.constprop.0 (gmain.c:4131) ==142014== by 0x48CA462: g_main_loop_run (gmain.c:4329) ==142014== by 0x10AF0B: main (irqbalance.c:706) ==142014== Address 0x57fa2e0 is 0 bytes inside a block of size 592 free'd ==142014== at 0x48470E4: free (vg_replace_malloc.c:872) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==142014== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==142014== by 0x10AED0: main (irqbalance.c:694) ==142014== Block was alloc'd at ==142014== at 0x4849464: calloc (vg_replace_malloc.c:1328) ==142014== by 0x10B99E: UnknownInlinedFun (classify.c:269) ==142014== by 0x10B99E: __add_banned_irq (classify.c:259) ==142014== by 0x10F6BE: add_new_irq (classify.c:615) ==142014== by 0x10F842: build_one_dev_entry (classify.c:654) ==142014== by 0x10FA75: build_dev_irqs (classify.c:743) ==142014== by 0x10ADD8: UnknownInlinedFun (classify.c:783) ==142014== by 0x10ADD8: UnknownInlinedFun (irqbalance.c:242) ==142014== by 0x10ADD8: main (irqbalance.c:664) ==142014== ==142014== Invalid read of size 4 ==142014== at 0x10B3A6: compare_ints (classify.c:256) ==142014== by 0x48C5950: g_list_find_custom (glist.c:927) ==142014== by 0x10CB95: UnknownInlinedFun (classify.c:875) ==142014== by 0x10CB95: remove_no_existing_irq (classify.c:861) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==142014== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==142014== by 0x11133A: scan (irqbalance.c:316) ==142014== by 0x48CB5A0: g_timeout_dispatch (gmain.c:4889) ==142014== by 0x48CAD4E: UnknownInlinedFun (gmain.c:3337) ==142014== by 0x48CAD4E: g_main_context_dispatch (gmain.c:4055) ==142014== by 0x491F607: g_main_context_iterate.constprop.0 (gmain.c:4131) ==142014== by 0x48CA462: g_main_loop_run (gmain.c:4329) ==142014== by 0x10AF0B: main (irqbalance.c:706) ==142014== Address 0x57fa2e0 is 0 bytes inside a block of size 592 free'd ==142014== at 0x48470E4: free (vg_replace_malloc.c:872) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==142014== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==142014== by 0x10AED0: main (irqbalance.c:694) ==142014== Block was alloc'd at ==142014== at 0x4849464: calloc (vg_replace_malloc.c:1328) ==142014== by 0x10B99E: UnknownInlinedFun (classify.c:269) ==142014== by 0x10B99E: __add_banned_irq (classify.c:259) ==142014== by 0x10F6BE: add_new_irq (classify.c:615) ==142014== by 0x10F842: build_one_dev_entry (classify.c:654) ==142014== by 0x10FA75: build_dev_irqs (classify.c:743) ==142014== by 0x10ADD8: UnknownInlinedFun (classify.c:783) ==142014== by 0x10ADD8: UnknownInlinedFun (irqbalance.c:242) ==142014== by 0x10ADD8: main (irqbalance.c:664) ==142014== ==142014== Invalid read of size 8 ==142014== at 0x10CBB1: UnknownInlinedFun (classify.c:879) ==142014== by 0x10CBB1: remove_no_existing_irq (classify.c:861) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==142014== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==142014== by 0x11133A: scan (irqbalance.c:316) ==142014== by 0x48CB5A0: g_timeout_dispatch (gmain.c:4889) ==142014== by 0x48CAD4E: UnknownInlinedFun (gmain.c:3337) ==142014== by 0x48CAD4E: g_main_context_dispatch (gmain.c:4055) ==142014== by 0x491F607: g_main_context_iterate.constprop.0 (gmain.c:4131) ==142014== by 0x48CA462: g_main_loop_run (gmain.c:4329) ==142014== by 0x10AF0B: main (irqbalance.c:706) ==142014== Address 0x57fa520 is 576 bytes inside a block of size 592 free'd ==142014== at 0x48470E4: free (vg_replace_malloc.c:872) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==142014== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==142014== by 0x10AED0: main (irqbalance.c:694) ==142014== Block was alloc'd at ==142014== at 0x4849464: calloc (vg_replace_malloc.c:1328) ==142014== by 0x10B99E: UnknownInlinedFun (classify.c:269) ==142014== by 0x10B99E: __add_banned_irq (classify.c:259) ==142014== by 0x10F6BE: add_new_irq (classify.c:615) ==142014== by 0x10F842: build_one_dev_entry (classify.c:654) ==142014== by 0x10FA75: build_dev_irqs (classify.c:743) ==142014== by 0x10ADD8: UnknownInlinedFun (classify.c:783) ==142014== by 0x10ADD8: UnknownInlinedFun (irqbalance.c:242) ==142014== by 0x10ADD8: main (irqbalance.c:664) ==142014== ==142014== Invalid free() / delete / delete[] / realloc() ==142014== at 0x48470E4: free (vg_replace_malloc.c:872) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==142014== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==142014== by 0x11133A: scan (irqbalance.c:316) ==142014== by 0x48CB5A0: g_timeout_dispatch (gmain.c:4889) ==142014== by 0x48CAD4E: UnknownInlinedFun (gmain.c:3337) ==142014== by 0x48CAD4E: g_main_context_dispatch (gmain.c:4055) ==142014== by 0x491F607: g_main_context_iterate.constprop.0 (gmain.c:4131) ==142014== by 0x48CA462: g_main_loop_run (gmain.c:4329) ==142014== by 0x10AF0B: main (irqbalance.c:706) ==142014== Address 0x57fa2e0 is 0 bytes inside a block of size 592 free'd ==142014== at 0x48470E4: free (vg_replace_malloc.c:872) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==142014== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==142014== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==142014== by 0x10AED0: main (irqbalance.c:694) ==142014== Block was alloc'd at ==142014== at 0x4849464: calloc (vg_replace_malloc.c:1328) ==142014== by 0x10B99E: UnknownInlinedFun (classify.c:269) ==142014== by 0x10B99E: __add_banned_irq (classify.c:259) ==142014== by 0x10F6BE: add_new_irq (classify.c:615) ==142014== by 0x10F842: build_one_dev_entry (classify.c:654) ==142014== by 0x10FA75: build_dev_irqs (classify.c:743) ==142014== by 0x10ADD8: UnknownInlinedFun (classify.c:783) ==142014== by 0x10ADD8: UnknownInlinedFun (irqbalance.c:242) ==142014== by 0x10ADD8: main (irqbalance.c:664) ==142014== ==142014== ==142014== HEAP SUMMARY: ==142014== in use at exit: 21,769 bytes in 68 blocks ==142014== total heap usage: 15,118 allocs, 15,386 frees, 18,402,934 bytes allocated ==142014== ==142014== LEAK SUMMARY: ==142014== definitely lost: 0 bytes in 0 blocks ==142014== indirectly lost: 0 bytes in 0 blocks ==142014== possibly lost: 304 bytes in 1 blocks ==142014== still reachable: 21,465 bytes in 67 blocks ==142014== suppressed: 0 bytes in 0 blocks ==142014== Rerun with --leak-check=full to see details of leaked memory ==142014== ==142014== For lists of detected and suppressed errors, rerun with: -s ==142014== ERROR SUMMARY: 79368 errors from 7 contexts (suppressed: 0 from 0) (I ctrl-c'ed it to get it to exit, as it did not crash when running under valgrind). Without the --policyscript argument, I don't see any valgrind errors. I created the stupidest of policy scripts: sh-5.1$ cat /tmp/policy.sh #!/bin/sh echo ban=true sh-5.1$ Then I ran: valgrind /usr/sbin/irqbalance --foreground --policyscript=/tmp/policy.sh And this is what I got: ==152519== Memcheck, a memory error detector ==152519== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==152519== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info ==152519== Command: /usr/sbin/irqbalance --foreground --policyscript=/tmp/policy.sh ==152519== ==152519== Invalid read of size 4 ==152519== at 0x10B3A4: compare_ints (classify.c:256) ==152519== by 0x48C5950: g_list_find_custom (glist.c:927) ==152519== by 0x10B749: get_irq_info (classify.c:812) ==152519== by 0x10FD79: parse_proc_interrupts (procinterrupts.c:302) ==152519== by 0x11133A: scan (irqbalance.c:316) ==152519== by 0x48CB5A0: g_timeout_dispatch (gmain.c:4889) ==152519== by 0x48CAD4E: UnknownInlinedFun (gmain.c:3337) ==152519== by 0x48CAD4E: g_main_context_dispatch (gmain.c:4055) ==152519== by 0x491F607: g_main_context_iterate.constprop.0 (gmain.c:4131) ==152519== by 0x48CA462: g_main_loop_run (gmain.c:4329) ==152519== by 0x10AF0B: main (irqbalance.c:706) ==152519== Address 0x504b620 is 0 bytes inside a block of size 592 free'd ==152519== at 0x48470E4: free (vg_replace_malloc.c:872) ==152519== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==152519== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==152519== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==152519== by 0x10AED0: main (irqbalance.c:694) ==152519== Block was alloc'd at ==152519== at 0x4849464: calloc (vg_replace_malloc.c:1328) ==152519== by 0x10B99E: UnknownInlinedFun (classify.c:269) ==152519== by 0x10B99E: __add_banned_irq (classify.c:259) ==152519== by 0x10F6BE: add_new_irq (classify.c:615) ==152519== by 0x10FA17: build_one_dev_entry (classify.c:682) ==152519== by 0x10FA75: build_dev_irqs (classify.c:743) ==152519== by 0x10ADD8: UnknownInlinedFun (classify.c:783) ==152519== by 0x10ADD8: UnknownInlinedFun (irqbalance.c:242) ==152519== by 0x10ADD8: main (irqbalance.c:664) ==152519== ==152519== Invalid read of size 4 ==152519== at 0x10CB2E: remove_no_existing_irq (classify.c:865) ==152519== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==152519== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==152519== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==152519== by 0x11133A: scan (irqbalance.c:316) ==152519== by 0x48CB5A0: g_timeout_dispatch (gmain.c:4889) ==152519== by 0x48CAD4E: UnknownInlinedFun (gmain.c:3337) ==152519== by 0x48CAD4E: g_main_context_dispatch (gmain.c:4055) ==152519== by 0x491F607: g_main_context_iterate.constprop.0 (gmain.c:4131) ==152519== by 0x48CA462: g_main_loop_run (gmain.c:4329) ==152519== by 0x10AF0B: main (irqbalance.c:706) ==152519== Address 0x504b85c is 572 bytes inside a block of size 592 free'd ==152519== at 0x48470E4: free (vg_replace_malloc.c:872) ==152519== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==152519== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==152519== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==152519== by 0x10AED0: main (irqbalance.c:694) ==152519== Block was alloc'd at ==152519== at 0x4849464: calloc (vg_replace_malloc.c:1328) ==152519== by 0x10B99E: UnknownInlinedFun (classify.c:269) ==152519== by 0x10B99E: __add_banned_irq (classify.c:259) ==152519== by 0x10F6BE: add_new_irq (classify.c:615) ==152519== by 0x10FA17: build_one_dev_entry (classify.c:682) ==152519== by 0x10FA75: build_dev_irqs (classify.c:743) ==152519== by 0x10ADD8: UnknownInlinedFun (classify.c:783) ==152519== by 0x10ADD8: UnknownInlinedFun (irqbalance.c:242) ==152519== by 0x10ADD8: main (irqbalance.c:664) ==152519== ==152519== Invalid read of size 8 ==152519== at 0x10CBB1: UnknownInlinedFun (classify.c:879) ==152519== by 0x10CBB1: remove_no_existing_irq (classify.c:861) ==152519== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==152519== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==152519== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==152519== by 0x11133A: scan (irqbalance.c:316) ==152519== by 0x48CB5A0: g_timeout_dispatch (gmain.c:4889) ==152519== by 0x48CAD4E: UnknownInlinedFun (gmain.c:3337) ==152519== by 0x48CAD4E: g_main_context_dispatch (gmain.c:4055) ==152519== by 0x491F607: g_main_context_iterate.constprop.0 (gmain.c:4131) ==152519== by 0x48CA462: g_main_loop_run (gmain.c:4329) ==152519== by 0x10AF0B: main (irqbalance.c:706) ==152519== Address 0x504b860 is 576 bytes inside a block of size 592 free'd ==152519== at 0x48470E4: free (vg_replace_malloc.c:872) ==152519== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==152519== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==152519== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==152519== by 0x10AED0: main (irqbalance.c:694) ==152519== Block was alloc'd at ==152519== at 0x4849464: calloc (vg_replace_malloc.c:1328) ==152519== by 0x10B99E: UnknownInlinedFun (classify.c:269) ==152519== by 0x10B99E: __add_banned_irq (classify.c:259) ==152519== by 0x10F6BE: add_new_irq (classify.c:615) ==152519== by 0x10FA17: build_one_dev_entry (classify.c:682) ==152519== by 0x10FA75: build_dev_irqs (classify.c:743) ==152519== by 0x10ADD8: UnknownInlinedFun (classify.c:783) ==152519== by 0x10ADD8: UnknownInlinedFun (irqbalance.c:242) ==152519== by 0x10ADD8: main (irqbalance.c:664) ==152519== ==152519== Invalid free() / delete / delete[] / realloc() ==152519== at 0x48470E4: free (vg_replace_malloc.c:872) ==152519== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==152519== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==152519== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==152519== by 0x11133A: scan (irqbalance.c:316) ==152519== by 0x48CB5A0: g_timeout_dispatch (gmain.c:4889) ==152519== by 0x48CAD4E: UnknownInlinedFun (gmain.c:3337) ==152519== by 0x48CAD4E: g_main_context_dispatch (gmain.c:4055) ==152519== by 0x491F607: g_main_context_iterate.constprop.0 (gmain.c:4131) ==152519== by 0x48CA462: g_main_loop_run (gmain.c:4329) ==152519== by 0x10AF0B: main (irqbalance.c:706) ==152519== Address 0x504b620 is 0 bytes inside a block of size 592 free'd ==152519== at 0x48470E4: free (vg_replace_malloc.c:872) ==152519== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==152519== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==152519== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==152519== by 0x10AED0: main (irqbalance.c:694) ==152519== Block was alloc'd at ==152519== at 0x4849464: calloc (vg_replace_malloc.c:1328) ==152519== by 0x10B99E: UnknownInlinedFun (classify.c:269) ==152519== by 0x10B99E: __add_banned_irq (classify.c:259) ==152519== by 0x10F6BE: add_new_irq (classify.c:615) ==152519== by 0x10FA17: build_one_dev_entry (classify.c:682) ==152519== by 0x10FA75: build_dev_irqs (classify.c:743) ==152519== by 0x10ADD8: UnknownInlinedFun (classify.c:783) ==152519== by 0x10ADD8: UnknownInlinedFun (irqbalance.c:242) ==152519== by 0x10ADD8: main (irqbalance.c:664) ==152519== ^C==152519== Invalid free() / delete / delete[] / realloc() ==152519== at 0x48470E4: free (vg_replace_malloc.c:872) ==152519== by 0x10C73E: UnknownInlinedFun (classify.c:694) ==152519== by 0x10C73E: UnknownInlinedFun (classify.c:798) ==152519== by 0x10C73E: free_irq_db (classify.c:702) ==152519== by 0x10AF3F: UnknownInlinedFun (irqbalance.c:249) ==152519== by 0x10AF3F: main (irqbalance.c:711) ==152519== Address 0x504b620 is 0 bytes inside a block of size 592 free'd ==152519== at 0x48470E4: free (vg_replace_malloc.c:872) ==152519== by 0x10FF00: UnknownInlinedFun (classify.c:798) ==152519== by 0x10FF00: UnknownInlinedFun (classify.c:893) ==152519== by 0x10FF00: parse_proc_interrupts (procinterrupts.c:358) ==152519== by 0x10AED0: main (irqbalance.c:694) ==152519== Block was alloc'd at ==152519== at 0x4849464: calloc (vg_replace_malloc.c:1328) ==152519== by 0x10B99E: UnknownInlinedFun (classify.c:269) ==152519== by 0x10B99E: __add_banned_irq (classify.c:259) ==152519== by 0x10F6BE: add_new_irq (classify.c:615) ==152519== by 0x10FA17: build_one_dev_entry (classify.c:682) ==152519== by 0x10FA75: build_dev_irqs (classify.c:743) ==152519== by 0x10ADD8: UnknownInlinedFun (classify.c:783) ==152519== by 0x10ADD8: UnknownInlinedFun (irqbalance.c:242) ==152519== by 0x10ADD8: main (irqbalance.c:664) ==152519== ==152519== ==152519== HEAP SUMMARY: ==152519== in use at exit: 21,769 bytes in 68 blocks ==152519== total heap usage: 8,131 allocs, 8,644 frees, 11,912,097 bytes allocated ==152519== ==152519== LEAK SUMMARY: ==152519== definitely lost: 0 bytes in 0 blocks ==152519== indirectly lost: 0 bytes in 0 blocks ==152519== possibly lost: 304 bytes in 1 blocks ==152519== still reachable: 21,465 bytes in 67 blocks ==152519== suppressed: 0 bytes in 0 blocks ==152519== Rerun with --leak-check=full to see details of leaked memory ==152519== ==152519== For lists of detected and suppressed errors, rerun with: -s ==152519== ERROR SUMMARY: 62969 errors from 5 contexts (suppressed: 0 from 0) As above, I had to Ctrl-C to kill it, since it didn't seem to crash. It seems clear that there's a bug related to policy scripts that emit ban=true. Regards, Andy But that being said, I was unable to reproduce this in a qemu vm with 4 cpus and that same ban=true policy script, so maybe it depends on my hardware. I've got an AMD 7443P cpu. (In reply to Andrew Schorr from comment #3) > But that being said, I was unable to reproduce this in a qemu vm with 4 cpus > and that same ban=true policy script, so maybe it depends on my hardware. > I've got an AMD 7443P cpu. Hi Andrew, Thanks for reporting the issue. However I couldn't reproduce it on my machine, with irqbalance-1.8.0-5.el9.x86_64 and upstream irqbalance, with the cmdline and policy scripts you provided: [root@amd-ethanol-01 tmp]# valgrind /usr/sbin/irqbalance --foreground --policyscript=/tmp/policy.sh ==31315== Memcheck, a memory error detector ==31315== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==31315== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info ==31315== Command: /usr/sbin/irqbalance --foreground --policyscript=/tmp/policy.sh ==31315== ^C==31315== ==31315== HEAP SUMMARY: ==31315== in use at exit: 21,769 bytes in 68 blocks ==31315== total heap usage: 6,366 allocs, 6,298 frees, 13,048,565 bytes allocated ==31315== ==31315== LEAK SUMMARY: ==31315== definitely lost: 0 bytes in 0 blocks ==31315== indirectly lost: 0 bytes in 0 blocks ==31315== possibly lost: 304 bytes in 1 blocks ==31315== still reachable: 21,465 bytes in 67 blocks ==31315== suppressed: 0 bytes in 0 blocks ==31315== Rerun with --leak-check=full to see details of leaked memory ==31315== ==31315== For lists of detected and suppressed errors, rerun with: -s ==31315== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) I don't know if it is related to cpu, which I used for testing is AMD EPYC 7601 32-Core Processor. Here is my suggestion: 1) Try the upstream irqbalance(https://github.com/Irqbalance/irqbalance.git) to see if it is reproducible on your machine. If does, you can open an issue there. 2) Since it only happened on one machine, you can try another amd physical machines, if you have any, to see if reproducible. If you have further findings, we can continue our discussion here. Thanks, Tao Liu Thanks for doing some investigation. I did notice that after a system reboot, irqbalance started successfully without crashing. I then stopped it and had to run it a few times under valgrind before I started getting the error. So I think it's somehow dependent on the state of the system. When I get a chance, I'll try to duplicate with upstream irqbalance. As for one machine: at the moment, I have only one running CentOS Stream 9, and I was restarting irqbalance repeatedly to test some changes to my policy script. That's why I discovered the issue. I have no idea whether it would happen on other systems, but I do think it somehow depends on restarting irqbalance. Thanks, Andy Current git master works fine. I ran git bisect and found the patch that fixes the problem. The bug is fixed by this patch: commit 066499ad5231a8a8d37f08a3af5dd6c38431ce6f Author: liuchao173 <55137861+liuchao173.github.com> Date: Fri May 7 20:48:32 2021 +0800 remove no existing irq in banned_irqs when a banned irq doesn't exist, it won't be removed from banned_irqs classify.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) (In reply to Andrew Schorr from comment #6) > Current git master works fine. I ran git bisect and found the patch that > fixes the problem. > The bug is fixed by this patch: > > commit 066499ad5231a8a8d37f08a3af5dd6c38431ce6f > Author: liuchao173 <55137861+liuchao173.github.com> > Date: Fri May 7 20:48:32 2021 +0800 > > remove no existing irq in banned_irqs > > when a banned irq doesn't exist, it won't be removed from banned_irqs > > classify.c | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) Hi Andrew, Thanks for your great work! I created another bug for rebasing irqbalance to v1.9.0, which will have this patch integrated automatically. So this one can be fixed when the rebasing finishes. Thanks, Tao Liu This bz is fixed automatically when rebased to v1.9.0, because it already contains the patch which mentioned in comment6. Hi Jiri, I think itm is also needed for this bz, to get release+ flag... Thanks, Tao Liu After some valgrinding, I would say that this bug is indeed fixed. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (irqbalance bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:8328 |