Bug 236817

Summary: irqbalance crash on the RT kernel.
Product: Red Hat Enterprise MRG Reporter: Gurhan Ozen <gozen>
Component: realtime-kernelAssignee: Neil Horman <nhorman>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 1.0CC: jburke
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: in 5.1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-10-02 14:57:21 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg output for the irqbalance issue
none
/proc/cpuinfo output none

Description Gurhan Ozen 2007-04-17 19:43:08 UTC
Description of problem:
  I noticed that RT kernel fails to start up irqbalance deamon. The dmesg output
about the crash it too long , So I will attach it to this report.

Version-Release number of selected component (if applicable):
# uname -a
Linux dell-pe1950-02.rhts.boston.redhat.com 2.6.20-19.el5rt #1 SMP PREEMPT Mon
Apr 16 12:14:21 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux


How reproducible:
Everytime

Steps to Reproduce:
1. Just boot into RT kernel.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Gurhan Ozen 2007-04-17 19:43:08 UTC
Created attachment 152847 [details]
dmesg output for the irqbalance issue

Comment 2 Gurhan Ozen 2007-04-17 19:44:09 UTC
Created attachment 152848 [details]
/proc/cpuinfo output

Comment 3 Gurhan Ozen 2007-04-18 15:32:16 UTC
FWIW, I am not seeing the backtrace with 2.6.20-21.el5rt kernel. However,
irqbalance still crashes:

kernel: irqbalance[3757]: segfault at 00005555557cc8f0 rip 0000555555555cdb rsp
00007fff96ae3f80 error 6

Here is an strace output:

# strace irqbalance
execve("/usr/sbin/irqbalance", ["irqbalance"], [/* 22 vars */]) = 0
brk(0)                                  = 0x55555576a000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x2ae9d0486000
uname({sys="Linux", node="dell-pe1950-03.rhts.boston.redhat.com", ...}) = 0
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=97593, ...}) = 0
mmap(NULL, 97593, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2ae9d0487000
close(3)                                = 0
open("/lib64/libc.so.6", O_RDONLY)      = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240\331"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1678480, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x2ae9d049f000
mmap(NULL, 3461272, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) =
0x2ae9d04a0000
mprotect(0x2ae9d05e4000, 2097152, PROT_NONE) = 0
mmap(0x2ae9d07e4000, 20480, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x144000) = 0x2ae9d07e4000
mmap(0x2ae9d07e9000, 16536, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x2ae9d07e9000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x2ae9d07ee000
arch_prctl(ARCH_SET_FS, 0x2ae9d07ee6f0) = 0
mprotect(0x2ae9d07e4000, 16384, PROT_READ) = 0
mprotect(0x555555756000, 4096, PROT_READ) = 0
mprotect(0x3fdda19000, 4096, PROT_READ) = 0
munmap(0x2ae9d0487000, 97593)           = 0
brk(0)                                  = 0x55555576a000
brk(0x55555578b000)                     = 0x55555578b000
open("/proc/cpuinfo", O_RDONLY)         = 3
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x2ae9d0487000
read(3, "processor\t: 0\nvendor_id\t: Genuin"..., 1024) = 1024
read(3, "i mmx fxsr sse sse2 ss ht tm sys"..., 1024) = 1024
read(3, ": 2327.531\ncache size\t: 4096 KB\n"..., 1024) = 484
read(3, "", 1024)                       = 0
close(3)                                = 0
munmap(0x2ae9d0487000, 4096)            = 0
open("/proc/interrupts", O_RDONLY)      = 3
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x2ae9d0487000
read(3, "           CPU0       CPU1      "..., 1024) = 1024
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

Comment 4 Tim Burke 2007-04-19 02:13:18 UTC
Hi Gozen - we are intending to use the new irqbalance which Neil Horman has
cooked up for RHEL5.1.  Does that version crash?


Comment 5 Neil Horman 2007-04-19 15:44:50 UTC
the dmesg output in comment 1 looks unrelated to irqbalance (Its a kernel
problem, perhaps triggered by some irqbalance behavior).  It probably warrants a
separate bugzilla. 

As for the irqbalance crash, It looks like an old problem that was since fixed.
 If you run the latest version of irqbalance (irqbalance-0.55-3.el5, which I
built yesterday), I expect this problem will clear up.  If not, please capture a
core dump of the process and attach it here.  I'll happily dig into it.  Thanks!

Comment 6 Gurhan Ozen 2007-04-19 16:20:04 UTC
Neil, 
Thanks a lot. Yes, comment#1 doesn't happen with the latest RT kernel.

As for the newer irqbalance package, it does work! Though, I have to admit, the
version numbers kind of threw me off, since what i originally had was
irqbalance-1.13-9.el5 .



Comment 7 Neil Horman 2007-04-19 17:01:28 UTC
yeah, we have a version rev slated for irqbalance in 5.1.  Its out of the
ordinary, but it just wasn't ready in time for GA.