Description of problem: Kernel 2.4.9-e.12 crash when Amanda-client-2.4.2p2-4 tries to backup big files (several Gigs in size). Version-Release number of selected component (if applicable): Amanda-Client computer: RedHat AS2.1 Kernel: 2.4.9-e.12smp Amanda-client-2.4.2p2-4 Tape server computer: RedHat 8.0 Kernel: 2.4.18-24.8.0smp Amanda-server-2.4.2p2-9 How reproducible: I would not try to reproduce it because it crashed Production Oracle-server Steps to Reproduce: 1. 2. 3. Actual results: Server crash Expected results: Tape backup Additional info: It was comp-root-tar method to backup files Also Amanda-Client computer is: -Oracle database server -Netdump client for netdump-server service that runs on the "Tape backup" computer. Here is "screen shot" from dead Oracle-server EIP: 0010:[<f8cd3131>] Not tainted EFLAGS: 00000086 EIP is at freeze_cpu [netconsole] 0x21 eax: f8cd49d5 ebx: f7ffc000 ecx: c02f50a4 edx: 00000003 esi: f7ffc000 edi: f7ffc000 ebp: c0105400 esp: f7ffdf6c ds: 0018 es: 0018 ss: 0018 Process swapper (pid: 0, stackpage=f7ffd000) Stack: c01139ef 00000000 c0105400 c02455ba c0105400 f7ffc000 00000003 f7ffc000 f7ffc000 c0105400 00000000 f7ff0018 f7ff0018 fffffffa c010542e 00000010 00000246 c0105492 0702080b 00000000 00000000 00000000 00000202 0000000d Call Trace: [<c01139ef>} smp_call_function_interrupt [kernel] 0x2f [<c0105400>] default_idle [kernel] 0x0 [<c02455ba>] call_call_function_interrupt [kernel] 0x5 [<c0105400>] default_idle [kernel] 0x0 [<c0105400>] default_idle [kernel] 0x0 [<c010542e>] default_idle [kernel] 0x2e [<c0105492>] cpu_idle [kernel] 0x32 [<c011c8b8>] printk [kernel] 0xd8 [<c0263c2f>] .rodata.str1.1 [kernel] 0xd2a Code: eb fd 8d b6 00 00 00 00 8d bc 27 00 00 00 00 55 57 56 53 81 console shuts up ... console shuts up ...
This is clearly a kernel bug, since nothing any userspace app (including amanda) does should ever cause a kernel crash. I'm reassigning it to the kernel folks.
You really need to try and reproduce this before we can make any progress debugging this issue. This basically shows the system is in the netconsole on one of the CPUs. Can you attach a serial console and retry running Amanda??? Larry Woodman
I use netdump-0.6.6-1 with ethernet connection to 'netdump-server'. Is it OK? ... Or should I install same thing for serial port? Thanks, Boris
Hi Larry, Here is another screen shot of the "Screen of death": CPU: 1 EIP: 0010:[<f8cd3131>] Not tainted EFLAGS: 00000086 EIP is at freeze_cpu [netconsole] 0x21 eax: f8cd49d5 ebx: f7ffe000 ecx: c02f50a4 edx: 00000001 esi: f7ffe000 edi: f7ffe000 ebp: c0105400 esp: f7ffff6c ds: 0018 es: 0018 ss: 0018 Process swapper (pid: 0, stackpage=f7fff000) Stack: c01139ef 00000000 c0105400 c02455ba c0105400 f7ffe000 00000001 f7ffe000 f7ffe000 c0105400 00000000 f7ff0018 f7ff0018 fffffffa c010542e 00000010 c0384c36 00001356 105492 0102080b 00000000 00000000 00000000 0 (130841611)/ Call Trace: [<c01139ef>] smp_call_function_interrupt [kernel] 0x2f [<c0105400>] default_idle [kernel] 0x0 [<c02455ba>] call_call_function_interrupt [kernel] 0x5 [<c0105400>] default_idle [kernel] 0x0 [<c0105400>] default_idle [kernel] 0x0 [<c010542e>] default_idle [kernel] 0x2e [<c0105492>] cpu_idle [kernel] 0x32 [<c011c5e6>] __call_console_drivers [kernel] 0x46 [<c011c75b>] call_console_drivers [kernel] 0xeb Code: eb fd 8d b6 00 00 00 00 8d bc 27 00 00 00 00 55 57 56 53 81 console shuts up ... Best regards, Boris
Hello, Kernel 2.4.9-e14.1smp seems to be free of the bug. Amanda works just fine during last several weeks. Thanks, Boris
Problem was fixed long ago. Larry Woodman