Red Hat Bugzilla – Bug 243986
Divide by zero in _stp_gettimeofday_ns causes panic
Last modified: 2008-04-14 12:08:39 EDT
Description of problem:
I left a systemtap script running for a number of hours. Amongst other things,
it does gettimeofday_ms() at least once per second. The kernel panicked with the
#0 [f6870d74] crash_kexec at c0442c1a
#1 [f6870db8] die at c04054ae
#2 [f6870de8] do_divide_error at c0405a9b
#3 [f6870e98] error_code (via divide_error) at c0404a6f
EAX: 00000116 EBX: 00000000 ECX: 00000116 EDX: 00000000 EBP: 00000116
DS: 007b ESI: 00000000 ES: 007b EDI: 430c2df0
CS: 0060 EIP: f913ef92 ERR: ffffffff EFLAGS: 00010046
#4 [f6870ecc] _stp_gettimeofday_ns at f913ef92
#5 [f6870ef0] function_gettimeofday_ms at f913f06a
#6 [f6870f00] probe_1512 at f9149c0e
#7 [f6870f18] enter_kprobe_probe at f9143fb4
It looks like there's only a single divide in that function, and it's a divide
by CPU frequency. I'm running RHEL 5 GA on an intel dual core i686 processor.
Version-Release number of selected component (if applicable):
Linux mbooth.redhat.laptop 2.6.18-8.1.3.el5 #1 SMP Mon Apr 16 15:54:12 EDT 2007
i686 i686 i386 GNU/Linux
I meant to add to this that the target machine is a laptop, and CPU frequency
scaling is enabled. I'll keep the vmcore around for a short while.
If you still have this script, or remember what it contained, please
send a copy over. I can't seem to reproduce this on this end. There
have been several changes to this area of the systemtap runtime code,
so it's also worth retesting.
Created attachment 297905 [details]
Syscall usage stap script
I'm fairly sure this was the script which caused the bug. Unfortunately I've
cleared out the vmcore now.
The timekeeping-related code was reworked around 2007-10. More current code
(including the 0.6.* variants in brew) do not appear to trigger this bug.
If able, try the script on one of these later versions, and consider closing
the bug CURRENTRELEASE.
Please reopen this bug if you see its ugly exoskeleton again.