This service will be undergoing maintenance at 00:00 UTC, 2016-09-28. It is expected to last about 1 hours
Bug 340161 - system crash on kjournald with div64_64
system crash on kjournald with div64_64
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
7
i386 Linux
low Severity medium
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-10-19 13:35 EDT by Robert Hoekstra
Modified: 2007-12-07 18:35 EST (History)
0 users

See Also:
Fixed In Version: 2.6.23.8-34.fc7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-12-07 18:35:57 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
crash 1, crashed on md1_raid1, not kjournald (241.58 KB, image/jpeg)
2007-10-19 13:35 EDT, Robert Hoekstra
no flags Details
This crash is on kjournald, not sure if it matters (321.74 KB, image/jpeg)
2007-10-19 13:36 EDT, Robert Hoekstra
no flags Details
crash on kjournald, kernel 2.6.23.1-10.fc7 (103.49 KB, image/jpeg)
2007-11-07 14:16 EST, Robert Hoekstra
no flags Details
The dmesg of the machine (18.12 KB, application/octet-stream)
2007-11-07 14:17 EST, Robert Hoekstra
no flags Details

  None (edit)
Description Robert Hoekstra 2007-10-19 13:35:25 EDT
Description of problem:
After a couple of hours (or, when lucky, days) the system hangs with a dump of a
process on screen. It doesn't show a 'kernel panic' and does not restart after 5
seconds (though I have that set in sysctl.conf).

This is occurring since the last month in shortening intervals.

Version-Release number of selected component (if applicable):
I am running an up-to-date fedora 7 system with kernel-2.6.22-9.91.fc7. I have
had this since a number of kernels though, so this version is not explicitly the
cause.

How reproducible:
Unknown. My last change to my system is adding a Promise SATA card to my P3
system, with a SATA disk attached.

Steps to Reproduce:
1.Bootup
2.wait.. (sorry, no way to actually reproduce)
3.
  
Actual results:
system hang

Expected results:
... no system hang ?

Additional info:
I will attach a couple of pictures I took of my system when it crashed. You will
notice that it varies to crash with the md1_raid1 process or kjournald.

I have run a memory test to verify that my memory is functioning properly.
Comment 1 Robert Hoekstra 2007-10-19 13:35:25 EDT
Created attachment 232821 [details]
crash 1, crashed on md1_raid1, not kjournald
Comment 2 Robert Hoekstra 2007-10-19 13:36:26 EDT
Created attachment 232831 [details]
This crash is on kjournald, not sure if it matters
Comment 3 Robert Hoekstra 2007-10-19 13:38:01 EDT
I forgot:

my lspci:
---------------------------------------
00:00.0 Host bridge: VIA Technologies, Inc. VT8605 [ProSavage PM133] (rev 81)
00:01.0 PCI bridge: VIA Technologies, Inc. VT8605 [PM133 AGP]
00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 1b)
00:04.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
00:04.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller
(rev 0e)
00:04.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller
(rev 0e)
00:04.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 20)
00:0e.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
00:0f.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
00:10.0 Mass storage controller: Promise Technology, Inc. PDC40775 (SATA 300
TX2plus) (rev 02)
00:11.0 Ethernet controller: Intel Corporation 82541PI Gigabit Ethernet
Controller (rev 05)
01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G400/G450 (rev 85)
---------------------------------------

for what it matters, my cpuinfo:
---------------------------------------
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 8
model name      : Pentium III (Coppermine)
stepping        : 6
cpu MHz         : 803.448
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 mtrr pge mca cmov pat pse36
mmx fxsr sse up
bogomips        : 1608.22
clflush size    : 32
---------------------------------------
Comment 4 Robert Hoekstra 2007-11-07 14:16:59 EST
Created attachment 250651 [details]
crash on kjournald, kernel 2.6.23.1-10.fc7

With the newest kernel, 2.6.23.1-10.fc7, also crashes, here's a new crash dump.
Comment 5 Robert Hoekstra 2007-11-07 14:17:47 EST
Created attachment 250661 [details]
The dmesg of the machine
Comment 6 Chuck Ebbert 2007-11-07 18:05:44 EST
You can work around this problem in the kernel by adding this to /etc/sysctl.conf:

kernel.sched_features = 21

This will disable precise CPU time accounting, a feature that is already removed
in 2.6.24.
Comment 7 Chuck Ebbert 2007-11-09 14:51:58 EST
Now disabled by default. Could be re-enabled by a user but that's not very likely.
Comment 8 Robert Hoekstra 2007-11-09 14:56:42 EST
Thank you for the response. I have applied the suggested change. Unfortunately,
only time will tell me if this is the solution. I do have good hopes though. I
will respond within two or three weeks with my result, with which I will close
the bug if that's okay.
Comment 9 Chuck Ebbert 2007-11-13 17:17:31 EST
In 2.6.23.1-28
Comment 10 Jan Gutter 2007-11-23 03:17:43 EST
Although I'm not a Red Hat or Fedora user, I've been bitten by the selfsame bug.
This patch definitely fixes the problem.

Just a small query: the likely/unlikely logic flips around because of this in
sched.c. Would it be prudent to amend it, or is that too invasive for "stable":

--- ./kernel/sched.c~   2007-11-19 10:37:44.000000000 +0200
+++ ./kernel/sched.c    2007-11-19 10:37:44.000000000 +0200
@@ -1988,7 +1988,7 @@
        int i, scale;

        this_rq->nr_load_updates++;
-       if (unlikely(!(sysctl_sched_features & SCHED_FEAT_PRECISE_CPU_LOAD)))
+       if (likely(!(sysctl_sched_features & SCHED_FEAT_PRECISE_CPU_LOAD)))
                goto do_avg;

        /* Update delta_fair/delta_exec fields first */

Note You need to log in before you can comment on or make changes to this bug.