Bug 164091 - (bad pmd) mmap bug detected in kernel shortly after crond run
Summary: (bad pmd) mmap bug detected in kernel shortly after crond run
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 4
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
Assignee: Dave Jones
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-07-24 17:56 UTC by gdelx001
Modified: 2015-01-04 22:21 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-11-05 08:06:31 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description gdelx001 2005-07-24 17:56:24 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.10) Gecko/20050720 Fedora/1.0.6-1.1.fc4 Firefox/1.0.6

Description of problem:
Machine was locked up and fortunately after reset produced a log of problems and stack dumps.  The log itself reports a mmap bug some seconds after crond runs: there may or may not be a causal relationship.  Machine was "idle" when this happend, and had a (GNOME) desktop running.


Version-Release number of selected component (if applicable):
kernel-2.6.12-1.1398_FC4smp

How reproducible:
Didn't try


Additional info:

System was installed with FC3, anaconda upgraded to FC4.

Jul 24 04:22:01 xxxxxx crond(pam_unix)[21793]: session opened for user root by (uid=0)
Jul 24 04:22:08 xxxxxx kernel: mm/memory.c:105: bad pmd ffff810060eea0f0(00007fffff8cefd6).
Jul 24 04:22:08 xxxxxx kernel: sh[23619]: segfault at 0000003340d13b1a rip 0000003340d13b1a rsp 00007fffff8cc438 error 14
Jul 24 04:22:08 xxxxxx kernel: ----------- [cut here ] --------- [please bite here ] ---------
Jul 24 04:22:08 xxxxxx kernel: Kernel BUG at "mm/mmap.c":2026
Jul 24 04:22:08 xxxxxx kernel: invalid operand: 0000 [1] SMP
Jul 24 04:22:08 xxxxxx kernel: CPU 0

Comment 1 Dave Jones 2005-07-28 07:36:49 UTC
We saw 'bad pmd' messages on x86-64 around 2.6.11, but these seemed to have
disappeared with 2.6.12.  Can you run memtest86 on this box for a few hours just
to check the most obvious thing?

If it passes the memtest, this bug is sadly still with us.

Comment 2 gdelx001 2005-08-22 02:21:25 UTC
Sorry about the delay.  Took me a while, ran memtest86 through a few passes: no
errors.

It took about a month of continuous runtime, but the same exact problem (with
the same kernel) happened again, in exactly the same sequence (4:22am crond and
this time 30 seconds later the bad pmd showed up, machine locked).


Comment 3 Dave Jones 2005-08-26 23:31:31 UTC
Please try the latest updates-testing kernel, which contains a workaround for a
hardware bug in the CPU.


Comment 4 Dave Jones 2005-09-30 07:21:05 UTC
Mass update to all FC4 bugs:

An update has been released (2.6.13-1.1526_FC4) which rebases to a new upstream
kernel (2.6.13.2). As there were ~3500 changes upstream between this and the
previous kernel, it's possible your bug has been fixed already.

Please retest with this update, and update this bug if necessary.

Thanks.


Comment 5 gdelx001 2005-10-30 17:47:34 UTC
I stuck with 2.6.12-1.1435_FC4smp (updates-testing) and it has not hit this
problem in over 2 months of continuous runtime.  Baring regressions in the newer
kernels (including *1526_FC4smp), it appears we're okay against this bug. 
Change status of bug to appropriate resolved.




Note You need to log in before you can comment on or make changes to this bug.