Bug 155857 (badpmd)

Summary: kernel 2.6.11-1.14_FC3 oopsing after "kernel: mm/memory.c:97: bad pmd" on x86_64/smp
Product: [Fedora] Fedora Reporter: Axel Thimm <axel.thimm>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 3CC: developer, iglesias, j, menscher, mrsam, paul+rhbugz, pfrields, prigault, thh
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-11-08 04:58:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
NMI, sleeping function called from invalid context none

Description Axel Thimm 2005-04-24 21:01:24 UTC
Description of problem:
After upgrading to kernel-smp-2.6.11-1.14_FC3 a dual-opteron tyan 2882 system
displays several log errors like

Apr 21 13:53:58 n26 kernel: mm/memory.c:97: bad pmd
ffff8100126a4808(00000035f8800a88).
Apr 21 13:53:58 n26 kernel: mm/memory.c:97: bad pmd
ffff8100126a4810(0000000000000001).
Apr 21 13:53:58 n26 kernel: mm/memory.c:97: bad pmd
ffff8100126a4818(00007ffffffffa32).

and finally starts to contiually oops:

Apr 21 13:56:17 n26 kernel: mm/memory.c:97: bad pmd
ffff81000e1a8b10(34365f3638780000).
Apr 21 13:56:17 n26 kernel: Unable to handle kernel paging request at
ffff810250800000 RIP: 
Apr 21 13:56:17 n26 kernel: <ffffffff80177024>{free_pages_and_swap_cache+68}
Apr 21 13:56:17 n26 kernel: PGD 8063 PUD 0 
Apr 21 13:56:17 n26 kernel: Oops: 0000 [1] SMP 
Apr 21 13:56:17 n26 kernel: CPU 0 
Apr 21 13:56:17 n26 kernel: Modules linked in: nfs lockd md5 ipv6 parport_pc lp
parport autofs4 sunrpc pcmcia yenta_socket rsrc_nonstatic pcmcia_
core video button battery ac ohci_hcd e100 mii tg3 floppy dm_snapshot dm_zero
dm_mirror ext3 jbd dm_mod sata_sil libata sd_mod scsi_mod
Apr 21 13:56:17 n26 kernel: Pid: 20886, comm: id Not tainted 2.6.11-1.14_FC3smp
Apr 21 13:56:17 n26 kernel: RIP: 0010:[<ffffffff80177024>]
<ffffffff80177024>{free_pages_and_swap_cache+68}
Apr 21 13:56:17 n26 kernel: RSP: 0000:ffff810015ef1ce8  EFLAGS: 00010202
Apr 21 13:56:17 n26 kernel: RAX: 0000000001000000 RBX: ffff810250800000 RCX:
ffff81000156a450
Apr 21 13:56:17 n26 kernel: RDX: ffff8100014064f0 RSI: 0000000000000001 RDI:
0000000000000068
Apr 21 13:56:17 n26 kernel: RBP: 0000000000000004 R08: ffff81007f107240 R09:
000000000000000f
Apr 21 13:56:17 n26 kernel: R10: 0000000000000001 R11: ffffffff8011caf0 R12:
ffff810001e083a8
Apr 21 13:56:17 n26 kernel: R13: 0000000000000005 R14: 0000000000000005 R15:
ffff810001e083a0
Apr 21 13:56:17 n26 kernel: FS:  00002aaaaafbd0a0(0000)
GS:ffffffff804e8980(0000) knlGS:00000000557bf0a0
Apr 21 13:56:17 n26 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Apr 21 13:56:17 n26 kernel: CR2: ffff810250800000 CR3: 0000000019252000 CR4:
00000000000006e0
Apr 21 13:56:17 n26 kernel: Process id (pid: 20886, threadinfo ffff810015ef0000,
task ffff81003fb9a030)
Apr 21 13:56:17 n26 kernel: Stack: 0000800000000000 ffff810001e08280
ffff81007c8049c0 ffff81007c804940 
Apr 21 13:56:17 n26 kernel:        ffff81007c8049b8 0000000000000001
ffff810015ef1ef8 ffffffff80172ac3 
Apr 21 13:56:17 n26 kernel:        0000000000000000 ffff810001e08280 
Apr 21 13:56:17 n26 kernel: Call Trace:<ffffffff80172ac3>{exit_mmap+307}
<ffffffff80135ec4>{mmput+52} 
Apr 21 13:56:17 n26 kernel:        <ffffffff8013adb3>{do_exit+355}
<ffffffff801436d5>{__dequeue_signal+485} 
Apr 21 13:56:17 n26 kernel:        <ffffffff8013b8ff>{do_group_exit+239}
<ffffffff801457da>{get_signal_to_deliver+1514} 
Apr 21 13:56:17 n26 kernel:        <ffffffff8010d963>{do_signal+163}
<ffffffff80201451>{__up_write+49} 
Apr 21 13:56:17 n26 kernel:        <ffffffff8010ebb2>{retint_signal+62} 
Apr 21 13:56:17 n26 kernel: 
Apr 21 13:56:17 n26 kernel: Code: 8b 03 a9 00 00 01 00 74 1b f0 0f ba 2b 00 19
c0 85 c0 75 10 
Apr 21 13:56:17 n26 kernel: RIP <ffffffff80177024>{free_pages_and_swap_cache+68}
RSP <ffff810015ef1ce8>
Apr 21 13:56:17 n26 kernel: CR2: ffff810250800000
Apr 21 13:56:17 n26 kernel:  <1>Unable to handle kernel NULL pointer dereference
at 0000000000000048 RIP: 
Apr 21 13:56:17 n26 kernel: <ffffffff80136026>{mm_release+86}
Apr 21 13:56:17 n26 kernel: PGD 0 
Apr 21 13:56:17 n26 kernel: Oops: 0000 [2] SMP 
Apr 21 13:56:17 n26 kernel: CPU 0 
Apr 21 13:56:17 n26 kernel: Modules linked in: nfs lockd md5 ipv6 parport_pc lp
parport autofs4 sunrpc pcmcia yenta_socket rsrc_nonstatic pcmcia_
core video button battery ac ohci_hcd e100 mii tg3 floppy dm_snapshot dm_zero
dm_mirror ext3 jbd dm_mod sata_sil libata sd_mod scsi_mod
Apr 21 13:56:17 n26 kernel: Pid: 20886, comm: id Not tainted 2.6.11-1.14_FC3smp
Apr 21 13:56:17 n26 kernel: RIP: 0010:[<ffffffff80136026>]
<ffffffff80136026>{mm_release+86}
Apr 21 13:56:17 n26 kernel: RSP: 0000:ffff810015ef1a78  EFLAGS: 00010206
Apr 21 13:56:17 n26 kernel: RAX: ffff81003fb9a030 RBX: ffff81003fb9a030 RCX:
ffff81003fb9a030
Apr 21 13:56:17 n26 kernel: RDX: ffff81003fb9a000 RSI: 0000000000000000 RDI:
00002aaaaafbd130
Apr 21 13:56:17 n26 kernel: RBP: 0000000000000000 R08: ffffffff80529a00 R09:
0000000000000008
Apr 21 13:56:17 n26 kernel: R10: 0000000000000000 R11: ffffffff8011caf0 R12:
0000000000000000
Apr 21 13:56:17 n26 kernel: R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000
Apr 21 13:56:17 n26 kernel: FS:  00002aaaaafbd0a0(0000)
GS:ffffffff804e8980(0000) knlGS:00000000557bf0a0
Apr 21 13:56:17 n26 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Apr 21 13:56:17 n26 kernel: CR2: 0000000000000048 CR3: 0000000019252000 CR4:
00000000000006e0
Apr 21 13:56:17 n26 kernel: Process id (pid: 20886, threadinfo ffff810015ef0000,
task ffff81003fb9a030)
Apr 21 13:56:17 n26 kernel: Stack: 0000000000000000 0000000000000000
ffff81003fb9a030 ffff81003fb9a030 
Apr 21 13:56:17 n26 kernel:        0000000000000009 ffffffff8013a93a
0000000000000000 0000000000000001 
Apr 21 13:56:17 n26 kernel:        ffff81003fb9a6b0 ffff81003fb9a030 
Apr 21 13:56:17 n26 kernel: Call Trace:<ffffffff8013a93a>{exit_mm+42}
<ffffffff8013adb3>{do_exit+355} 
Apr 21 13:56:17 n26 kernel:        <ffffffff8010fe68>{oops_end+40}
<ffffffff80122b52>{do_page_fault+2050} 
Apr 21 13:56:17 n26 kernel:        <ffffffff80138d30>{vprintk+528}
<ffffffff8010f041>{error_exit+0} 
Apr 21 13:56:17 n26 kernel:        <ffffffff8011caf0>{flat_send_IPI_mask+0}
<ffffffff80177024>{free_pages_and_swap_cache+68} 
Apr 21 13:56:17 n26 kernel:       
<ffffffff8017705d>{free_pages_and_swap_cache+125} <ffffffff80172ac3>{exit_mmap+307} 
Apr 21 13:56:17 n26 kernel:        <ffffffff80135ec4>{mmput+52}
<ffffffff8013adb3>{do_exit+355} 
Apr 21 13:56:17 n26 kernel:        <ffffffff801436d5>{__dequeue_signal+485}
<ffffffff8013b8ff>{do_group_exit+239} 
Apr 21 13:56:17 n26 kernel:       
<ffffffff801457da>{get_signal_to_deliver+1514} <ffffffff8010d963>{do_signal+163} 
Apr 21 13:56:17 n26 kernel:        <ffffffff80201451>{__up_write+49}
<ffffffff8010ebb2>{retint_signal+62} 
Apr 21 13:56:17 n26 kernel:        
Apr 21 13:56:17 n26 kernel: 
Apr 21 13:56:17 n26 kernel: Code: 41 8b 45 48 ff c8 7e 63 48 c7 83 e8 01 00 00
00 00 00 00 65 
Apr 21 13:56:17 n26 kernel: RIP <ffffffff80136026>{mm_release+86} RSP
<ffff810015ef1a78>
Apr 21 13:56:17 n26 kernel: CR2: 0000000000000048
Apr 21 13:56:17 n26 kernel:  <1>Unable to handle kernel NULL pointer dereference
at 0000000000000048 RIP: 
[...]
Apr 21 14:24:07 n26 kernel: Oops: 0000 [3482] SMP 
 

Version-Release number of selected component (if applicable):
kernel-smp-2.6.11-1.14_FC3

How reproducible:
difficult, it looks like the machine needs to be under CPU and NFS load. It's
100% reproducable if I start rebuilding ATrpms over NFS (using one processor only).

Steps to Reproduce:
1.Install the kernel
2.stress the system over NFS?
  
Actual results:


Expected results:


Additional info:
The issue seems to be known on lkml and also fedora kernel maintainers there,
although it was hoped that this has been fixed in fedora kernels. Filing this
here, so we have a reference point.

The system runs w/o X, and w/o any extra kernel modules, tainting etc.

Let me know what other input I can provide, or whether I should try another kernel.

Comment 1 Dave Jones 2005-04-25 01:19:10 UTC
argh.  this one is a mystery, and I've spent quite a few days chasing it.
The good news is that it is fixed in 2.6.12, but a rebase to that for FC3 is
some way off, so it'd be good to get to the bottom of this before then.


Comment 2 Paul Jakma 2005-05-25 22:41:53 UTC
I see it too:

May 25 23:31:40 sheen kernel: mm/memory.c:97: bad pmd
ffff81004ab41750(000000000000000f).
May 25 23:31:40 sheen kernel: mm/memory.c:97: bad pmd
ffff81004ab41758(00007ffffffff773).
May 25 23:31:40 sheen kernel: mm/memory.c:97: bad pmd
ffff81004ab41770(365f363878000000).
May 25 23:31:40 sheen kernel: mm/memory.c:97: bad pmd
ffff81004ab41778(0000000000000034).
<etc>

Also got:

May 24 23:37:41 sheen kernel: swap_free: Bad swap offset entry 5f363878000000
May 24 23:37:41 sheen kernel: swap_free: Bad swap file entry d800000000000034

Finally, though this probably isn't related (i enabled some BIOS MCE reporting
option after a reboot), I now get lots of:

May 25 02:46:50 sheen kernel: Machine check events logged

There doesnt seem anything logged about what events these are exactly.

No oops though. Tyan S2885 machine with dual 2.2GHz Opteron CPUs, DDR-333, node
interleave disabled in BIOS (so kernel sees it as a NUMA machine).



Comment 3 Axel Thimm 2005-05-27 16:44:56 UTC
With the latest errata kernel, 2.6.11-1.27_FC3smp, the logs have changed,
perhaps that gives some information on what's happening?

May 27 18:31:32 n26 kernel: check-files:11239: mm/memory.c:98: bad pmd
ffff81003068c7d8(00000035f8800a88).
May 27 18:31:32 n26 kernel: check-files:11239: mm/memory.c:98: bad pmd
ffff81003068c7e0(0000000000000003).
[...]
May 27 18:33:49 n26 kernel: check-files:16154: mm/memory.c:98: bad pmd
ffff810028ea47d8(00000035f8800a88).
May 27 18:33:49 n26 kernel: check-files:16154: <7>Losing some ticks... checking
if CPU frequency changed.
May 27 18:33:49 n26 kernel: mm/memory.c:98: bad pmd
ffff810028ea47e0(0000000000000003).
May 27 18:33:49 n26 kernel: check-files:16154: mm/memory.c:98: bad pmd
ffff810028ea47e8(00007ffffffffa25).

check-files is a script in /usr/lib/rpm/check-files used by rpmbuild. Its
contents look quite harmless (see below). Note that when the check-files script
is run there is no NFS activity (%{buildroot} is on local files system).

#!/bin/sh
#
# Gets file list on standard input and RPM_BUILD_ROOT as first parameter
# and searches for omitted files (not counting directories).
# Returns it's output on standard output.
#
# filon.pl

RPM_BUILD_ROOT=$1

if [ ! -d "$RPM_BUILD_ROOT" ] ; then
        cat > /dev/null
        exit 1
fi

[ "$TMPDIR" ] || TMPDIR=/tmp
FILES_DISK=`mktemp $TMPDIR/rpmXXXXXX`
FILES_RPM=`mktemp $TMPDIR/rpmXXXXXX`

find $RPM_BUILD_ROOT -type f | LC_ALL=C sort > $FILES_DISK
LC_ALL=C sort > $FILES_RPM

for f in `diff -d "$FILES_DISK" "$FILES_RPM" | grep "^< " | cut -c3-`; do
        echo $f | sed -e "s#^$RPM_BUILD_ROOT#   #g"
done

rm -f $FILES_DISK
rm -f $FILES_RPM



Comment 4 Damian Menscher 2005-05-29 17:34:35 UTC
I've now seen this on two different machines:
  dual AMD Opteron(tm) Processor 246 with 4GB ram
  dual AMD Opteron(tm) Processor 244 with 8GB ram

Something that might be of interest is that it seems to appear along with the error:

Usage: ld.so [OPTION]... EXECUTABLE-FILE [ARGS-FOR-PROGRAM...]

On one machine, I typed "reboot" yesterday, and it just gave that error (no
reboot).  Typing reboot again a few hours later worked as expected.  Another
machine gave that error four times when running the weekly makewhatis cronjob
last night.

We haven't had any kernel crashes yet, but having commands randomly fail is bad
enough that I'm willing to try experimental kernels on one of our machines.  If
there are debugging kernels you want me to try, just let me know.  (I see
there's a -29 build out, but someone already reproduced the error there.)  Also
tell me if there's other info I can provide.

Comment 5 Sam Varshavchik 2005-05-29 17:41:23 UTC
Has anyone tried rolling back to the most recent 2.6.10 kernel errata for FC2?

Is there any reason why 2.6.10-1.771_FC2smp won't work on FC3?



Comment 6 Mike Iglesias 2005-05-31 22:27:31 UTC
I'm seeing this with kernel 2.6.11-1.27_FC3smp.  I get the 

Usage: ld.so [OPTION]... EXECUTABLE-FILE [ARGS-FOR-PROGRAM...]

messages when trying to run configure for some software.  Reboots fixes it
sometimes, but most of the time the reboot fails when fsck is run on the /
partition.  This usually requires a powercycle to clear, but the bad pmd errors
come back fairly quickly.  Having to go to the system to clear this up is a pain
for systems on the other side of campus.

The systems are using Tyan 2882 mb with dual Opterons.



Comment 7 Pete Stieber 2005-06-02 16:43:07 UTC
I was having this problem too with a Tyan S2885, but a patched kernel (x86_64 
version of 2.6.11-1.31_FC3smp) provided by Dave Jones 
(http://people.redhat.com/davej/kernels/Fedora/FC3/) seems to have fixed the 
problem. You may want to try it.

Comment 8 Mike Iglesias 2005-06-02 21:23:20 UTC
I read about that kernel on the fedora-list and I have it running on my systems
as well.  So far so good...


Comment 9 Dave Jones 2005-06-04 05:03:23 UTC
*** Bug 159560 has been marked as a duplicate of this bug. ***

Comment 10 Dave Jones 2005-06-04 05:12:48 UTC
There's still 1-2 reports of this bug happening with the .31 kernel, so it seems
we're really not making progress at nailing it down in the .11.x kernel.

I've begin work on a 2.6.12rc backport to FC3, which is at
http://people.redhat.com/davej/kernels/test/
It's had no testing at all just yet, so be very careful with it, and if anyone
is brave enough to try it, I'd be *very* interested to hear from you if this bug
reoccurs with it.

Thanks.




Comment 11 Axel Thimm 2005-06-04 06:49:47 UTC
I tried to be brave, but there are some selinux dependencies left:

# rpm -ihv kernel-smp-2.6.11-1.1369_FC3.x86_64.rpm 
error: Failed dependencies:
        selinux-policy-targeted < 1.23.16-1 conflicts with
kernel-smp-2.6.11-1.1369_FC3.x86_64

Should we try it with --nodeps anyway, or is it doomed to break due to selinux?


Comment 12 Dave Jones 2005-06-04 22:04:46 UTC
I think you'll get loads of avc errors if you --nodeps. I'm not sure if you can
just take the FC4 policy or not. I'll check with Dan on Monday, and get
something worked out.


Comment 13 Philippe Rigault 2005-06-05 13:46:56 UTC
>How reproducible: 
>difficult, it looks like the machine needs to be under CPU and NFS load. It's 
>100% reproducable if I start rebuilding ATrpms over NFS (using one processor 
>only). 
 
FWIW, NFS is not in the picture in my case 
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=159560 
I am using RAID5 partitions on a 3ware Escalade 8506-4. 
It happened during compilation (koffice, with gcc-3.4.3-22) using one CPU. 

Comment 14 Mike Iglesias 2005-06-09 07:02:37 UTC
Any more progress on this?  I'm using the 2.6.11-1.33 kernel now, and while
trying to put the web100 changes into the kernel I get the usage message for
"ld.so" (like in comment #4) occasionally during the make.  Restarting the make
gets it further, then it happens again.  I've had to restart the make 5 times so
far.  I don't know if this is related to the "bad pmd" issue; I'm not getting
the messages in /var/log/messages any more.

Comment 15 Dave Jones 2005-07-15 17:37:34 UTC
An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which
may contain a fix for your problem.   Please update to this new kernel, and
report whether or not it fixes your problem.

If you have updated to Fedora Core 4 since this bug was opened, and the problem
still occurs with the latest updates for that release, please change the version
field of this bug to 'fc4'.

Thank you.

Comment 16 Dave Jones 2005-08-03 22:56:20 UTC
Can anyone still seeing this problem on the latest kernel please check that they
have the latest BIOS update installed ? There is an AMD errata in some of their
CPUs that can only be worked around with a BIOS update, which hopefully all
vendors should have picked up by now.

As this bug has been quiet for a few weeks, I'm going to close this soon unless
someone reports that they're still seeing it with the update.

Thanks.

Comment 17 Paul Jakma 2005-08-16 15:06:56 UTC
I upgraded to 2.6.12-1.1373_FC3smp, (from 2.6.11-1.35) and didn't get any 'bad
pmd' messages (just under 2 hours of running). I also updated the Tyan 2885 BIOS
to v2.05, and the machine check errors appear to have gone.

Will observe a while longer, but no looks good so far.


Comment 18 Paul Jakma 2005-08-17 06:51:36 UTC
Some machine check errors (no information on what these are about) since
yesterday, but otherwise fine. Appears to be fixed otherwise.



Comment 19 Paul Jakma 2005-10-06 16:44:25 UTC
The machine is still unstable. I was getting a lot of "Northbridge Chipkill ECC
error" and "L2 cache ECC error" MCE's. After carefully reading through the AMD
errata, I decided to disable chipkill and ECC scrubbing in the BIOS (there's at
least one errata related to chipkill, old but I've no idea whether Tyan applied
the workaround in the v2.05 BIOS update).

However, with only ECC MCE reporting enabled, I still get MCEs: "Northbridge ECC
error" at a rate of a couple/day, and the machine tends to lockup hard about
every other week. I captured the forced panic via NMI watchdog using netconsole
and it seems it might actually be an NMI handler bug (or bug in code interrupted
by NMI handler) locking my machine up rather than an actual hardware lockup, as
I get the following message:

>Debug: sleeping function called from invalid context at include/linux/rwsem.h:43 

See the attached file for details.



Comment 20 Paul Jakma 2005-10-06 16:46:36 UTC
Created attachment 119677 [details]
NMI, sleeping function called from invalid context

Comment 21 Dave Jones 2005-10-06 19:53:16 UTC
The machine check problem is likely unrelated to the issue first reported in
this bug. Have you tried running memtest86 on that box ?


Comment 22 Paul Jakma 2005-10-06 22:48:50 UTC
Not recently no. And memtest86 isn't particularly good at finding RAM problems,
I can't afford to not have this machine at my disposal for several days. :(

However, if it's due to RAM that'd be weird as it affects a bunch of about 16
different reported addresses in 2 distinct sets of ranges (is there a way to
figure out what ADDR reported by mcelog belongs to what DIMM? Eg, what ranges of
physical addresses are assigned to what banks on which CPUs? What does the 'CPU
X Y' line in mcelog mean? CPU number then bank?).

And the RAM is over-spec'd, dmidecode thinks it's "400MHz" (DDR-400?) however
for whatever reason (CPU model/speed) the BIOS sets them up for 166MHz (DDR-333?).

Anyway, it does /seem/ like bad RAM or CPU, but I just keep wondering: "Am I
hitting another Opteron errata"? :(

(Pair of model 5 stepping 8 Opteron, C0 revision btw).


Comment 23 Dave Jones 2005-11-08 04:58:02 UTC
The original bug was fixed by the errata workaround.
Any other problems that may be seen, please file a separate bug.

Thanks.