Bug 1054537 - [abrt] kernel BUG at mm/mlock.c:512!
Summary: [abrt] kernel BUG at mm/mlock.c:512!
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 19
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL: https://retrace.fedoraproject.org/faf...
Whiteboard: abrt_hash:5fa9d6a9e8d781f6c9c0047963b...
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-01-17 01:44 UTC by Karl Auerbach
Modified: 2014-06-16 15:19 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2014-06-16 15:19:14 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
File: dmesg (63.08 KB, text/plain)
2014-01-17 01:45 UTC, Karl Auerbach
no flags Details
Simple program to cause this kernel issue (2.84 KB, text/plain)
2014-01-17 06:52 UTC, Karl Auerbach
no flags Details

Description Karl Auerbach 2014-01-17 01:44:58 UTC
Description of problem:
This started with the most rencent Fedora kernel update 3.12.7-200.fc19.x86_64.  The code that triggers this has been running fine for years until this latest update.

This code uses RX and TX ring buffers to receive packets and mmaps those rings.  The kernel problem occurs during shutdown when the buffers are unmapped.

I can provide a program that triggers this.

Additional info:
reporter:       libreport-2.1.10
kernel BUG at mm/mlock.c:512!
invalid opcode: 0000 [#1] SMP 
Modules linked in: nfsv3 nfs_acl nfs lockd sunrpc fscache nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack bnep bluetooth rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw snd_hda_codec_hdmi snd_hda_codec_via snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm sp5100_tco snd_page_alloc snd_timer snd soundcore r8169 kvm mii edac_core ppdev k10temp parport_pc serio_raw i2c_piix4 parport asus_atk0110 microcode edac_mce_amd wmi shpchp acpi_cpufreq uinput radeon ata_generic pata_acpi i2c_algo_bit pata_atiixp drm_kms_helper ttm drm i2c_core
CPU: 3 PID: 1774 Comm: a.out Not tainted 3.12.7-200.fc19.x86_64 #1
Hardware name: System manufacturer System Product Name/M4A88T-M, BIOS 2403    12/23/2010
task: ffff8800a70c98c0 ti: ffff880118f2e000 task.ti: ffff880118f2e000
RIP: 0010:[<ffffffff8116fcb5>]  [<ffffffff8116fcb5>] munlock_vma_pages_range+0x2f5/0x310
RSP: 0018:ffff880118f2fe00  EFLAGS: 00010206
RAX: 00000000000001ff RBX: ffff8800b0199170 RCX: 0000000000000037
RDX: 00000007fdcd1248 RSI: ffffea0004620d00 RDI: ffffea0004620d00
RBP: ffff880118f2fed8 R08: 6800000000000000 R09: a800118834000000
R10: 57ffd877d0620d00 R11: 0000000000000206 R12: ffffea0004620d00
R13: 00007fdcd3188000 R14: ffff8800b0199170 R15: 00007fdcd1248000
FS:  00007fdcd3188740(0000) GS:ffff88011fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003e94ef0710 CR3: 000000011870c000 CR4: 00000000000007e0
Stack:
 0000005200000000 00007fdcd3187fff 00007fdcd3188000 00000000100020fb
 000001ffb0199170 00007fdcd3188000 0000000000000000 0000000000000000
 00007fdcd1248000 ffff880118f2fe68 ffffffff8116f9a9 0000000000000000
Call Trace:
 [<ffffffff8116f9a9>] ? __mlock_vma_pages_range+0x89/0xa0
 [<ffffffff81170183>] ? __mm_populate+0x133/0x150
 [<ffffffff81172bef>] do_munmap+0x18f/0x3b0
 [<ffffffff81172e51>] vm_munmap+0x41/0x60
 [<ffffffff81173dd2>] SyS_munmap+0x22/0x30
 [<ffffffff81675ee9>] system_call_fastpath+0x16/0x1b
Code: e2 01 fe ff 84 c0 48 8b 95 28 ff ff ff 0f 85 68 ff ff ff e9 54 ff ff ff 66 0f 1f 44 00 00 e8 18 4b 4f 00 0f 1f 00 e8 0a 4b 4f 00 <0f> 0b 4c 89 e7 e8 21 22 fd ff 90 e9 61 fd ff ff 66 66 2e 0f 1f 
RIP  [<ffffffff8116fcb5>] munlock_vma_pages_range+0x2f5/0x310
 RSP <ffff880118f2fe00>

Comment 1 Karl Auerbach 2014-01-17 01:45:03 UTC
Created attachment 851372 [details]
File: dmesg

Comment 2 Karl Auerbach 2014-01-17 06:52:06 UTC
Created attachment 851413 [details]
Simple program to cause this kernel issue

This is a C program that must be run with root privilege.  It runs fine on older kernels but causes a kernel problem on the most recent one.

The code creates a raw socket with transmit and receive ring buffers.

The rings are then made visible via mmap().

When munmap() is called things go awry.

The code in this example was quickly lifted from a larger body of code that has been running for several years.

Comment 3 Josh Boyer 2014-01-17 13:19:16 UTC
Could you try this with a kernel from the rawhide nodebug repo and see if it recreates on that?  If so, it would be best to report it directly to upstream.

Comment 4 Karl Auerbach 2014-01-17 19:26:10 UTC
I just tried the kernel from rawhide nodebug repo:

Linux v-f19-x64.cavebear.com 3.13.0-0.rc8.git2.2.fc21.x86_64 #1 SMP Wed Jan 15 16:18:47 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

And it had the same issue:

[ 3364.076609] kernel BUG at mm/mlock.c:512!
[ 3364.076664] invalid opcode: 0000 [#1] SMP
etc etc

Comment 5 Karl Auerbach 2014-01-17 20:11:20 UTC
Under the prior kernel things still go bad, but with a slightly different reported location:

Linux v-f19-x64.cavebear.com 3.12.6-200.fc19.x86_64 #1 SMP Mon Dec 23 16:33:38 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

[  132.747009] kernel BUG at include/linux/page-flags.h:413!
[  132.747085] invalid opcode: 0000 [#1] SMP 
[  132.747152] Modules linked in: nfsv3 nfs_acl nfs lockd sunrpc fscache nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack bnep bluetooth rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw snd_hda_codec_hdmi snd_hda_codec_via snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc kvm snd_timer sp5100_tco r8169 i2c_piix4 microcode snd ppdev parport_pc mii edac_core soundcore wmi parport shpchp edac_mce_amd k10temp serio_raw asus_atk0110 acpi_cpufreq uinput radeon i2c_algo_bit
[  132.748420]  drm_kms_helper ata_generic ttm pata_acpi drm pata_atiixp i2c_core
[  132.748537] CPU: 3 PID: 1793 Comm: a.out Not tainted 3.12.6-200.fc19.x86_64 #1

etc etc

Comment 6 Karl Auerbach 2014-01-17 22:04:42 UTC
I'm starting to search for the kernel version at which this problem started.  So far I know that 3.9.5-301.fc19.x86_64 (the release version for F19) that things were OK.

So stuff went awry sometime after 3.9.5-301.fc19.x86_64 and before 3.12.5-200.fc19.x86_64

Is there an archive of the Fedora kernel RPMs so that I can pull the various versions that came out and give 'em a try?  If so could someone give me a pointer?

Comment 7 Josh Boyer 2014-01-18 00:11:08 UTC
http://koji.fedoraproject.org/koji/packageinfo?packageID=8 has all the builds we've done.

Comment 8 Karl Auerbach 2014-01-18 01:09:44 UTC
I've now narrowed it down to a specific Fedora kernel release.

The problem first occurs with:

kernel-3.12.5-200.fc19.x86_64
http://koji.fedoraproject.org/koji/buildinfo?buildID=485557

There is not a problem with its predecessor, 3.11.10-200.fc19.x86_64

By-the-way, I don't think I mentioned:

1.  That things seem to go wrong in a call to munmap()

2. That after the kernel event the machine is often still somewhat active, but it hangs during a restart - the reset button or a power cycle is needed.

I've kinda reached the limit of my expertise (or lack of same) here.

Comment 9 Karl Auerbach 2014-01-30 01:01:53 UTC
I have done some additional testing to isolate when this bug crept in.

I built kernels directly from the sources at kernels.org and used the default .config values.

Kernel 3.11.10 was OK.
Kernel 3.12.1 was not.
There were no intervening kernels.

So this jumped in during the hop from 3.11.10 to 3.12.1

I perused the change log but nothing jumped out at me, but then again, I am way beyond my depth here.

Comment 10 Justin M. Forbes 2014-03-10 14:50:53 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.13.5-100.fc19.  Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you experience different issues, please open a new bug report for those.

Comment 11 Karl Auerbach 2014-03-12 01:31:23 UTC
This bug still exists in 3.13.5-100.fc19.

I have heard that a patch has been submitted upstream; I don't know whether this has been accepted and propagated:

https://bugzilla.kernel.org/show_bug.cgi?id=70021

Comment 12 Justin M. Forbes 2014-05-21 19:30:50 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.14.4-100.fc19.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 20, and are still experiencing this issue, please change the version to Fedora 20.

If you experience different issues, please open a new bug report for those.

Comment 13 Josh Boyer 2014-06-16 15:19:14 UTC
Should be fixed with 3.14 (commit 9050d7eba40b3d79551668f54e68fd6f51945ef3)


Note You need to log in before you can comment on or make changes to this bug.