Bug 625173

Summary: [RHEL6][Kernel] FATAL: Error inserting ipv6, Cannot allocate memory, causes panic
Product: Red Hat Enterprise Linux 6 Reporter: Jeff Burke <jburke>
Component: kernelAssignee: Neil Horman <nhorman>
Status: CLOSED ERRATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: arozansk, jolsa, lwang, nhorman, pbunyan, tgraf
Target Milestone: rcKeywords: RHELNAK
Target Release: ---   
Hardware: All   
OS: Linux   
URL: http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=16198367
Whiteboard:
Fixed In Version: kernel-2.6.32-85.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-05-23 20:49:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
patch to balance addrconf/addrlabel pernet registration none

Description Jeff Burke 2010-08-18 18:51:25 UTC
Description of problem:
 When booting the i386 debug kernel. FATAL: Error inserting ipv6 (/lib/modules/2.6.32-66.el6.i686.debug/kernel/net/ipv6/ipv6.ko): Cannot allocate memory
failed to allocate memory 

Version-Release number of selected component (if applicable):
2.6.32-66.el6.i686.debug

How reproducible:
100% using the debug kernel and system with only a gig of ram
  
Actual results:

FATAL: Error inserting ipv6 (/lib/modules/2.6.32-66.el6.i686.debug/kernel/net/ipv6/ipv6.ko): Cannot allocate memory
Bringing up loopback interface:  [  OK  ]
Bringing up interface eth0:  
Determining IP information for eth0...[-- MARK -- Wed Aug 18 03:45:00 2010]
 done.
[  OK  ]
BUG: unable to handle kernel paging request at fba9fed8
IP: [<c060d2f9>] list_del+0x9/0x90
*pdpt = 0000000000e0e001 *pde = 0000000030505067 *pte = 0000000000000000 
Oops: 0000 [#1] SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:1c.5/0000:01:00.0/net/eth0/ifindex
Modules linked in: ipv6(+) dm_mirror dm_region_hash dm_log wmi serio_raw sg iTCO_wdt iTCO_vendor_support tg3 snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i7core_edac edac_core ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif firewire_ohci firewire_core crc_itu_t ahci nouveau ttm drm_kms_helper drm i2c_algo_bit video output i2c_core dm_mod [last unloaded: microcode]

Pid: 1725, comm: modprobe Tainted: G        W  ----------------  (2.6.32-66.el6.i686.debug #1) HP Z600 Workstation
EIP: 0060:[<c060d2f9>] EFLAGS: 00010296 CPU: 6
EIP is at list_del+0x9/0x90
EAX: f82e6c88 EBX: f82e6c88 ECX: 00000006 EDX: fba9fed8
ESI: f82e6c88 EDI: 00000000 EBP: f06d1f30 ESP: f06d1f18
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process modprobe (pid: 1725, ti=f06d0000 task=f06886e0 task.ti=f06d0000)
Stack:
 22222222 22222222 00000000 c0a611a0 00000000 f06d1f40 f06d1f40 c078d9dc
<0> f82e6c88 f82e9580 f06d1f4c c078db29 fffffff4 f06d1f5c f7e5f24b f06d1f78
<0> fffffffc f06d1f88 c040303d f82e9580 c0a2b844 fffffffc f82e9580 00000000
Call Trace:
 [<c078d9dc>] ? unregister_pernet_operations+0xc/0x50
 [<c078db29>] ? unregister_pernet_subsys+0x19/0x30
 [<f7e5f24b>] ? inet6_init+0x24b/0x296 [ipv6]
 [<c040303d>] ? do_one_initcall+0x2d/0x1d0
 [<f7e5f000>] ? inet6_init+0x0/0x296 [ipv6]
 [<c049b303>] ? sys_init_module+0xb3/0x220
 [<c048be8c>] ? trace_hardirqs_on_caller+0x12c/0x170
 [<c0409ceb>] ? sysenter_do_call+0x12/0x38
Code: 6d 8c 01 8b 55 8c 89 55 88 e9 8a fe ff ff 8b 5d 80 e9 ab fe ff ff 90 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 83 ec 18 8b 50 04 <8b> 12 39 d0 75 20 8b 10 8b 4a 04 39 c8 75 43 8b 48 04 89 4a 04 
EIP: [<c060d2f9>] list_del+0x9/0x90 SS:ESP 0068:f06d1f18
CR2: 00000000fba9fed8
---[ end trace 233f9f3d53fb82d6 ]---
Kernel panic - not syncing: Fatal exception
Pid: 1725, comm: modprobe Tainted: G      D W  ----------------  2.6.32-66.el6.i686.debug #1
Call Trace:
 [<c0833608>] ? printk+0x18/0x20
 [<c0833540>] panic+0x43/0xf3
 [<c0837b99>] oops_end+0xb9/0xd0
 [<c0430fde>] no_context+0xbe/0x190
 [<c0431140>] __bad_area_nosemaphore+0x90/0x140
 [<c0488b3b>] ? trace_hardirqs_off+0xb/0x10
 [<c0838f30>] ? do_page_fault+0x0/0x4a0
 [<c0431202>] bad_area_nosemaphore+0x12/0x20
 [<c08392e0>] do_page_fault+0x3b0/0x4a0
 [<c048bbb2>] ? mark_held_locks+0x62/0x90
 [<c0838f30>] ? do_page_fault+0x0/0x4a0
 [<c0836f30>] error_code+0x78/0x80
 [<c060d2f9>] ? list_del+0x9/0x90
 [<c078d9dc>] unregister_pernet_operations+0xc/0x50
 [<c078db29>] unregister_pernet_subsys+0x19/0x30
 [<f7e5f24b>] inet6_init+0x24b/0x296 [ipv6]
 [<c040303d>] do_one_initcall+0x2d/0x1d0
 [<f7e5f000>] ? inet6_init+0x0/0x296 [ipv6]
 [<c049b303>] sys_init_module+0xb3/0x220
 [<c048be8c>] ? trace_hardirqs_on_caller+0x12c/0x170
 [<c0409ceb>] sysenter_do_call+0x12/0x38

Expected results:


Additional info:

Comment 2 RHEL Program Management 2010-08-18 19:18:10 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 3 RHEL Program Management 2010-08-18 21:23:01 UTC
Thank you for your bug report. This issue was evaluated for inclusion
in the current release of Red Hat Enterprise Linux. Unfortunately, we
are unable to address this request in the current release. Because we
are in the final stage of Red Hat Enterprise Linux 6 development, only
significant, release-blocking issues involving serious regressions and
data corruption can be considered.

If you believe this issue meets the release blocking criteria as
defined and communicated to you by your Red Hat Support representative,
please ask your representative to file this issue as a blocker for the
current release. Otherwise, ask that it be evaluated for inclusion in
the next minor release of Red Hat Enterprise Linux.

Comment 4 Neil Horman 2010-09-21 17:22:24 UTC
I think this actually is a problem in how we manage the pernet list.  I'm reserving one of the hp z600 systems to investigate.

Comment 6 Neil Horman 2010-09-22 20:27:36 UTC
Note to self:
This might be solved inadvertently upstream by 72ad937abd0a43b7cf2c557ba1f2ec75e608c516 and some supporting patches.  I'll try a backport in the AM

Comment 7 Neil Horman 2010-09-24 17:43:07 UTC
Created attachment 449477 [details]
patch to balance addrconf/addrlabel pernet registration

so good news bad news.

I fixed the problem with the oops, we have an unbalanced registration that leaves a list in an inconsistent state if the code doesn't load properly.

The bad news is that the kernel still doesn't boot, which is bad, but we're in a constrained memory position anyway.  Theres not much we can, save for put the ipv6 module and the kernel as a whole on a diet, but we should probably track that in a separate bz.

This needs to go upstream as well, so I'll send it there and then post internally

Comment 8 Neil Horman 2010-09-27 17:17:09 UTC
http://brewweb.devel.redhat.com/brew/taskinfo?taskID=2782687

Test build of backport

Comment 9 RHEL Program Management 2010-11-19 17:40:29 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 10 Aristeu Rozanski 2010-12-13 15:14:25 UTC
Patch(es) available on kernel-2.6.32-89.el6

Comment 13 Mike Gahagan 2011-05-02 18:35:26 UTC
Have not seen this happening in the last few runs of Kernel Tier 1 and other tests so this can be verified.

Comment 14 errata-xmlrpc 2011-05-23 20:49:25 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0542.html