Bug 648571

Summary: WARNING: at lib/list_debug.c:30 __list_add+0x68/0x81()
Product: [Fedora] Fedora Reporter: Mihai Harpau <mishu>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 14CC: dougsland, elfyn.mcbratney, eloranta, gansalmon, igeorgex, itamar, jburke, jonathan, kernel-maint, madhu.chinakonda, markku.kolkka, nsoranzo, orion
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-10-11 17:48:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mihai Harpau 2010-11-01 16:45:32 UTC
Description of problem:

WARNING: at lib/list_debug.c:30 __list_add+0x68/0x81()
Hardware name: Latitude E5400                  
list_add corruption. prev->next should be next (ffffffffa01287f8), but was ffffffffa0137488. (prev=ffffffff81a7ae20).
Modules linked in: microcode(+) sdhci_pci sdhci firewire_ohci mmc_core firewire_core crc_itu_t yenta_socket i915 drm_kms_helper drm i2c_algo_bit i2c_core video output [last unloaded: scsi_wait_scan]
Pid: 669, comm: modprobe Tainted: G        W   2.6.35.6-48.fc14.x86_64 #1
Call Trace:
 [<ffffffff8104d7c1>] warn_slowpath_common+0x85/0x9d
 [<ffffffff8104d87c>] warn_slowpath_fmt+0x46/0x48
 [<ffffffff812263a3>] __list_add+0x68/0x81
 [<ffffffff81217dcd>] module_bug_finalize+0xb9/0xca
 [<ffffffff81028856>] module_finalize+0x156/0x165
 [<ffffffff8107bfff>] load_module+0x1170/0x1b74
 [<ffffffff81079981>] ? setup_modinfo_srcversion+0x0/0x29
 [<ffffffff811e437a>] ? selinux_capable+0x37/0x40
 [<ffffffff8107ca53>] sys_init_module+0x50/0x1e4
 [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b

Version-Release number of selected component (if applicable):
F14 RC1 up-to-date
kernel-2.6.35.6-48.fc14.x86_64 

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Mihai Harpau 2010-11-01 16:47:36 UTC
I see also this:

WARNING: at lib/list_debug.c:26 __list_add+0x3f/0x81()
Hardware name: Latitude E5400                  
list_add corruption. next->prev should be prev (ffffffff81a7ae20), but was ffffffffa0137488. (next=ffffffffa01287f8).
Modules linked in: sdhci_pci sdhci firewire_ohci mmc_core firewire_core crc_itu_t yenta_socket i915 drm_kms_helper drm i2c_algo_bit i2c_core video output [last unloaded: scsi_wait_scan]
Pid: 669, comm: modprobe Not tainted 2.6.35.6-48.fc14.x86_64 #1
Call Trace:
 [<ffffffff8104d7c1>] warn_slowpath_common+0x85/0x9d
 [<ffffffff8104d87c>] warn_slowpath_fmt+0x46/0x48
 [<ffffffff8122637a>] __list_add+0x3f/0x81
 [<ffffffff81217dcd>] module_bug_finalize+0xb9/0xca
 [<ffffffff81028856>] module_finalize+0x156/0x165
 [<ffffffff8107bfff>] load_module+0x1170/0x1b74
 [<ffffffff81079981>] ? setup_modinfo_srcversion+0x0/0x29
 [<ffffffff811e437a>] ? selinux_capable+0x37/0x40
 [<ffffffff8107ca53>] sys_init_module+0x50/0x1e4
 [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b

Comment 2 Orion Poplawski 2010-11-15 18:48:49 UTC
Same here:

[    9.420308] WARNING: at lib/list_debug.c:26 __list_add+0x3f/0x81()
[    9.420310] Hardware name: Latitude E5400                  
[    9.420313] list_add corruption. next->prev should be prev (ffffffff81a7ae20), but was ffffffffa01d5488. (next=ffffffffa01c31a8).
[    9.420315] Modules linked in: snd iTCO_wdt iTCO_vendor_support tg3 soundcore i2c_i801 snd_page_alloc dell_wmi dcdbas wmi joydev usb_storage sdhci_pci sdhci firewire_ohci mmc_core firewire_core yenta_socket crc_itu_t i915 drm_kms_helper drm i2c_algo_bit i2c_core video output
[    9.420338] Pid: 637, comm: modprobe Not tainted 2.6.35.6-48.fc14.x86_64 #1
[    9.420340] Call Trace:
[    9.420345]  [<ffffffff8104d7c1>] warn_slowpath_common+0x85/0x9d
[    9.420350]  [<ffffffff8104d87c>] warn_slowpath_fmt+0x46/0x48
[    9.420354]  [<ffffffff8122637a>] __list_add+0x3f/0x81
[    9.420359]  [<ffffffff81217dcd>] module_bug_finalize+0xb9/0xca
[    9.420363]  [<ffffffff81028856>] module_finalize+0x156/0x165
[    9.420371]  [<ffffffff8107bfff>] load_module+0x1170/0x1b74
[    9.420374]  [<ffffffff81079981>] ? setup_modinfo_srcversion+0x0/0x29
[    9.420380]  [<ffffffff811e437a>] ? selinux_capable+0x37/0x40
[    9.420383]  [<ffffffff8107ca53>] sys_init_module+0x50/0x1e4
[    9.420387]  [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b

Comment 3 Jeff Burke 2011-02-08 19:28:59 UTC
Can you please give a little bit of detail about what the system was doing when thins message printed? Was it just booting the kernel or were you adding or removing a module?

Comment 4 Mihai Harpau 2011-03-05 21:13:04 UTC
Hi, sorry for the late respond. The messages are printed at booting time only.

Comment 5 elfyn.mcbratney 2011-03-06 17:38:47 UTC
I just hit this bug about half an hour ago as well; similar stack trace (can post if necessary), same warning and offsets. Happened during ~10 seconds into boot without me doing anything to trigger it.

After a little digging through my LKML archives I found a note from Thomas Gleixner:

  Subject: [BUG 2.6.36-rc6] list corruption in module_bug_finalize
  Message-ID: <alpine.LFD.2.00.1010032141410.14550>

that seems related. There's a patch at the end of the thread. I'd try and find the a link to the post on lkml.org, but I'm having difficulty reaching that site. With any luck this small nugget will be of some help...

Comment 6 JM 2011-04-12 15:14:07 UTC
Same here with kernel

kernel-2.6.35.12-88.fc14.x86_64

Comment 8 Josh Boyer 2011-08-30 00:18:21 UTC
*** Bug 651528 has been marked as a duplicate of this bug. ***

Comment 9 Josh Boyer 2011-08-30 17:27:15 UTC
*** Bug 693085 has been marked as a duplicate of this bug. ***

Comment 10 Josh Boyer 2011-09-12 13:35:07 UTC
*** Bug 737345 has been marked as a duplicate of this bug. ***

Comment 11 Josh Boyer 2011-09-13 20:49:59 UTC
I've added a backport of the commit highlighted in comment #7.  Since this issue is intermittent, it's somewhat hard to say if the issue is fully resolved but the kernel seems to fare well in my local testing.

The next F14 kernel build should have this included.

Comment 12 Dave Jones 2011-10-11 17:48:50 UTC
that build is now out. This issue should be fixed.