Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 651869

Summary: probe-remove loop of i7core_edac module causes oops
Product: Red Hat Enterprise Linux 5 Reporter: Jan Tluka <jtluka>
Component: kernelAssignee: Mauro Carvalho Chehab <mchehab>
Status: CLOSED ERRATA QA Contact: Jan Tluka <jtluka>
Severity: high Docs Contact:
Priority: high    
Version: 5.5CC: lwang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-01-13 22:00:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Tluka 2010-11-10 14:25:32 UTC
Description of problem:

This is a follow up on https://bugzilla.redhat.com/show_bug.cgi?id=468877#c118

While doing a probe/remove loop of i7core_edac kernel module I hit following Oops.

Unable to handle kernel paging request at 000004500000010f RIP: 
 [<ffffffff80161e10>] pci_get_subsys+0x7c/0xeb
PGD 0 
Oops: 0000 [1] SMP 
last sysfs file: /devices/pci0000:00/0000:00:01.0/0000:01:00.1/irq
CPU 19 
Modules linked in: i7core_edac iptable_filter ip_tables x_tables autofs4 hidp
rfcomm l2cap bluetooth lockd sunrpc cpufreq_ondemand acpi_cpufreq freq_table
mperf ipv6 xfrm_nalgo crypto_api loop dm_multipath scsi_dh video backlight sbs
power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug
ac parport_pc lp parport joydev sr_mod tpm_tis cdrom igb tpm i2c_i801 sg 8021q
tpm_bios i2c_core shpchp serio_raw pcspkr dca edac_mc dm_raid45 dm_message
dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ahci
libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 19573, comm: modprobe Not tainted 2.6.18-229.el5 #1
RIP: 0010:[<ffffffff80161e10>]  [<ffffffff80161e10>] pci_get_subsys+0x7c/0xeb
RSP: 0018:ffff8101240c7ca8  EFLAGS: 00010217
RAX: 000000000000303d RBX: 00000000ffffffff RCX: 00000000c0000100
RDX: 00000450000000cb RSI: ffff81013af917a0 RDI: 00000450000000cb
RBP: ffff81013b084800 R08: ffff8101240c6000 R09: 000000000000002c
R10: ffff81014709d490 R11: 000000000000027f R12: 00000000ffffffff
R13: 0000000000002da8 R14: 0000000000008086 R15: 000000000ad1e370
FS:  00002b60c1f0e6e0(0000) GS:ffff81033fd7a1c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000004500000010f CR3: 000000033f6bb000 CR4: 00000000000006e0
Process modprobe (pid: 19573, threadinfo ffff8101240c6000, task
ffff81013af917a0)
Stack:  00000000000000ff ffffffff88199f80 ffffffff88199f80 00000000000000ff
 ffff8101240c7d90 ffffffff88197906 0000000800000011 0000000000000008
 ffffffff8819a040 ffffffff88199f80 00000000000000ff ffffffff88199f00
Call Trace:
 [<ffffffff88197906>] :i7core_edac:i7core_get_onedevice+0x2e/0x23a
 [<ffffffff88197bb4>] :i7core_edac:i7core_probe+0xa2/0x856
 [<ffffffff8008c850>] __wake_up_common+0x3e/0x68
 [<ffffffff8002e281>] __wake_up+0x38/0x4f
 [<ffffffff801614e3>] pci_device_probe+0x104/0x184
 [<ffffffff801cb823>] driver_probe_device+0x52/0xaa
 [<ffffffff801cb952>] __driver_attach+0x65/0xb6
 [<ffffffff801cb8ed>] __driver_attach+0x0/0xb6
 [<ffffffff801cb12a>] bus_for_each_dev+0x43/0x6e
 [<ffffffff801cad66>] bus_add_driver+0x76/0x110
 [<ffffffff801617ff>] __pci_register_driver+0x51/0xa6
 [<ffffffff881b0064>] :i7core_edac:i7core_init+0x64/0x85
 [<ffffffff800a8d1e>] sys_init_module+0xaf/0x1f2
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0


Code: 0f b7 42 44 44 39 f0 75 2b 41 83 fd ff 74 09 0f b7 47 46 44 
RIP  [<ffffffff80161e10>] pci_get_subsys+0x7c/0xeb
 RSP <ffff8101240c7ca8>
CR2: 000004500000010f
 <0>Kernel panic - not syncing: Fatal exception

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. while true; do rmmod i7core_edac; modprobe i7core_edac; done
2.
3.
  
Actual results:
Oops

Expected results:
No oops

Additional info:
There's bug 603124 tracking this issue in RHEL6.

Comment 2 Mauro Carvalho Chehab 2010-11-23 17:50:06 UTC
Patches backported. Test kernels is available at:
    http://people.redhat.com/~mchehab/.rhel5_i7core_edac/

Comment 3 RHEL Program Management 2010-11-23 18:09:43 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 33 Jarod Wilson 2010-12-02 15:15:48 UTC
in kernel-2.6.18-235.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 35 Jan Tluka 2010-12-07 14:16:07 UTC
Verified on -235.el5 kernel. The kernel survived 1000 loops of rmmod/modprobe.

Comment 37 errata-xmlrpc 2011-01-13 22:00:25 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0017.html