Bug 467714 - Kernel BUG at include/linux/module.h:397
Kernel BUG at include/linux/module.h:397
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.7
i386 Linux
high Severity high
: rc
: ---
Assigned To: Neil Horman
Martin Jenner
: Regression
Depends On:
Blocks: RHEL4u8_relnotes
  Show dependency treegraph
 
Reported: 2008-10-20 09:44 EDT by Marcus Alves Grando
Modified: 2010-10-23 01:18 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
The ibmphp module is not safe to unload. Previously, the mechanism that prevented the ibmphp module from unloading was insufficient, and eventually triggered a bug halt. With this update, the method to prevent this module from unloading has been improved, preventing the bug halt. However, attempting to unload the module may produce a warning in the message log, indicating that the module is not safe to unload. This warning can be safely ignored.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-05-18 15:22:17 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
panic picture (376.91 KB, image/jpeg)
2008-10-20 09:44 EDT, Marcus Alves Grando
no flags Details
patch to mark ibmphp as unsafe to remove (690 bytes, patch)
2008-12-10 13:49 EST, Neil Horman
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
IBM Linux Technology Center 50609 None None None Never

  None (edit)
Description Marcus Alves Grando 2008-10-20 09:44:46 EDT
Created attachment 320867 [details]
panic picture

I've tried to upgrade kernel in one IBM xSeries to 78.0.5 but doesn't work. Panic every time. I've added one picture of panic.

I suppose somethink in ibmphp or ibmasm module.

Last kernel that works fine is: 2.6.9-67.0.7.ELsmp

Loaded modules in 2.6.9-67.0.7.ELsmp:
hangcheck_timer         7897  0 
iptable_filter          6977  0 
ip_tables              22721  1 iptable_filter
dm_mirror              31557  0 
dm_round_robin          7361  1 
dm_multipath           22984  2 dm_round_robin
dm_mod                 67177  29 dm_mirror,dm_multipath
button                 10705  0 
battery                12997  0 
ac                      8901  0 
joydev                 14465  0 
ohci_hcd               24273  0 
ibmasm                 28493  0 
ibmphp                 70573  4294967295 
e1000                 122705  0 
e100                   36677  0 
mii                     9281  1 e100
floppy                 58193  0 
sg                     38369  0 
ext3                  119497  6 
jbd                    59865  1 ext3
raid1                  19777  3 
qla2300               129857  0 
aic7xxx               146425  8 
qla2xxx               171877  34 qla2300
scsi_transport_fc      12353  1 qla2xxx
sd_mod                 20545  25 
scsi_mod              120269  5 sg,aic7xxx,qla2xxx,scsi_transport_fc,sd_mod

# lspci 
00:00.0 Host bridge: IBM Winnipeg PCI-X Host Bridge (rev 03)
00:01.0 VGA compatible controller: S3 Inc. Savage 4 (rev 06)
00:02.0 Bridge: IBM Remote Supervisor Adapter (RSA)
00:03.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08)
00:04.0 SCSI storage controller: Adaptec AIC-7892P U160/m (rev 02)
00:06.0 Class 0808: IBM: Unknown device 0246
00:0f.0 ISA bridge: Broadcom OSB4 South Bridge (rev 50)
00:0f.1 IDE interface: Broadcom OSB4 IDE Controller
00:0f.2 USB Controller: Broadcom OSB4/CSB5 OHCI USB Controller (rev 04)
01:00.0 Host bridge: IBM Winnipeg PCI-X Host Bridge (rev 03)
01:01.0 Ethernet controller: Intel Corporation 82544EI Gigabit Ethernet Controller (Copper) (rev 02)
01:03.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08)
0a:00.0 Host bridge: IBM Winnipeg PCI-X Host Bridge (rev 03)
0a:01.0 Fibre Channel: QLogic Corp. ISP2312-based 2Gb Fibre Channel to PCI-X HBA (rev 02)

Regards
Comment 1 Marcus Alves Grando 2008-10-28 16:59:01 EDT
Guys, some news about this problem? I can't upgrade my server until this is fixed.

Thanks
Comment 2 Marcus Alves Grando 2008-11-21 18:49:47 EST
ping
Comment 3 Jeremy Agee 2008-12-02 11:25:07 EST
Encountered the same error message on an xSeries system. Add following to /etc/modprobe.conf allowed the system to boot.

alias ibmphp off
Comment 4 Issue Tracker 2008-12-04 17:07:36 EST
The error occurs on this line of sys_init_module():

        /* Drop initial reference. */
        module_put(mod);

The "initial reference" is created in the module_unload_init() routine
while the module is being loaded and mapped, and it should be dropped
here. The BUG is hit because there is no reference to drop. From the
module_put() in-line function:

        BUG_ON(module_refcount(module) == 0)

The module being loaded is ibmphp

I cannot think of a way this situation could occur. Perhaps IBM has seen
this before?


Product changed from 'Red Hat Enterprise Linux 4.6' to 'Red Hat Enterprise
Linux'
Category set to: Kernel::Modules
Internal Status set to 'Waiting on Support'
Version set to: '4.6'

This event sent from IssueTracker by streeter 
 issue 232278
Comment 5 Issue Tracker 2008-12-04 17:07:38 EST
Well, the reason why it is happening now is that the BUG_ON() test was
added in the 68.28.EL kernel for BZ 280431. It has probably been wrong all
along, but was never caught before.



This event sent from IssueTracker by streeter 
 issue 232278
Comment 6 Guy Streeter 2008-12-04 17:11:23 EST
I think this qualifies as a regression in 4.7. Even though the real problem probably existed in 4.6, it was benign but now cause a system crash.
Comment 7 RHEL Product and Program Management 2008-12-04 17:43:53 EST
This bugzilla has Keywords: Regression.  

Since no regressions are allowed between releases, 
it is also being proposed as a blocker for this release.  

Please resolve ASAP.
Comment 9 mark wisner 2008-12-10 08:14:20 EST
added bugproxy@us.ibm.com to the cc list for reverse mirroring
Comment 10 Neil Horman 2008-12-10 13:49:30 EST
Created attachment 326535 [details]
patch to mark ibmphp as unsafe to remove

The BUG_ON that I added in my previous patch doesn't need to be removed.  In fact it worked perfectly here, uncovering a very poor use of module_put in the ibmphp driver.  From ibmphp_init:

/* lock ourselves into memory with a module 
 * count of -1 so that no one can unload us. */
        module_put(THIS_MODULE);

The driver is purposely underflowing the module refcount of this driver to prevent it from being unloaded.  That is both poor practice, and an incorrect solution, as a subsequent module_get would return the count to zero, allowing for a possible unload.  If the module is unsafe to unload, there is a call to inform the kernel of exactly that.  I've attached a patch to correct the problem.  Please test it out and confirm that the issue is resolved.  Thanks!
Comment 11 Neil Horman 2008-12-17 07:28:18 EST
ping, whats the word here?  It would be nice to get this handled by 4.8 close, given that its marked as a high priority bug.
Comment 12 Guy Streeter 2008-12-17 10:26:08 EST
I don't understand why this is set to needinfo for the QA contact.
Comment 13 Guy Streeter 2008-12-17 10:26:42 EST
I will build a test kernel.
Comment 14 John Jarvis 2008-12-17 14:12:49 EST
Is this happening on a specific IBM server model or is this happening on all IBM xSeries boxes?  Please provide the hardware information for IBM to repro.
Comment 15 Marcus Alves Grando 2008-12-17 15:09:19 EST
(In reply to comment #14)
> Is this happening on a specific IBM server model or is this happening on all
> IBM xSeries boxes?  Please provide the hardware information for IBM to repro.

It's an IBM xSeries 360 (4x XEON 1.4GHz / 4Gb MEM)

Regards
Comment 16 Guy Streeter 2008-12-17 15:11:41 EST
John,
 The problem is not specific to a module. It involves loading the ibmphp module. We believe Neil has identified the problem and we are testing his fix.
Comment 20 IBM Bug Proxy 2009-01-07 10:41:34 EST
The more recent IBM models don't use this module, but there are likely a fair number of systems in the field that do need this module.
Comment 21 Neil Horman 2009-01-09 13:55:15 EST
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
The ibmphp module is unsafe to unload.  The mechanism by which this module is prevented from unloading was, in previous releases, insufficient, and eventually triggered a bug halt.  The new, more correct method of preventing this module from uloading prevents the aforementioned bug halt, but produces a warning messsge that was previously unrecorded in the message log, indicating that the module is marked as being unsafe to unload.  This warning message can be safely ignored.
Comment 23 Vivek Goyal 2009-01-16 09:43:06 EST
Committed in 78.30.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Comment 32 IBM Bug Proxy 2009-03-27 11:41:04 EDT
------- Comment From lnx1138@linux.vnet.ibm.com 2009-03-27 11:34 EDT-------
Hello, anyone verified this is fixed in latest 4.8 snapshot so we can close please? Thanks.
Comment 34 Ryan Lerch 2009-03-29 20:00:05 EDT
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1 +1 @@
-The ibmphp module is unsafe to unload.  The mechanism by which this module is prevented from unloading was, in previous releases, insufficient, and eventually triggered a bug halt.  The new, more correct method of preventing this module from uloading prevents the aforementioned bug halt, but produces a warning messsge that was previously unrecorded in the message log, indicating that the module is marked as being unsafe to unload.  This warning message can be safely ignored.+The ibmphp module is not safe to unload. Previously, the mechanism that prevented the ibmphp module from unloading was insufficient, and eventually triggered a bug halt. With this update, the method to prevent this module from unloading has been improved, preventing the bug halt. However, attempting to unload the module may produce a warning in the message log, indicating that the module is not safe to unload. This warning can be safely ignored.
Comment 35 IBM Bug Proxy 2009-03-30 08:11:12 EDT
------- Comment From mbeeraka@in.ibm.com 2009-03-30 08:09 EDT-------
(In reply to comment #17)
> (In reply to comment #16)
> > Hello, anyone verified this is fixed in latest 4.8 snapshot so we can close
> > please? Thanks.
> >
>
> Alright, we will verify this bug & update the bug report soon.
>

Verified this bug by upgrading the kernel from RHEL4.6 (2.6.9-67.ELsmp) to RHEL4.8-snap1(2.6.9-84.ELsmp) and could not reproduce.
Comment 38 errata-xmlrpc 2009-05-18 15:22:17 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1024.html

Note You need to log in before you can comment on or make changes to this bug.