RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 769652 - scsi_alloc_sdev can leak memory
Summary: scsi_alloc_sdev can leak memory
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Jeff Moyer
QA Contact: Gris Ge
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-12-21 15:47 UTC by Jeff Moyer
Modified: 2012-06-20 08:12 UTC (History)
3 users (show)

Fixed In Version: kernel-2.6.32-235.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-20 08:12:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2012:0862 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Linux 6 kernel security, bug fix and enhancement update 2012-06-20 12:55:00 UTC

Description Jeff Moyer 2011-12-21 15:47:17 UTC
Description of problem:

Upstream commit f7c9c6bb14f3104608a3a83cadea10a6943d2804
Author: Anton Blanchard <anton>
Date:   Thu Nov 3 08:56:22 2011 +1100

    [SCSI] Fix block queue and elevator memory leak in scsi_alloc_sdev

    When looking at memory consumption issues I noticed quite a
    lot of memory in the kmalloc-2048 bucket:

      OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
      6561   6471  98%    2.30K    243       27     15552K kmalloc-2048

    Over 15MB. slub debug shows that cfq is responsible for almost
    all of it:

    # sort -nr /sys/kernel/slab/kmalloc-2048/alloc_calls
    6402 .cfq_init_queue+0xec/0x460 age=43423/43564/43655 pid=1 cpus=4,11,13

    In scsi_alloc_sdev we do scsi_alloc_queue but if slave_alloc
    fails we don't free it with scsi_free_queue.

    The patch below fixes the issue:

      OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
       135     72  53%    2.30K      5       27       320K kmalloc-2048

    # cat /sys/kernel/slab/kmalloc-2048/alloc_calls
    3 .cfq_init_queue+0xec/0x460 age=3811/3876/3925 pid=1 cpus=4,11,13

    Signed-off-by: Anton Blanchard <anton>
    Cc: <stable>             #2.6.38+
    Signed-off-by: James Bottomley <JBottomley>


Also required is the follow-on fix:

Upstream commit 4e6c82b3614a18740ef63109d58743a359266daf
Author: James Bottomley <James.Bottomley>
Date:   Mon Nov 7 08:51:24 2011 -0600

    [SCSI] fix WARNING: at drivers/scsi/scsi_lib.c:1704

    On Mon, 2011-11-07 at 17:24 +1100, Stephen Rothwell wrote:
    > Hi all,
    >
    > Starting some time last week I am getting the following during boot on
    > our PPC970 blade:
    >
    > calling  .ipr_init+0x0/0x68 @ 1
    > ipr: IBM Power RAID SCSI Device Driver version: 2.5.2 (April 27, 2011)
    > ipr 0000:01:01.0: Found IOA with IRQ: 26
    > ipr 0000:01:01.0: Starting IOA initialization sequence.
    > ipr 0000:01:01.0: Adapter firmware version: 06160039
    > ipr 0000:01:01.0: IOA initialized.
    > scsi0 : IBM 572E Storage Adapter
    > ------------[ cut here ]------------
    > WARNING: at drivers/scsi/scsi_lib.c:1704
    > Modules linked in:
    > NIP: c00000000053b3d4 LR: c00000000053e5b0 CTR: c000000000541d70
    > REGS: c0000000783c2f60 TRAP: 0700   Not tainted  (3.1.0-autokern1)
    > MSR: 8000000000029032 <EE,ME,CE,IR,DR>  CR: 24002024  XER: 20000002
    > TASK = c0000000783b8000[1] 'swapper' THREAD: c0000000783c0000 CPU: 0
    > GPR00: 0000000000000001 c0000000783c31e0 c000000000cf38b0 c00000000239a9d0
    > GPR04: c000000000cbe8f8 0000000000000000 c0000000783c3040 0000000000000000
    > GPR08: c000000075daf488 c000000078a3b7ff c000000000bcacc8 0000000000000000
    > GPR12: 0000000044002028 c000000007ffb000 0000000002e40000 000000000099b800
    > GPR16: 0000000000000000 c000000000bba5fc c000000000a61db8 0000000000000000
    > GPR20: 0000000001b77200 0000000000000000 c000000078990000 0000000000000001
    > GPR24: c000000002396828 0000000000000000 0000000000000000 c000000078a3b938
    > GPR28: fffffffffffffffa c0000000008ad2c0 c000000000c7faa8 c00000000239a9d0
    > NIP [c00000000053b3d4] .scsi_free_queue+0x24/0x90
    > LR [c00000000053e5b0] .scsi_alloc_sdev+0x280/0x2e0
    > Call Trace:
    > [c0000000783c31e0] [c000000000c7faa8] wireless_seq_fops+0x278d0/0x2eb88 (unreliable)
    > [c0000000783c3270] [c00000000053e5b0] .scsi_alloc_sdev+0x280/0x2e0
    > [c0000000783c3330] [c00000000053eba0] .scsi_probe_and_add_lun+0x390/0xb40
    > [c0000000783c34a0] [c00000000053f7ec] .__scsi_scan_target+0x16c/0x650
    > [c0000000783c35f0] [c00000000053fd90] .scsi_scan_channel+0xc0/0x100
    > [c0000000783c36a0] [c00000000053fefc] .scsi_scan_host_selected+0x12c/0x1c0
    > [c0000000783c3750] [c00000000083dcb4] .ipr_probe+0x2c0/0x390
    > [c0000000783c3830] [c0000000003f50b4] .local_pci_probe+0x34/0x50
    > [c0000000783c38a0] [c0000000003f5f78] .pci_device_probe+0x148/0x150
    > [c0000000783c3950] [c0000000004e1e8c] .driver_probe_device+0xdc/0x210
    > [c0000000783c39f0] [c0000000004e20cc] .__driver_attach+0x10c/0x110
    > [c0000000783c3a80] [c0000000004e1228] .bus_for_each_dev+0x98/0xf0
    > [c0000000783c3b30] [c0000000004e1bf8] .driver_attach+0x28/0x40
    > [c0000000783c3bb0] [c0000000004e07d8] .bus_add_driver+0x218/0x340
    > [c0000000783c3c60] [c0000000004e2a2c] .driver_register+0x9c/0x1b0
    > [c0000000783c3d00] [c0000000003f62d4] .__pci_register_driver+0x64/0x140
    > [c0000000783c3da0] [c000000000b99f88] .ipr_init+0x4c/0x68
    > [c0000000783c3e20] [c00000000000ad24] .do_one_initcall+0x1a4/0x1e0
    > [c0000000783c3ee0] [c000000000b512d0] .kernel_init+0x14c/0x1fc
    > [c0000000783c3f90] [c000000000022468] .kernel_thread+0x54/0x70
    > Instruction dump:
    > ebe1fff8 7c0803a6 4e800020 7c0802a6 fba1ffe8 fbe1fff8 7c7f1b78 f8010010
    > f821ff71 e8030398 3120ffff 7c090110 <0b000000> e86303b0 482de065 60000000
    > ---[ end trace 759bed76a85e8dec ]---
    > scsi 0:0:1:0: Direct-Access     IBM-ESXS MAY2036RC        T106 PQ: 0 ANSI: 5
    > ------------[ cut here ]------------
    >
    > I get lots more of these.  The obvious commit to point the finger at
    > is 3308511c93e6 ("[SCSI] Make scsi_free_queue() kill pending SCSI
    > commands") but the root cause may be something different.

    Caused by

    commit f7c9c6bb14f3104608a3a83cadea10a6943d2804
    Author: Anton Blanchard <anton>
    Date:   Thu Nov 3 08:56:22 2011 +1100

        [SCSI] Fix block queue and elevator memory leak in scsi_alloc_sdev

    Doesn't completely do the teardown.  The true fix is to do a proper
    teardown instead of hand rolling it

    Reported-by: Stephen Rothwell <sfr.org.au>
    Tested-by: Stephen Rothwell <sfr.org.au>
    Cc: stable       #2.6.38+
    Signed-off-by: James Bottomley <JBottomley>


Version-Release number of selected component (if applicable):


How reproducible:
Storage dependent.  I was not able to reproduce it.

Comment 1 RHEL Program Management 2011-12-21 15:59:57 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 4 Aristeu Rozanski 2012-02-21 19:49:42 UTC
Patch(es) available on kernel-2.6.32-235.el6

Comment 7 Gris Ge 2012-05-24 07:54:07 UTC
Got no kmemleak in ppc64 or ipr driver in RHEL 6.2 GA kernel -220.

The folder of "/sys/kernel/slab" in comment 0  cannot be found in debug kernel, no idea how to enable it.

Sanity Only.

Please feel free to comment if you have sufficient test case.

Comment 9 errata-xmlrpc 2012-06-20 08:12:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0862.html


Note You need to log in before you can comment on or make changes to this bug.