Bug 811946

Summary: dm-thinp: memory leak
Product: Red Hat Enterprise Linux 6 Reporter: Heinz Mauelshagen <heinzm>
Component: kernelAssignee: Mike Snitzer <msnitzer>
Status: CLOSED DUPLICATE QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: urgent    
Version: 6.4CC: agk, dwysocha, heinzm, jbrassow, msnitzer, prajnoha, prockai, thornber, zkabelac
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-07-30 21:13:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 840683    
Attachments:
Description Flags
reinstate mempool_free in cell_release_singleton
none
dm thin: fix memory leak of singleton cell none

Description Heinz Mauelshagen 2012-04-12 11:30:20 UTC
Description of problem:

After a run of the thinp-test-suite involving many thinp/snapshot configuration changes, lots of memory aren't freed.

Version-Release number of selected component (if applicable):


How reproducible:

Always by running thinp-test-suite


Steps to Reproduce:
1. Start test system and setup config.rb
2. Run test suite
3.
  
Actual results:
Almost 3GB leaked.

Expected results:
Memory freed.

Additional info:
Test system started with ~500MB used.
Extracted from top after test-suite run:
Mem:   6109712k total,  3480524k used,  2629188k free,    13896k buffer

slabtop:
8822550 8819429  99%    0.12K 294085       30   1176340K size-128
Ie. kfrees missing for respective kmallocs.

Comment 1 Zdenek Kabelac 2012-04-12 11:50:47 UTC
I'd suggest try to use  kmemleak (with linux debug kernel) - so you get pretty quickly to code path which leaks memory chunks.

Comment 2 Mike Snitzer 2012-04-12 12:23:05 UTC
The slab that has all the leaked memory is "size-128".

I was going to try kmemleak today... but Heinz you're welcome to take ownership of tracking this down (and thanks for opening this BZ).

Comment 3 Mike Snitzer 2012-04-12 12:25:45 UTC
Ah, comment#0 already pointed to slab-128...

Telltale sign of a leak is the memory is unreclaimable (from /proc/meminfo):

Slab:            2052224 kB
SReclaimable:     121952 kB
SUnreclaim:      1930272 kB

Comment 4 Mike Snitzer 2012-04-12 15:18:57 UTC
I am using a modified Linux 3.3 kernel with 3.4's DM changes plus refactoring that split the bio-prison code out to drivers/md/dm-bio-prison.c -- and all exported bio-prison.c symbols were exported via dm-mod.

I ran: ./run_tests -n /test_two_pools_can_create_thins/

which resulted in numerous kmemleak reports like the following:

unreferenced object 0xffff8802b1567f40 (size 128):
  comm "kworker/u:86", pid 651, jiffies 4295253235 (age 302.975s)
  hex dump (first 32 bytes):
    00 01 10 00 00 00 ad de 00 02 20 00 00 00 ad de  .......... .....
    00 80 3b f0 02 88 ff ff 01 00 00 00 03 88 ff ff  ..;.............
  backtrace:
    [<ffffffff814a5f4e>] kmemleak_alloc+0x5e/0xc0
    [<ffffffff811459cf>] __kmalloc+0x13f/0x260
    [<ffffffff810fbe95>] mempool_kmalloc+0x15/0x20
    [<ffffffff810fc040>] mempool_alloc+0x60/0x170
    [<ffffffffa02f99be>] dm_bio_detain+0x10e/0x210 [dm_mod]
    [<ffffffffa06c85b0>] process_bio+0x60/0x170 [dm_thin_pool]
    [<ffffffffa06c8788>] process_deferred_bios+0xc8/0x230 [dm_thin_pool]
    [<ffffffffa06c8942>] do_worker+0x52/0x60 [dm_thin_pool]
    [<ffffffff81057092>] process_one_work+0x132/0x450
    [<ffffffff81058e9b>] worker_thread+0x17b/0x3c0
    [<ffffffff8105dd9e>] kthread+0x9e/0xb0
    [<ffffffff814c66a4>] kernel_thread_helper+0x4/0x10
    [<ffffffffffffffff>] 0xffffffffffffffff

dm_bio_detain+0x10e is drivers/md/dm-bio-prison.c:145, which is dm_bio_detain's:

cell2 = mempool_alloc(prison->cell_pool, GFP_NOIO);

prison->cell_pool was created in dm_prison_create() with:

prison->cell_pool = mempool_create_kmalloc_pool(nr_cells, sizeof(struct cell));

sizeof(struct cell) is 72 bytes:

crash> struct -o cell
struct cell {
   [0] struct hlist_node list;
  [16] struct bio_prison *prison;
  [24] struct cell_key key;
  [48] struct bio *holder;
  [56] struct bio_list bios;
}
SIZE: 72

I'm not yet seeing _why_ there is a leak... but I'll continue looking at the bio-prison code in this area.

Comment 5 Alasdair Kergon 2012-04-12 16:19:39 UTC
Created attachment 577128 [details]
reinstate mempool_free in cell_release_singleton

Patch to try attached.

Comment 6 Alasdair Kergon 2012-04-12 16:34:47 UTC
Also, can we consider mempool_create_slab_pool to keep memory like this segregated instead of kmalloc?

Comment 7 RHEL Program Management 2012-04-12 16:40:14 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 8 Mike Snitzer 2012-04-12 19:57:24 UTC
(In reply to comment #5)
> Created attachment 577128 [details]
> reinstate mempool_free in cell_release_singleton
> 
> Patch to try attached.

prison is undefined.  I took a different approach.  Will attach.

Comment 9 Mike Snitzer 2012-04-12 20:01:54 UTC
Created attachment 577160 [details]
dm thin: fix memory leak of singleton cell

verified to eliminate leak of single cell in upstream Linux 3.4 -- this leak is _not_ a concern for RHEL6.3.

Comment 10 Mike Snitzer 2012-04-12 20:04:26 UTC
(In reply to comment #6)
> Also, can we consider mempool_create_slab_pool to keep memory like this
> segregated instead of kmalloc?

Yes, I have a patch I'll post to dm-devel.

If you'd like this conversion applied to RHEL6.3 it can easily be ported.

Comment 11 Mike Snitzer 2012-04-12 21:35:52 UTC
kmemleak found what it thinks is another leak (which appears relevant to RHEL6.3's thinp):

unreferenced object 0xffff88032ae2fb40 (size 32):
  comm "dmsetup", pid 5477, jiffies 4296842262 (age 6021.209s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 b0 57 c3 12 03 88 ff ff  .........W......
    03 00 00 00 00 00 00 00 6d 2d 32 00 00 00 00 00  ........m-2.....
  backtrace:
    [<ffffffff814a5f4e>] kmemleak_alloc+0x5e/0xc0
    [<ffffffff81144983>] kmem_cache_alloc_trace+0xd3/0x200
    [<ffffffffa06ba125>] insert_shadow+0x35/0x90 [dm_persistent_data]
    [<ffffffffa06ba20e>] dm_tm_new_block+0x8e/0xa0 [dm_persistent_data]
    [<ffffffffa06b8758>] sm_ll_extend+0xc8/0x120 [dm_persistent_data]
    [<ffffffffa06b9dd3>] dm_sm_metadata_create+0x103/0x1e0 [dm_persistent_data]
    [<ffffffffa06ba593>] dm_tm_create_internal+0xe3/0x200 [dm_persistent_data]
    [<ffffffffa06ba6db>] dm_tm_create_with_sm+0x2b/0x30 [dm_persistent_data]
    [<ffffffffa0357aeb>] init_pmd.clone.0+0x3b/0x370 [dm_thin_pool]
    [<ffffffffa0357f4e>] dm_pool_metadata_open+0x12e/0x2ec [dm_thin_pool]
    [<ffffffffa03540a9>] pool_create+0x39/0x3e0 [dm_thin_pool]
    [<ffffffffa0356056>] pool_ctr+0x296/0x460 [dm_thin_pool]
    [<ffffffffa02fedde>] dm_table_add_target+0x13e/0x3b0 [dm_mod]
    [<ffffffffa03014f9>] table_load+0xc9/0x340 [dm_mod]
    [<ffffffffa0302874>] ctl_ioctl+0x1b4/0x270 [dm_mod]
    [<ffffffffa0302943>] dm_ctl_ioctl+0x13/0x20 [dm_mod]

There were only 3 of these leaked, so it is considerably more rare than the leak  from comment#4.

drivers/md/persistent-data/dm-transaction-manager.c:insert_shadow calls:
si = kmalloc(sizeof(*si), GFP_NOIO);

crash> struct -o shadow_info
struct shadow_info {
   [0] struct hlist_node hlist;
  [16] dm_block_t where;
}
SIZE: 24

Comment 12 Mike Snitzer 2012-04-12 22:27:09 UTC
The leak from comment#11 is induced by thinp-test-suite's 'test_mass_create_apply_remove_dtrandom_device' test.

Comment 13 Mike Snitzer 2012-04-12 22:43:58 UTC
(In reply to comment #12)
> The leak from comment#11 is induced by thinp-test-suite's
> 'test_mass_create_apply_remove_dtrandom_device' test.

But unfortunately, it isn't a reliable reproducer.

Comment 14 Alasdair Kergon 2012-04-12 22:50:48 UTC
I suggest moving this to a separate bugzilla to simplify tracking.

Comment 15 Mike Snitzer 2012-04-13 00:08:11 UTC
(In reply to comment #14)
> I suggest moving this to a separate bugzilla to simplify tracking.

Well, this BZ is suspect as is.  The original leak report is not relevant for RHEL6.3 -- the upstream Linux 3.4 changes aren't in RHEL6.3.

So the only thing this bug would be for (if the leak in comment#11 is moved to a new BZ) is the cleanup to use slab caches in dm-thin.c.

IMHO, that cleanup isn't even really needed given the late date of RHEL6.3 (and the fact that the cleanup doesn't actually fix anything).

Comment 16 Mike Snitzer 2012-04-13 15:04:17 UTC
Here is another stack trace for the leak associated with insert_shadow's allocation of 'struct shadow_info' (first reported in comment#11):

unreferenced object 0xffff88032d7b5880 (size 32):
  comm "dmsetup", pid 6324, jiffies 4296338651 (age 916.002s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 d8 67 4f fd 02 88 ff ff  .........gO.....
    02 00 00 00 00 00 00 00 6d 2d 30 00 ff ff ff ff  ........m-0.....
  backtrace:
    [<ffffffff814a5f4e>] kmemleak_alloc+0x5e/0xc0
    [<ffffffff81144983>] kmem_cache_alloc_trace+0xd3/0x200
    [<ffffffffa05d8125>] insert_shadow+0x35/0x90 [dm_persistent_data]
    [<ffffffffa05d820e>] dm_tm_new_block+0x8e/0xa0 [dm_persistent_data]
    [<ffffffffa05daeab>] new_block+0x1b/0x20 [dm_persistent_data]
    [<ffffffffa05d92e0>] dm_btree_empty+0x30/0xd0 [dm_persistent_data]
    [<ffffffffa05d60de>] sm_ll_new_metadata+0x8e/0xa0 [dm_persistent_data]
    [<ffffffffa05d7db0>] dm_sm_metadata_create+0xe0/0x1e0 [dm_persistent_data]
    [<ffffffffa05d8593>] dm_tm_create_internal+0xe3/0x200 [dm_persistent_data]
    [<ffffffffa05d86db>] dm_tm_create_with_sm+0x2b/0x30 [dm_persistent_data]
    [<ffffffffa054caeb>] init_pmd.clone.0+0x3b/0x370 [dm_thin_pool]
    [<ffffffffa054cf4e>] dm_pool_metadata_open+0x12e/0x2ec [dm_thin_pool]
    [<ffffffffa05490a9>] pool_create+0x39/0x3e0 [dm_thin_pool]
    [<ffffffffa054b056>] pool_ctr+0x296/0x460 [dm_thin_pool]
    [<ffffffffa02dddde>] dm_table_add_target+0x13e/0x3b0 [dm_mod]
    [<ffffffffa02e04f9>] table_load+0xc9/0x340 [dm_mod]

test_discard_random_sectors(DiscardTests): pool status metadata(90/1048576) data(0/262144)
Fri Apr 13 10:53:30 -0400 2012 wiping dev
entering discard loop
Apr 13 10:55:39 rhel-storage-01 kernel: kmemleak: 6 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

Bottomline, running the thinp-test-suite will cause kmemleak to record a leak associated with insert_shadow... but there isn't one specific test that reliably causes it.

Comment 17 Mike Snitzer 2012-04-13 15:13:35 UTC
Any leak fixes will be introduced when the upstream Linux 3.4 thinp code is introduced in RHEL6.4.  Pushing to 6.4.

Comment 19 Mike Snitzer 2012-06-20 18:11:50 UTC
The leak in comment#4 is fixed in this 6.4 patch via bug# 828955:
[RHEL6.4 PATCH 23/34] dm thin: fix stacked bi_next usage

The leak in comment#11 has been fixed with this patch:
http://www.redhat.com/archives/dm-devel/2012-June/msg00110.html

Leaving this BZ open for now, it could be used to commit the fix the leak from comment#11... or closed as a dup of a new BZ that will be opened to commit the 2nd push of DM fixes for RHEL6.4.

Comment 20 Mike Snitzer 2012-07-30 21:13:06 UTC
The fix for the leak in comment#11 was posted to rhkernel-list using bug#828955

*** This bug has been marked as a duplicate of bug 828955 ***