Bug 1015024

Summary:	LVM RAID sometimes fails on kmem_cache sanity check
Product:	[Fedora] Fedora	Reporter:	Zdenek Kabelac <zkabelac>
Component:	kernel	Assignee:	Kernel Maintainer List <kernel-maint>
Status:	CLOSED RAWHIDE	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	rawhide	CC:	agk, bmarzins, bmr, dwysocha, gansalmon, heinzm, itamar, jbrassow, jonathan, kernel-maint, lvm-team, madhu.chinakonda, marcelo.barbosa, msnitzer, prajnoha, prockai, zkabelac
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	1031086 (view as bug list)		Environment:
Last Closed:	2013-11-18 16:32:17 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1031086

Description Zdenek Kabelac 2013-10-03 10:28:26 UTC

Description of problem:

Conversion of lvm2 raid seems to trigger kmem_cache_sanity_check and
results in failure upon dm table reload:

 DEBUG: ioctl/libdm-iface.c:1750   dm table   (253:11) OF   [16384] (*1)
 DEBUG: libdm-deptree.c:2511   Suppressed @PREFIX@vg-LV1_rmeta_0 (253:11) identical table reload.
 DEBUG: libdm-deptree.c:2476   Loading @PREFIX@vg-LV1 table (253:19)
 DEBUG: libdm-deptree.c:2420   Adding target to (253:19): 0 1536 raid raid5_ls 3 128 region_size 512 4 253:11 253:12 253:20 253:21 253:15 253:16 253:17 253:18
 DEBUG: ioctl/libdm-iface.c:1750   dm table   (253:19) OF   [16384] (*1)
 DEBUG: ioctl/libdm-iface.c:1750   dm reload   (253:19) NF   [16384] (*1)
 DEBUG: ioctl/libdm-iface.c:1768   device-mapper: reload ioctl on  failed: Input/output error
 DEBUG: libdm-deptree.c:2572   <backtrace>
 DEBUG: activate/dev_manager.c:2680   <backtrace>
 DEBUG: activate/dev_manager.c:2728   <backtrace>
 DEBUG: activate/activate.c:1078   <backtrace>
 DEBUG: activate/activate.c:1676   <backtrace>
 DEBUG: locking/locking.c:394   <backtrace>
 DEBUG: locking/locking.c:464   <backtrace>
 DEBUG: metadata/raid_manip.c:1751   Failed to suspend @PREFIX@vg/LV1 before committing changes
 DEBUG: lvconvert.c:2638   <backtrace>


#lvconvert-raid.sh:238+ lvconvert --replace @TESTDIR@/dev/mapper/@PREFIX@pv2 @PREFIX@vg/LV1



mdX: bitmap initialized from disk: read 1 pages, set 0 of 1 bits
md: recovery of RAID array mdX
md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
md: using 128k window, over a total of 256k.
@TESTDIR@/dev/mapper/@PREFIX@vg-LV1_rmeta_4 not set up by udev: Falling back to direct node creation.
  @TESTDIR@/dev/mapper/@PREFIX@vg-LV1_rimage_4 not set up by udev: Falling back to direct node creation.
md: mdX: recovery done.
md/raid:mdX: device dm-18 operational as raid disk 3
md/raid:mdX: device dm-16 operational as raid disk 2
md/raid:mdX: device dm-12 operational as raid disk 0
kmem_cache_sanity_check (raid5-ffff88004ae96010): Cache name already exists.
CPU: 0 PID: 14234 Comm: lvm Not tainted 3.11.1-300.fc20.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
 00007ffffffff000 ffff8800757979f0 ffffffff81643cbb ffff880076f3df00
 ffff880075797a60 ffffffff8115d45d 0000000000000000 0000000000000000
 00000000000005d0 0000000000000000 ffff8800371a5cd0 ffff880075797fd8
Call Trace:
 [<ffffffff81643cbb>] dump_stack+0x45/0x56
 [<ffffffff8115d45d>] kmem_cache_create_memcg+0x12d/0x380
 [<ffffffff8115d6db>] kmem_cache_create+0x2b/0x30
 [<ffffffffa012aa05>] setup_conf+0x5c5/0x790 [raid456]
 [<ffffffff8113ebcd>] ? mempool_create_node+0xdd/0x140
 [<ffffffff8118e527>] ? kmem_cache_alloc_trace+0x1d7/0x230
 [<ffffffff8113e8b0>] ? mempool_alloc_slab+0x20/0x20
 [<ffffffffa012b748>] run+0x858/0xa50 [raid456]
 [<ffffffff811dbff6>] ? bioset_create+0x216/0x2e0
 [<ffffffff8113d365>] ? filemap_write_and_wait+0x55/0x60
 [<ffffffff814daecc>] md_run+0x3fc/0x980
 [<ffffffff811dc76d>] ? bio_put+0x7d/0xa0
 [<ffffffff814d2bb8>] ? sync_page_io+0xc8/0x140
 [<ffffffffa014051c>] raid_ctr+0xecc/0x135d [dm_raid]
 [<ffffffff814e6527>] dm_table_add_target+0x167/0x470
 [<ffffffff814e952b>] table_load+0x10b/0x320
 [<ffffffff814e9420>] ? list_devices+0x180/0x180
 [<ffffffff814ea315>] ctl_ioctl+0x255/0x500
 [<ffffffff814ea5d3>] dm_ctl_ioctl+0x13/0x20
 [<ffffffff811b9bdd>] do_vfs_ioctl+0x2dd/0x4b0
 [<ffffffff811aa3a1>] ? __sb_end_write+0x31/0x60
 [<ffffffff811a7f92>] ? vfs_write+0x172/0x1e0
 [<ffffffff811b9e31>] SyS_ioctl+0x81/0xa0
 [<ffffffff81652e59>] system_call_fastpath+0x16/0x1b
md/raid:mdX: couldn't allocate 4338kB for buffers
md: pers->run() failed ...
device-mapper: table: 253:19: raid: Fail to run raid array
device-mapper: ioctl: error adding target to table
device-mapper: reload ioctl on  failed: Input/output error
  Failed to suspend @PREFIX@vg/LV1 before committing changes
  Node @TESTDIR@/dev/mapper/@PREFIX@vg-LV1_rmeta_1 was not removed by udev. Falling back to direct node removal.
  Node @TESTDIR@/dev/mapper/@PREFIX@vg-LV1_rimage_1 was not removed by udev. Falling back to direct node removal.

Version-Release number of selected component (if applicable):
3.11.1-300.fc20.x86_64

How reproducible:
lvconvert-raid test from internal lvm2 test suite occasionally triggers this issue - it seems to be time-dependent

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Jonathan Earl Brassow 2013-10-22 02:00:50 UTC

Bug fixed by the following upstream commit:

commit 3e374919b314f20e2a04f641ebc1093d758f66a4
Author: Christoph Lameter <cl>
Date:   Sat Sep 21 21:56:34 2013 +0000

    slab_common: Do not check for duplicate slab names
    
    SLUB can alias multiple slab kmem_create_requests to one slab cache to save
    memory and increase the cache hotness. As a result the name of the slab can be
    stale. Only check the name for duplicates if we are in debug mode where we do
    not merge multiple caches.
    
    This fixes the following problem reported by Jonathan Brassow:
    
      The problem with kmem_cache* is this:
    
      *) Assume CONFIG_SLUB is set
      1) kmem_cache_create(name="foo-a")
      - creates new kmem_cache structure
      2) kmem_cache_create(name="foo-b")
      - If identical cache characteristics, it will be merged with the previously
        created cache associated with "foo-a".  The cache's refcount will be
        incremented and an alias will be created via sysfs_slab_alias().
      3) kmem_cache_destroy(<ptr>)
      - Attempting to destroy cache associated with "foo-a", but instead the
        refcount is simply decremented.  I don't even think the sysfs aliases are
        ever removed...
      4) kmem_cache_create(name="foo-a")
      - This FAILS because kmem_cache_sanity_check colides with the existing
        name ("foo-a") associated with the non-removed cache.
    
      This is a problem for RAID (specifically dm-raid) because the name used
      for the kmem_cache_create is ("raid%d-%p", level, mddev).  If the cache
      persists for long enough, the memory address of an old mddev will be
      reused for a new mddev - causing an identical formulation of the cache
      name.  Even though kmem_cache_destory had long ago been used to delete
      the old cache, the merging of caches has cause the name and cache of that
      old instance to be preserved and causes a colision (and thus failure) in
      kmem_cache_create().  I see this regularly in my testing.
    
    Reported-by: Jonathan Brassow <jbrassow>
    Signed-off-by: Christoph Lameter <cl>
    Signed-off-by: Pekka Enberg <penberg>

Comment 3 Josh Boyer 2013-10-22 12:51:34 UTC

The bug is reported against rawhide.  Rawhide already has this commit.

Does this need to be sent to upstream stable?