Bug 1225370 - thin-pool after few days of uptime fails on allocation order 5
Summary: thin-pool after few days of uptime fails on allocation order 5
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Joe Thornber
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-05-27 08:39 UTC by Zdenek Kabelac
Modified: 2017-05-23 14:17 UTC (History)
17 users (show)

Fixed In Version:
Clone Of:
: 1226347 1244318 (view as bug list)
Environment:
Last Closed: 2017-05-23 13:32:36 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Full test trace with visible memory fault kernel report (823.09 KB, text/plain)
2015-05-27 12:43 UTC, Zdenek Kabelac
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1454828 1 None None None 2021-09-06 14:58:36 UTC

Internal Links: 1454828

Description Zdenek Kabelac 2015-05-27 08:39:09 UTC
Description of problem:

When kernel memory gets fragment after couple days of heavy use, thin-pool target starts to fail on this allocation error:

lvm: page allocation failure: order:5, mode:0x40d0
CPU: 0 PID: 15987 Comm: lvm Tainted: G        W       4.0.0-0.rc7.git2.1.fc23.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
 0000000000000000 0000000080956201 ffff8800007bf8c8 ffffffff81781898
 0000000000000000 00000000000040d0 ffff8800007bf958 ffffffff811a5bbe
 ffff88007a5eab08 0000000000000005 0000000000000040 ffff88002723d970
Call Trace:
 [<ffffffff81781898>] dump_stack+0x45/0x57
 [<ffffffff811a5bbe>] warn_alloc_failed+0xfe/0x170
 [<ffffffff811a9523>] ? __alloc_pages_direct_compact+0x43/0x100
 [<ffffffff811a9b17>] __alloc_pages_nodemask+0x537/0xa10
 [<ffffffff811f2c21>] alloc_pages_current+0x91/0x110
 [<ffffffff811a5d5b>] alloc_kmem_pages+0x3b/0xf0
 [<ffffffffa012d8ff>] ? dm_bm_unlock+0x2f/0x60 [dm_persistent_data]
 [<ffffffff811c3f6e>] kmalloc_order_trace+0x2e/0xd0
 [<ffffffffa0146766>] pool_ctr+0x486/0x9d0 [dm_thin_pool]
 [<ffffffff815fa65b>] dm_table_add_target+0x15b/0x3b0
 [<ffffffff815fa1b7>] ? dm_table_create+0x87/0x140
 [<ffffffff815fde4b>] table_load+0x14b/0x370
 [<ffffffff815fdd00>] ? retrieve_status+0x1c0/0x1c0
 [<ffffffff815feaf2>] ctl_ioctl+0x232/0x520
 [<ffffffff815fedf3>] dm_ctl_ioctl+0x13/0x20
 [<ffffffff81231d86>] do_vfs_ioctl+0x2c6/0x4d0
 [<ffffffff8114085c>] ? __audit_syscall_entry+0xac/0x100
 [<ffffffff810225d5>] ? do_audit_syscall_entry+0x55/0x80
 [<ffffffff81232011>] SyS_ioctl+0x81/0xa0
 [<ffffffff81788188>] ? int_check_syscall_exit_work+0x34/0x3d
 [<ffffffff81787f49>] system_call_fastpath+0x12/0x17
Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:  30
active_anon:28189 inactive_anon:24979 isolated_anon:0\x0a active_file:119411 inactive_file:79073 isolated_file:0\x0a unevictable:4714 dirty:441 writeback:0 unstable:0\x0a free:26120 slab_reclaimable:119444 slab_unreclaimable:46909\x0a mapped:12817 shmem:32535 pagetables:1146 bounce:0\x0a free_cma:0
Node 0 DMA free:8060kB min:364kB low:452kB high:544kB active_anon:1144kB inactive_anon:1212kB active_file:208kB inactive_file:164kB unevictable:44kB isolated(anon):0kB isolated(file):0kB present:15984kB managed:15900kB mlocked:40kB dirty:0kB writeback:0kB mapped:0kB shmem:1752kB slab_reclaimable:2268kB slab_unreclaimable:1028kB kernel_stack:208kB pagetables:8kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:8 all_unreclaimable? no
lowmem_reserve[]: 0 1893 1893 1893
Node 0 DMA32 free:96420kB min:44688kB low:55860kB high:67032kB active_anon:111612kB inactive_anon:98704kB active_file:477436kB inactive_file:316128kB unevictable:18812kB isolated(anon):0kB isolated(file):0kB present:1988596kB managed:1941380kB mlocked:18800kB dirty:1764kB writeback:0kB mapped:51268kB shmem:128388kB slab_reclaimable:475508kB slab_unreclaimable:186608kB kernel_stack:2416kB pagetables:4576kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:640 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 116*4kB (UEM) 39*8kB (UEM) 7*16kB (U) 9*32kB (UM) 8*64kB (UE) 10*128kB (UEM) 10*256kB (UM) 1*512kB (U) 0*1024kB 1*2048kB (R) 0*4096kB = 8088kB
Node 0 DMA32: 19355*4kB (UEM) 840*8kB (UEM) 346*16kB (UEM) 9*32kB (UM) 3*64kB (M) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 1*2048kB (R) 1*4096kB (R) = 96428kB
Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
234220 total pagecache pages
1507 pages in swap cache
Swap cache stats: add 11781, delete 10274, find 266363/269302
Free swap  = 1031824kB
Total swap = 1048572kB
501145 pages RAM
0 pages HighMem/MovableOnly
11825 pages reserved
0 pages hwpoisoned
device-mapper: table: 253:14: thin-pool: Error allocating memory for pool
device-mapper: ioctl: error adding target to table
device-mapper: reload ioctl on (253:14) failed: Cannot allocate memory

--

## DEBUG: libdm-deptree.c:2646   Loading @PREFIX@vg-LV1 table (253:14)
## DEBUG: libdm-deptree.c:2590   Adding target to (253:14): 0 65536 thin-pool 253:11 253:13 128 0 0
## DEBUG: ioctl/libdm-iface.c:1802   dm table   (253:14) OF   [16384] (*1)
## DEBUG: ioctl/libdm-iface.c:1802   dm reload   (253:14) NF   [16384] (*1)
## DEBUG: ioctl/libdm-iface.c:1834   device-mapper: reload ioctl on (253:14) failed: Cannot allocate memory

lvm2-> Failed to activate pool logical volume @PREFIX@vg/LV1.




Version-Release number of selected component (if applicable):
kernel 4.0
lvm2 2.02.120

How reproducible:
Using lvm2 test suite for days without rebooting machine.
Particularly test/shell/lvconvert-thin.sh often fail.

## Line: 140 	 lvcreate -L32 -n $lv1 $vg
## Line: 141 	 lvcreate -L16 -n $lv2 $vg
## Line: 142 	 lvconvert --yes --thinpool $vg/$lv1


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Zdenek Kabelac 2015-05-27 12:43:34 UTC
Created attachment 1030538 [details]
Full test trace with visible memory fault kernel report

Comment 2 Joe Thornber 2015-07-02 14:05:23 UTC
Order 5 is only 128k of memory.  Probably allocated by one of the slabs.  At what point do you suggest I stop using kmalloc and switch to vmalloc()?  32k? 4k?

Or are you suggesting the fragmentation is due to thinp?

Comment 3 Zdenek Kabelac 2015-07-02 15:06:59 UTC
Mikulas - do you have some advice here ?

Comment 4 Mikuláš Patočka 2015-07-02 15:44:30 UTC
We could try kmalloc and if it fails, use vmalloc. The code that tries kmalloc and falls back to vmalloc is already present in dm-ioctl.c in function copy_params. So, we should extract that piece of code into a separate function and call that function also from dm-thin.c.

I will write a patch that does that.

Comment 7 Josh Boyer 2015-07-09 11:50:30 UTC
Do you want the Fedora kernel to carry any of these patches before landing in upstream, or should we just wait for them to be merged?

The kernel tested in the original comment is "old" now, so we'd be looking at adding whatever in rawhide first if something were to be added.

Comment 8 Mike Snitzer 2015-07-09 13:46:14 UTC
(In reply to Josh Boyer from comment #7)
> Do you want the Fedora kernel to carry any of these patches before landing
> in upstream, or should we just wait for them to be merged?
> 
> The kernel tested in the original comment is "old" now, so we'd be looking
> at adding whatever in rawhide first if something were to be added.

I'll be sending a set of 4.2-rc fixes to Linus next week.  AFAIK this issue isn't so common as to warrant a rush _now_.

But this is the fix that is staged and destined for upstream (and stable@):
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=a822c83e47d97cdef38c4352e1ef62d9f46cfe98

You're welcome to pick it up and carry in Fedora rawhide now if you like.

The patches that Mikulas listed in comment#6 have been reworked some and will hopefully land in Linux 4.3 (dm-thinp will be adapted accordingly at that time).

Comment 9 Mike Snitzer 2017-05-23 13:35:16 UTC
Fix has been upstream (and in Fedora) since July 2015.


Note You need to log in before you can comment on or make changes to this bug.