Bug 1354686 - vgspilt can lead to OOM
Summary: vgspilt can lead to OOM
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.3
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Heinz Mauelshagen
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-07-11 23:16 UTC by Corey Marthaler
Modified: 2016-11-04 04:15 UTC (History)
7 users (show)

Fixed In Version: lvm2-2.02.161-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-04 04:15:47 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1445 normal SHIPPED_LIVE lvm2 bug fix and enhancement update 2016-11-03 13:46:41 UTC

Description Corey Marthaler 2016-07-11 23:16:11 UTC
Description of problem:

SCENARIO - [split_two_device_raid_from_vg]
Split out a two device raid LV from VG
create a linear and a raid in the same vg (different pvs)
lvcreate --alloc anywhere --type raid5 -i 2 -n raid -L 100M seven /dev/sdb1 /dev/sdd1
deactivating seven/raid
host-082: vgsplit -n raid seven ten
vgsplit attempt failed


[root@host-082 ~]# lvs -a -o +devices
  LV              VG    Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices                                           
  linear          seven -wi-a----- 100.00m                                                     /dev/sdc1(0)                                      
  raid            seven rwa---r--- 104.00m                                                     raid_rimage_0(0),raid_rimage_1(0),raid_rimage_2(0)
  [raid_rimage_0] seven Iwa---r---  52.00m                                                     /dev/sdb1(1)                                      
  [raid_rimage_1] seven Iwa---r---  52.00m                                                     /dev/sdd1(1)                                      
  [raid_rimage_2] seven Iwa---r---  52.00m                                                     /dev/sdb1(15)                                     
  [raid_rmeta_0]  seven ewa---r---   4.00m                                                     /dev/sdb1(0)                                      
  [raid_rmeta_1]  seven ewa---r---   4.00m                                                     /dev/sdd1(0)                                      
  [raid_rmeta_2]  seven ewa---r---   4.00m                                                     /dev/sdb1(14)                                     

Jul 11 17:47:30 host-082 qarshd[25629]: Running cmdline: vgsplit -n raid seven ten
Jul 11 18:10:23 host-082 kernel: dhclient invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
Jul 11 18:10:23 host-082 kernel: dhclient cpuset=/ mems_allowed=0
Jul 11 18:10:23 host-082 kernel: CPU: 0 PID: 648 Comm: dhclient Not tainted 3.10.0-419.el7.x86_64 #1
Jul 11 18:10:23 host-082 kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
Jul 11 18:10:23 host-082 kernel: ffff880039e82a00 00000000e73fc544 ffff88003d333760 ffffffff81657044
Jul 11 18:10:23 host-082 kernel: ffff88003d3337f0 ffffffff81652778 ffffffff8129d76b ffff88003cdff360
Jul 11 18:10:23 host-082 kernel: ffff88003cdff378 ffffffff00000202 fffeefff00000000 0000000000000001
Jul 11 18:10:23 host-082 kernel: Call Trace:
Jul 11 18:10:23 host-082 kernel: [<ffffffff81657044>] dump_stack+0x19/0x1b
Jul 11 18:10:23 host-082 kernel: [<ffffffff81652778>] dump_header+0x8e/0x225
Jul 11 18:10:23 host-082 kernel: [<ffffffff8129d76b>] ? cred_has_capability+0x6b/0x120
Jul 11 18:10:23 host-082 kernel: [<ffffffff81132f03>] ? delayacct_end+0x33/0xb0
Jul 11 18:10:23 host-082 kernel: [<ffffffff81179b0e>] oom_kill_process+0x24e/0x3c0
Jul 11 18:10:23 host-082 kernel: [<ffffffff8117a346>] out_of_memory+0x4b6/0x4f0
Jul 11 18:10:23 host-082 kernel: [<ffffffff81180738>] __alloc_pages_nodemask+0xaa8/0xba0
Jul 11 18:10:23 host-082 kernel: [<ffffffff811c670a>] alloc_pages_vma+0x9a/0x150
Jul 11 18:10:23 host-082 kernel: [<ffffffff811b76bb>] read_swap_cache_async+0xeb/0x160
Jul 11 18:10:23 host-082 kernel: [<ffffffff811b77d8>] swapin_readahead+0xa8/0x110
Jul 11 18:10:23 host-082 kernel: [<ffffffff811a4e49>] handle_mm_fault+0xab9/0xfc0
Jul 11 18:10:23 host-082 kernel: [<ffffffff81190780>] ? shmem_add_to_page_cache.isra.19+0xe0/0x160
Jul 11 18:10:23 host-082 kernel: [<ffffffff81662ab8>] __do_page_fault+0x148/0x440
Jul 11 18:10:23 host-082 kernel: [<ffffffff81662de5>] do_page_fault+0x35/0x90
Jul 11 18:10:23 host-082 kernel: [<ffffffff8165f088>] page_fault+0x28/0x30
Jul 11 18:10:23 host-082 kernel: [<ffffffff81312ae9>] ? copy_user_generic_unrolled+0x89/0xc0
Jul 11 18:10:23 host-082 kernel: [<ffffffff812039d1>] ? set_fd_set+0x21/0x30
Jul 11 18:10:23 host-082 kernel: [<ffffffff8120488f>] core_sys_select+0x20f/0x300
Jul 11 18:10:23 host-082 kernel: [<ffffffff811a4a3c>] ? handle_mm_fault+0x6ac/0xfc0
Jul 11 18:10:23 host-082 kernel: [<ffffffff8105fa7f>] ? kvm_clock_get_cycles+0x1f/0x30
Jul 11 18:10:23 host-082 kernel: [<ffffffff8105fa7f>] ? kvm_clock_get_cycles+0x1f/0x30
Jul 11 18:10:23 host-082 kernel: [<ffffffff810e01ec>] ? ktime_get_ts64+0x4c/0xf0
Jul 11 18:10:23 host-082 kernel: [<ffffffff81204a3a>] SyS_select+0xba/0x110
Jul 11 18:10:23 host-082 kernel: [<ffffffff81667609>] system_call_fastpath+0x16/0x1b
Jul 11 18:10:23 host-082 kernel: Mem-Info:
Jul 11 18:10:23 host-082 kernel: Node 0 DMA per-cpu:
Jul 11 18:10:23 host-082 kernel: CPU    0: hi:    0, btch:   1 usd:   0
Jul 11 18:10:23 host-082 kernel: Node 0 DMA32 per-cpu:
Jul 11 18:10:23 host-082 kernel: CPU    0: hi:  186, btch:  31 usd: 189
Jul 11 18:10:23 host-082 kernel: active_anon:109147 inactive_anon:109232 isolated_anon:0#012 active_file:0 inactive_file:12 isolated_file:0#012 unevictable:3154 dirty:0 writeback:0 unstable:0#012 free:12162 slab_reclaimable:3904 slab_unreclaimable:7788#012 mapped:1850 shmem:102 pagetables:2422 bounce:0#012 free_cma:0
Jul 11 18:10:23 host-082 kernel: Node 0 DMA free:4600kB min:704kB low:880kB high:1056kB active_anon:4916kB inactive_anon:5136kB active_file:0kB inactive_file:0kB unevictable:264kB isolated(anon):0kB isolated(file):0kB present:15984kB managed:15892kB mlocked:264kB dirty:0kB writeback:0kB mapped:192kB shmem:40kB slab_reclaimable:104kB slab_unreclaimable:416kB kernel_stack:32kB pagetables:152kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jul 11 18:10:23 host-082 kernel: lowmem_reserve[]: 0 975 975 975
Jul 11 18:10:23 host-082 kernel: Node 0 DMA32 free:44048kB min:44348kB low:55432kB high:66520kB active_anon:431672kB inactive_anon:431792kB active_file:0kB inactive_file:48kB unevictable:12352kB isolated(anon):0kB isolated(file):0kB present:1032180kB managed:1000920kB mlocked:12352kB dirty:0kB writeback:0kB mapped:7208kB shmem:368kB slab_reclaimable:15512kB slab_unreclaimable:30736kB kernel_stack:2832kB pagetables:9536kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1218332 all_unreclaimable? yes
Jul 11 18:10:23 host-082 kernel: lowmem_reserve[]: 0 0 0 0
Jul 11 18:10:23 host-082 kernel: Node 0 DMA: 36*4kB (UM) 29*8kB (UM) 14*16kB (UM) 13*32kB (UM) 10*64kB (UM) 7*128kB (UM) 4*256kB (UM) 2*512kB (UM) 0*1024kB 0*2048kB 0*4096kB = 4600kB
Jul 11 18:10:23 host-082 kernel: Node 0 DMA32: 892*4kB (UE) 838*8kB (UEM) 491*16kB (UE) 256*32kB (UE) 147*64kB (UEM) 61*128kB (UEM) 2*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 44048kB
Jul 11 18:10:23 host-082 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jul 11 18:10:23 host-082 kernel: 2236 total pagecache pages
Jul 11 18:10:23 host-082 kernel: 387 pages in swap cache
Jul 11 18:10:23 host-082 kernel: Swap cache stats: add 212788, delete 212401, find 69756/70014
Jul 11 18:10:23 host-082 kernel: Free swap  = 0kB
Jul 11 18:10:23 host-082 kernel: Total swap = 839676kB
Jul 11 18:10:23 host-082 kernel: 262041 pages RAM
Jul 11 18:10:23 host-082 kernel: 0 pages HighMem/MovableOnly
Jul 11 18:10:23 host-082 kernel: 7838 pages reserved


Version-Release number of selected component (if applicable):
3.10.0-419.el7.x86_64

lvm2-2.02.160-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016
lvm2-libs-2.02.160-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016
lvm2-cluster-2.02.160-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016
device-mapper-1.02.130-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016
device-mapper-libs-1.02.130-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016
device-mapper-event-1.02.130-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016
device-mapper-event-libs-1.02.130-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016
device-mapper-persistent-data-0.6.2-0.1.rc8.el7    BUILT: Wed May  4 02:56:34 CDT 2016
cmirror-2.02.160-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016
sanlock-3.3.0-1.el7    BUILT: Wed Feb 24 09:52:30 CST 2016
sanlock-lib-3.3.0-1.el7    BUILT: Wed Feb 24 09:52:30 CST 2016
lvm2-lockd-2.02.160-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016

Comment 2 Alasdair Kergon 2016-07-12 00:29:26 UTC
Incorrect use of dm_list_iterate_safe() which only permits the current element to be removed from the list, not other elements.

E.g. first flag all the LVs that need to move, then move them in a single pass.

Comment 3 Heinz Mauelshagen 2016-07-12 14:17:46 UTC
Fixed via conditional update of the temporary list pointer in case it references a split off LV.

Comment 5 Corey Marthaler 2016-07-21 23:09:39 UTC
With the exception of bug 1358961, the vgsplit cases now pass again.

Marking verified with the latest rpms.

3.10.0-472.el7.x86_64

lvm2-2.02.161-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016
lvm2-libs-2.02.161-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016
lvm2-cluster-2.02.161-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016
device-mapper-1.02.131-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016
device-mapper-libs-1.02.131-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016
device-mapper-event-1.02.131-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016
device-mapper-event-libs-1.02.131-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016
device-mapper-persistent-data-0.6.2-1.el7    BUILT: Mon Jul 11 04:32:34 CDT 2016
cmirror-2.02.161-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016
sanlock-3.4.0-1.el7    BUILT: Fri Jun 10 11:41:03 CDT 2016
sanlock-lib-3.4.0-1.el7    BUILT: Fri Jun 10 11:41:03 CDT 2016
lvm2-lockd-2.02.161-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016

Comment 7 errata-xmlrpc 2016-11-04 04:15:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1445.html


Note You need to log in before you can comment on or make changes to this bug.