RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1110041 - Warn about possible OOM issues when caching volumes and free memory is low
Summary: Warn about possible OOM issues when caching volumes and free memory is low
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Zdenek Kabelac
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
: 1114113 (view as bug list)
Depends On:
Blocks: 1114113 1119326 1186924 1313485
TreeView+ depends on / blocked
 
Reported: 2014-06-16 21:52 UTC by Corey Marthaler
Modified: 2023-03-08 07:26 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1114113 (view as bug list)
Environment:
Last Closed: 2021-03-01 07:32:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Corey Marthaler 2014-06-16 21:52:49 UTC
Description of problem:

# Creating 88th cache volume:
Create origin (slow) volume
lvcreate -L 20M -n origin_88 cache_sanity /dev/sdc1

Create cache data and cache metadata (fast) volumes
lvcreate -L 10M -n pool_88 cache_sanity /dev/sde1
lvcreate -L 8M -n pool_88_meta cache_sanity /dev/sde1

Create cache pool volume by combining the cache data and cache metadata (fast) volumes
lvconvert --type cache-pool --cachemode writethrough --poolmetadata cache_sanity/pool_88_meta cache_sanity/pool_88
Create cached volume by combining the cache pool (fast) and origin (slow) volumes
lvconvert --type cache --cachepool cache_sanity/pool_88 cache_sanity/origin_88
couldn't create combined cached volume


qarshd[17154]: Running cmdline: lvconvert --type cache --cachepool cache_sanity/pool_88 cache_sanity/origin_88
kernel: bio: create slab <bio-0> at 0
kernel: dmeventd invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=-1000
kernel: dmeventd cpuset=/ mems_allowed=0
kernel: CPU: 0 PID: 4264 Comm: dmeventd Tainted: G            -------------- T 3.10.0-123.el7.x86_64 #1
kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
kernel: ffff88002b114440 00000000b902d34a ffff88002a5c16e0 ffffffff815e19ba
kernel: ffff88002a5c1770 ffffffff815dd02d ffffffff810b68f8 ffff88002a4620c0
kernel: ffffffff00000202 fbfeffff00000000 0000000000000001 ffffffff81102e03
kernel: Call Trace:
kernel: [<ffffffff815e19ba>] dump_stack+0x19/0x1b
kernel: [<ffffffff815dd02d>] dump_header+0x8e/0x214
kernel: [<ffffffff810b68f8>] ? ktime_get_ts+0x48/0xe0
kernel: [<ffffffff81102e03>] ? proc_do_uts_string+0xe3/0x130
kernel: [<ffffffff8114520e>] oom_kill_process+0x24e/0x3b0
kernel: [<ffffffff81145a36>] out_of_memory+0x4b6/0x4f0
kernel: [<ffffffff8114b579>] __alloc_pages_nodemask+0xa09/0xb10
kernel: [<ffffffff8118bc3a>] alloc_pages_vma+0x9a/0x140
kernel: [<ffffffff8117e7cb>] read_swap_cache_async+0xeb/0x160
kernel: [<ffffffff8117e8e8>] swapin_readahead+0xa8/0x110
kernel: [<ffffffff8116cb7f>] handle_mm_fault+0x94f/0xd90
kernel: [<ffffffff811c3b00>] ? poll_select_copy_remaining+0x150/0x150
kernel: [<ffffffff815ed186>] __do_page_fault+0x156/0x540
kernel: [<ffffffff81103024>] ? __delayacct_blkio_end+0x34/0x60
kernel: [<ffffffff811410d0>] ? sleep_on_page+0x20/0x20
kernel: [<ffffffff815e525e>] ? __wait_on_bit+0x7e/0x90
kernel: [<ffffffff81143ad8>] ? wait_on_page_bit_killable+0x88/0xb0
kernel: [<ffffffff815ed58a>] do_page_fault+0x1a/0x70
kernel: [<ffffffff815e97c8>] page_fault+0x28/0x30
kernel: [<ffffffff812c50d9>] ? copy_user_generic_unrolled+0x89/0xc0
kernel: [<ffffffff811c39a1>] ? set_fd_set+0x21/0x30
kernel: [<ffffffff811c486f>] core_sys_select+0x20f/0x300
kernel: [<ffffffff815ed234>] ? __do_page_fault+0x204/0x540
kernel: [<ffffffff8104609f>] ? kvm_clock_get_cycles+0x1f/0x30
kernel: [<ffffffff810b68f8>] ? ktime_get_ts+0x48/0xe0
kernel: [<ffffffff8104609f>] ? kvm_clock_get_cycles+0x1f/0x30
kernel: [<ffffffff810b68f8>] ? ktime_get_ts+0x48/0xe0
kernel: [<ffffffff811c4a1a>] SyS_select+0xba/0x110
kernel: [<ffffffff815f2119>] system_call_fastpath+0x16/0x1b
kernel: Mem-Info:
kernel: Node 0 DMA per-cpu:
kernel: CPU    0: hi:    0, btch:   1 usd:   0
kernel: Node 0 DMA32 per-cpu:
kernel: CPU    0: hi:  186, btch:  31 usd:   0
kernel: active_anon:1 inactive_anon:4 isolated_anon:0
 active_file:0 inactive_file:0 isolated_file:0
 unevictable:19106 dirty:0 writeback:0 unstable:0
 free:12227 slab_reclaimable:4653 slab_unreclaimable:25226
 mapped:1688 shmem:6 pagetables:1724 bounce:0
 free_cma:0
kernel: Node 0 DMA free:4572kB min:712kB low:888kB high:1068kB active_anon:4kB inactive_anon:16kB active_file:0kB inactive_file:0kB unevictable:676kB isolated(anon):0kB isolated(file):0kB present:15984kB managed:15892kB mlocked:676kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:320kB slab_unreclaimable:1304kB kernel_stack:248kB pagetables:104kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:200 all_unreclaimable? yes
kernel: lowmem_reserve[]: 0 966 966 966
kernel: Node 0 DMA32 free:44336kB min:44340kB low:55424kB high:66508kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:75748kB isolated(anon):0kB isolated(file):0kB present:1032180kB managed:989604kB mlocked:75748kB dirty:0kB writeback:0kB mapped:6752kB shmem:24kB slab_reclaimable:18292kB slab_unreclaimable:99600kB kernel_stack:10920kB pagetables:6792kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
kernel: lowmem_reserve[]: 0 0 0 0
kernel: Node 0 DMA: 1*4kB (U) 1*8kB (M) 1*16kB (U) 0*32kB 1*64kB (M) 1*128kB (U) 3*256kB (UM) 3*512kB (UM) 0*1024kB 1*2048kB (R) 0*4096kB = 4572kB
kernel: Node 0 DMA32: 208*4kB (UER) 324*8kB (UER) 146*16kB (UEMR) 56*32kB (UE) 11*64kB (UEM) 2*128kB (UR) 0*256kB 4*512kB (UMR) 25*1024kB (MR) 4*2048kB (MR) 0*4096kB = 44352kB
kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
kernel: 1704 total pagecache pages
kernel: 18 pages in swap cache
kernel: Swap cache stats: add 528016, delete 527998, find 172727/195254
kernel: Free swap  = 636148kB
kernel: Total swap = 839676kB
kernel: 262140 pages RAM
kernel: 7647 pages reserved
kernel: 272149 pages shared
kernel: 239042 pages non-shared
kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
kernel: [  437]     0   437    10291      167      25       84             0 systemd-journal
kernel: [  440]     0   440    45138      158      23     1402             0 lvmetad
kernel: [  458]     0   458    11157      200      24      458         -1000 systemd-udevd
kernel: [  515]     0   515    29168      167      26      138         -1000 auditd
kernel: [  545]    70   545     7538      190      19       85             0 avahi-daemon
kernel: [  546]   997   546     1084      103       6       56             0 lsmd
kernel: [  547]     0   547    53061      193      59      428             0 abrtd
kernel: [  550]     0   550    52445      164      54      327             0 abrt-watch-log
kernel: [  553]    70   553     7507       26      18       60             0 avahi-daemon
kernel: [  554]     0   554   113516      210      71      541             0 NetworkManager
[...]
kernel: Out of memory: Kill process 17155 (lvconvert) score 40 or sacrifice child
kernel: Killed process 17155 (lvconvert) total-vm:214744kB, anon-rss:65596kB, file-rss:5984kB



Version-Release number of selected component (if applicable):
3.10.0-123.el7.x86_64
lvm2-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
lvm2-libs-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
lvm2-cluster-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-libs-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-event-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-event-libs-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-persistent-data-0.3.2-1.el7    BUILT: Thu Apr  3 09:58:51 CDT 2014
cmirror-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014

Comment 2 Corey Marthaler 2014-06-16 23:32:22 UTC
With a larger mem system, I didn't see any oom issues. 

Here's how /proc/slabinfo grew however from the 20th create to the 100th create without any other thing running on the system.

[root@harding-03 ~]# cat /proc/meminfo
MemTotal:       65758100 kB
MemFree:        49877140 kB
MemAvailable:   61384544 kB


# 20th create
kmalloc-8192         334    424   8192    4    8 : tunables    0    0    0 : slabdata    106    106      0
kmalloc-4096        3941   3952   4096    8    8 : tunables    0    0    0 : slabdata    494    494      0
kmalloc-2048        2426   2560   2048   16    8 : tunables    0    0    0 : slabdata    160    160      0
kmalloc-1024        6055   6368   1024   32    8 : tunables    0    0    0 : slabdata    199    199      0
kmalloc-512         9056   9056    512   32    4 : tunables    0    0    0 : slabdata    283    283      0
kmalloc-256         8793   9408    256   32    2 : tunables    0    0    0 : slabdata    294    294      0
kmalloc-192        22701  23100    192   42    2 : tunables    0    0    0 : slabdata    550    550      0
kmalloc-128       369696 369696    128   32    1 : tunables    0    0    0 : slabdata  11553  11553      0
kmalloc-96         24122  31920     96   42    1 : tunables    0    0    0 : slabdata    760    760      0
kmalloc-64        458821 459328     64   64    1 : tunables    0    0    0 : slabdata   7177   7177      0
kmalloc-32        200406 200960     32  128    1 : tunables    0    0    0 : slabdata   1570   1570      0
kmalloc-16        129735 140544     16  256    1 : tunables    0    0    0 : slabdata    549    549      0
kmalloc-8         306769 310272      8  512    1 : tunables    0    0    0 : slabdata    606    606      0


# 100th create
kmalloc-8192         362    428   8192    4    8 : tunables    0    0    0 : slabdata    107    107      0
kmalloc-4096       12352  12360   4096    8    8 : tunables    0    0    0 : slabdata   1545   1545      0
kmalloc-2048        2461   2608   2048   16    8 : tunables    0    0    0 : slabdata    163    163      0
kmalloc-1024        8417   8480   1024   32    8 : tunables    0    0    0 : slabdata    265    265      0
kmalloc-512         9211   9408    512   32    4 : tunables    0    0    0 : slabdata    294    294      0
kmalloc-256        15664  15840    256   32    2 : tunables    0    0    0 : slabdata    495    495      0
kmalloc-192        23271  23856    192   42    2 : tunables    0    0    0 : slabdata    568    568      0
kmalloc-128       377664 377664    128   32    1 : tunables    0    0    0 : slabdata  11802  11802      0
kmalloc-96         30555  32844     96   42    1 : tunables    0    0    0 : slabdata    782    782      0
kmalloc-64        548531 549504     64   64    1 : tunables    0    0    0 : slabdata   8586   8586      0
kmalloc-32        198868 200320     32  128    1 : tunables    0    0    0 : slabdata   1565   1565      0
kmalloc-16        129915 140544     16  256    1 : tunables    0    0    0 : slabdata    549    549      0
kmalloc-8         293016 302592      8  512    1 : tunables    0    0    0 : slabdata    591    591      0

Comment 3 Zdenek Kabelac 2014-11-25 16:54:52 UTC
Mike - are there any numbers we may provide to estimate 'max' number of usable cached volumes in a system with given RAM size.

How much memory is approximately consumed by running cache ?

Comment 5 Jonathan Earl Brassow 2015-02-04 13:27:14 UTC
There is a kernel component to this (bug 1189059).  This bug is to find a clean way to handle low memory situations.

Comment 8 Joe Thornber 2015-06-29 15:54:12 UTC
The question isn't really about the number of cache volumes, but the number of cache blocks (ssd space used / cache block size).

The new smq policy uses around 22 bytes per cache block, the old mq one used 88.

Comment 9 Jonathan Earl Brassow 2015-06-29 22:35:59 UTC
Ok, so it is somewhat safe to say that we have improved cache by ~4x WRT the amount of memory used.  It won't solve the problem though, so we need to find some way of explaining the limits to people - even if that is in 'total_cache_blocks' vs # of LVs or size of cache LV.

2^5 easily covers the space / cache block.  1GiB of memory should give ~ 2^25 (~32 million) blocks.  Default chunk size is 2^16.  So, that means ~2^41 (2TiB) of managed cache blocks per GiB of memory.

So, this used to be ~256GiB of managed cache space per GiB of memory.  That's a significant improvement.

Comment 10 Jonathan Earl Brassow 2015-07-01 15:08:44 UTC
We have addressed memory issues WRT the cache via the SMQ policy in the kernel.  Bug 1229756 will make LVM use SMQ as the default.  This should help quite a bit on memory usage.

However, it doesn't completely address the issue.  LVM may need to check the available memory and caution the user about OOM possibilities if activated.  This bug will remain to address those issues - issues for a later release.

Comment 13 Jonathan Earl Brassow 2016-01-22 16:23:23 UTC
If unable to code a sufficient solution, this bug could be handled by simply making the information in comment 9 available to users.

Comment 14 Jonathan Earl Brassow 2016-01-22 18:46:48 UTC
*** Bug 1114113 has been marked as a duplicate of this bug. ***

Comment 15 Roman Bednář 2016-02-10 13:24:05 UTC
Adding QA ACK for 7.3.

Testing will depend on implemented solution.

Comment 16 Jonathan Earl Brassow 2016-06-15 15:59:48 UTC
We currently have no solution for this bug.  One proposal that could work is to assume a certain size for each cache volume and do some computations/restrictions based on that.  This is somewhat of a hack and upstream is looking for something better.  That could mean that no solution is forth-coming in 7.3 timeframe.

Comment 17 Zdenek Kabelac 2016-06-30 20:02:44 UTC
We don't have yet agreement on the actual implementation.

Proposed plain 'free memory check' is quite racy even thought it likely would capture majority of misusage with not to high effort - on the other hand might also cause stopping working with some existing environment.

For any better solution there was not yet any deeper discussion.

Comment 21 RHEL Program Management 2021-03-01 07:32:34 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.


Note You need to log in before you can comment on or make changes to this bug.