Bug 1110041
Summary: | Warn about possible OOM issues when caching volumes and free memory is low | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Corey Marthaler <cmarthal> | |
Component: | lvm2 | Assignee: | Zdenek Kabelac <zkabelac> | |
lvm2 sub component: | Cache Logical Volumes | QA Contact: | cluster-qe <cluster-qe> | |
Status: | CLOSED WONTFIX | Docs Contact: | ||
Severity: | high | |||
Priority: | high | CC: | agk, heinzm, jbrassow, jkachuck, msnitzer, pasik, prajnoha, rbednar, thornber, zkabelac | |
Version: | 7.0 | Keywords: | Triaged | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1114113 (view as bug list) | Environment: | ||
Last Closed: | 2021-03-01 07:32:34 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1114113, 1119326, 1186924, 1313485 |
Description
Corey Marthaler
2014-06-16 21:52:49 UTC
With a larger mem system, I didn't see any oom issues. Here's how /proc/slabinfo grew however from the 20th create to the 100th create without any other thing running on the system. [root@harding-03 ~]# cat /proc/meminfo MemTotal: 65758100 kB MemFree: 49877140 kB MemAvailable: 61384544 kB # 20th create kmalloc-8192 334 424 8192 4 8 : tunables 0 0 0 : slabdata 106 106 0 kmalloc-4096 3941 3952 4096 8 8 : tunables 0 0 0 : slabdata 494 494 0 kmalloc-2048 2426 2560 2048 16 8 : tunables 0 0 0 : slabdata 160 160 0 kmalloc-1024 6055 6368 1024 32 8 : tunables 0 0 0 : slabdata 199 199 0 kmalloc-512 9056 9056 512 32 4 : tunables 0 0 0 : slabdata 283 283 0 kmalloc-256 8793 9408 256 32 2 : tunables 0 0 0 : slabdata 294 294 0 kmalloc-192 22701 23100 192 42 2 : tunables 0 0 0 : slabdata 550 550 0 kmalloc-128 369696 369696 128 32 1 : tunables 0 0 0 : slabdata 11553 11553 0 kmalloc-96 24122 31920 96 42 1 : tunables 0 0 0 : slabdata 760 760 0 kmalloc-64 458821 459328 64 64 1 : tunables 0 0 0 : slabdata 7177 7177 0 kmalloc-32 200406 200960 32 128 1 : tunables 0 0 0 : slabdata 1570 1570 0 kmalloc-16 129735 140544 16 256 1 : tunables 0 0 0 : slabdata 549 549 0 kmalloc-8 306769 310272 8 512 1 : tunables 0 0 0 : slabdata 606 606 0 # 100th create kmalloc-8192 362 428 8192 4 8 : tunables 0 0 0 : slabdata 107 107 0 kmalloc-4096 12352 12360 4096 8 8 : tunables 0 0 0 : slabdata 1545 1545 0 kmalloc-2048 2461 2608 2048 16 8 : tunables 0 0 0 : slabdata 163 163 0 kmalloc-1024 8417 8480 1024 32 8 : tunables 0 0 0 : slabdata 265 265 0 kmalloc-512 9211 9408 512 32 4 : tunables 0 0 0 : slabdata 294 294 0 kmalloc-256 15664 15840 256 32 2 : tunables 0 0 0 : slabdata 495 495 0 kmalloc-192 23271 23856 192 42 2 : tunables 0 0 0 : slabdata 568 568 0 kmalloc-128 377664 377664 128 32 1 : tunables 0 0 0 : slabdata 11802 11802 0 kmalloc-96 30555 32844 96 42 1 : tunables 0 0 0 : slabdata 782 782 0 kmalloc-64 548531 549504 64 64 1 : tunables 0 0 0 : slabdata 8586 8586 0 kmalloc-32 198868 200320 32 128 1 : tunables 0 0 0 : slabdata 1565 1565 0 kmalloc-16 129915 140544 16 256 1 : tunables 0 0 0 : slabdata 549 549 0 kmalloc-8 293016 302592 8 512 1 : tunables 0 0 0 : slabdata 591 591 0 Mike - are there any numbers we may provide to estimate 'max' number of usable cached volumes in a system with given RAM size. How much memory is approximately consumed by running cache ? There is a kernel component to this (bug 1189059). This bug is to find a clean way to handle low memory situations. The question isn't really about the number of cache volumes, but the number of cache blocks (ssd space used / cache block size). The new smq policy uses around 22 bytes per cache block, the old mq one used 88. Ok, so it is somewhat safe to say that we have improved cache by ~4x WRT the amount of memory used. It won't solve the problem though, so we need to find some way of explaining the limits to people - even if that is in 'total_cache_blocks' vs # of LVs or size of cache LV. 2^5 easily covers the space / cache block. 1GiB of memory should give ~ 2^25 (~32 million) blocks. Default chunk size is 2^16. So, that means ~2^41 (2TiB) of managed cache blocks per GiB of memory. So, this used to be ~256GiB of managed cache space per GiB of memory. That's a significant improvement. We have addressed memory issues WRT the cache via the SMQ policy in the kernel. Bug 1229756 will make LVM use SMQ as the default. This should help quite a bit on memory usage. However, it doesn't completely address the issue. LVM may need to check the available memory and caution the user about OOM possibilities if activated. This bug will remain to address those issues - issues for a later release. If unable to code a sufficient solution, this bug could be handled by simply making the information in comment 9 available to users. *** Bug 1114113 has been marked as a duplicate of this bug. *** Adding QA ACK for 7.3. Testing will depend on implemented solution. We currently have no solution for this bug. One proposal that could work is to assume a certain size for each cache volume and do some computations/restrictions based on that. This is somewhat of a hack and upstream is looking for something better. That could mean that no solution is forth-coming in 7.3 timeframe. We don't have yet agreement on the actual implementation. Proposed plain 'free memory check' is quite racy even thought it likely would capture majority of misusage with not to high effort - on the other hand might also cause stopping working with some existing environment. For any better solution there was not yet any deeper discussion. After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. |