Bug 2003033
Summary: | 'free' command reports misleading "used" value | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Alexander Hass <alexander.hass> |
Component: | procps-ng | Assignee: | Jan Rybar <jrybar> |
Status: | CLOSED ERRATA | QA Contact: | Karel Volný <kvolny> |
Severity: | low | Docs Contact: | Šárka Jana <sjanderk> |
Priority: | unspecified | ||
Version: | 9.0 | CC: | aarnold, helge.deller, kvolny, peter.pitterling, sjanderk |
Target Milestone: | rc | Keywords: | Triaged |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | procps-ng-3.3.17-6.el9 | Doc Type: | Bug Fix |
Doc Text: |
.`free` command uses a new calculation method for used memory
Previously, the calculation of used memory in the `free` utility subtracted free space, cache space and buffer space from the total memory. Consequently, a discrepancy occurred when you compared the value of used memory with outcome of another tool because the `free` utility did not calculate shared memory. With this update, the `free` command uses a new calculation method that provides clear state of free memory and considers the unreclaimable cache. Used memory is now any memory that is not available, and includes also `tmpfs` objects that are in the virtual memory.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2022-11-15 11:12:39 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Alexander Hass
2021-09-10 10:24:17 UTC
Hello, I believe shared memory is not taken into account for the right reasons, as it basically is used memory (holding tmpfs), hence it is a subset of 'used', which is calculated as "mem_used = kb_main_total - kb_main_free - kb_main_cached - kb_main_buffers;". I can see that MemAvailable was introduced in 2014 because kb_main_cached does not represent what it used to (see https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=34e431b0ae398fc54ea69ff85ec700722c9da773), so it may really be a good topic for discussion on upstream. It is IMO worth reminding the statement in 'procfs' manpage: """ MemAvailable %lu (since Linux 3.14) An estimate of how much memory is available for starting new applications, without swapping. """ ... which may semantically have a different meaning than "free" memory (or complement of 'used' memory). This is a good question for kernelists (again?). However, 'free' already shows MemAvailable in the output ('available'). It is calculated the same way as in kernel (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/proc/meminfo.c?id=34e431b0ae398fc54ea69ff85ec700722c9da773#n64). It's worth questioning the upstream whether to just show the meminfo's value instead calculating it on our own. To summarize this: - counting 'shared' into 'used' is IMO right. - 'used' can be tweaked to separate irreclaimable 'cached', maybe just 'MemTotal'-'MemAvailable'? - 'free' already shows MemAvailable But in all cases, it is needed to remind that 'free' and even kernel show ROUGH ESTIMATES of free/used memory! If the user/customer lists 'free' and '/proc/meminfo' next to each other, it is worth saying that even those tools impact the memory and "the kilobytes" may never fully correspond to each other. The values change every milisecond. Hello Jan, (In reply to Jan Rybar from comment #1) > [...] > > To summarize this: > - counting 'shared' into 'used' is IMO right. > - 'used' can be tweaked to separate irreclaimable 'cached', maybe just 'MemTotal'-'MemAvailable'? > - 'free' already shows MemAvailable > > But in all cases, it is needed to remind that 'free' and even kernel show > ROUGH ESTIMATES of free/used memory! If the user/customer lists 'free' and > '/proc/meminfo' next to each other, it is worth saying that even those tools > impact the memory and "the kilobytes" may never fully correspond to each > other. The values change every milisecond. thank you for your analysis. Imho there is no need for a perfectly precise accounting here but disregarding "shared" (probably more precise the amount of "Shmem" from /proc/meminfo) when calculating the "used" value within the procps tools can falsify the outcome by far. Let's think of some database buffers or other applications consuming a notable amount of (anonymous) shared memory, where "cached" is not just mostly reclaimable page cache. Such memory should be accounted into "used" memory, so procps-tools do not pretend it being reclaimable or unused memory at all. Kind regards Alexander Hello Jan, may I ask for an update, since RHEL 9 is approaching quickly and I cannot see an improved behaviour so far? Thank you Alexander tl;dr 1) AFAIK 'shared' is not left out, it's part of 'used' 2) used=total-memavail has not been accepted by upstream so far, we're not doing it either long version: I still fail to see the reporter's point here. I haven't found a hint of why Shmem should be not accounted in mem_used. as per [1] /proc/sysinfo.c:789: ``` kb_main_cached = kb_page_cache + kb_slab_reclaimable; kb_swap_used = kb_swap_total - kb_swap_free; ... mem_used = kb_main_total - kb_main_free - kb_main_cached - kb_main_buffers; ``` I cannot see where kb_main_shared should be left out of anything. It is inherently a subset of mem_used. I tried to crawl through kernel code and found that: kb_main_buffers = -> ./mm/page_alloc.c:5778:void si_meminfo(struct sysinfo *val) -> ./mm/page_alloc.c:5783: val->bufferram = nr_blockdev_pages(); where ./block/bdev.c:515:long nr_blockdev_pages(void) does not look for anything associated with shared memory, if I haven't missed anything. It searches for mapping of block devices. After reading the ML thread [2], I can see the main discussion was about the true characteristic of cache vs tmpfs. Yes, patching sysinfo.c to count "used = total - MemAvail" would be a good-looking simplification, however the upstream does not seem to be inclined to that change and downstream deviation from upstream in this magnitude might cause even more questions. Upstream's reluctance to make this change in the codebase to this day seems to support the bias between both sides and lack of information across kernel/userspace to make a solid decision. I'm inclined to CLOSED. [1] https://gitlab.com/procps-ng/procps/-/blob/master/proc/sysinfo.c#L781 [2] https://www.freelists.org/post/procps/free-regression-due-to-a-different-calculation-of-Used-memory Hello Jan, thank you for your assessment. Unfortunately I might not be able to follow your code analysis and would therefore like to elaborate a little more on the symptom of my enquiry - 'free' command does not account shared memory as used memory - with the following example: > free -m total used free shared buff/cache available Mem: 12384284 1963180 8837044 1553499 1584059 8825800 Swap: 16383 0 16383 According to the 'free' command's output only 1,92 TB (1963180 MB) of memory are in use. This is in my opinion and also according to the "available" value not correct, since about 1,48 TB of shared memory are not counted as "used" memory. The "used memory" value seems to be calculated as follows (as of proc/sysinfo.c - https://gitlab.com/procps-ng/procps/-/commit/6cb75efef85f735b72e6c96f197f358f511f8ed9): kb_main_used = kb_main_total - kb_main_free - kb_main_cached - kb_main_buffers; Filled with values from this example system (values taken from the /proc/meminfo output below): 2023719192 = 12681507364(MemTotal) - 9053660316(MemFree) - 1603494528(Cached) - 633328(Buffers) Resulting in about 1976288 MB (1,93TB) of used main memory, approximately matching, since the system is under load, the output of the 'free' command above. By substracting 'kb_main_cached' not only page cache or similar reclaimable memory is in result not counted as used memory, also the significant 'Shmem' part, which is part of 'Cached' is neglected. In our case there is no file-backed shared memory, but all is anonymous, swap-backed shared memory, if this might be relevant. Therefore this _is_ actual used memory, which cannot be easily reclaimed and seen as available memory, it should therefore be accounted as "used" memory. So even if the ML discussion might have slid to some tmpfs discussion, the 'free' command still does miscalculate the "used" value by ignoring 'Cached' and therefore pretending the customer that much more memory might be available - since it is not "used". I hope this helps to explain my concerns better. Since a MemAvailable-based approach was not accepted, a quick fix might be just adjusting the "used" formula as follows by adding 'Shmem': kb_main_used = kb_main_total - kb_main_free - kb_main_cached - kb_main_buffers + Shmem; Of course I can understand, that changes might again confuse users - but the current implementation is imho just wrong and misleading and therefore needs to be corrected in some way to represent 'Shmem' again. Thank you Alexander > cat /proc/meminfo MemTotal: 12681507364 kB -> 12384284 MB -> 12094 GB -> 11,81 TB MemFree: 9053660316 kB -> 8841465 MB -> 8634 GB -> 8,43 TB MemAvailable: 9042150272 kB -> 8830224 MB -> 8623 GB -> 8,42 TB Buffers: 633328 kB -> 618 MB Cached: 1603494528 kB -> 1565912 MB -> 1529 GB -> 1,49 TB SwapCached: 0 kB Active: 1994483344 kB Inactive: 1592605080 kB Active(anon): 1983178584 kB Inactive(anon): 1590577416 kB Active(file): 11304760 kB Inactive(file): 2027664 kB Unevictable: 32468 kB Mlocked: 32468 kB SwapTotal: 16777212 kB SwapFree: 16777212 kB Dirty: 1300 kB Writeback: 0 kB AnonPages: 1982992232 kB Mapped: 1591612564 kB Shmem: 1590783444 kB -> 1553499 MB -> 1517 GB -> 1,48 TB Slab: 19059804 kB [..] So it was decided on upstream mailing list that the calculation will change to `Used = Total - MemAvail`. This shall be backported. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (procps-ng bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:8276 |