Bug 431490
Summary: | [BETA RHEL5.2] i386 wrong hugepages info shown after allocate and deallocate | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Mike Gahagan <mgahagan> | ||||
Component: | kernel | Assignee: | Eric Paris <eparis> | ||||
Status: | CLOSED WONTFIX | QA Contact: | Martin Jenner <mjenner> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 5.2 | CC: | anderson, dzickus, jburke | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | i386 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2008-07-08 13:54:19 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 173617 | ||||||
Bug Blocks: | |||||||
Attachments: |
|
Comment 2
Eric Paris
2008-03-04 23:58:45 UTC
1359448 Above brew task number should have a patch I think will close this race. hugetlb_report_meminfo() and hugetlb_report_node_meminfo() both used nr_huge_pages pages and palls without holding the hugetlb_lock. Its possible during the snprintf operation to build the output buffer these could get out of sync. Appears to be purely cosmetic as all of the accounting seems to be done under the lock. This adds the lock to the proc output buffer building. ok, after trying this again with both the -92 kernel and the test kernel on ibm-hermes-n1, I'm not able to reproduce the bug anymore with either kernel. If this is just /proc accounting and we aren't leaking hugepages or anything like that I'm ok with closing this (or taking the fix for that matter) I just ask mike to try running a slightly different test. 2 threads setting the number of hugepages up and down 1 thread per core (16 cores on his test machine) reading /proc/meminfo The most interesting thing is the threads reading /proc since I think 2 threads changing the sysctl's will probably be about enough to saturate the system... I hacked up the test to have 2 processes set nr_huge_pages and 15 to read the values and report anytime free hugepages > total hugepages.. I'll run it overnight with the test kernel. I let the modified test case run overnight (2 processes set nr_huge_pages and 15 to read the values from /proc/meminfo). I have not seen any accounting descrepencies. The -92 kernel typically showed free hugepages > total hugepages after approximately 5 minutes of run time. I'd say our race is very likely fixed, but I'll be glad to test it more if anyone wants to see more results. I'll go ahead and propose it for 5.3 and set the qe ack. Created attachment 309864 [details]
test case
multi-threaded test case minus the rhts specific stuff.
Patch sent to lkml for laughs. http://marc.info/?l=linux-kernel&m=121389959715727&w=2 Upstream told me to go fly a kite. Locking here could allow a normal user to significantly degrade the systems use of hugetables since every process that wanted to free or take a hugetlb page would have to wait on the locking of that process. Proc is inherrintly racy and they will not take a fix for this. I'd suggest changing the test case to look for free - total > 2 or two invocations in a row both showing incorrect numbers. closing as WONTFIX |