Bug 806903
Summary: | readdir fails to read entire contents of /proc when pid exceeds 32768 using 32-bit application | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | John Tavares <john_tavares> | ||||
Component: | glibc | Assignee: | Jeff Law <law> | ||||
Status: | CLOSED ERRATA | QA Contact: | qe-baseos-tools-bugs | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 5.5 | CC: | fweimer, mfranc | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-04-02 17:43:39 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
John Tavares
2012-03-26 13:23:46 UTC
John, it would significantly help if you indicated what 32-bit application you are using to read the contents of /proc. I've tried a few on a Red Hat Enterprise Linux 5.5 VM and have not managed to trigger a failure yet. Created attachment 574466 [details]
gzipped tar file containing 32-bit test program and source code for it
Trying to resubmit test code/executable to use that shows the issues once prerequisite conditions as I described int he how to reproduce have been met.
I use my own 32-bit binary to read /proc. I had attached a zip file with source and the binary I built to make this easy. The problem will only show if you change the kernel.pid_max setting to say 65536 (sysctl -w kernel.pid_max=65536) and create enough processes until processes with a pid greater than 32768 start to appear. Once this happens, the problem shows itself. Note that you must be on a x64 system for all of this. From strace, the problem is related to getdents: getdents(3, /* d_reclen == 0, problem here *//* 1 entries */, 32768) = 6484 getdents(3, /* 0 entries */, 32768) = 0 $ uname -r 2.6.18-194.el5 $ uname -m x86_64 $ cat /etc/redhat\-release Red Hat Enterprise Linux Server release 5.5 (Tikanga) $ sysctl kernel.pid_max kernel.pid_max = 65536 $ ps -eaf | tail -3 qa_inst 65279 8149 0 17:53 pts/36 00:00:00 ps -eaf qa_inst 65280 8149 0 17:53 pts/36 00:00:00 tail -3 Here is the tail output the sample program results: Found 31170 Found 32162 Error - could not read all contents of /proc: Invalid argument Thanks John. I've been able to reproduce the problem. As you hinted at in the initial report, right now this appears to be a kernel problem. I'm still doing some analysis, but the signs are pointing that direction. You are welcome. The problem appears to be isolated to /proc. I had initially tried to simulate this until I found a system that I could reproduce it by trying make a copy of /proc under /tmp (/tmp/proc/...) and created directories to make it look like there where processes > 32768 and I could not reproduce it by doing that. I have also been trying to see if this is a generic issue or to specific architectures. So far, I have been I have only been able to find a similar system running on s390x, and the problem does not appear to be there. I am trying to find a similar system on IA64 and PowerPC, but I have yet to do so. So as of now, I have only seen this on x64. This was fixed in Red Hat Enterprise Linux 5.6 which was released with kernel 2.6.18-238.el5. Simple bisection shows that 2.6.18-219.el5 fails while 2.6.18-221.el5 works. Looking at the ChangeLogs, this change stands out as potentially fixing the problem, perhaps as a side effect of the RFE. - [fs] proc: add file position and flags info in /proc (Jerome Marchand) [498081] Regardless of precisely which change in the 220/221 kernel fixed the bug, the errata for the kernel update is here: http://rhn.redhat.com/errata/RHSA-2011-0017.html |