Bug 218410
Summary: | non-main task's waitpid exited status lost when tracing | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Andrew Cagney <cagney> |
Component: | frysk | Assignee: | Andrew Cagney <cagney> |
Status: | CLOSED ERRATA | QA Contact: | Len DiMaggio <ldimaggi> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 5.0 | CC: | kasal, mcvet, mjw, npremji, pmuldoon, rmoseley, roland, scox, timoore |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | RHEA-2007-0592 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2007-11-07 18:05:47 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 173278 |
Description
Andrew Cagney
2006-12-05 05:46:28 UTC
(In reply to comment #2) > appears to show only WNOHANG calls. > that is racy. after SIGCHLD, some short period may pass before wait succeeds. > your guarantee is that a blocking wait will block a very short time, not that a > WNOHANG wait will succeed immediately. Que? Was POSIX documentation explaining SIGCHLD and its querks with waitpid ever located? The assumption that SIGCHLD is always posted after the wait status was recorded - i.e., SIGIO behavior - is wrong? Does: -> SIGCHLD remain pending when waitpid events are pending; allowing one waitpid read per signal to work? -> SIGCHLD get withdrawn when all waiptpid events have been consumed; allowing more efficient draining of waitpid events? Testing shows that at least the second isn't true and the first, given that the signal is not counting, likely isn't either. Rwrite to frysk's event-loop to use a blocking waitpid call will prevent problem of occasional hangs when monitoring a process. New code currently being tested upstream. Testing included in frysk's testsuite. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Fixes committed upstream, note that two tests - testCloneThanKillAttached and testDeleteAttached have been enabled in the testsuite and are now expected to pass. Index: frysk-core/frysk/proc/ChangeLog 2007-04-09 Andrew Cagney <cagney> * TestProcTasksObserver.java (testCloneThenKillAttached) (testDeleteAttached): Remove brokenIfUtraceXXX due to 3486. * Manager.java (usePoll): Set to false, enable WaitEventLoop. Index: frysk-imports/frysk/sys/ChangeLog 2007-04-09 Andrew Cagney <cagney> * cni/Wait.cxx (log): Add "logger" parameter, update calls. (waitForEvent): Delete. (waitAll): Use "log". Replace loop calling waitForEvent with multiple waitpid calls. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2007-0592.html |