Bug 518480
Summary: | Oprofile seems to not daemonize properly on ia64 | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Petr Muller <pmuller> | ||||
Component: | oprofile | Assignee: | William Cohen <wcohen> | ||||
Status: | CLOSED ERRATA | QA Contact: | BaseOS QE <qe-baseos-auto> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 5.4 | CC: | ohudlick | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | ia64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | 0.9.4-14.el5 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2010-03-30 08:51:58 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Petr Muller
2009-08-20 15:09:53 UTC
I attempted to reproduce the problem following the steps above, but was unable to replicate the behavior. This machine is subscribed to RHN and has RHEL 5.3 on it with the exception on newer kernel and the listed oprofile rpm. I wasn't able to reproduce the problem on this ia64 machine. I am setting up a red hat test machine with clean version of rhel5.4 What kernel verion was used for the testing (uname -a)? I tried a fresh install of RHEL5.4-Server-20090819.0 on a red hat test system ia64 machine and still was unable to recreate the hang using the steps listed. The machine had the following rpms: kernel-2.6.18-164.el5 oprofile-0.9.4-11.el5 What was the original shell script that produced the problem on ia64. Was there something in there that intercepted signals? I can with no problem reproduce the issue on a RHEL5.3 and RHEL5.4 testing box by simply running # opcontrol --deinit; opcontrol --init; opcontrol --start-daemon | tee log as stated above. I did the original investigation on RHEL5.4 box, so it is weird you cannot reproduce it. The versions: oprofile-0.9.4-11.el5.ia64 on both: 2.6.18-128.el5 (rhel5.3) 2.6.18-162.el5xen (some rhel5.4 candidate) The original script showing the behavior was the runtest.sh of /tools/oprofile/Sanity/opcontrol-options RHTS test, at line 189, doing 'opcontrol --start-daemon --no-vmlinux --verbose 2>&1 | tee $TMPOUTPUT' There is nothing signal-interfering that I know about. Sample output of the reproducing line: # opcontrol --deinit; opcontrol --init; opcontrol --start-daemon --verbose | tee log Stopping profiling. Killing daemon. Unloading oprofile module Parameters used: SESSION_DIR /var/lib/oprofile LOCK_FILE /var/lib/oprofile/lock SAMPLES_DIR /var/lib/oprofile/samples CURRENT_SAMPLES_DIR /var/lib/oprofile/samples/current CPUTYPE ia64/itanium2 BUF_SIZE 500 BUF_WATERSHED 250 CPU_BUF_SIZE 1000 SEPARATE_LIB 0 SEPARATE_KERNEL 0 SEPARATE_THREAD 0 SEPARATE_CPU 0 CALLGRAPH 0 VMLINUX none KERNEL_RANGE XENIMAGE none XEN_RANGE executing oprofiled --session-dir=/var/lib/oprofile --separate-lib=0 --separate-kernel=0 --separate-thread=0 --separate-cpu=0 --events=CPU_CYCLES:18:0:150000:0:1:1, --no-vmlinux --verbose=all Events: CPU_CYCLES:18:0:150000:0:1:1, Using 2.6+ OProfile kernel interface. Running perfmon child on CPU0. Events: CPU_CYCLES:18:0:150000:0:1:1, Using 2.6+ OProfile kernel interface. Waiting on CPU0 Perfmon child up on CPU0 Daemon started. (... sitting here until ctrl-c or something ...) Created attachment 359410 [details]
Disconnect children running perfmon from stdin/stdout
I originally misunderstood the desired behavior. I compared the ia64 behavior with the x86_64 and found how the "--start-daemon" option was suppose to behave.
When ia64 oprofiled starts up it creates children processes to run perfmon. These children processes still have file descriptors open for stdin, stdout, and stderr. The attached patch closes those file descriptors to allow the tee operation to continue. This patch in not in the final state, but shows what is going wrong on the ia64 and the basic approach to fix it.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2010-0283.html |