Bug 678949

Summary: oprofiled did not start on Nehalem-EX platform
Product: [Retired] Red Hat Hardware Certification Program Reporter: chen yuwen <yuchen>
Component: Test Suite (tests)Assignee: Greg Nichols <gnichols>
Status: CLOSED ERRATA QA Contact: Guangze Bai <gbai>
Severity: medium Docs Contact:
Priority: medium    
Version: 1.2CC: czhang, emcnabb, gbai, rlandry, wcohen, ykun, yshao
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
In v7 1.2, oprofiled could not started on some systems, this issue has been fixed in v7 1.3, now oprofiled can be started on these systems.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-05-09 16:14:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
revised profiler test, resetting NMI watchdog
none
revised profiler test, resetting NMI watchdog, ignoring spurious opreport stderr output. none

Description chen yuwen 2011-02-21 03:32:32 UTC
Description of problem:
oprofiled did not start on Nehalem-EX platform.
Then profiler testing FAIL.


Test Parameters: DEBUG=off RUNMODE=normal OUTPUTFILE=/var/log/v7/runs/1/profiler/output.log DEVICE= TESTSERVER=unknown 
using linux image /usr/lib/debug/lib/modules/2.6.32-114.0.1.el6.x86_64/vmlinux
Using Linux image /usr/lib/debug/lib/modules/2.6.32-114.0.1.el6.x86_64/vmlinux

Subtest Reset:

==== START: Errors during reset may be ignored. ====
Warning: 
"opcontrol --shutdown" has output on stderr
Verified data has beed removed
^^^^ END: Errors during reset may be ignored. ^^^^

PASS

Subtest Start Daemon:
starting opcontrold
Using default event: CPU_CLK_UNHALTED:100000:0:1:1
Error: counter 0 not available nmi_watchdog using this resource ? Try:
opcontrol --deinit
echo 0 > /proc/sys/kernel/nmi_watchdog
Error: oprofiled did not start
FAIL


Version-Release number of selected component (if applicable):
RHEL6.1-20110210.1-x86_64
v7-1.3-10

How reproducible:
always

Steps to Reproduce:
1. provision RHEL6.1-20110210.1-x86_64 on Nehalem-EX platform
2. install v7 and dependencies
3. run profiler testing
  
Actual results:
FAIL

Expected results:
PASS

Additional info:

Comment 2 chen yuwen 2011-02-22 09:14:26 UTC
# vim /etc/modprobe.d/modprobe.conf
options oprofile timer=1

Then profiler testing PASS.

Comment 4 Greg Nichols 2011-03-07 18:00:31 UTC
(In reply to comment #2)
> # vim /etc/modprobe.d/modprobe.conf
> options oprofile timer=1
> 
> Then profiler testing PASS.


What was the state of the system for the failure?  did the file /etc/modprobe.d/modprobe.conf exist?  what did it contain?

Thanks!

Comment 5 chen yuwen 2011-03-08 02:25:16 UTC
The file did not exist before.
I create the file according to Comment #1, and it work.

Comment 6 chen yuwen 2011-03-10 07:27:26 UTC

*** This bug has been marked as a duplicate of bug 683176 ***

Comment 7 Rob Landry 2011-03-10 14:59:12 UTC
Reopening.  The cause is bz 683176, however this is the v7 side of that bug.  v7 needs to disabled the nmi_watchdog before running the oprofile service, then run the test portion and upon exit restore the nmi_watchdog state.  This will avoid the resource conflict introduced by that change.

Comment 8 Greg Nichols 2011-03-12 13:20:04 UTC
Created attachment 483893 [details]
revised profiler test, resetting NMI watchdog

This revision also changes test flow and logic with respect to forced timer configuration.

Comment 9 Greg Nichols 2011-03-12 13:22:28 UTC
Test runs of the above patch still fail on the RHEL 6.1 system I'm running.  The "report" subtest has opreport producting an error:

"opreport" has output on stderr
Overflow stats not available

Comment 10 Rob Landry 2011-03-15 14:00:02 UTC
Will, is it expected behavior that when nmi_watchdog is disabled to release the required timer that opreport would output "Overflow stats not available"?

Greg, is there any regular output or just this stderr message?

Comment 11 William Cohen 2011-03-15 14:34:00 UTC
Looking through the code this appears to be produced when there isn't a /var/lib/oprofile/samples/current/stats/ directory. This output is just going to stderr.


This has been removed in later versions of oprofile:

http://oprofile.git.sourceforge.net/git/gitweb.cgi?p=oprofile/oprofile;a=commit;h=3cb5ede4de23f32ae57f2f7f50a5642edc33faa6

Looks like you could ignore this message.

Comment 12 Greg Nichols 2011-03-15 20:30:27 UTC
Created attachment 485599 [details]
revised profiler test, resetting NMI watchdog, ignoring spurious opreport stderr output.

Comment 16 Caspar Zhang 2011-05-01 09:49:04 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
In v7 1.2, oprofiled could not started on some systems, this issue has been fixed in v7 1.3, now oprofiled can be started on these systems.

Comment 17 errata-xmlrpc 2011-05-09 16:14:51 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0497.html