Bug 678949 - oprofiled did not start on Nehalem-EX platform
Summary: oprofiled did not start on Nehalem-EX platform
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Hardware Certification Program
Classification: Retired
Component: Test Suite (tests)
Version: 1.2
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Greg Nichols
QA Contact: Guangze Bai
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-02-21 03:32 UTC by chen yuwen
Modified: 2015-02-08 21:36 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
In v7 1.2, oprofiled could not started on some systems, this issue has been fixed in v7 1.3, now oprofiled can be started on these systems.
Clone Of:
Environment:
Last Closed: 2011-05-09 16:14:51 UTC


Attachments (Terms of Use)
revised profiler test, resetting NMI watchdog (10.39 KB, application/octet-stream)
2011-03-12 13:20 UTC, Greg Nichols
no flags Details
revised profiler test, resetting NMI watchdog, ignoring spurious opreport stderr output. (10.53 KB, application/octet-stream)
2011-03-15 20:30 UTC, Greg Nichols
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0497 0 normal SHIPPED_LIVE v7 bug fix and enhancement update 2011-05-09 16:11:16 UTC

Description chen yuwen 2011-02-21 03:32:32 UTC
Description of problem:
oprofiled did not start on Nehalem-EX platform.
Then profiler testing FAIL.


Test Parameters: DEBUG=off RUNMODE=normal OUTPUTFILE=/var/log/v7/runs/1/profiler/output.log DEVICE= TESTSERVER=unknown 
using linux image /usr/lib/debug/lib/modules/2.6.32-114.0.1.el6.x86_64/vmlinux
Using Linux image /usr/lib/debug/lib/modules/2.6.32-114.0.1.el6.x86_64/vmlinux

Subtest Reset:

==== START: Errors during reset may be ignored. ====
Warning: 
"opcontrol --shutdown" has output on stderr
Verified data has beed removed
^^^^ END: Errors during reset may be ignored. ^^^^

PASS

Subtest Start Daemon:
starting opcontrold
Using default event: CPU_CLK_UNHALTED:100000:0:1:1
Error: counter 0 not available nmi_watchdog using this resource ? Try:
opcontrol --deinit
echo 0 > /proc/sys/kernel/nmi_watchdog
Error: oprofiled did not start
FAIL


Version-Release number of selected component (if applicable):
RHEL6.1-20110210.1-x86_64
v7-1.3-10

How reproducible:
always

Steps to Reproduce:
1. provision RHEL6.1-20110210.1-x86_64 on Nehalem-EX platform
2. install v7 and dependencies
3. run profiler testing
  
Actual results:
FAIL

Expected results:
PASS

Additional info:

Comment 2 chen yuwen 2011-02-22 09:14:26 UTC
# vim /etc/modprobe.d/modprobe.conf
options oprofile timer=1

Then profiler testing PASS.

Comment 4 Greg Nichols 2011-03-07 18:00:31 UTC
(In reply to comment #2)
> # vim /etc/modprobe.d/modprobe.conf
> options oprofile timer=1
> 
> Then profiler testing PASS.


What was the state of the system for the failure?  did the file /etc/modprobe.d/modprobe.conf exist?  what did it contain?

Thanks!

Comment 5 chen yuwen 2011-03-08 02:25:16 UTC
The file did not exist before.
I create the file according to Comment #1, and it work.

Comment 6 chen yuwen 2011-03-10 07:27:26 UTC

*** This bug has been marked as a duplicate of bug 683176 ***

Comment 7 Rob Landry 2011-03-10 14:59:12 UTC
Reopening.  The cause is bz 683176, however this is the v7 side of that bug.  v7 needs to disabled the nmi_watchdog before running the oprofile service, then run the test portion and upon exit restore the nmi_watchdog state.  This will avoid the resource conflict introduced by that change.

Comment 8 Greg Nichols 2011-03-12 13:20:04 UTC
Created attachment 483893 [details]
revised profiler test, resetting NMI watchdog

This revision also changes test flow and logic with respect to forced timer configuration.

Comment 9 Greg Nichols 2011-03-12 13:22:28 UTC
Test runs of the above patch still fail on the RHEL 6.1 system I'm running.  The "report" subtest has opreport producting an error:

"opreport" has output on stderr
Overflow stats not available

Comment 10 Rob Landry 2011-03-15 14:00:02 UTC
Will, is it expected behavior that when nmi_watchdog is disabled to release the required timer that opreport would output "Overflow stats not available"?

Greg, is there any regular output or just this stderr message?

Comment 11 William Cohen 2011-03-15 14:34:00 UTC
Looking through the code this appears to be produced when there isn't a /var/lib/oprofile/samples/current/stats/ directory. This output is just going to stderr.


This has been removed in later versions of oprofile:

http://oprofile.git.sourceforge.net/git/gitweb.cgi?p=oprofile/oprofile;a=commit;h=3cb5ede4de23f32ae57f2f7f50a5642edc33faa6

Looks like you could ignore this message.

Comment 12 Greg Nichols 2011-03-15 20:30:27 UTC
Created attachment 485599 [details]
revised profiler test, resetting NMI watchdog, ignoring spurious opreport stderr output.

Comment 16 Caspar Zhang 2011-05-01 09:49:04 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
In v7 1.2, oprofiled could not started on some systems, this issue has been fixed in v7 1.3, now oprofiled can be started on these systems.

Comment 17 errata-xmlrpc 2011-05-09 16:14:51 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0497.html


Note You need to log in before you can comment on or make changes to this bug.