Red Hat Bugzilla – Bug 438342
Oprofile does not work in R2 or MRG
Last modified: 2008-06-03 02:08:32 EDT
Vernon Mauery <email@example.com> - 2008-03-18 17:14 EDT
When trying to run oprofile, I see this:
[root@elm3b198 ~]# opcontrol --reset
Signalling daemon... done
[root@elm3b198 ~]# opcontrol --shutdown
[root@elm3b198 ~]# opcontrol
[root@elm3b198 ~]# opcontrol --start
Using default event: CPU_CLK_UNHALTED:100000:0:1:1
/usr/bin/opcontrol: line 1031: /dev/oprofile/0/enabled: No such file or directory
/usr/bin/opcontrol: line 1031: /dev/oprofile/0/event: No such file or directory
/usr/bin/opcontrol: line 1031: /dev/oprofile/0/count: No such file or directory
/usr/bin/opcontrol: line 1031: /dev/oprofile/0/kernel: No such file or directory
/usr/bin/opcontrol: line 1031: /dev/oprofile/0/user: No such file or directory
/usr/bin/opcontrol: line 1031: /dev/oprofile/0/unit_mask: No such file or directory
Using 2.6+ OProfile kernel interface.
Reading module info.
Using log file /var/lib/oprofile/samples/oprofiled.log
[root@elm3b198 ~]# opcontrol --stop
[root@elm3b198 ~]# opreport --long-filenames
opreport error: No sample file found: try running opcontrol --dump
or specify a session containing sample files
This bug exhibits similar symptoms to bug #40996, but the patches that were
applied to the kernel to fix that bug are already in the R2/MRG kernels.
I have seen this running 126.96.36.199-29.el5rt and Alan Stevens has seen it running
Vernon Mauery <firstname.lastname@example.org> - 2008-03-18 17:22 EDT
Since this blocks further investigation of Bug #35584 - RH244819-Multiple
streams degrades network performance, I am marking it as blocking.
Ankita Garg <email@example.com> - 2008-03-19 06:20 EDT
Vernon, could you check if oprofile works for you with a version of ooprofile
package that had acme had provided sometime back:
or can be obtained from here:
Ankita Garg <firstname.lastname@example.org> - 2008-03-19 06:21 EDT
On HS21 & LS41, oprofile is working fine with the latest R2 kernel without any
userspace package changes. I have not tried with the MRG kernel though.
Vernon Mauery <email@example.com> - 2008-03-19 16:06 EDT
I tried using acme's build of oprofile. It didn't seem to fix the problem. I
also tried downloading the source and building it myself. I always saw the same
This is on an LS21. I don't recall what machine type Alan was working on.
Alan P. Stevens <firstname.lastname@example.org> - 2008-03-20 05:41 EDT
(In reply to comment #4)
> This is on an LS21. I don't recall what machine type Alan was working on.
Vernon, I'm also on an LS21.
NOTE: Since the Austin Performance Tools tprof is also broken on MRG kernels (
because of the variable tick interval ) I now have no working profilers for
MRG / V2...
Ankita Garg <email@example.com> - 2008-03-20 05:53 EDT
I tried with the oprofile rpms in comment #2 on a LS21 on top of R2. With this
version of the rpm, oprofile worked fine for me. I could not try it on MRG yet,
but since R2 is based on MRG, wonder what could be missing.
------- Comment From firstname.lastname@example.org 2008-03-25 16:00 EDT-------
In all my attempts to get oprofile working with various -rt kernels and various
versions of oprofile, I must have messed something up.
I just tried again using a fresh install of MRG plus acme's build of oprofile
and that combination seems to work fine for me. I am going to close this bug out.
*** This bug has been marked as a duplicate of 40996 ***
------- Comment From email@example.com 2008-03-25 17:06 EDT-------
Reopening as we need to track this into MRG.
------- Comment From firstname.lastname@example.org 2008-03-31 08:00 EDT-------
RH has put an updated oprofile rpm in their repos. I installed this rpm
(oprofile-0.9.3-16.el5), but I continued to see the same error as before!! I
then installed http://userweb.kernel.org/~acme/oprofile-0.9.3-6.acme.x86_64.rpm
on the same machine and saw that the errors disappear.
So http://userweb.kernel.org/~acme/oprofile-0.9.3-6.acme.x86_64.rpm works, but
oprofile-0.9.3-16.el5 doesn't seem to.
Ankita, could you please verify this? In case I did something wrong...
The difference between the two rpms is that the one that works has a patch that
check the /dev/oprofile/ for the various counter directories ([0-9]+). The
patched code generates a bit mask based on the counter directories that are
available and only puts the events in the available counters.
Would you boot our latest kernel with nmi_watchdog=0 and see if the version of
oprofile delivered with MRG works?
Created attachment 299937 [details]
Patch to make oprofile only use counters that are available
The attached patch was proposed in nov 2006. It didn't get any comments on it
and wasn't pushed into the upstream oprofile. Changes were made in the kernel,
so that this was no longer an apparent problem in the kernel after 2.6.19 (e.g.
unable to replicate problem on 188.8.131.52-50.f8 kernel with watchdog timer). The
RT kernel appears to be doing things differently.
After discussing this with Clark and trying the workaround successfully, I was
asked to summarize our results here.
- In RHEL-5 oprofile works because the oprofile-kernel piece disables the
nmi_watchdog allowing it to access all 4 performance counters (IOW either
nmi_watchdog or oprofile can run but not both)
- Upstream 2.6.19 and later, I added code that allows both oprofile and
nmi_watchdog to run together (at the cost of one perf counter).
- Fedora follows upstream and upstream has nmi_watchdog disabled by default, so
the oprofile-userspace patch was never needed (unless you enabled nmi_watchdog,
then it would be necessary)
- kernel-rt follows RHEL-5 and enables the nmi_watchdog by default thus causing
this bugzilla (and the need for the oprofile-userspace patch).
- testing with nmi_watchdog=1 on the boot commandline failed to show this
failure because nmi_watchdog=1 uses IO_APIC for the watchdog which does _not_
use perfcounters. Booting with nmi_watchdog=2 uses the LOCAL_APIC and does use
The temporary workaround to avoid using the userspace patch would be to boot the
kernel-rt normally and run the following commands to use oprofile
# echo 0 > /proc/sys/kernel/nmi_watchdog //disable nmi_watchdog
# *oprofile stuff*
# opcontrol --deinit //unload oprofile driver module
# echo 1 > /proc/sys/kernel/nmi_watchdog //re-enable nmi_watchdog
booting with nmi_watchdog=0 is not recommended if you would like to turn the
nmi_watchdog on later because 'echo 1 > /proc/sys/kernel/nmi_watchdog' has a bug
that prevents turning on the nmi_watchdog for the first time.
I think that covers everything.
------- Comment From email@example.com 2008-04-02 18:23 EDT-------
I have verified that the 'echo  > /proc/sys/kernel/nmi_watchdog' workaround
as described above works with the oprofile-0.9.3-16.el5 package.
We'll need to wait for a user-space update to opcontrol for a permanent fix.
With a workaround in place, I'm going to drop the severity from urgent to medium
We've pushed the RHEL5.2 candidate rpm into the MRG beta repository and will
carry that until MRG RT rebases to RHEL5.2 (from RHEL5.1).
------- Comment From firstname.lastname@example.org 2008-05-27 12:52 EDT-------
(In reply to comment #26)
> ------- Comment From email@example.com 2008-05-02 15:28 EST-------
> We've pushed the RHEL5.2 candidate rpm into the MRG beta repository and will
> carry that until MRG RT rebases to RHEL5.2 (from RHEL5.1).
I don't see the rpm under
http://ftp.redhat.com/pub/redhat/linux/beta/MRG/RHEL-5/. Isn't that the place to
I believe that the 5.2 packages got pulled from the beta repository when RHEL5.2
GA'ed. Do we need to put it back?
------- Comment From firstname.lastname@example.org 2008-05-28 05:48 EDT-------
(In reply to comment #28)
> ------- Comment From email@example.com 2008-05-27 18:16 EST-------
> I believe that the 5.2 packages got pulled from the beta repository when RHEL5.2
> GA'ed. Do we need to put it back?
Yes, if MRG is going to be supported on RHEL5.1. I thought it was so.
oprofile packages back in the repository, closing
------- Comment From firstname.lastname@example.org 2008-06-03 02:03 EDT-------
Closing on our side as well.