Bug 822076

Summary: perf utility with argument top produces segmentation fault after some seconds.
Product: [Fedora] Fedora Reporter: Andreas Wirooks <andreas.wirooks>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: unspecified    
Version: 17CC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, rtguille
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-07-05 18:48:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
This is the mentioned coredump. none

Description Andreas Wirooks 2012-05-16 09:30:02 UTC
Description of problem:
perf utility with argument top produces segmentation fault after some seconds. stat or record does not seem to be affected.

Version-Release number of selected component (if applicable):
perf-3.3.4-5.fc17.x86_64

How reproducible:
Everytime with argument top.

Steps to Reproduce:
1. enter "perf top" inside a shell as root
2. wait 5 till 35 seconds
  
Actual results:
perf: Segmentation fault

Expected results:
Runnig until pressing q.

Additional info:
I was able to make a coredump if this helps. I this is helpfull i will add the dump (26 MB bzipped -9). The system is an actual and distro-synced fedora 17 and i have first seen this error 7 weeks ago where i first installed fedora 17 alpha.

Comment 1 Andreas Wirooks 2012-05-16 12:19:42 UTC
Created attachment 584951 [details]
This is the mentioned coredump.

File was 4MB too big. Where to place coredumps? I had to get a rapidshare account for this. I will delete the account completely after fix. Is there an alternative?

Comment 2 Josh Boyer 2012-05-16 17:32:07 UTC
I ran 'perf top' for 30 minutes and it didn't segfault.  The backtrace you've pointed to died in the select system call, which is slightly odd.

This happens every time you run 'perf top'?  Are you having other applications crash often, or is there anything in dmesg that would indicate a kernel issue?

Comment 3 Andreas Wirooks 2012-05-21 07:06:46 UTC
Thank you very much for your answer. I think i identified the cause. It's qemu-kvm.

When i shutdown all virtual machines "perf top" runs without crash. I am a student and do my internship at the moment and my job is to do performance analysis of virtualized multicore-machines. I am working with "perf kvm" and try to do performance analysis inside the guest. Normally i use the -cpu host parameter. But i also tried a simple VM with no special parameters. The effect is the same. I run "perf top" for minutes without crash. I ssh the machine a second time and use virsh to start a vm. Then "perf top" immediatly crashes.

This is the dmesgoutput while running "perf top" and starting the vm with "-cpu host" option, shutting down the vm, starting "perf top" again and starting the simple vm with no tweaks. Both times "perf top" crashes:
[  189.794691] device vnet0 entered promiscuous mode
[  189.803636] br0: port 2(vnet0) entered forwarding state
[  189.803650] br0: port 2(vnet0) entered forwarding state
[  192.661340] kvm: 1188: cpu0 unhandled rdmsr: 0xc0010112
[  192.661431] kvm: 1188: cpu0 unhandled rdmsr: 0xc0010048
[  192.812551] kvm: 1188: cpu0 unhandled rdmsr: 0xc0010001
[  192.824427] kvm: 1188: cpu1 unhandled rdmsr: 0xc0010048
[  192.836334] kvm: 1188: cpu2 unhandled rdmsr: 0xc0010048
[  192.848098] kvm: 1188: cpu3 unhandled rdmsr: 0xc0010048
[  192.859853] kvm: 1188: cpu4 unhandled rdmsr: 0xc0010048
[  192.871633] kvm: 1188: cpu5 unhandled rdmsr: 0xc0010048
[  192.883470] kvm: 1188: cpu6 unhandled rdmsr: 0xc0010048
[  192.895211] kvm: 1188: cpu7 unhandled rdmsr: 0xc0010048
[  199.866027] vnet0: no IPv6 routers present
[  265.794427] br0: port 2(vnet0) entered disabled state
[  265.799381] br0: port 2(vnet0) entered disabled state
[  265.799598] device vnet0 left promiscuous mode
[  265.799671] br0: port 2(vnet0) entered disabled state
[  689.407728] device vnet0 entered promiscuous mode
[  689.424636] br0: port 2(vnet0) entered forwarding state
[  689.424650] br0: port 2(vnet0) entered forwarding state
[  692.163058] kvm: 1421: cpu0 unhandled rdmsr: 0xc0010001
[  700.210018] vnet0: no IPv6 routers present

Comment 4 Andreas Wirooks 2012-05-21 07:13:52 UTC
I forgot one thing to tell. The fedora was a really clean and distrosynced installation. So the "perf top" issue happend with the official qemu-kvm package since the first day. Now i use an selfcompiled git version of qemu-kvm (i tought it would help to read hardware performancecounters inside a vm but it doesn't) and the crashes of "perf top" are still there.

Comment 5 Andreas Wirooks 2012-06-01 09:17:52 UTC
Fedora 17 is stable now. Everything is updated and distrosynced, even KVM, and Crash is still there and reproducable. perf top runs perfectly until starting KVM. Then it immediatly crashes.

Comment 6 Reartes Guillermo 2012-06-02 20:22:58 UTC
I found perf top segfaulted after one second. (KDE-Konsole)

Then i logged in via ssh and it worked ok, then went back to Konsole but it worked ok.

I rebooted the system and will try again.

perf.x86_64   3.3.7-1.fc17

Comment 7 Andreas Wirooks 2012-06-05 14:24:58 UTC
I updated to Kernel/perf 3.4 from updates-testint and there the bug is now gone. I know, 3.4 is from updates-testing and not the regular one, so as long as 3.3 is still the regular kernel, i woudn't call this bug fixed. But the end is near ;-)

Comment 8 Josh Boyer 2012-07-05 18:48:30 UTC
F17 is on the 3.4 kernel now.  Closing out.