| Summary: | Use of kernel perf support by PAPI causes crash | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | William Cohen <wcohen> |
| Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
| Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 15 | CC: | fche, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-02-06 17:33:04 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
This occurs on an AMD Family 10h machine (not sure if the bug is processor specific). Dual socket, 8 core machine. Below is the processor 0 information. processor : 0 vendor_id : AuthenticAMD cpu family : 16 model : 2 model name : Quad-Core AMD Opteron(tm) Processor 2350 stepping : 3 cpu MHz : 1000.000 cache size : 512 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monito r cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3d nowprefetch osvw ibs npt lbrv svm_lock bogomips : 3999.68 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm stc 100mhzsteps hwpstate does it still happen in the 2.6.42 update ? (This is based on upstream 3.2) The kernel-2.6.42.3-1.fc15.x86_64.rpm does not suffer from this problem. Everything seems to work fine with it. excellent! |
Description of problem: When running PAPI tests ("make fulltest") on Fedora 15 x86-64 box the tests caused the machine to crash and the machine needs to be rebooted. Version-Release number of selected component (if applicable): kernel-2.6.41.10-3.fc15.x86_64 papi-4.1.3-2.fc15.x86_64 How reproducible: All the time Steps to Reproduce: 1. yum install "papi*" 2. yumdownloader --source papi; rpm -Uvh papi*src.rpm; cd rpmbuild/SPECS; rpmbuild -ba papi.spec 3. cd ~/rpmbuild/BUILD/papi-4.1.3/src 4. while true; do make fulltest; done 5. (tests will run a while, but then the machine will crash) Actual results: [ 404.654873] ------------[ cut here ]------------ [ 404.655804] WARNING: at arch/x86/kernel/cpu/perf_event.c:1255 x86_pmu_stop+0) [ 404.655804] Hardware name: MCP55 [ 404.655804] Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl tun ebt] [ 404.655804] Pid: 2790, comm: multiplex2 Not tainted 2.6.41.10-3.fc15.x86_64 1 [ 404.655804] Call Trace: [ 404.655804] [<ffffffff8106b7bf>] warn_slowpath_common+0x7f/0xc0 [ 404.655804] [<ffffffff8106b81a>] warn_slowpath_null+0x1a/0x20 [ 404.655804] [<ffffffff81024075>] x86_pmu_stop+0xc5/0xe0 [ 404.655804] [<ffffffff81026eb5>] x86_pmu_enable+0x95/0x270 [ 404.655804] [<ffffffff8110cea6>] __perf_install_in_context+0x166/0x1b0 [ 404.655804] [<ffffffff81109420>] ? perf_adjust_period+0x1c0/0x1c0 [ 404.655804] [<ffffffff81109468>] remote_function+0x48/0x60 [ 404.655804] [<ffffffff810a5077>] smp_call_function_single+0x147/0x160 [ 404.801032] [<ffffffff8118ff82>] ? mnt_clone_write+0x12/0x30 [ 404.801032] [<ffffffff81108274>] task_function_call+0x44/0x50 [ 404.801032] [<ffffffff8110cd40>] ? perf_event_sched_in+0xa0/0xa0 [ 404.801032] [<ffffffff8110addd>] perf_install_in_context+0x5d/0xa0 [ 404.801032] [<ffffffff81110715>] sys_perf_event_open+0x695/0x950 [ 404.801032] [<ffffffff815b7f82>] system_call_fastpath+0x16/0x1b [ 404.801032] ---[ end trace ce673c479678ad8b ]--- Expected results: The tests will run without crashing the machine Additional info: Thread by Vince Weaver describing the same problem: https://lkml.org/lkml/2011/12/16/463