Description of problem: The papi 4.2.1 in Fedora 16 core dumps. Version-Release number of selected component (if applicable): papi-4.2.1-1 How reproducible: Always Steps to Reproduce: 1. build papi-4.2.1-1.fc18 for f16 2. install it 3. run papi_avail or papi_native_avail Actual results: ---------- $ papi_avail *** glibc detected *** papi_avail: free(): invalid next size (normal): 0x0000000006045490 *** ======= Backtrace: ========= /lib64/libc.so.6[0x3680a7dda6] /lib64/libc.so.6[0x3680a7f08e] /lib64/libc.so.6(fclose+0x155)[0x3680a6db45] /usr/lib64/libpapi.so(_mx_init_substrate+0xa8)[0x368162a5a8] /usr/lib64/libpapi.so(_papi_hwi_init_global+0x38)[0x368161a6b8] /usr/lib64/libpapi.so(PAPI_library_init+0xad)[0x368161768d] papi_avail[0x40156f] /lib64/libc.so.6(__libc_start_main+0xed)[0x3680a2169d] papi_avail[0x40213d] ======= Memory map: ======== ... ---------- Additional info: It doesn't core dump in RHEL6.2
Not building the lm_sensors component appears to fix the problem: ---with-components="coretemp example lmsensors lustre mx net" +--with-components="coretemp example lustre mx net"
I just tried to replicate this on a AMD family 10 machine running fedora 15 and Intel Pentium 4 machine running F16. Neither triggered the problem. This problem could be very machine specific. What processor and mother board are you getting this failure on? Could you supply the papi_avail output before the failure? Could you install the papi-debuginfo rpm and glibc-debuginfo (debuginfo-install glibc) and see if can get a backtrace that provides the line numbers? Alterative run papi_avail in gdb, set a break point in exit, run and print backtrace: gdb /usr/bin/papi_avail break exit run where
papi_avail output from a papi-4.2.1-1.x86_64 rpm built in mock without the lmsensors component (the only specfile change is the one mentioned in the second comment): ---------- # papi_avail Available events and hardware information. -------------------------------------------------------------------------------- PAPI Version : 4.2.1.0 Vendor string and code : GenuineIntel (1) Model string and code : Intel(R) Core(TM) i7 CPU 870 @ 2.93GHz (30) CPU Revision : 5.000000 CPUID Info : Family: 6 Model: 30 Stepping: 5 CPU Megahertz : 1200.000000 CPU Clock Megahertz : 1200 Hdw Threads per core : 2 Cores per Socket : 4 NUMA Nodes : 1 CPUs per Node : 8 Total CPUs : 8 Number Hardware Counters : 7 Max Multiplex Counters : 64 -------------------------------------------------------------------------------- Name Code Avail Deriv Description (Note) PAPI_L1_DCM 0x80000000 Yes No Level 1 data cache misses PAPI_L1_ICM 0x80000001 Yes No Level 1 instruction cache misses PAPI_L2_DCM 0x80000002 Yes Yes Level 2 data cache misses PAPI_L2_ICM 0x80000003 Yes No Level 2 instruction cache misses PAPI_L3_DCM 0x80000004 No No Level 3 data cache misses PAPI_L3_ICM 0x80000005 No No Level 3 instruction cache misses PAPI_L1_TCM 0x80000006 Yes Yes Level 1 cache misses PAPI_L2_TCM 0x80000007 Yes No Level 2 cache misses PAPI_L3_TCM 0x80000008 Yes No Level 3 cache misses PAPI_CA_SNP 0x80000009 No No Requests for a snoop PAPI_CA_SHR 0x8000000a No No Requests for exclusive access to shared cache line PAPI_CA_CLN 0x8000000b No No Requests for exclusive access to clean cache line PAPI_CA_INV 0x8000000c No No Requests for cache line invalidation PAPI_CA_ITV 0x8000000d No No Requests for cache line intervention PAPI_L3_LDM 0x8000000e Yes No Level 3 load misses PAPI_L3_STM 0x8000000f No No Level 3 store misses PAPI_BRU_IDL 0x80000010 No No Cycles branch units are idle PAPI_FXU_IDL 0x80000011 No No Cycles integer units are idle PAPI_FPU_IDL 0x80000012 No No Cycles floating point units are idle PAPI_LSU_IDL 0x80000013 No No Cycles load/store units are idle PAPI_TLB_DM 0x80000014 Yes No Data translation lookaside buffer misses PAPI_TLB_IM 0x80000015 Yes No Instruction translation lookaside buffer misses PAPI_TLB_TL 0x80000016 Yes Yes Total translation lookaside buffer misses PAPI_L1_LDM 0x80000017 Yes No Level 1 load misses PAPI_L1_STM 0x80000018 Yes No Level 1 store misses PAPI_L2_LDM 0x80000019 Yes No Level 2 load misses PAPI_L2_STM 0x8000001a Yes No Level 2 store misses PAPI_BTAC_M 0x8000001b No No Branch target address cache misses PAPI_PRF_DM 0x8000001c No No Data prefetch cache misses PAPI_L3_DCH 0x8000001d No No Level 3 data cache hits PAPI_TLB_SD 0x8000001e No No Translation lookaside buffer shootdowns PAPI_CSR_FAL 0x8000001f No No Failed store conditional instructions PAPI_CSR_SUC 0x80000020 No No Successful store conditional instructions PAPI_CSR_TOT 0x80000021 No No Total store conditional instructions PAPI_MEM_SCY 0x80000022 No No Cycles Stalled Waiting for memory accesses PAPI_MEM_RCY 0x80000023 No No Cycles Stalled Waiting for memory Reads PAPI_MEM_WCY 0x80000024 No No Cycles Stalled Waiting for memory writes PAPI_STL_ICY 0x80000025 No No Cycles with no instruction issue PAPI_FUL_ICY 0x80000026 No No Cycles with maximum instruction issue PAPI_STL_CCY 0x80000027 No No Cycles with no instructions completed PAPI_FUL_CCY 0x80000028 No No Cycles with maximum instructions completed PAPI_HW_INT 0x80000029 No No Hardware interrupts PAPI_BR_UCN 0x8000002a Yes No Unconditional branch instructions PAPI_BR_CN 0x8000002b Yes No Conditional branch instructions PAPI_BR_TKN 0x8000002c Yes No Conditional branch instructions taken PAPI_BR_NTK 0x8000002d Yes Yes Conditional branch instructions not taken PAPI_BR_MSP 0x8000002e Yes No Conditional branch instructions mispredicted PAPI_BR_PRC 0x8000002f Yes Yes Conditional branch instructions correctly predicted PAPI_FMA_INS 0x80000030 No No FMA instructions completed PAPI_TOT_IIS 0x80000031 Yes No Instructions issued PAPI_TOT_INS 0x80000032 Yes No Instructions completed PAPI_INT_INS 0x80000033 No No Integer instructions PAPI_FP_INS 0x80000034 Yes No Floating point instructions PAPI_LD_INS 0x80000035 Yes No Load instructions PAPI_SR_INS 0x80000036 Yes No Store instructions PAPI_BR_INS 0x80000037 Yes No Branch instructions PAPI_VEC_INS 0x80000038 No No Vector/SIMD instructions (could include integer) PAPI_RES_STL 0x80000039 Yes No Cycles stalled on any resource PAPI_FP_STAL 0x8000003a No No Cycles the FP unit(s) are stalled PAPI_TOT_CYC 0x8000003b Yes No Total cycles PAPI_LST_INS 0x8000003c Yes Yes Load/store instructions completed PAPI_SYC_INS 0x8000003d No No Synchronization instructions completed PAPI_L1_DCH 0x8000003e Yes Yes Level 1 data cache hits PAPI_L2_DCH 0x8000003f Yes Yes Level 2 data cache hits PAPI_L1_DCA 0x80000040 Yes No Level 1 data cache accesses PAPI_L2_DCA 0x80000041 Yes No Level 2 data cache accesses PAPI_L3_DCA 0x80000042 Yes Yes Level 3 data cache accesses PAPI_L1_DCR 0x80000043 Yes No Level 1 data cache reads PAPI_L2_DCR 0x80000044 Yes No Level 2 data cache reads PAPI_L3_DCR 0x80000045 Yes No Level 3 data cache reads PAPI_L1_DCW 0x80000046 Yes No Level 1 data cache writes PAPI_L2_DCW 0x80000047 Yes No Level 2 data cache writes PAPI_L3_DCW 0x80000048 Yes No Level 3 data cache writes PAPI_L1_ICH 0x80000049 Yes No Level 1 instruction cache hits PAPI_L2_ICH 0x8000004a Yes No Level 2 instruction cache hits PAPI_L3_ICH 0x8000004b No No Level 3 instruction cache hits PAPI_L1_ICA 0x8000004c Yes No Level 1 instruction cache accesses PAPI_L2_ICA 0x8000004d Yes No Level 2 instruction cache accesses PAPI_L3_ICA 0x8000004e Yes No Level 3 instruction cache accesses PAPI_L1_ICR 0x8000004f Yes No Level 1 instruction cache reads PAPI_L2_ICR 0x80000050 Yes No Level 2 instruction cache reads PAPI_L3_ICR 0x80000051 Yes No Level 3 instruction cache reads PAPI_L1_ICW 0x80000052 No No Level 1 instruction cache writes PAPI_L2_ICW 0x80000053 No No Level 2 instruction cache writes PAPI_L3_ICW 0x80000054 No No Level 3 instruction cache writes PAPI_L1_TCH 0x80000055 No No Level 1 total cache hits PAPI_L2_TCH 0x80000056 Yes Yes Level 2 total cache hits PAPI_L3_TCH 0x80000057 No No Level 3 total cache hits PAPI_L1_TCA 0x80000058 Yes Yes Level 1 total cache accesses PAPI_L2_TCA 0x80000059 Yes No Level 2 total cache accesses PAPI_L3_TCA 0x8000005a Yes No Level 3 total cache accesses PAPI_L1_TCR 0x8000005b Yes Yes Level 1 total cache reads PAPI_L2_TCR 0x8000005c Yes Yes Level 2 total cache reads PAPI_L3_TCR 0x8000005d Yes Yes Level 3 total cache reads PAPI_L1_TCW 0x8000005e No No Level 1 total cache writes PAPI_L2_TCW 0x8000005f Yes No Level 2 total cache writes PAPI_L3_TCW 0x80000060 Yes No Level 3 total cache writes PAPI_FML_INS 0x80000061 No No Floating point multiply instructions PAPI_FAD_INS 0x80000062 No No Floating point add instructions PAPI_FDV_INS 0x80000063 No No Floating point divide instructions PAPI_FSQ_INS 0x80000064 No No Floating point square root instructions PAPI_FNV_INS 0x80000065 No No Floating point inverse instructions PAPI_FP_OPS 0x80000066 Yes Yes Floating point operations PAPI_SP_OPS 0x80000067 Yes Yes Floating point operations; optimized to count scaled single precision vector operations PAPI_DP_OPS 0x80000068 Yes Yes Floating point operations; optimized to count scaled double precision vector operations PAPI_VEC_SP 0x80000069 Yes No Single precision vector/SIMD instructions PAPI_VEC_DP 0x8000006a Yes No Double precision vector/SIMD instructions ------------------------------------------------------------------------- Of 107 possible events, 63 are available, of which 17 are derived. avail.c PASSED
papi-4.2.1-1 built in mock (no specfile changes). Additional RPMs installed: * debuginfo-install lm_sensors-libs-3.3.1-1.fc16.x86_64 ---------- # gdb /usr/bin/papi_avail GNU gdb (GDB) Fedora (7.3.50.20110722-10.fc16) Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/bin/papi_avail...Reading symbols from /usr/lib/debug/usr/bin/papi_avail.debug...done. done. (gdb) break exit Breakpoint 1 at 0x4013b0 (gdb) run Starting program: /usr/bin/papi_avail Detaching after fork from child process 26950. *** glibc detected *** /usr/bin/papi_avail: free(): invalid next size (normal): 0x000000000460d490 *** ======= Backtrace: ========= /lib64/libc.so.6[0x3680a7dda6] /lib64/libc.so.6[0x3680a7f08e] /lib64/libc.so.6(fclose+0x155)[0x3680a6db45] /usr/lib64/libpapi.so(_mx_init_substrate+0xa8)[0x7ffff7daf5a8] /usr/lib64/libpapi.so(_papi_hwi_init_global+0x38)[0x7ffff7d9f6b8] /usr/lib64/libpapi.so(PAPI_library_init+0xad)[0x7ffff7d9c68d] /usr/bin/papi_avail[0x40156f] /lib64/libc.so.6(__libc_start_main+0xed)[0x3680a2169d] /usr/bin/papi_avail[0x40213d] ======= Memory map: ======== 00400000-00407000 r-xp 00000000 08:03 2624639 /usr/bin/papi_avail 00606000-00607000 r--p 00006000 08:03 2624639 /usr/bin/papi_avail 00607000-00608000 rw-p 00007000 08:03 2624639 /usr/bin/papi_avail 00608000-04678000 rw-p 00000000 00:00 0 [heap] 3680600000-3680622000 r-xp 00000000 08:03 2097266 /lib64/ld-2.14.90.so 3680821000-3680822000 r--p 00021000 08:03 2097266 /lib64/ld-2.14.90.so 3680822000-3680823000 rw-p 00022000 08:03 2097266 /lib64/ld-2.14.90.so 3680823000-3680824000 rw-p 00000000 00:00 0 3680a00000-3680bad000 r-xp 00000000 08:03 2097277 /lib64/libc-2.14.90.so 3680bad000-3680dad000 ---p 001ad000 08:03 2097277 /lib64/libc-2.14.90.so 3680dad000-3680db1000 r--p 001ad000 08:03 2097277 /lib64/libc-2.14.90.so 3680db1000-3680db3000 rw-p 001b1000 08:03 2097277 /lib64/libc-2.14.90.so 3680db3000-3680db8000 rw-p 00000000 00:00 0 3680e00000-3680e83000 r-xp 00000000 08:03 2097462 /lib64/libm-2.14.90.so 3680e83000-3681082000 ---p 00083000 08:03 2097462 /lib64/libm-2.14.90.so 3681082000-3681083000 r--p 00082000 08:03 2097462 /lib64/libm-2.14.90.so 3681083000-3681084000 rw-p 00083000 08:03 2097462 /lib64/libm-2.14.90.so 3681a00000-3681a15000 r-xp 00000000 08:03 2097469 /lib64/libgcc_s-4.6.2-20111027.so.1 3681a15000-3681c14000 ---p 00015000 08:03 2097469 /lib64/libgcc_s-4.6.2-20111027.so.1 3681c14000-3681c15000 rw-p 00014000 08:03 2097469 /lib64/libgcc_s-4.6.2-20111027.so.1 3683a00000-3683a0e000 r-xp 00000000 08:03 2626130 /usr/lib64/libsensors.so.4.3.1 3683a0e000-3683c0d000 ---p 0000e000 08:03 2626130 /usr/lib64/libsensors.so.4.3.1 3683c0d000-3683c0e000 rw-p 0000d000 08:03 2626130 /usr/lib64/libsensors.so.4.3.1 7ffff7ab5000-7ffff7ae1000 rw-p 00000000 00:00 0 7ffff7ae1000-7ffff7b49000 r-xp 00000000 08:03 2624924 /usr/lib64/libpfm.so.4.2.0 7ffff7b49000-7ffff7d49000 ---p 00068000 08:03 2624924 /usr/lib64/libpfm.so.4.2.0 7ffff7d49000-7ffff7d83000 rw-p 00068000 08:03 2624924 /usr/lib64/libpfm.so.4.2.0 7ffff7d83000-7ffff7d85000 rw-p 00000000 00:00 0 7ffff7d85000-7ffff7dcd000 r-xp 00000000 08:03 2624905 /usr/lib64/libpapi.so.4.2.1.0 7ffff7dcd000-7ffff7fcc000 ---p 00048000 08:03 2624905 /usr/lib64/libpapi.so.4.2.1.0 7ffff7fcc000-7ffff7fce000 r--p 00047000 08:03 2624905 /usr/lib64/libpapi.so.4.2.1.0 7ffff7fce000-7ffff7fd3000 rw-p 00049000 08:03 2624905 /usr/lib64/libpapi.so.4.2.1.0 7ffff7fd3000-7ffff7fd8000 rw-p 00000000 00:00 0 7ffff7ffc000-7ffff7ffe000 rw-p 00000000 00:00 0 7ffff7ffe000-7ffff7fff000 r-xp 00000000 00:00 0 [vdso] 7ffffffde000-7ffffffff000 rw-p 00000000 00:00 0 [stack] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Program received signal SIGABRT, Aborted. 0x0000003680a36285 in __GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 64 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); (gdb) where #0 0x0000003680a36285 in __GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x0000003680a37b9b in __GI_abort () at abort.c:91 #2 0x0000003680a77a7e in __libc_message (do_abort=2, fmt=0x3680b76678 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:198 #3 0x0000003680a7dda6 in malloc_printerr (action=3, str=0x3680b76840 "free(): invalid next size (normal)", ptr=<optimized out>) at malloc.c:5021 #4 0x0000003680a7f08e in _int_free (av=0x3680db1700, p=0x460d480, have_lock=0) at malloc.c:3942 #5 0x0000003680a6db45 in _IO_new_fclose (fp=0x460d490) at iofclose.c:88 #6 0x00007ffff7daf5a8 in _mx_init_substrate () at components/mx/linux-mx.c:234 #7 0x00007ffff7d9f6b8 in _papi_hwi_init_global () at papi_internal.c:1420 #8 0x00007ffff7d9c68d in PAPI_library_init (version=<optimized out>) at papi.c:601 #9 0x000000000040156f in main (argc=<optimized out>, argv=0x7fffffffe3e8) at avail.c:163 ----------
(In reply to comment #2) > > What processor and mother board are you getting this failure on? > From the dmidecode output: Base Board Information Manufacturer: Foxconn Product Name: H55MX-S Series Version: 1.1 BIOS Information Vendor: American Megatrends Inc. Version: 080015 Release Date: 08/09/2010 Processor Information Socket Designation: CPU 1 Type: Central Processor Family: Core i7 Manufacturer: Intel Signature: Type 0, Family 6, Model 30, Stepping 5
Ran through valgrind. It looks like there might be a access pass the end of the allocated array for linux-lmsensors.c:116. $ valgrind papi_avail ==22559== Memcheck, a memory error detector ==22559== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al. ==22559== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info ==22559== Command: papi_avail ==22559== ==22559== Invalid write of size 4 ==22559== at 0x5263AD9: createNativeEvents (linux-lmsensors.c:116) ==22559== by 0x5263C0D: LM_SENSORS_init_substrate (linux-lmsensors.c:191) ==22559== by 0x5253AA7: _papi_hwi_init_global (papi_internal.c:1420) ==22559== by 0x525177C: PAPI_library_init (papi.c:601) ==22559== by 0x40180A: main (avail.c:163) ==22559== Address 0x5dd1ad8 is 216 bytes inside a block of size 568 free'd ==22559== at 0x500EC4A: free (vg_replace_malloc.c:427) ==22559== by 0x305E8662DF: __fopen_internal (iofopen.c:98) ==22559== by 0x305F004FEE: sensors_get_label (in /usr/lib64/libsensors.so.4.2.0) ==22559== by 0x52639EF: createNativeEvents (linux-lmsensors.c:80) ==22559== by 0x5263C0D: LM_SENSORS_init_substrate (linux-lmsensors.c:191) ==22559== by 0x5253AA7: _papi_hwi_init_global (papi_internal.c:1420) ==22559== by 0x525177C: PAPI_library_init (papi.c:601) ==22559== by 0x40180A: main (avail.c:163) ==22559== Available events and hardware information. -------------------------------------------------------------------------------- PAPI Version : 4.2.1.0 Vendor string and code : GenuineIntel (1) Model string and code : Intel(R) Core(TM) i7 CPU M 620 @ 2.67GHz (37) CPU Revision : 2.000000 CPUID Info : Family: 6 Model: 37 Stepping: 2 CPU Megahertz : 1199.000000 CPU Clock Megahertz : 1199 Hdw Threads per core : 2 Cores per Socket : 2 NUMA Nodes : 1 CPUs per Node : 4 Total CPUs : 4 Number Hardware Counters : 7 Max Multiplex Counters : 64 -------------------------------------------------------------------------------- Name Code Avail Deriv Description (Note) PAPI_L1_DCM 0x80000000 Yes No Level 1 data cache misses PAPI_L1_ICM 0x80000001 Yes No Level 1 instruction cache misses PAPI_L2_DCM 0x80000002 Yes Yes Level 2 data cache misses PAPI_L2_ICM 0x80000003 Yes No Level 2 instruction cache misses PAPI_L3_DCM 0x80000004 No No Level 3 data cache misses PAPI_L3_ICM 0x80000005 No No Level 3 instruction cache misses PAPI_L1_TCM 0x80000006 Yes Yes Level 1 cache misses PAPI_L2_TCM 0x80000007 Yes No Level 2 cache misses PAPI_L3_TCM 0x80000008 Yes No Level 3 cache misses PAPI_CA_SNP 0x80000009 No No Requests for a snoop PAPI_CA_SHR 0x8000000a No No Requests for exclusive access to shared cache line PAPI_CA_CLN 0x8000000b No No Requests for exclusive access to clean cache line PAPI_CA_INV 0x8000000c No No Requests for cache line invalidation PAPI_CA_ITV 0x8000000d No No Requests for cache line intervention PAPI_L3_LDM 0x8000000e Yes No Level 3 load misses PAPI_L3_STM 0x8000000f No No Level 3 store misses PAPI_BRU_IDL 0x80000010 No No Cycles branch units are idle PAPI_FXU_IDL 0x80000011 No No Cycles integer units are idle PAPI_FPU_IDL 0x80000012 No No Cycles floating point units are idle PAPI_LSU_IDL 0x80000013 No No Cycles load/store units are idle PAPI_TLB_DM 0x80000014 Yes No Data translation lookaside buffer misses PAPI_TLB_IM 0x80000015 Yes No Instruction translation lookaside buffer misses PAPI_TLB_TL 0x80000016 Yes Yes Total translation lookaside buffer misses PAPI_L1_LDM 0x80000017 Yes No Level 1 load misses PAPI_L1_STM 0x80000018 Yes No Level 1 store misses PAPI_L2_LDM 0x80000019 Yes No Level 2 load misses PAPI_L2_STM 0x8000001a Yes No Level 2 store misses PAPI_BTAC_M 0x8000001b No No Branch target address cache misses PAPI_PRF_DM 0x8000001c No No Data prefetch cache misses PAPI_L3_DCH 0x8000001d No No Level 3 data cache hits PAPI_TLB_SD 0x8000001e No No Translation lookaside buffer shootdowns PAPI_CSR_FAL 0x8000001f No No Failed store conditional instructions PAPI_CSR_SUC 0x80000020 No No Successful store conditional instructions PAPI_CSR_TOT 0x80000021 No No Total store conditional instructions PAPI_MEM_SCY 0x80000022 No No Cycles Stalled Waiting for memory accesses PAPI_MEM_RCY 0x80000023 No No Cycles Stalled Waiting for memory Reads PAPI_MEM_WCY 0x80000024 No No Cycles Stalled Waiting for memory writes PAPI_STL_ICY 0x80000025 No No Cycles with no instruction issue PAPI_FUL_ICY 0x80000026 No No Cycles with maximum instruction issue PAPI_STL_CCY 0x80000027 No No Cycles with no instructions completed PAPI_FUL_CCY 0x80000028 No No Cycles with maximum instructions completed PAPI_HW_INT 0x80000029 No No Hardware interrupts PAPI_BR_UCN 0x8000002a Yes No Unconditional branch instructions PAPI_BR_CN 0x8000002b Yes No Conditional branch instructions PAPI_BR_TKN 0x8000002c Yes No Conditional branch instructions taken PAPI_BR_NTK 0x8000002d Yes Yes Conditional branch instructions not taken PAPI_BR_MSP 0x8000002e Yes No Conditional branch instructions mispredicted PAPI_BR_PRC 0x8000002f Yes Yes Conditional branch instructions correctly predicted PAPI_FMA_INS 0x80000030 No No FMA instructions completed PAPI_TOT_IIS 0x80000031 Yes No Instructions issued PAPI_TOT_INS 0x80000032 Yes No Instructions completed PAPI_INT_INS 0x80000033 No No Integer instructions PAPI_FP_INS 0x80000034 Yes No Floating point instructions PAPI_LD_INS 0x80000035 Yes No Load instructions PAPI_SR_INS 0x80000036 Yes No Store instructions PAPI_BR_INS 0x80000037 Yes No Branch instructions PAPI_VEC_INS 0x80000038 No No Vector/SIMD instructions (could include integer) PAPI_RES_STL 0x80000039 Yes No Cycles stalled on any resource PAPI_FP_STAL 0x8000003a No No Cycles the FP unit(s) are stalled PAPI_TOT_CYC 0x8000003b Yes No Total cycles PAPI_LST_INS 0x8000003c Yes Yes Load/store instructions completed PAPI_SYC_INS 0x8000003d No No Synchronization instructions completed PAPI_L1_DCH 0x8000003e No No Level 1 data cache hits PAPI_L2_DCH 0x8000003f Yes Yes Level 2 data cache hits PAPI_L1_DCA 0x80000040 No No Level 1 data cache accesses PAPI_L2_DCA 0x80000041 Yes No Level 2 data cache accesses PAPI_L3_DCA 0x80000042 Yes Yes Level 3 data cache accesses PAPI_L1_DCR 0x80000043 No No Level 1 data cache reads PAPI_L2_DCR 0x80000044 Yes No Level 2 data cache reads PAPI_L3_DCR 0x80000045 Yes No Level 3 data cache reads PAPI_L1_DCW 0x80000046 No No Level 1 data cache writes PAPI_L2_DCW 0x80000047 Yes No Level 2 data cache writes PAPI_L3_DCW 0x80000048 Yes No Level 3 data cache writes PAPI_L1_ICH 0x80000049 Yes No Level 1 instruction cache hits PAPI_L2_ICH 0x8000004a Yes No Level 2 instruction cache hits PAPI_L3_ICH 0x8000004b No No Level 3 instruction cache hits PAPI_L1_ICA 0x8000004c Yes No Level 1 instruction cache accesses PAPI_L2_ICA 0x8000004d Yes No Level 2 instruction cache accesses PAPI_L3_ICA 0x8000004e Yes No Level 3 instruction cache accesses PAPI_L1_ICR 0x8000004f Yes No Level 1 instruction cache reads PAPI_L2_ICR 0x80000050 Yes No Level 2 instruction cache reads PAPI_L3_ICR 0x80000051 Yes No Level 3 instruction cache reads PAPI_L1_ICW 0x80000052 No No Level 1 instruction cache writes PAPI_L2_ICW 0x80000053 No No Level 2 instruction cache writes PAPI_L3_ICW 0x80000054 No No Level 3 instruction cache writes PAPI_L1_TCH 0x80000055 No No Level 1 total cache hits PAPI_L2_TCH 0x80000056 Yes Yes Level 2 total cache hits PAPI_L3_TCH 0x80000057 No No Level 3 total cache hits PAPI_L1_TCA 0x80000058 No No Level 1 total cache accesses PAPI_L2_TCA 0x80000059 Yes No Level 2 total cache accesses PAPI_L3_TCA 0x8000005a Yes No Level 3 total cache accesses PAPI_L1_TCR 0x8000005b No No Level 1 total cache reads PAPI_L2_TCR 0x8000005c Yes Yes Level 2 total cache reads PAPI_L3_TCR 0x8000005d Yes Yes Level 3 total cache reads PAPI_L1_TCW 0x8000005e No No Level 1 total cache writes PAPI_L2_TCW 0x8000005f Yes No Level 2 total cache writes PAPI_L3_TCW 0x80000060 Yes No Level 3 total cache writes PAPI_FML_INS 0x80000061 No No Floating point multiply instructions PAPI_FAD_INS 0x80000062 No No Floating point add instructions PAPI_FDV_INS 0x80000063 No No Floating point divide instructions PAPI_FSQ_INS 0x80000064 No No Floating point square root instructions PAPI_FNV_INS 0x80000065 No No Floating point inverse instructions PAPI_FP_OPS 0x80000066 Yes Yes Floating point operations PAPI_SP_OPS 0x80000067 Yes Yes Floating point operations; optimized to count scaled single precision vector operations PAPI_DP_OPS 0x80000068 Yes Yes Floating point operations; optimized to count scaled double precision vector operations PAPI_VEC_SP 0x80000069 Yes No Single precision vector/SIMD instructions PAPI_VEC_DP 0x8000006a Yes No Double precision vector/SIMD instructions ------------------------------------------------------------------------- Of 107 possible events, 57 are available, of which 14 are derived. avail.c PASSED ==22559== ==22559== HEAP SUMMARY: ==22559== in use at exit: 253,706 bytes in 893 blocks ==22559== total heap usage: 7,217 allocs, 6,324 frees, 6,530,071 bytes allocated ==22559== ==22559== LEAK SUMMARY: ==22559== definitely lost: 38,720 bytes in 39 blocks ==22559== indirectly lost: 0 bytes in 0 blocks ==22559== possibly lost: 0 bytes in 0 blocks ==22559== still reachable: 214,986 bytes in 854 blocks ==22559== suppressed: 0 bytes in 0 blocks ==22559== Rerun with --leak-check=full to see details of leaked memory ==22559== ==22559== For counts of detected and suppressed errors, rerun with: -v ==22559== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 6 from 6)
In an effort to better understand what is going on I have made a scratch version of valgrind 3.7.1 (with cpuid support if needed) for fedora 16 to run papi_avail on and show if there are problems with the memory allocation in lmsensors code. The fedora-16 scratch build of valgrind is at: http://koji.fedoraproject.org/koji/taskinfo?taskID=3858127 Once installing valgrind-3.7.0-1.fc16.cpuid.x86_64.rpm from http://koji.fedoraproject.org/koji/taskinfo?taskID=3858128 Or valgrind-3.7.0-1.fc16.cpuid.i686.rpm from http://koji.fedoraproject.org/koji/taskinfo?taskID=3858129 You should be able to get some information about memory allocation problems with: valgrind papi_avail
William, Here goes the output: $ rpm -q valgrind valgrind-3.7.0-1.fc16.cpuid.x86_64 $ valgrind /usr/bin/papi_avail ==31273== Memcheck, a memory error detector ==31273== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al. ==31273== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info ==31273== Command: /usr/bin/papi_avail ==31273== ==31273== Invalid write of size 4 ==31273== at 0x367DC29345: createNativeEvents (linux-lmsensors.c:116) ==31273== by 0x367DC2945F: LM_SENSORS_init_substrate (linux-lmsensors.c:191) ==31273== by 0x367DC1A6B7: _papi_hwi_init_global (papi_internal.c:1420) ==31273== by 0x367DC1768C: PAPI_library_init (papi.c:601) ==31273== by 0x40156E: main (avail.c:163) ==31273== Address 0x5b01728 is 216 bytes inside a block of size 568 free'd ==31273== at 0x520D8AE: free (vg_replace_malloc.c:427) ==31273== by 0x3680A6E5CC: __fopen_internal (iofopen.c:98) ==31273== by 0x3683A0388F: sensors_get_label (access.c:188) ==31273== by 0x367DC29271: createNativeEvents (linux-lmsensors.c:80) ==31273== by 0x367DC2945F: LM_SENSORS_init_substrate (linux-lmsensors.c:191) ==31273== by 0x367DC1A6B7: _papi_hwi_init_global (papi_internal.c:1420) ==31273== by 0x367DC1768C: PAPI_library_init (papi.c:601) ==31273== by 0x40156E: main (avail.c:163) ==31273== Available events and hardware information. -------------------------------------------------------------------------------- PAPI Version : 4.2.1.0 Vendor string and code : GenuineIntel (1) Model string and code : Intel(R) Core(TM) i7 CPU 870 @ 2.93GHz (30) CPU Revision : 5.000000 CPUID Info : Family: 6 Model: 30 Stepping: 5 CPU Megahertz : 1200.000000 CPU Clock Megahertz : 1200 Hdw Threads per core : 2 Cores per Socket : 4 NUMA Nodes : 1 CPUs per Node : 8 Total CPUs : 8 Number Hardware Counters : 7 Max Multiplex Counters : 64 -------------------------------------------------------------------------------- Name Code Avail Deriv Description (Note) PAPI_L1_DCM 0x80000000 Yes No Level 1 data cache misses PAPI_L1_ICM 0x80000001 Yes No Level 1 instruction cache misses . . . PAPI_VEC_SP 0x80000069 Yes No Single precision vector/SIMD instructions PAPI_VEC_DP 0x8000006a Yes No Double precision vector/SIMD instructions ------------------------------------------------------------------------- Of 107 possible events, 57 are available, of which 14 are derived. avail.c PASSED ==31273== ==31273== HEAP SUMMARY: ==31273== in use at exit: 252,620 bytes in 980 blocks ==31273== total heap usage: 7,856 allocs, 6,876 frees, 6,814,488 bytes allocated ==31273== ==31273== LEAK SUMMARY: ==31273== definitely lost: 38,720 bytes in 39 blocks ==31273== indirectly lost: 0 bytes in 0 blocks ==31273== possibly lost: 0 bytes in 0 blocks ==31273== still reachable: 213,900 bytes in 941 blocks ==31273== suppressed: 0 bytes in 0 blocks ==31273== Rerun with --leak-check=full to see details of leaked memory ==31273== ==31273== For counts of detected and suppressed errors, rerun with: -v ==31273== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 2 from 2)
$ rpm -q lm_sensors lm_sensors-3.3.1-1.fc16.x86_6 Adding a couple of printfs I get a Radeon video driver event and then a buffer overrun: $ valgrind papi_avail ==4079== Memcheck, a memory error detector ==4079== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al. ==4079== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info ==4079== Command: papi_avail ==4079== LM_SENSORS_init_substrate: Number of LM_SENSORS events = 1 createNativeEvents: 000: radeon-pci-0100.temp1.temp1_input ==4079== Invalid write of size 4 ==4079== at 0x5265388: ??? (in /usr/lib64/libpapi.so.4.2.1.0) ==4079== by 0x52654DE: LM_SENSORS_init_substrate (in /usr/lib64/libpapi.so.4.2.1.0) ==4079== by 0x52566B7: _papi_hwi_init_global (in /usr/lib64/libpapi.so.4.2.1.0) ==4079== by 0x525368C: PAPI_library_init (in /usr/lib64/libpapi.so.4.2.1.0) ==4079== by 0x40156E: main (avail.c:163) ... The problem appears to be in this piece of code of createNativeEvents(): ... /* increment the table index counter */ id++; /* <-- PROBLEM */ } lm_sensors_native_table[id].count = count + 1; /* Crashes here */ ...
The buffer overrun is also be detected by valgrind in RHEL6.2/SL6.2 (but the papi apps don't crash): * debuginfo-install papi * modprobe coretemp # to add events detected by lm_sensors * valgrind papi_avail -------- # valgrind papi_avail ==1758== Memcheck, a memory error detector ==1758== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al. ==1758== Using Valgrind-3.6.0 and LibVEX; rerun with -h for copyright info ==1758== Command: papi_avail ==1758== ==1758== Invalid write of size 4 ==1758== at 0x3F2E628AD9: createNativeEvents (linux-lmsensors.c:116) ==1758== by 0x3F2E628C0D: LM_SENSORS_init_substrate (linux-lmsensors.c:191) ==1758== by 0x3F2E618AA7: _papi_hwi_init_global (papi_internal.c:1420) ==1758== by 0x3F2E61677C: PAPI_library_init (papi.c:601) ==1758== by 0x40180A: main (avail.c:163) ==1758== Address 0x5746328 is 216 bytes inside a block of size 568 free'd ==1758== at 0x520C95D: free (vg_replace_malloc.c:366) ==1758== by 0x3D8746584C: fclose@@GLIBC_2.2.5 (iofclose.c:88) ==1758== by 0x3F2EA05022: sensors_get_label (access.c:190) ==1758== by 0x3F2E6289EF: createNativeEvents (linux-lmsensors.c:80) ==1758== by 0x3F2E628C0D: LM_SENSORS_init_substrate (linux-lmsensors.c:191) ==1758== by 0x3F2E618AA7: _papi_hwi_init_global (papi_internal.c:1420) ==1758== by 0x3F2E61677C: PAPI_library_init (papi.c:601) ==1758== by 0x40180A: main (avail.c:163) ==1758== .... -----
commit 0526b12537d187bee8dac734026578d5f2b9035e Author: Vince Weaver <vweaver1.edu> Date: Fri Mar 9 14:41:14 2012 -0500 Fix buffer overrun in lmsensors component
The patch has been applied and a new RPM built, papi-4.2.1-2. Could you verify this fixes the problem. http://koji.fedoraproject.org/koji/taskinfo?taskID=3875511
(In reply to comment #12) > The patch has been applied and a new RPM built, papi-4.2.1-2. Could you verify > this fixes the problem. > > http://koji.fedoraproject.org/koji/taskinfo?taskID=3875511 It no longer crashes in F16 x86_64 and valgrind no longer detects invalid writes in F16 x86_64 and in Sl6.2 x86_64. /jpo
According the Comment 13 this is fixed with the patch from upstream.