Bug 1158802

Summary: are_valid_hwcaps() assertion fail makes valgrind unusable on (qemu emulated) Haswell x86_64
Product: Red Hat Enterprise Linux 6 Reporter: Lubomir Rintel <lrintel>
Component: valgrindAssignee: Mark Wielaard <mjw>
Status: CLOSED ERRATA QA Contact: Miloš Prchlík <mprchlik>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.7CC: jakub, mbenitez, mcermak, mfranc, mprchlik
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: valgrind-3.8.1-4.el6 Doc Type: Bug Fix
Doc Text:
Valgrind assumed that a processor that supported the Advanced Vector Extensions 2 (AVX2) instruction set also always supported the Leading Zeros Count (LZCNT) instruction. This is not always true under QEMU, which can support AVX2 instructions, but not LZCNT. Consequently, Valgrind failed to run under QEMU when AVX2 instructions were enabled. Valgrind has been fixed to be able to run when the AVX2 instruction set is supported but the LZCNT instruction is not, and Valgrind now runs under QEMU as expected.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-07-22 06:23:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Lubomir Rintel 2014-10-30 09:31:45 UTC
Description of problem:

[lkundrak@rhel6-1 ~]$ valgrind ls
==2933== Memcheck, a memory error detector
==2933== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==2933== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==2933== Command: ls
==2933== 

vex: priv/main_main.c:319 (LibVEX_Translate): Assertion `are_valid_hwcaps(VexArchAMD64, vta->archinfo_host.hwcaps)' failed.
vex storage: T total 0 bytes allocated
vex storage: P total 0 bytes allocated

valgrind: the 'impossible' happened:
   LibVEX called failure_exit().
==2933==    at 0x38031DA7: report_and_quit (m_libcassert.c:235)
==2933==    by 0x38031E0E: panic (m_libcassert.c:319)
==2933==    by 0x38031E68: vgPlain_core_panic_at (m_libcassert.c:324)
==2933==    by 0x38031E7A: vgPlain_core_panic (m_libcassert.c:329)
==2933==    by 0x3804D162: failure_exit (m_translate.c:708)
==2933==    by 0x380D4248: vex_assert_fail (main_util.c:219)
==2933==    by 0x380D2619: LibVEX_Translate (main_main.c:319)
==2933==    by 0x3804AACE: vgPlain_translate (m_translate.c:1559)
==2933==    by 0x38079D9F: vgPlain_scheduler (scheduler.c:991)
==2933==    by 0x380A5A29: run_a_thread_NORETURN (syswrap-linux.c:103)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable
==2933==    at 0x3EAF200B00: ??? (in /lib64/ld-2.12.so)


Note: see also the FAQ in the source distribution.
It contains workarounds to several common problems.
In particular, if Valgrind aborted or crashed after
identifying problems in your program, there's a good chance
that fixing those problems will prevent Valgrind aborting or
crashing, especially if it happened in m_mallocfree.c.

If that doesn't help, please report this bug to: www.valgrind.org

In the bug report, send all the above text, the valgrind
version, and what OS and version you are using.  Thanks.

[lkundrak@rhel6-1 ~]$ 

Version-Release number of selected component (if applicable):

valgrind-3.8.1-3.7.el6.x86_64

How reproducible:

Always.

Additional info:

I'm running this virtualized on Fedora 20 with libvirt. The qemu command line is:

qemu     26149  9.9 10.7 4672352 1737044 ?     Sl   10:16   1:25 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name rhel6-1 -S -machine pc-i440fx-1.6,accel=kvm,usb=off -cpu Haswell -m 2048 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 202b07f7-3244-4170-b8f0-9b2beb7e1ce7 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel6-1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/var/lib/libvirt/images/rhel6-1.qcow,if=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=27 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:d8:dd:69,bus=pci.0,addr=0x3 -netdev tap,fd=28,id=hostnet1,vhost=on,vhostfd=32 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:d6:da:3f,bus=pci.0,addr=0x9 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0 -spice port=5901,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1 -chardev spicevmc,id=charredir2,name=usbredir -device usb-redir,chardev=charredir2,id=redir2 -chardev spicevmc,id=charredir3,name=usbredir -device usb-redir,chardev=charredir3,id=redir3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8


Not sure what other information is relevant. Please ask.

[lkundrak@rhel6-1 ~]$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             4
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 60
Stepping:              1
CPU MHz:               2793.530
BogoMIPS:              5587.06
Hypervisor vendor:     KVM
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              4096K
NUMA node0 CPU(s):     0-3
[lkundrak@rhel6-1 ~]$ cat /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 60
model name	: Intel Core Processor (Haswell)
stepping	: 1
microcode	: 1
cpu MHz		: 2793.530
cache size	: 4096 KB
physical id	: 0
siblings	: 1
core id		: 0
cpu cores	: 1
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_tsc rep_good unfair_spinlock pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm xsaveopt fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm
bogomips	: 5587.06
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 60
model name	: Intel Core Processor (Haswell)
stepping	: 1
microcode	: 1
cpu MHz		: 2793.530
cache size	: 4096 KB
physical id	: 1
siblings	: 1
core id		: 0
cpu cores	: 1
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_tsc rep_good unfair_spinlock pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm xsaveopt fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm
bogomips	: 5587.06
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

processor	: 2
vendor_id	: GenuineIntel
cpu family	: 6
model		: 60
model name	: Intel Core Processor (Haswell)
stepping	: 1
microcode	: 1
cpu MHz		: 2793.530
cache size	: 4096 KB
physical id	: 2
siblings	: 1
core id		: 0
cpu cores	: 1
apicid		: 2
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_tsc rep_good unfair_spinlock pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm xsaveopt fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm
bogomips	: 5587.06
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 60
model name	: Intel Core Processor (Haswell)
stepping	: 1
microcode	: 1
cpu MHz		: 2793.530
cache size	: 4096 KB
physical id	: 3
siblings	: 1
core id		: 0
cpu cores	: 1
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_tsc rep_good unfair_spinlock pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm xsaveopt fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm
bogomips	: 5587.06
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

[lkundrak@rhel6-1 ~]$

Comment 2 Mark Wielaard 2014-11-03 10:16:25 UTC
I'll try to replicate this issue. But in the meantime could you provide the output of "rpm -q valgrind" and "valgrind -v /bin/true"

Thanks,

Mark

Comment 3 Mark Wielaard 2014-11-03 15:45:27 UTC
Got it replicated against valgrind-3.8.1-3.7.el6.x86_64

$ valgrind -v /bin/true
==4146== Memcheck, a memory error detector
==4146== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==4146== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==4146== Command: /bin/true
==4146== 
--4146-- Valgrind options:
--4146--    -v
--4146-- Contents of /proc/version:
--4146--   Linux version 2.6.32-504.el6.x86_64 (mockbuild@x86-023.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC) ) #1 SMP Tue Sep 16 01:56:35 EDT 2014
--4146-- Arch and hwcaps: AMD64, INVALID
[...]

As a quick workaround try valgrind from DTS-3, part of rhel-server-rhscl-6-rpms.
devtoolset-3-valgrind-3.9.0-8.3.el6.x86_64 works on the same setup as expected.

==4182== Memcheck, a memory error detector
==4182== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==4182== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==4182== Command: /bin/true
==4182== 
--4182-- Valgrind options:
--4182--    -v
--4182-- Contents of /proc/version:
--4182--   Linux version 2.6.32-504.el6.x86_64 (mockbuild@x86-023.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC) ) #1 SMP Tue Sep 16 01:56:35 EDT 2014
--4182-- Arch and hwcaps: AMD64, amd64-cx16-rdtscp-sse3-avx-avx2-bmi
[...]

Comment 4 Mark Wielaard 2014-11-03 19:26:44 UTC
The issue is that the valgrind hwcaps check expects Haswell (actually AVX2 capable) CPUs to always have lzcnt available. That is true on real CPUs, but not under qemu.

Valgrind actually handles that combination fine, so the simplest fix is just to add this case to the hwcaps sanity check:

--- VEX/priv/main_main.c.orig	2014-11-03 20:15:32.647070331 +0100
+++ VEX/priv/main_main.c	2014-11-03 20:12:04.772687750 +0100
@@ -1147,6 +1147,10 @@
            | VEX_HWCAPS_AMD64_BMI:
          return "amd64-sse3-cx16-lzcnt-avx-bmi";
       case VEX_HWCAPS_AMD64_SSE3 | VEX_HWCAPS_AMD64_CX16
+           | VEX_HWCAPS_AMD64_AVX
+           | VEX_HWCAPS_AMD64_BMI | VEX_HWCAPS_AMD64_AVX2:
+         return "amd64-sse3-cx16-avx2-bmi";
+      case VEX_HWCAPS_AMD64_SSE3 | VEX_HWCAPS_AMD64_CX16
            | VEX_HWCAPS_AMD64_LZCNT | VEX_HWCAPS_AMD64_AVX
            | VEX_HWCAPS_AMD64_BMI | VEX_HWCAPS_AMD64_AVX2:
          return "amd64-sse3-cx16-lzcnt-avx2-bmi";

Upstream made this check much saner (which is why this issue isn't seen with valgrind 3.9.0 or later). See VEX svn r2701. But that introduces some other changes too.

If there is a way for qemu to pass-through the lzcnt cpuid flag (support is indicated via the CPUID.80000001H:ECX.ABM[Bit 5] flag), that would be another workaround.

Comment 8 Miloš Prchlík 2015-02-24 15:04:05 UTC
Verified for build valgrind-3.8.1-7.el6.

Comment 10 errata-xmlrpc 2015-07-22 06:23:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1298.html