Bug 750739

Summary: Work around valgrind choking on our use of memalign()
Product: Red Hat Enterprise Linux 6 Reporter: Markus Armbruster <armbru>
Component: qemu-kvmAssignee: Markus Armbruster <armbru>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.2CC: acathrow, areis, bsarathy, chayang, juzhang, minovotn, mkenneth, sluo, tburke, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-0.12.1.2-2.229.el6 Doc Type: Bug Fix
Doc Text:
No Documentation Needed
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-20 11:35:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Markus Armbruster 2011-11-02 09:29:43 UTC
Description of problem:
Valgrind aborts when memalign is called with an alignment larger than 1 MiB.  We use 2 MiB.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. valgrind rhel6-qemu-kvm --nodefaults --no-kvm -vnc :0 -S -m 384 -monitor stdio
  
Actual results:
Dies like this:

==5275== Memcheck, a memory error detector
==5275== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==5275== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==5275== Command: rhel6-qemu-kvm --nodefaults --no-kvm -vnc :0 -S -m 384 -monitor stdio
==5275== 
VG_(arena_memalign)(0x388B0658, 2097152, 402653184)
bad alignment value 2097152
(it is too small, too big, or not a power of two)
valgrind: the 'impossible' happened:
   VG_(arena_memalign)
==5275==    at 0x3804AD06: report_and_quit (m_libcassert.c:209)
==5275==    by 0x3804AD63: panic (m_libcassert.c:293)
==5275==    by 0x3804AF18: vgPlain_core_panic_at (m_libcassert.c:298)
==5275==    by 0x3804AF2A: vgPlain_core_panic (m_libcassert.c:303)
==5275==    by 0x38056B36: vgPlain_arena_memalign (m_mallocfree.c:1599)
==5275==    by 0x3802051C: vgMemCheck_new_block (mc_malloc_wrappers.c:201)
==5275==    by 0x380207CF: vgMemCheck_memalign (mc_malloc_wrappers.c:268)
==5275==    by 0x38084299: vgPlain_scheduler (scheduler.c:1409)
==5275==    by 0x38093684: run_a_thread_NORETURN (syswrap-linux.c:94)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable
==5275==    at 0x4A049B8: memalign (vg_replace_malloc.c:581)
==5275==    by 0x4A04A67: posix_memalign (vg_replace_malloc.c:709)
==5275==    by 0x478F63: qemu_memalign (osdep.c:108)
==5275==    by 0x4D89F1: qemu_ram_alloc_from_ptr (exec.c:2730)
==5275==    by 0x4D8BC9: qemu_ram_alloc (exec.c:2777)
==5275==    by 0x44B57F: pc_init_pci (pc.c:1115)
==5275==    by 0x44C717: pc_init_rhel620 (pc.c:1611)
==5275==    by 0x40EE0E: main (vl.c:6262)


Note: see also the FAQ in the source distribution.
It contains workarounds to several common problems.
In particular, if Valgrind aborted or crashed after
identifying problems in your program, there's a good chance
that fixing those problems will prevent Valgrind aborting or
crashing, especially if it happened in m_mallocfree.c.

If that doesn't help, please report this bug to: www.valgrind.org

In the bug report, send all the above text, the valgrind
version, and what OS and version you are using.  Thanks.

Expected results:
valgrind runs normally.

Additional info:
Upstream commit c2a8238a works around valgrind's alignment restriction.

Comment 2 Markus Armbruster 2011-12-21 09:52:20 UTC
In quick, superficial testing, valgrind complains about the KVM ioctls.  This is expected.  More experiments are needed to find out whether its still useful.  
With KVM disabled, I see many complaints related to TCG instead.  More than with upstream, I think.  No idea whether these are false positives or real defects.  Probably not worth investigating, as TCG isn't supported.

Comment 5 Michal Novotny 2012-05-03 17:59:35 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
No Documentation Needed

Comment 6 Sibiao Luo 2012-05-16 09:45:31 UTC
I reproduced and verified this issue as following.

Steps to Reproduce:
1.download valgrind-3.7.0.tar.bz2 form valgrind download page: www.valgrind.org.
2.install the valgrind.
3.run # valgrind /usr/libexec/qemu-kvm --nodefaults --no-kvm -vnc :0 -S -m 384 -monitor stdio

host info:
# uname -r && rpm -q qemu-kvm
2.6.32-270.el6.x86_64
qemu-kvm-0.12.1.2-2.226.el6.x86_64

Actual reproduce results:
after the step 3, dies like this:
# valgrind /usr/libexec/qemu-kvm --nodefaults --no-kvm -vnc :0 -S -m 384 -monitor stdio
==61515== Memcheck, a memory error detector
==61515== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==61515== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==61515== Command: /usr/libexec/qemu-kvm --nodefaults --no-kvm -vnc :0 -S -m 384 -monitor stdio
==61515== 
VG_(arena_memalign)(0x388E37D8, 2097152, 402653184)
bad alignment value 2097152
(it is too small, too big, or not a power of two)
valgrind: the 'impossible' happened:
   VG_(arena_memalign)
==61515==    at 0x3802EB67: report_and_quit (m_libcassert.c:210)
==61515==    by 0x3802EBCE: panic (m_libcassert.c:294)
==61515==    by 0x3802EC28: vgPlain_core_panic_at (m_libcassert.c:299)
==61515==    by 0x3802EC3A: vgPlain_core_panic (m_libcassert.c:304)
==61515==    by 0x3803C2E0: vgPlain_arena_memalign (m_mallocfree.c:1870)
==61515==    by 0x38003254: vgMemCheck_new_block (mc_malloc_wrappers.c:248)
==61515==    by 0x3800352F: vgMemCheck_memalign (mc_malloc_wrappers.c:315)
==61515==    by 0x3807206A: do_client_request (scheduler.c:1469)
==61515==    by 0x38073A70: vgPlain_scheduler (scheduler.c:1144)
==61515==    by 0x3809D159: run_a_thread_NORETURN (syswrap-linux.c:98)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable
==61515==    at 0x4A05988: memalign (vg_replace_malloc.c:694)
==61515==    by 0x4A059E1: posix_memalign (vg_replace_malloc.c:835)
==61515==    by 0x1E34E3: qemu_memalign (osdep.c:108)
==61515==    by 0x24E5BC: qemu_ram_alloc_from_ptr (exec.c:2737)
==61515==    by 0x1B0692: pc_init1.clone.0 (pc.c:1113)
==61515==    by 0x169506: main (vl.c:6255)


Note: see also the FAQ in the source distribution.
It contains workarounds to several common problems.
In particular, if Valgrind aborted or crashed after
identifying problems in your program, there's a good chance
that fixing those problems will prevent Valgrind aborting or
crashing, especially if it happened in m_mallocfree.c.

If that doesn't help, please report this bug to: www.valgrind.org

In the bug report, send all the above text, the valgrind
version, and what OS and version you are using.  Thanks.



Verified this issue as the same steps as reproduced.
host info:
# uname -r && rpm -q qemu-kvm
2.6.32-270.el6.x86_64
qemu-kvm-0.12.1.2-2.292.el6.x86_64

Actual verify results:
after the step 3, the valgrind runs normally.
# valgrind /usr/libexec/qemu-kvm --nodefaults --no-kvm -vnc :0 -S -m 384 -monitor stdio
==61475== Memcheck, a memory error detector
==61475== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==61475== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==61475== Command: /usr/libexec/qemu-kvm --nodefaults --no-kvm -vnc :0 -S -m 384 -monitor stdio
==61475== 
==61475== Warning: set address range perms: large range [0x18aaa000, 0x30aaa000) (undefined)
QEMU 0.12.1 monitor - type 'help' for more information
(qemu) info status 
VM status: paused (prelaunch)
(qemu) c
(qemu) info status 
VM status: running
(qemu) q
==61475== 
==61475== HEAP SUMMARY:
==61475==     in use at exit: 508,591,616 bytes in 910 blocks
==61475==   total heap usage: 3,064 allocs, 2,154 frees, 509,321,207 bytes allocated
==61475== 
==61475== LEAK SUMMARY:
==61475==    definitely lost: 23 bytes in 3 blocks
==61475==    indirectly lost: 0 bytes in 0 blocks
==61475==      possibly lost: 0 bytes in 0 blocks
==61475==    still reachable: 508,591,593 bytes in 907 blocks
==61475==         suppressed: 0 bytes in 0 blocks
==61475== Rerun with --leak-check=full to see details of leaked memory
==61475== 
==61475== For counts of detected and suppressed errors, rerun with: -v
==61475== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 30 from 7)


Above all, this issue has been fixed correctly.

Comment 7 Chao Yang 2012-05-16 12:06:19 UTC
Moving to VERIFIED based on Comment #6

Comment 8 errata-xmlrpc 2012-06-20 11:35:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0746.html