Bug 1305962

Summary: Valgrind reports uninitialised byte(s) in futex
Product: Red Hat Enterprise Linux 7 Reporter: Slawomir Czarko <slawomir>
Component: valgrindAssignee: Mark Wielaard <mjw>
Status: CLOSED ERRATA QA Contact: Miloš Prchlík <mprchlik>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.2CC: jakub, jwakely, mbenitez, mcermak, mjw, mpolacek, mprchlik, ohudlick
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: valgrind-3.11.0-20.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-04 02:55:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Failing program
none
Failing program with annotations
none
Preprocessed test.cpp
none
Output of valgrind none

Description Slawomir Czarko 2016-02-09 16:55:15 UTC
Created attachment 1122463 [details]
Failing program

Description of problem:

When valgrind runs program compiled from test.cpp it reports an error:

Syscall param futex(dummy) contains uninitialised byte(s)


Version-Release number of selected component (if applicable):
4.8.5-4.el7.x86_64

How reproducible:
Always

Steps to Reproduce:
1. g++ -std=c++11 -pthread -g -O0 -o mutex.bin test.cpp
2. valgrind --tool=memcheck mutex.bin
3.

Actual results:
Valgrind reports error

Expected results:
No error reported

Additional info:

Comment 1 Slawomir Czarko 2016-02-09 16:57:40 UTC
Created attachment 1122464 [details]
Failing program with annotations

changing any of lines with #if 1 to #if 0 causes valgrind not to report any errors

Comment 2 Slawomir Czarko 2016-02-09 16:58:22 UTC
Created attachment 1122465 [details]
Preprocessed test.cpp

Comment 3 Slawomir Czarko 2016-02-09 16:59:10 UTC
Created attachment 1122467 [details]
Output of valgrind

Comment 5 Jonathan Wakely 2016-02-09 17:05:30 UTC
probably https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68921

Comment 7 Jonathan Wakely 2016-02-09 17:07:00 UTC
(In reply to Jonathan Wakely from comment #5)
> probably https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68921

Oh, but that code isn't in gcc 4.8.5, so it can't be, sorry.

Comment 8 Mark Wielaard 2016-02-09 17:22:34 UTC
valgrind backtrace with debuginfo installed and linenumbers:

==5930== Syscall param futex(dummy) contains uninitialised byte(s)
==5930==    at 0x5663F8C: __lll_timedlock_wait (lowlevellock.S:190)
==5930==    by 0x5660A97: _L_timedlock_67 (pthread_mutex_timedlock.c:236)
==5930==    by 0x566032B: pthread_mutex_timedlock (pthread_mutex_timedlock.c:83)
==5930==    by 0x401108: __gthread_mutex_timedlock(pthread_mutex_t*, timespec const*) (gthr-default.h:768)
==5930==    by 0x4029A7: bool std::timed_mutex::_M_try_lock_until<std::chrono::duration<long, std::ratio<1l, 1000000000l> > >(std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > > const&) (mutex:286)
==5930==    by 0x4023DE: bool std::timed_mutex::_M_try_lock_for<long, std::ratio<1l, 1000l> >(std::chrono::duration<long, std::ratio<1l, 1000l> > const&) (mutex:267)
==5930==    by 0x401FBE: bool std::timed_mutex::try_lock_for<long, std::ratio<1l, 1000l> >(std::chrono::duration<long, std::ratio<1l, 1000l> > const&) (mutex:240)
==5930==    by 0x401AFC: std::unique_lock<std::timed_mutex>::unique_lock<long, std::ratio<1l, 1000l> >(std::timed_mutex&, std::chrono::duration<long, std::ratio<1l, 1000l> > const&) (mutex:470)
==5930==    by 0x40128B: worker() (test.cpp:30)
==5930==    by 0x403504: void std::_Bind_simple<void (*())()>::_M_invoke<>(std::_Index_tuple<>) (functional:1732)
==5930==    by 0x40345E: std::_Bind_simple<void (*())()>::operator()() (functional:1720)
==5930==    by 0x4033F7: std::thread::_Impl<std::_Bind_simple<void (*())()> >::_M_run() (thread:115)

Comment 9 Mark Wielaard 2016-02-09 17:26:20 UTC
The check that valgrind does:

   case VKI_FUTEX_WAIT_BITSET:
      /* Check that the address at least begins in client-accessible area. */
      if (!VG_(am_is_valid_for_client)( ARG1, 1, VKI_PROT_READ )) {
            SET_STATUS_Failure( VKI_EFAULT );
            return;
      }
      if (*(vki_u32 *)ARG1 != ARG3) {
         PRE_REG_READ5(long, "futex",
                       vki_u32 *, futex, int, op, int, val,
                       struct timespec *, utime, int, dummy);
      } else {
         PRE_REG_READ6(long, "futex",
                       vki_u32 *, futex, int, op, int, val,
                       struct timespec *, utime, int, dummy, int, val3);
      }
      break;

Comment 10 Mark Wielaard 2016-02-09 20:23:51 UTC
It looks like argument 5 (uaddr2 aka dummy) really is not used. And for some reason in this particular case we call futex FUTEXT_WAIT_BITSET with that argument register really containing some uninitialized stuff.

The following makes the issue go away:

diff --git a/coregrind/m_syswrap/syswrap-linux.c b/coregrind/m_syswrap/syswrap-linux.c
index f796969..b57436c 100644
--- a/coregrind/m_syswrap/syswrap-linux.c
+++ b/coregrind/m_syswrap/syswrap-linux.c
@@ -1154,13 +1154,16 @@ PRE(sys_futex)
             return;
       }
       if (*(vki_u32 *)ARG1 != ARG3) {
-         PRE_REG_READ5(long, "futex",
+         PRE_REG_READ4(long, "futex",
                        vki_u32 *, futex, int, op, int, val,
-                       struct timespec *, utime, int, dummy);
+                       struct timespec *, utime);
       } else {
-         PRE_REG_READ6(long, "futex",
+        /* Note argument 5 is unused, but argument 6 is used.
+           So we cannot just PRE_REG_READ6. Read argument 6 separately.  */
+         PRE_REG_READ4(long, "futex",
                        vki_u32 *, futex, int, op, int, val,
-                       struct timespec *, utime, int, dummy, int, val3);
+                       struct timespec *, utime);
+         PRA6("futex",int,val3);
       }
       break;
    case VKI_FUTEX_WAKE_BITSET:

Will have to discuss upstream to see why the original code was written as it was. And whether this seems sane.

Comment 11 Mark Wielaard 2016-02-17 21:37:18 UTC
Patch upstream (valgrind svn r15793) and in fedora (valgrind-3.11.0-13.fc24)

Comment 12 Mark Wielaard 2016-02-18 11:17:15 UTC
Note that the fix as is in valgrind svn r15793 caused a regression in helgrind and drd. A small extra fix is necessary. See upstream valgrind svn r15795.

Comment 14 Miloš Prchlík 2016-06-08 12:09:41 UTC
Verified for build valgrind-3.11.0-22.el7.

Comment 16 errata-xmlrpc 2016-11-04 02:55:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2297.html