Bug 828341

Summary: unhandled instruction bytes: 0xF 0x18 0x9 0x8B i686 PREFETCH instruction
Product: Red Hat Enterprise Linux 6 Reporter: Karel Volný <kvolny>
Component: valgrindAssignee: Mark Wielaard <mjw>
Status: CLOSED ERRATA QA Contact: Miloš Prchlík <mprchlik>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.3CC: fche, mbenitez, mfranc, mjw, mnewsome, mprchlik
Target Milestone: rc   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: valgrind-3.8.1-3.3.el6 Doc Type: Bug Fix
Doc Text:
Cause: Valgrind didn't recognize certain older i386 AMD athlon processors had support for the mmxext (integer sse) instruction set (a subset of the SSE1 instruction set). In particular it didn't recognize these processors supported the PREFETCH instruction. Consequence: Running an application under valgrind on such a processor that did use an mmxext instruction (in particular the PREFETCH instruction) would case an error from valgrind about using an unsupported instruction. Fix: Valgrind now understands which processors implement the mmxext (integer sse) instruction set. Result: Programs now run on such processors when they use such instructions.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-14 06:36:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1056252    

Description Karel Volný 2012-06-04 16:14:00 UTC
Filed from caserun https://tcms.engineering.redhat.com/run/40402/#caserun_1177439

Version-Release number of selected component (if applicable):
RHEL6.3-20120531.0

Steps to Reproduce: 
run the test /CoreOS/mysql/Security/CVE-2010-3679-BINLOG-use-unassigned-memory


Actual results: 
see https://beaker.engineering.redhat.com/tasks/executed?task=/CoreOS/mysql/Security/CVE-2010-3679-BINLOG-use-unassigned-memory&job_id=241806

valgrind output:
==9390== Memcheck, a memory error detector
==9390== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==9390== Using Valgrind-3.6.0 and LibVEX; rerun with -h for copyright info
==9390== Command: /usr/libexec/mysqld
==9390== 
120601 15:12:35  InnoDB: Initializing buffer pool, size = 8.0M
120601 15:12:36  InnoDB: Completed initialization of buffer pool
vex x86->IR: unhandled instruction bytes: 0xF 0x18 0x9 0x8B
==9390== valgrind: Unrecognised instruction at address 0x83aa31a.
==9390== Your program just tried to execute an instruction that Valgrind
==9390== did not recognise.  There are two possible reasons for this.
==9390== 1. Your program has a bug and erroneously jumped to a non-code
==9390==    location.  If you are running Memcheck and you just saw a
==9390==    warning about a bad jump, it's probably your program's fault.
==9390== 2. The instruction is legitimate but Valgrind doesn't handle it,
==9390==    i.e. it's Valgrind's fault.  If you think this is the case or
==9390==    you are not sure, please let us know and we'll try to fix it.
==9390== Either way, Valgrind will now raise a SIGILL signal which will
==9390== probably kill your program.
19:12:38 UTC - mysqld got signal 4 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

key_buffer_size=8384512
read_buffer_size=131072
max_used_connections=0
max_threads=151
thread_count=0
connection_count=0
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 337736 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x30000
/usr/libexec/mysqld(my_print_stacktrace+0x2e) [0x84c058e]
/usr/libexec/mysqld(handle_fatal_signal+0x484) [0x82e7d14]
/lib/libpthread.so.0() [0x96f8f8]
/usr/libexec/mysqld(dict_index_copy_rec_order_prefix+0x1a) [0x83aa31a]
/usr/libexec/mysqld(btr_pcur_store_position+0xbd) [0x844b84d]
/usr/libexec/mysqld(dict_check_tablespaces_and_store_max_id+0x3ba) [0x83b7c5a]
/usr/libexec/mysqld(innobase_start_or_create_for_mysql+0x13a8) [0x8427578]
/usr/libexec/mysqld() [0x8384542]
/usr/libexec/mysqld(ha_initialize_handlerton(st_plugin_int*)+0x40) [0x82da240]
/usr/libexec/mysqld() [0x8369cbf]
/usr/libexec/mysqld(plugin_init(int*, char**, int)+0x773) [0x836c203]
/usr/libexec/mysqld() [0x81eee4b]
/usr/libexec/mysqld(main+0x1a4) [0x81f1a44]
/lib/libc.so.6(__libc_start_main+0xe6) [0x7b0ce6]
/usr/libexec/mysqld() [0x81277c1]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
==9390== 
==9390== HEAP SUMMARY:
==9390==     in use at exit: 28,020,084 bytes in 31,833 blocks
==9390==   total heap usage: 31,941 allocs, 108 frees, 28,156,151 bytes allocated
==9390== 
==9390== LEAK SUMMARY:
==9390==    definitely lost: 96 bytes in 2 blocks
==9390==    indirectly lost: 0 bytes in 0 blocks
==9390==      possibly lost: 3,116 bytes in 7 blocks
==9390==    still reachable: 28,016,872 bytes in 31,824 blocks
==9390==         suppressed: 0 bytes in 0 blocks
==9390== Rerun with --leak-check=full to see details of leaked memory
==9390== 
==9390== For counts of detected and suppressed errors, rerun with: -v
==9390== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 33 from 10)

Expected results:
(no such errors)

Comment 1 RHEL Program Management 2012-12-14 07:04:08 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 2 Mark Wielaard 2013-03-13 18:33:57 UTC
RHEL 6.4 contains an updated valgrind version 3.8.1 (RHEL 6.3 had version 3.6.0). Does this still happen with the version of valgrind in RHEL 6.4?

Comment 3 Mark Wielaard 2013-06-17 21:46:08 UTC
Replicated on RHEL6.4 with valgrind-3.8.1-3.2.el6.i686

[root@athlon4 ~]# valgrind /usr/libexec/mysqld
==12739== Memcheck, a memory error detector
==12739== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==12739== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==12739== Command: /usr/libexec/mysqld
==12739== 
130617 17:43:25  InnoDB: Initializing buffer pool, size = 8.0M
130617 17:43:26  InnoDB: Completed initialization of buffer pool
vex x86->IR: unhandled instruction bytes: 0xF 0x18 0x9 0x8B
==12739== valgrind: Unrecognised instruction at address 0x83adeba.
==12739==    at 0x83ADEBA: dict_index_copy_rec_order_prefix (in /usr/libexec/mysqld)
==12739==    by 0x844F6BC: btr_pcur_store_position (in /usr/libexec/mysqld)
==12739==    by 0x83BB829: dict_check_tablespaces_and_store_max_id (in /usr/libexec/mysqld)
==12739==    by 0x842B457: innobase_start_or_create_for_mysql (in /usr/libexec/mysqld)
==12739==    by 0x8387ED1: ??? (in /usr/libexec/mysqld)
==12739==    by 0x82DD7FF: ha_initialize_handlerton(st_plugin_int*) (in /usr/libexec/mysqld)
==12739==    by 0x836D60E: ??? (in /usr/libexec/mysqld)
==12739==    by 0x836FB52: plugin_init(int*, char**, int) (in /usr/libexec/mysqld)
==12739==    by 0x81F027A: ??? (in /usr/libexec/mysqld)
==12739==    by 0x81F2E73: main (in /usr/libexec/mysqld)
==12739== Your program just tried to execute an instruction that Valgrind
==12739== did not recognise.  There are two possible reasons for this.
==12739== 1. Your program has a bug and erroneously jumped to a non-code
==12739==    location.  If you are running Memcheck and you just saw a
==12739==    warning about a bad jump, it's probably your program's fault.
==12739== 2. The instruction is legitimate but Valgrind doesn't handle it,
==12739==    i.e. it's Valgrind's fault.  If you think this is the case or
==12739==    you are not sure, please let us know and we'll try to fix it.
==12739== Either way, Valgrind will now raise a SIGILL signal which will
==12739== probably kill your program.
21:43:28 UTC - mysqld got signal 4 ;

Comment 4 Mark Wielaard 2013-06-17 21:59:20 UTC
(gdb) target remote | /usr/lib/valgrind/../../bin/vgdb --pid=12751
Remote debugging using | /usr/lib/valgrind/../../bin/vgdb --pid=12751
relaying data between gdb and process 12751
Reading symbols from /lib/ld-linux.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.12.so.debug...done.
done.
Loaded symbols for /lib/ld-linux.so.2
[Switching to Thread 12751]
0x003c6850 in _start () from /lib/ld-linux.so.2
(gdb) c
Continuing.

Program received signal SIGILL, Illegal instruction.
dict_index_copy_rec_order_prefix (index=0x41bd718, 
    rec=0x596c08d "SYS_FOREIGN", n_fields=0xbeaa7a80, buf=0xbeaa7aa4, 
    buf_size=0xbeaa7aa8) at dict/dict0dict.c:3662
3662		UNIV_PREFETCH_R(rec);
(gdb) disassemble 
Dump of assembler code for function dict_index_copy_rec_order_prefix:
   0x083adea0 <+0>:	push   %ebp
   0x083adea1 <+1>:	mov    %esp,%ebp
   0x083adea3 <+3>:	push   %esi
   0x083adea4 <+4>:	push   %ebx
   0x083adea5 <+5>:	call   0x812d8b4 <__i686.get_pc_thunk.bx>
   0x083adeaa <+10>:	add    $0x36d752,%ebx
   0x083adeb0 <+16>:	lea    -0x20(%esp),%esp
   0x083adeb4 <+20>:	mov    0x8(%ebp),%edx
   0x083adeb7 <+23>:	mov    0xc(%ebp),%ecx
=> 0x083adeba <+26>:	prefetcht0 (%ecx)
   0x083adebd <+29>:	mov    0xc(%edx),%eax
   0x083adec0 <+32>:	test   $0x4,%al
   0x083adec2 <+34>:	jne    0x83adf0f <dict_index_copy_rec_order_prefix+111>

Comment 5 mbenitez 2013-06-26 21:04:10 UTC
From a conversation with Mark this could be easily fixed / nice to have.
Proposing for 6.6 since valgrind is not in 6.5 ACL.

Comment 6 Karel Volný 2013-07-02 13:30:48 UTC
oops, sorry for not responding to needinfo in timely manner ... thanks for getting things sorted out without me, nice work

Comment 7 Mark Wielaard 2013-08-15 13:47:06 UTC
Poking at this again I seem unable to replicate it.
Have to dig a little to see if this might be cpu specific.
Last time I was apparently using an actual athlon for the reproducer.

Comment 8 Mark Wielaard 2013-08-16 11:37:12 UTC
(In reply to Mark Wielaard from comment #7)
> Poking at this again I seem unable to replicate it.
> Have to dig a little to see if this might be cpu specific.
> Last time I was apparently using an actual athlon for the reproducer.

Right, on an actual athlon it does fail:

processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 6
model		: 2
model name	: AMD Athlon(tm) Processor
stepping	: 1
cpu MHz		: 700.035
cache size	: 512 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 mtrr pge mca cmov pat pse36 mmx fxsr syscall mmxext 3dnowext 3dnow up
bogomips	: 1400.07
clflush size	: 32
cache_alignment	: 32
address sizes	: 36 bits physical, 32 bits virtual
power management:

I am guessing valgrind checks the flags and assumes PREFETCH isn't supported on this.

Comment 9 Mark Wielaard 2013-08-16 14:06:46 UTC
So the problem is that this is an AMD 3DNow mmxext instruction.
https://en.wikipedia.org/wiki/3DNow!#3DNow.21_extensions
"The 19 new MMX instructions are a subset of Intel's SSE1 instruction set."

valgrind support either no-SSE at all (the x86 baseline) or SSE1 (or higher) fully. I'll look if I can add support for just this subset.

Comment 10 Mark Wielaard 2013-08-28 20:02:43 UTC
Patch accepted upstream:
https://bugs.kde.org/show_bug.cgi?id=323713
Support mmxext (integer sse) subset on i386 (athlon)
VEX: r2745
valgrind: r13515

Comment 12 Mark Wielaard 2013-09-10 08:20:12 UTC
The fix is now part of the fedora package valgrind-3.8.1-27.fc20

Comment 15 Miloš Prchlík 2014-07-09 22:20:24 UTC
Verified for build valgrind-3.8.1-3.5.el6.

Comment 16 errata-xmlrpc 2014-10-14 06:36:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1464.html