Bug 1495320 - Valgrind on armv7hl reports illegal instruction within libcrypto.so
Summary: Valgrind on armv7hl reports illegal instruction within libcrypto.so
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: valgrind
Version: rawhide
Hardware: arm
OS: Unspecified
unspecified
low
Target Milestone: ---
Assignee: Mark Wielaard
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-25 19:47 UTC by Pablo Greco
Modified: 2017-09-26 06:53 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-09-26 06:34:53 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Pablo Greco 2017-09-25 19:47:25 UTC
Description of problem:
Valgrind reports "Unrecognised instruction" on any program that requires libcrypt.so


Version-Release number of selected component (if applicable):
valgrind-3.13.0-7.fc28.armv7hl
openssl-libs-1.1.0f-9.fc27.armv7hl

How reproducible:
Always

Steps to Reproduce:
1. Run "valgrind ssh"
2.
3.

Actual results:
disInstr(arm): unhandled instruction: 0xEC510F1E
                 cond=14(0xE) 27:20=197(0xC5) 4:4=1 3:0=14(0xE)
==1246== valgrind: Unrecognised instruction at address 0x48fafa8.
==1246==    at 0x48FAFA8: ??? (in /usr/lib/libcrypto.so.1.1.0f)
==1246== Your program just tried to execute an instruction that Valgrind
==1246== did not recognise.  There are two possible reasons for this.
==1246== 1. Your program has a bug and erroneously jumped to a non-code
==1246==    location.  If you are running Memcheck and you just saw a
==1246==    warning about a bad jump, it's probably your program's fault.
==1246== 2. The instruction is legitimate but Valgrind doesn't handle it,
==1246==    i.e. it's Valgrind's fault.  If you think this is the case or
==1246==    you are not sure, please let us know and we'll try to fix it.
==1246== Either way, Valgrind will now raise a SIGILL signal which will
==1246== probably kill your program.


Expected results:
No unhandled instruction, just run the program

Additional info:

Tested on fully updated fedora rawhide, bananapi-m1 (A20)
Linux bpi-fedora 4.14.0-0.rc1.git4.1.fc28.armv7hl #1 SMP Fri Sep 22 23:35:46 UTC 2017 armv7l armv7l armv7l GNU/Linux
Also tested on Centos 7.4 with an older openssl version with the same results.

Comment 1 Mark Wielaard 2017-09-25 20:27:53 UTC
Would you be able to run the same with valgrind --vgdb-error=0 ssh
And then in another terminal gdb ssh
(gdb) target remote | vgdb
(gdb) continue
It should then stop when reporting the SIGILL
Then (gdb) disassamble
so we can see exactly which instruction it was?

See also http://valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.gdbserver

Comment 2 Pablo Greco 2017-09-25 21:48:47 UTC
(In reply to Mark Wielaard from comment #1)
> Would you be able to run the same with valgrind --vgdb-error=0 ssh
> And then in another terminal gdb ssh
> (gdb) target remote | vgdb
> (gdb) continue
> It should then stop when reporting the SIGILL
> Then (gdb) disassamble
> so we can see exactly which instruction it was?
> 
> See also
> http://valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.
> gdbserver

#gdb ssh
GNU gdb (GDB) Fedora 8.0-25.fc28
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "armv7hl-redhat-linux-gnueabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ssh...Reading symbols from /root/ssh...(no debugging symbols found)...done.
(no debugging symbols found)...done.
Missing separate debuginfos, use: dnf debuginfo-install openssh-clients-7.5p1-5.fc27.armv7hl
(gdb) target remote | vgdb
Remote debugging using | vgdb
relaying data between gdb and process 1425
warning: remote target does not support file transfer, attempting to access files from local filesystem.
Reading symbols from /lib/ld-linux-armhf.so.3...(no debugging symbols found)...done.
0x04000c00 in _start () from /lib/ld-linux-armhf.so.3
(gdb) continue
Continuing.
Cannot parse expression `.L1170 4@r4'.
warning: Probes-based dynamic linker interface failed.
Reverting to original interface.


Program received signal SIGILL, Illegal instruction.
0x048fafa8 in _armv7_tick () from /lib/libcrypto.so.1.1
(gdb) disas
Dump of assembler code for function _armv7_tick:
=> 0x048fafa8 <+0>:	mrrc	15, 1, r0, r1, cr14
   0x048fafac <+4>:	bx	lr
End of assembler dump.
(gdb)

Comment 3 Mark Wielaard 2017-09-25 22:03:47 UTC
Thanks, this looks like upstream bug: https://bugs.kde.org/show_bug.cgi?id=331178

In which case it might be that libcrypto is deliberately trying to get a SIGILL (to determine if the instruction is supported). Does the program run under valgrind without extra messages if you use --sigill-diagnostics=no ?

Comment 4 Pablo Greco 2017-09-25 22:10:58 UTC
Yes, just normal valgrind messages.

# valgrind --sigill-diagnostics=no ssh
==1625== Memcheck, a memory error detector
==1625== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1625== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==1625== Command: ssh
==1625== 
==1625== Warning: invalid file descriptor 1024 in syscall close()
==1625== Warning: invalid file descriptor 1025 in syscall close()
==1625== Warning: invalid file descriptor 1026 in syscall close()
==1625== Warning: invalid file descriptor 1027 in syscall close()
==1625==    Use --log-fd=<number> to select an alternative log fd.
==1625== Warning: invalid file descriptor 1028 in syscall close()
==1625== Warning: invalid file descriptor 1029 in syscall close()
==1625== Warning: invalid file descriptor 1030 in syscall close()
usage: ssh [-1246AaCfGgKkMNnqsTtVvXxYy] [-b bind_address] [-c cipher_spec]
           [-D [bind_address:]port] [-E log_file] [-e escape_char]
           [-F configfile] [-I pkcs11] [-i identity_file]
           [-J [user@]host[:port]] [-L address] [-l login_name] [-m mac_spec]
           [-O ctl_cmd] [-o option] [-p port] [-Q query_option] [-R address]
           [-S ctl_path] [-W host:port] [-w local_tun[:remote_tun]]
           [user@]hostname [command]
==1625== 
==1625== HEAP SUMMARY:
==1625==     in use at exit: 2,404 bytes in 37 blocks
==1625==   total heap usage: 148 allocs, 111 frees, 58,254 bytes allocated
==1625== 
==1625== LEAK SUMMARY:
==1625==    definitely lost: 112 bytes in 1 blocks
==1625==    indirectly lost: 2,220 bytes in 27 blocks
==1625==      possibly lost: 0 bytes in 0 blocks
==1625==    still reachable: 72 bytes in 9 blocks
==1625==         suppressed: 0 bytes in 0 blocks
==1625== Rerun with --leak-check=full to see details of leaked memory
==1625== 
==1625== For counts of detected and suppressed errors, rerun with: -v
==1625== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Comment 5 Mark Wielaard 2017-09-26 06:34:53 UTC
In that case this isn't really a bug since the program deliberately uses an non-existing instruction and handles the resulting SIGILL. If you don't want to see the message please run with -q or --sigill-diagnostics=no.


Note You need to log in before you can comment on or make changes to this bug.