RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1306844 - valgrind on s390x fails to handle "popcnt" instruction (0xb9e1)
Summary: valgrind on s390x fails to handle "popcnt" instruction (0xb9e1)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: valgrind
Version: 7.1
Hardware: s390x
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Mark Wielaard
QA Contact: Miloš Prchlík
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-02-11 21:20 UTC by Dave Malcolm
Modified: 2016-11-04 02:55 UTC (History)
5 users (show)

Fixed In Version: valgrind-3.11.0-20.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-04 02:55:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch (to gcc) to hack out usage of __builtin_popcountl (1.80 KB, patch)
2016-02-11 21:43 UTC, Dave Malcolm
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
KDE Software Compilation 359289 0 None None None 2016-02-11 21:53:03 UTC
Red Hat Product Errata RHEA-2016:2297 0 normal SHIPPED_LIVE valgrind bug fix and enhancement update 2016-11-03 13:38:42 UTC

Description Dave Malcolm 2016-02-11 21:20:02 UTC
Description of problem:
I'm seeing fatal "unimplemented insn" errors when attempting to run gcc under valgrind on s390x EL7:

vex s390->IR: unimplemented insn: B9E1 0033
==360== valgrind: Unrecognised instruction at address 0x419a028.
==360==    at 0x419A028: bitmap_count_bits(bitmap_head const*) (bitmap.c:660)
(snip)

Connecting to valgrind from gdb shows it failing consistently here:

Dump of assembler code for function bitmap_count_bits(bitmap_head const*):
   0x000000000419a008 <+0>:	stg	%r11,88(%r15)
   0x000000000419a00e <+6>:	ltg	%r11,8(%r2)
   0x000000000419a014 <+12>:	lghi	%r2,0
   0x000000000419a018 <+16>:	je	0x419a08a <bitmap_count_bits(bitmap_head const*)+130>
   0x000000000419a01c <+20>:	lg	%r3,24(%r11)
   0x000000000419a022 <+26>:	lg	%r1,32(%r11)
   0x000000000419a028 <+32>:	popcnt	%r3,%r3
=> 0x000000000419a02c <+36>:	popcnt	%r1,%r1
   0x000000000419a030 <+40>:	sllg	%r5,%r3,32
   0x000000000419a036 <+46>:	sllg	%r4,%r1,32
   0x000000000419a03c <+52>:	agr	%r3,%r5
   0x000000000419a040 <+56>:	agr	%r1,%r4
   0x000000000419a044 <+60>:	sllg	%r5,%r3,16
   0x000000000419a04a <+66>:	sllg	%r4,%r1,16
   0x000000000419a050 <+72>:	agr	%r3,%r5
   0x000000000419a054 <+76>:	agr	%r1,%r4
   0x000000000419a058 <+80>:	sllg	%r5,%r3,8
   0x000000000419a05e <+86>:	sllg	%r4,%r1,8
   0x000000000419a064 <+92>:	agr	%r3,%r5
   0x000000000419a068 <+96>:	agr	%r1,%r4
   0x000000000419a06c <+100>:	srlg	%r3,%r3,56
   0x000000000419a072 <+106>:	srlg	%r1,%r1,56
   0x000000000419a078 <+112>:	agr	%r1,%r3
   0x000000000419a07c <+116>:	agr	%r2,%r1
   0x000000000419a080 <+120>:	ltg	%r11,0(%r11)
   0x000000000419a086 <+126>:	jne	0x419a01c <bitmap_count_bits(bitmap_head const*)+20>
   0x000000000419a08a <+130>:	lg	%r11,88(%r15)
   0x000000000419a090 <+136>:	br	%r14
End of assembler dump.


Version-Release number of selected component (if applicable):
valgrind-3.10.0-16.el7.s390x
gcc-4.8.3-9.el7.s390x


How reproducible:
100%

Steps to Reproduce:
A more minimal reproducer:
$ cat popcnt.c

int main (int argc, const char **argv)
{
  return __builtin_popcountl ((long)argc);
}

$ gcc popcnt.c
$ valgrind ./a.out


Actual results:

==507== Memcheck, a memory error detector
==507== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==507== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==507== Command: ./a.out
==507== 
vex s390->IR: unimplemented insn: B9E1 0011
==507== valgrind: Unrecognised instruction at address 0x80000564.
==507==    at 0x80000564: main (in /home/dmalcolm/gcc-bugfixing/src/a.out)
==507== Your program just tried to execute an instruction that Valgrind
==507== did not recognise.  There are two possible reasons for this.
==507== 1. Your program has a bug and erroneously jumped to a non-code
==507==    location.  If you are running Memcheck and you just saw a
==507==    warning about a bad jump, it's probably your program's fault.
==507== 2. The instruction is legitimate but Valgrind doesn't handle it,
==507==    i.e. it's Valgrind's fault.  If you think this is the case or
==507==    you are not sure, please let us know and we'll try to fix it.
==507== Either way, Valgrind will now raise a SIGILL signal which will
==507== probably kill your program.
==507== 
==507== Process terminating with default action of signal 4 (SIGILL)
==507==  Illegal opcode at address 0x80000564
==507==    at 0x80000564: main (in /home/dmalcolm/gcc-bugfixing/src/a.out)
==507== 
==507== HEAP SUMMARY:
==507==     in use at exit: 0 bytes in 0 blocks
==507==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==507== 
==507== All heap blocks were freed -- no leaks are possible
==507== 
==507== For counts of detected and suppressed errors, rerun with: -v
==507== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Illegal instruction (core dumped)

Expected results:

Successful execution of the test binary.

Additional info:

Seen when trying to run gcc under valgrind; which gave me this error:
  vex s390->IR: unimplemented insn: B9E1 0033
(is this a different addressing mode? or just different registers?)

FWIW https://patchwork.ozlabs.org/patch/187671/ has a patch for qemu for implementing popcnt on s390x, and this confirms 0xb9e1 as the first 2 bytes of the opcode.

Comment 1 Dave Malcolm 2016-02-11 21:24:11 UTC
(In reply to Dave Malcolm from comment #0)
[...]
> Steps to Reproduce:
> A more minimal reproducer:
> $ cat popcnt.c
> 
> int main (int argc, const char **argv)
> {
>   return __builtin_popcountl ((long)argc);
> }

FWIW the cast to long is redundant; the reproducer triggers the bug equally well without it.

[...]

Comment 3 Dave Malcolm 2016-02-11 21:25:21 UTC
objdump -d a.out shows:
[...snip...]
0000000080000540 <main>:
    80000540:	eb bf f0 58 00 24 	stmg	%r11,%r15,88(%r15)
    80000546:	e3 f0 ff 50 ff 71 	lay	%r15,-176(%r15)
    8000054c:	b9 04 00 bf       	lgr	%r11,%r15
    80000550:	b9 04 00 12       	lgr	%r1,%r2
    80000554:	e3 30 b0 a0 00 24 	stg	%r3,160(%r11)
    8000055a:	50 10 b0 ac       	st	%r1,172(%r11)
    8000055e:	e3 10 b0 ac 00 14 	lgf	%r1,172(%r11)
    80000564:	b9 e1 00 11       	popcnt	%r1,%r1
    80000568:	eb 21 00 20 00 0d 	sllg	%r2,%r1,32
    8000056e:	b9 08 00 12       	agr	%r1,%r2
    80000572:	eb 21 00 10 00 0d 	sllg	%r2,%r1,16
    80000578:	b9 08 00 12       	agr	%r1,%r2
    8000057c:	eb 21 00 08 00 0d 	sllg	%r2,%r1,8
    80000582:	b9 08 00 12       	agr	%r1,%r2
    80000586:	eb 11 00 38 00 0c 	srlg	%r1,%r1,56
    8000058c:	b9 14 00 11       	lgfr	%r1,%r1
    80000590:	b9 04 00 21       	lgr	%r2,%r1
    80000594:	e3 40 b1 20 00 04 	lg	%r4,288(%r11)
    8000059a:	eb bf b1 08 00 04 	lmg	%r11,%r15,264(%r11)
    800005a0:	07 f4             	br	%r4
    800005a2:	07 07             	nopr	%r7
    800005a4:	07 07             	nopr	%r7
    800005a6:	07 07             	nopr	%r7
[...snip...]

Comment 4 Dave Malcolm 2016-02-11 21:43:25 UTC
Created attachment 1123300 [details]
Patch (to gcc) to hack out usage of __builtin_popcountl

Note to self: here's the crude workaround patch I applied to my code (gcc) to avoid using __builtin_popcountl, enabling valgrind to run enough to see the bug I was actually trying to track down.

Comment 5 Mark Wielaard 2016-02-11 21:53:03 UTC
Confirmed. Replicated with latest valgrind and filed upstream: https://bugs.kde.org/show_bug.cgi?id=359289

Comment 6 Mark Wielaard 2016-02-17 21:35:43 UTC
Patch upstream (VEX svn r3210) and in fedora (valgrind-3.11.0-13.fc24)

Comment 8 Miloš Prchlík 2016-06-08 12:41:30 UTC
Verified for build valgrind-3.11.0-22.el7.

Comment 10 errata-xmlrpc 2016-11-04 02:55:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2297.html


Note You need to log in before you can comment on or make changes to this bug.