RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1752241 - octave test fails with illegal instruction on s390x
Summary: octave test fails with illegal instruction on s390x
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: openblas
Version: 8.0
Hardware: s390x
OS: Linux
unspecified
low
Target Milestone: rc
: 8.2
Assignee: Nikola Forró
QA Contact: RHEL CS Apps Subsystem QE
URL:
Whiteboard:
Depends On:
Blocks: ZedoraTracker 1711971
TreeView+ depends on / blocked
 
Reported: 2019-09-15 02:30 UTC by Orion Poplawski
Modified: 2023-04-11 16:52 UTC (History)
12 users (show)

Fixed In Version: openblas-0.3.3-4.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-28 15:55:31 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
fix izamax (628 bytes, patch)
2019-09-16 17:52 UTC, Dan Horák
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
IBM Linux Technology Center 181498 0 None None None 2019-09-16 15:02:51 UTC
Red Hat Issue Tracker RHELPLAN-29923 0 None None None 2023-04-11 16:52:50 UTC
Red Hat Product Errata RHBA-2020:1664 0 None None None 2020-04-28 15:55:35 UTC

Description Orion Poplawski 2019-09-15 02:30:12 UTC
Description of problem:

BUILDSTDERR:   sparse/eigs.m ..................................................fatal: caught signal Illegal instruction -- stopping myself...
BUILDSTDERR: /bin/sh: line 1:   320 Illegal instruction     (core dumped) /bin/sh ../run-octave --norc --silent --no-history -p /builddir/build/BUILD/octave-5.1.0/test/mex /builddir/build/BUILD/octave-5.1.0/test/fntests.m /builddir/build/BUILD/octave-5.1.0/test

Version-Release number of selected component (if applicable):
octave-5.1.0-2

Comment 1 Orion Poplawski 2019-09-15 02:31:11 UTC
Tried to get a backtrace with libSegFault.so to no avail.

Comment 2 Dan Horák 2019-09-15 06:51:24 UTC
I'll try to look. Is it octave from epel8 branch?

Comment 3 Orion Poplawski 2019-09-16 03:10:45 UTC
Yes.  Thanks.

Comment 4 Dan Horák 2019-09-16 12:55:14 UTC
OK, reproduced locally, bellow is the traceback.

(gdb) where
#0  0x000003ffa852e1a8 in izamax_k () from /lib64/libopenblas.so.0
#1  0x000003ffa83a7d46 in izamax_ () from /lib64/libopenblas.so.0
#2  0x000003ffa8a3e564 in zlatrs_ () from /lib64/libopenblas.so.0
#3  0x000003ffa8a80f2e in ztrcon_ () from /lib64/libopenblas.so.0
#4  0x000003ffab157ef4 in ComplexMatrix::utsolve (this=this@entry=0x3ffc53f1278, mattype=..., b=..., info=@0x3ffc53f0e54: 0, rcon=@0x3ffc53f0e58: 0, sing_handler=0x0, calc_cond=true, 
    transt=blas_no_trans) at liboctave/array/CMatrix.cc:1566
#5  0x000003ffab15b8b2 in ComplexMatrix::solve (this=this@entry=0x3ffc53f1278, mattype=..., b=..., info=@0x3ffc53f0e54: 0, rcon=@0x3ffc53f0e58: 0, sing_handler=0x0, singular_fallback=true, 
    transt=blas_no_trans) at liboctave/array/CMatrix.cc:1977
#6  0x000003ffab47e478 in lusolve<ComplexMatrix, ComplexMatrix> (L=..., U=..., m=..., m@entry=<error reading variable: value has been optimized out>) at ./liboctave/array/dim-vector.h:285
#7  0x000003ffab48f75c in EigsComplexNonSymmetricMatrixShift<ComplexMatrix> (m=..., sigma=..., k_arg=k_arg@entry=10, p_arg=<optimized out>, info=@0x3ffc53f1734: 0, eig_vec=..., 
    eig_val=..., _b=..., permB=..., cresid=..., os=..., tol=<optimized out>, tol@entry=2.2204460492503131e-16, rvec=false, cholB=false, disp=0, maxit=7) at /usr/include/c++/8/complex:1307
#8  0x000003ff66c8e726 in F__eigs__ (interp=..., args=..., nargout=<optimized out>) at libinterp/dldfcn/__eigs__.cc:457
#9  0x000003ffac6beb5a in octave_builtin::call (this=0x2aa73c688a0, tw=..., nargout=<optimized out>, args=...) at libinterp/octave-value/ov-builtin.cc:71

(gdb) disas
Dump of assembler code for function izamax_k:
...
   0x000003ffa852e182 <+978>:	vrepg	%v5,%v7,1
   0x000003ffa852e188 <+984>:	wfcdb	%v26,%v6
   0x000003ffa852e18e <+990>:	jne	0x3ffa852e1a8 <izamax_k+1016>
   0x000003ffa852e192 <+994>:	vsteg	%v6,160(%r15),0
   0x000003ffa852e198 <+1000>:	vmnlg	%v1,%v5,%v7
   0x000003ffa852e19e <+1006>:	vlgvg	%r5,%v1,0
   0x000003ffa852e1a4 <+1012>:	j	0x3ffa852e1a6 <izamax_k+1014>
=> 0x000003ffa852e1a8 <+1016>:	wfchdb	%v16,%v26,%v6
   0x000003ffa852e1ae <+1022>:	vsel	%v1,%v5,%v7,%v16
   0x000003ffa852e1b4 <+1028>:	vsel	%v0,%v26,%v6,%v16
   0x000003ffa852e1ba <+1034>:	vlgvg	%r5,%v1,0
   0x000003ffa852e1c0 <+1040>:	std	%f0,160(%r15)
   0x000003ffa852e1c4 <+1044>:	cgrjh	%r2,%r11,0x3ffa852e1ce <izamax_k+1054>
   0x000003ffa852e1ca <+1050>:	j	0x3ffa852dede <izamax_k+302>
   0x000003ffa852e1ce <+1054>:	sllg	%r4,%r11,1
   0x000003ffa852e1d4 <+1060>:	ld	%f4,160(%r15)
   0x000003ffa852e1d8 <+1064>:	j	0x3ffa852de8c <izamax_k+220>
   0x000003ffa852e1dc <+1068>:	lghi	%r2,1
   0x000003ffa852e1e0 <+1072>:	j	0x3ffa852de4e <izamax_k+158>
   0x000003ffa852e1e4 <+1076>:	brasl	%r14,0x3ffa837b5d8 <__stack_chk_fail@plt>
End of assembler dump.

Could be z14 instruction slipping into z13 code or similar issue. I haven't checked the openblas build for rhel8/epel8 yet.

Comment 5 Dan Horák 2019-09-16 13:49:20 UTC
0x000003ffa852e1a4 <+1012>:	j	0x3ffa852e1a6 <izamax_k+1014>

looks suspicious, it jumps into a middle of next instruction, while it should jump much further, right after the "std" instruction

Comment 6 Dan Horák 2019-09-16 14:06:42 UTC
https://github.com/xianyi/OpenBLAS/blob/v0.3.3/kernel/zarch/izamax.c#L188 is the source code in question

Comment 7 Dan Horák 2019-09-16 14:53:44 UTC
Reassigned to RHEL, trying to figure out if it's an openblas issue or a toolchain issue.

Comment 8 Dan Horák 2019-09-16 17:52:03 UTC
With fixed openblas I've got

Summary:

  PASS                            15407
  FAIL                                5
  REGRESSION                          1
  XFAIL (reported bug)               28
  SKIP (missing feature)            124
  SKIP (run-time condition)          34

Comment 9 Dan Horák 2019-09-16 17:52:55 UTC
Created attachment 1615586 [details]
fix izamax

Comment 10 Dan Horák 2019-09-16 19:33:23 UTC
I strongly recommend to explicitly set TARGET=Z13 during the build, so the rpms won't get different default when the builders move to another machine.

Comment 11 Nikola Forró 2019-09-24 15:05:18 UTC
Thanks Dan.

> I strongly recommend to explicitly set TARGET=Z13 during the build, so the rpms won't get different default when the builders move to another machine.

Do you think I should also disable DYNAMIC_ARCH as is the case with other non-x86_64 arches?

Comment 12 Dan Horák 2019-10-01 09:50:48 UTC
AFAIK using DYNAMIC_ARCH is OK, because it builds all variants and selects the right one during runtime. What we should fix is the builds that don't support DYNAMIC_ARCH and don't set TARGET explicitly (like s390x).

Comment 15 IBM Bug Proxy 2019-11-21 12:10:23 UTC
------- Comment From Andreas.Krebbel.com 2019-11-21 07:04 EDT-------
"vlgvg  %[index],%%v1,0  \n\t"
"j 3    \n\t"
"2:     \n\t"

"j 3" is wrong. It must be either "j 3f" or "j 3b". This problem has been fixed in OpenBLAS in February this year.

Comment 16 Nikola Forró 2019-11-21 12:39:33 UTC
> "j 3" is wrong. It must be either "j 3f" or "j 3b". This problem has been fixed in OpenBLAS in February this year.

Yes, the patch changes "j 3" to "j 3f".

Comment 17 IBM Bug Proxy 2019-11-21 13:00:21 UTC
------- Comment From Andreas.Krebbel.com 2019-11-21 07:53 EDT-------
(In reply to comment #12)
> > "j 3" is wrong. It must be either "j 3f" or "j 3b". This problem has been fixed in OpenBLAS in February this year.
>
> Yes, the patch changes "j 3" to "j 3f".

Oh right. I missed that.

In upstream OpenBLAS there are bunch of patches to add z14 support. These also fix a couple of issues with the z13 support. There should be no testsuite fails anymore with the upstream level. We will check what needs to be backported and open a separate Bugzilla for this.

Comment 18 Dan Horák 2019-11-21 13:31:48 UTC
IIRC s390x is the only (RHEL) arch in openblas that can't build with DYNAMIC_ARCH (aka runtime CPU level detection). Without that we can only build the z13 variant for RHEL-8 as it's the minimum supported arch.

Comment 19 IBM Bug Proxy 2019-11-21 14:10:24 UTC
------- Comment From arnez.com 2019-11-21 09:06 EDT-------
> IIRC s390x is the only (RHEL) arch in openblas that can't build with
> DYNAMIC_ARCH (aka runtime CPU level detection).
Right.  There was a proposed project as part of the OpenMainframeProject's internship program to fix that:
https://github.com/openmainframeproject-internship/resources/blob/master/proposed_projects/OpenBLAS.mdp
But it hasn't been picked up by anyone yet.

Comment 20 IBM Bug Proxy 2019-11-21 14:20:33 UTC
------- Comment From arnez.com 2019-11-21 09:14 EDT-------
Oops, please replace "OpenBLAS.mdp" by "OpenBLAS.md" in the URL above.

Comment 22 errata-xmlrpc 2020-04-28 15:55:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1664


Note You need to log in before you can comment on or make changes to this bug.