Bug 1186162 - illegal instruction fault in arithchk
Summary: illegal instruction fault in arithchk
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: mp
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Paulo Andrade
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ZedoraTracker
TreeView+ depends on / blocked
 
Reported: 2015-01-27 08:46 UTC by Dan Horák
Modified: 2015-01-30 16:17 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-01-30 16:17:16 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Dan Horák 2015-01-27 08:46:15 UTC
When the arithchk tool is run during the build, it fails with illegal instruction fault. It should be the 0x27f constant (=639) that makes no sense on s390(x).


(gdb) run
Starting program: /home/sharkcz/mp/mp-35060ba2a59f2b0f0fd622ed9df678f142f846ed/build/bin/arithchk 

Program received signal SIGILL, Illegal instruction.
0x0000000080000e2c in fpinit_ASL () at /home/sharkcz/mp/mp-35060ba2a59f2b0f0fd622ed9df678f142f846ed/src/asl/solvers/fpinit.c:135
135		_FPU_SETCW(__fpu_control);
(gdb) where
#0  0x0000000080000e2c in fpinit_ASL () at /home/sharkcz/mp/mp-35060ba2a59f2b0f0fd622ed9df678f142f846ed/src/asl/solvers/fpinit.c:135
#1  0x0000000080000a7e in main () at /home/sharkcz/mp/mp-35060ba2a59f2b0f0fd622ed9df678f142f846ed/src/asl/solvers/arithchk.c:193
(gdb) disas
Dump of assembler code for function fpinit_ASL:
   0x0000000080000e10 <+0>:	larl	%r1,0x80003050
   0x0000000080000e16 <+6>:	lhi	%r3,639
   0x0000000080000e1a <+10>:	lhi	%r2,639
   0x0000000080000e1e <+14>:	lg	%r1,0(%r1)
   0x0000000080000e24 <+20>:	st	%r3,0(%r1)
   0x0000000080000e28 <+24>:	sfpc	%r2,%r0
=> 0x0000000080000e2c <+28>:	br	%r14
End of assembler dump.
(gdb) list
130		__fpu_control = _FPU_IEEE - _FPU_EXTENDED + _FPU_DOUBLE;
131	#else
132		__fpu_control = 0x27f;
133	#endif
134	#endif /* ASL_FPINIT_KEEP_TRAPBITS */
135		_FPU_SETCW(__fpu_control);
136	#endif
137		}
138	#endif /*} NO_fpu_control */
139	#endif /*} __linux__ */



Version-Release number of selected component (if applicable):
mp-1.3.0-2.fc22

Comment 1 Dan Horák 2015-01-27 08:48:44 UTC
The logic in fpinit_ASL() (src/asl/solvers/fpinit.c) seems to be x86-focused.

Comment 2 Paulo Andrade 2015-01-27 12:11:44 UTC
Problem reported upstream. I tried to fix the build
with a patch for arm, loosely following previous
code there for other architectures, but maybe it
have been being developed for too long on x86 only.

https://github.com/ampl/mp/issues/25

Comment 3 Paulo Andrade 2015-01-27 12:16:00 UTC
On second thought, I think it may be another
issue. The SFPC instruction should receive a
single argument.

Comment 4 Paulo Andrade 2015-01-27 12:25:33 UTC
From the manual I use for s390 reference. [RRE]
is the instruction format, not an argument.

SET FPC
SFPC R                         [RRE]

    'B384'      //////// R1 ////////
.---------------.-------.---.------.
0              16      24         31

The contents of bit positions 32-63 of of the
general register designated by R1 are placed in
the FPC (floating-point-control) register.

All of bits 32-63 corresponding to unassigned bit
positions in the FPC must be zero; otherwise, a
specification exception is recognized. Bits 0-31 of
the general register are ignored.

Condition Code: The code remains unchanged.

Bits other than 62 and 63 of the second-operand
address are ignored.

Condition Code: The code remains unchanged.

IEEE Exception Conditions: None.

Program Exceptions:
  * Data with DXC 2, BFP instruction
  * Operation (if the BFP facility is not installed)

Comment 5 Paulo Andrade 2015-01-27 12:46:35 UTC
Translating 0x27f (639) to the FPC bits, it
looks like some translation should be done,
it maps to:

0x2f == 0b1001111111

0  IEEE-invalid-operation mask		1
1  IEEE-division-by-zero mask		1
2  IEEE-overflow mask			1
3  IEEE-underflow mask			1
4  IEEE-inexact mask			1
5  (reserved)				1
6  (reserved)				1
7  (reserved)				0
8  IEEE-invalid operation flag		0
9  IEEE-division by zero flag		0
...

so, the illegal instruction should be due to
bits 5 and 6 set.

Comment 6 Paulo Andrade 2015-01-27 12:47:16 UTC
ops, bit 9 above is 1...

Comment 7 Paulo Andrade 2015-01-28 12:26:27 UTC
I tried

$ s390-koji build --scratch rawhide  /home/pcpa/fedora/mp/mp-1.3.0-3.fc22.src.rpm

Did need to comment "BuildRequires: python-sphinx-latex", but
it still failed, can you please give build it on an environment
with a shell to see if you can have more details of the failure?

http://s390.koji.fedoraproject.org/kojifiles/work/tasks/6258/1716258/build.log

...
21/29 Test #21: aslexpr-test .....................   Passed    0.00 sec
      Start 22: aslbuilder-test
22/29 Test #22: aslbuilder-test ..................***Failed    0.01 sec
      Start 23: aslproblem-test
23/29 Test #23: aslproblem-test ..................   Passed    0.02 sec
...

BTW, now it is using this patch:
http://pkgs.fedoraproject.org/cgit/mp.git/commit/?id=069dff618dea604426b370448bf316ce1331375c

Example, on x86_64, of what should help:

---8<---
<mock-chroot>[root@localhost mp-35060ba2a59f2b0f0fd622ed9df678f142f846ed]# file build/bin/aslbuilder-test
build/bin/aslbuilder-test: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=3c3f96ac4804358526a10f0ad2d763e678e500ec, not stripped
<mock-chroot>[root@localhost mp-35060ba2a59f2b0f0fd622ed9df678f142f846ed]# build/bin/aslbuilder-test
[==========] Running 22 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 1 test from NLTest
[ RUN      ] NLTest.ReadFullHeader
[       OK ] NLTest.ReadFullHeader (0 ms)
[----------] 1 test from NLTest (0 ms total)

[----------] 21 tests from ASLBuilderTest
[ RUN      ] ASLBuilderTest.Ctor
[       OK ] ASLBuilderTest.Ctor (0 ms)
[ RUN      ] ASLBuilderTest.InitASLTrivial
[       OK ] ASLBuilderTest.InitASLTrivial (0 ms)
[ RUN      ] ASLBuilderTest.InitASLFull
[       OK ] ASLBuilderTest.InitASLFull (0 ms)
[ RUN      ] ASLBuilderTest.ASLBuilderAdjFcn
[       OK ] ASLBuilderTest.ASLBuilderAdjFcn (1 ms)
[ RUN      ] ASLBuilderTest.ASLBuilderInvalidProblemDim
[       OK ] ASLBuilderTest.ASLBuilderInvalidProblemDim (0 ms)
[ RUN      ] ASLBuilderTest.ASLBuilderX0Len
[       OK ] ASLBuilderTest.ASLBuilderX0Len (0 ms)
[ RUN      ] ASLBuilderTest.ASLBuilderLinear
[       OK ] ASLBuilderTest.ASLBuilderLinear (1 ms)
[ RUN      ] ASLBuilderTest.ASLBuilderTrivialProblem
[       OK ] ASLBuilderTest.ASLBuilderTrivialProblem (0 ms)
[ RUN      ] ASLBuilderTest.ASLBuilderDisallowCLPByDefault
[       OK ] ASLBuilderTest.ASLBuilderDisallowCLPByDefault (0 ms)
[ RUN      ] ASLBuilderTest.ASLBuilderAllowCLP
[       OK ] ASLBuilderTest.ASLBuilderAllowCLP (1 ms)
[ RUN      ] ASLBuilderTest.AddObj
[       OK ] ASLBuilderTest.AddObj (0 ms)
[ RUN      ] ASLBuilderTest.AddCon
[       OK ] ASLBuilderTest.AddCon (0 ms)
[ RUN      ] ASLBuilderTest.AddLogicalCon
[       OK ] ASLBuilderTest.AddLogicalCon (0 ms)
[ RUN      ] ASLBuilderTest.RegisterFunction
[       OK ] ASLBuilderTest.RegisterFunction (0 ms)
[ RUN      ] ASLBuilderTest.AddFunction
[       OK ] ASLBuilderTest.AddFunction (0 ms)
[ RUN      ] ASLBuilderTest.AddFunctionMatchNumArgs
[       OK ] ASLBuilderTest.AddFunctionMatchNumArgs (0 ms)
[ RUN      ] ASLBuilderTest.SetMissingFunction
[       OK ] ASLBuilderTest.SetMissingFunction (0 ms)
[ RUN      ] ASLBuilderTest.SizeOverflow
[       OK ] ASLBuilderTest.SizeOverflow (0 ms)
[ RUN      ] ASLBuilderTest.ColumnSizeHandler
[       OK ] ASLBuilderTest.ColumnSizeHandler (0 ms)
[ RUN      ] ASLBuilderTest.BuildColumnwiseMatrix
[       OK ] ASLBuilderTest.BuildColumnwiseMatrix (0 ms)
[ RUN      ] ASLBuilderTest.ASLHandler
[       OK ] ASLBuilderTest.ASLHandler (0 ms)
[----------] 21 tests from ASLBuilderTest (3 ms total)

[----------] Global test environment tear-down
[==========] 22 tests from 2 test cases ran. (3 ms total)
[  PASSED  ] 22 tests.
---8<---

Comment 8 Paulo Andrade 2015-01-28 12:55:15 UTC
Thanks to Dan, here are the results:

[ RUN      ] ASLBuilderTest.InitASLFull
/root/rpmbuild/BUILD/mp-35060ba2a59f2b0f0fd622ed9df678f142f846ed/test/asl/aslbuilder-test.cc:187: Failure
Value of: actual.i.xscanf_
  Actual: 0x800b64c0
Expected: expected.i.xscanf_
Which is: 0x800b6100
/root/rpmbuild/BUILD/mp-35060ba2a59f2b0f0fd622ed9df678f142f846ed/test/asl/aslbuilder-test.cc:277: Failure
Value of: actual.i.binary_nl_
  Actual: 2
Expected: expected.i.binary_nl_
Which is: 0
/root/rpmbuild/BUILD/mp-35060ba2a59f2b0f0fd622ed9df678f142f846ed/test/asl/aslbuilder-test.cc:384: Failure
Value of: actual.i.iadjfcn
  Actual: 0x800c5390
Expected: expected.i.iadjfcn
Which is: NULL
/root/rpmbuild/BUILD/mp-35060ba2a59f2b0f0fd622ed9df678f142f846ed/test/asl/aslbuilder-test.cc:385: Failure
Value of: actual.i.dadjfcn
  Actual: 0x800c5390
Expected: expected.i.dadjfcn
Which is: NULL
[  FAILED  ] ASLBuilderTest.InitASLFull (1 ms)
[ RUN      ] ASLBuilderTest.ASLBuilderAdjFcn

Comment 9 Paulo Andrade 2015-01-30 16:17:16 UTC
The package is only in rawhide and patches were made and build pass now.

https://github.com/ampl/mp/issues/25


Note You need to log in before you can comment on or make changes to this bug.