Bug 1367571

Summary: gmp-ecm FTBFS on aarch64
Product: [Fedora] Fedora Reporter: Yaakov Selkowitz <yselkowi>
Component: gmp-ecmAssignee: Jerry James <loganjerry>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: 24CC: loganjerry, rjones
Target Milestone: ---   
Target Release: ---   
Hardware: aarch64   
OS: Linux   
Whiteboard:
Fixed In Version: gmp-ecm-7.0.3-2.fc26 gmp-ecm-7.0.4-1.fc25 gmp-ecm-7.0.4-1.fc24 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-10-19 17:27:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 922257    

Description Yaakov Selkowitz 2016-08-16 19:58:54 UTC
The testsuite of gmp-ecm 7.x fails on aarch64 in test.pm1:

GMP-ECM 7.0.3 [configured with GMP 6.1.1, --enable-openmp] [P-1]
Input number is (2^1009-1)/3454817 (298 digits)
Using B1=5000, B2=9972-1389888, polynomial x^1, x0=4284271689
Step 1 took 7ms
./test.pm1: line 166: 28518 Done                    echo "(2^1009-1)/3454817"
     28519 Segmentation fault      | $PM1 -no-ntt 5e3 1e4-1e6
############### ERROR ###############
Expected return code 6 but got 139

Where the expected result appears to be:

********** Factor found in step 2: 198582684439
Found prime factor of 12 digits: 198582684439
Composite cofactor ((2^1009-1)/3454817)/198582684439 has 286 digits

Comment 1 Jerry James 2016-08-16 22:18:48 UTC
Are there any aarch64 machines I could do mock builds on, to debug this issue?

Comment 2 Richard W.M. Jones 2016-08-22 09:47:36 UTC
I cannot install the debuginfo packages because mirrormanager is
broken somehow.

However here is a stack trace and other info:

Core was generated by `/home/rjones/d/fedora/gmp-ecm/master/ecm-7.0.3/.libs/lt-ecm -pm1 -no-ntt 5e3 1e'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000003ff9a6b4e48 in __gmpn_add_n () from /lib64/libgmp.so.10
[Current thread is 1 (Thread 0x3ff9735f180 (LWP 4615))]
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.23.1-8.fc24.aarch64 gmp-6.1.1-1.fc24.aarch64 libgcc-6.1.1-3.fc24.aarch64 libgomp-6.1.1-3.fc24.aarch64
(gdb) t a a bt

Thread 8 (Thread 0x3ff98b5f180 (LWP 4612)):
#0  0x000003ff9a497ac0 in syscall () from /lib64/libc.so.6
#1  0x000003ff9a7b8934 in ?? () from /lib64/libgomp.so.1
#2  0x000003ff9a7b5ddc in ?? () from /lib64/libgomp.so.1
#3  0x000003ff9a55708c in start_thread () from /lib64/libpthread.so.0
#4  0x000003ff9a49cce0 in thread_start () from /lib64/libc.so.6

Thread 7 (Thread 0x3ff9a35f180 (LWP 4609)):
#0  0x000003ff9a497ac0 in syscall () from /lib64/libc.so.6
#1  0x000003ff9a7b8934 in ?? () from /lib64/libgomp.so.1
#2  0x000003ff9a7b5ddc in ?? () from /lib64/libgomp.so.1
#3  0x000003ff9a55708c in start_thread () from /lib64/libpthread.so.0
#4  0x000003ff9a49cce0 in thread_start () from /lib64/libc.so.6

Thread 6 (Thread 0x3ff9935f180 (LWP 4611)):
#0  0x000003ff9a497ac0 in syscall () from /lib64/libc.so.6
#1  0x000003ff9a7b8934 in ?? () from /lib64/libgomp.so.1
#2  0x000003ff9a7b5ddc in ?? () from /lib64/libgomp.so.1
#3  0x000003ff9a55708c in start_thread () from /lib64/libpthread.so.0
#4  0x000003ff9a49cce0 in thread_start () from /lib64/libc.so.6

Thread 5 (Thread 0x3ff97b5f180 (LWP 4614)):
#0  0x000003ff9a497ac0 in syscall () from /lib64/libc.so.6
#1  0x000003ff9a7b8934 in ?? () from /lib64/libgomp.so.1
#2  0x000003ff9a7b5ddc in ?? () from /lib64/libgomp.so.1
#3  0x000003ff9a55708c in start_thread () from /lib64/libpthread.so.0
#4  0x000003ff9a49cce0 in thread_start () from /lib64/libc.so.6

Thread 4 (Thread 0x3ff9835f180 (LWP 4613)):
#0  0x000003ff9a497ac0 in syscall () from /lib64/libc.so.6
#1  0x000003ff9a7b8934 in ?? () from /lib64/libgomp.so.1
#2  0x000003ff9a7b5ddc in ?? () from /lib64/libgomp.so.1
#3  0x000003ff9a55708c in start_thread () from /lib64/libpthread.so.0
#4  0x000003ff9a49cce0 in thread_start () from /lib64/libc.so.6

Thread 3 (Thread 0x3ff99b5f180 (LWP 4610)):
#0  0x000003ff9a497ac0 in syscall () from /lib64/libc.so.6
#1  0x000003ff9a7b8934 in ?? () from /lib64/libgomp.so.1
#2  0x000003ff9a7b5ddc in ?? () from /lib64/libgomp.so.1
#3  0x000003ff9a55708c in start_thread () from /lib64/libpthread.so.0
#4  0x000003ff9a49cce0 in thread_start () from /lib64/libc.so.6

Thread 2 (Thread 0x3ff9a855f40 (LWP 4595)):
#0  0x000003ff9a497ac0 in syscall () from /lib64/libc.so.6
#1  0x000003ff9a7b8934 in ?? () from /lib64/libgomp.so.1
#2  0x000003ff9a7b6e3c in ?? () from /lib64/libgomp.so.1
#3  0x000003ff9a736dc8 in pm1_sequence_g (g_mpz=0x363af20509876f00, 
    g_mpz@entry=0x2aaff74db10, g_ntt=g_ntt@entry=0x0, b_1=0x3ffc3291d18, 
    b_1@entry=0x1, P=1155, M_param=M_param@entry=839, l_param=1080, 
    m_1=m_1@entry=0x3ffc3291d58, k_2=0, modulus_param=<optimized out>, 
    modulus_param@entry=0x3ffc3291de0, ntt_context=<optimized out>, 
    ntt_context@entry=0x0) at pm1fs2.c:1976
#4  0x000003ff9a73b3ec in __ecm_pm1fs2 (f=0x363af20509876f00, f@entry=0x1, 
    X=0x1, X@entry=0x3ffc3291d18, modulus=modulus@entry=0x3ffc3291de0, 
    params=0x3ffc3291d38) at pm1fs2.c:2717
#5  0x000003ff9a71dfe8 in pm1 (f=0x1, f@entry=0x3ffc32922c8, p=0x2aaff742540, 
    N=N@entry=0x3ffc32923e0, go=<optimized out>, B1done=<optimized out>, 
    B1=B1@entry=5000, B2min_parm=0x44, B2_parm=<optimized out>, 
    k=<optimized out>, verbose=1, repr=0, use_ntt=0, os=<optimized out>, 
    es=0x3ff9a5413b8 <_IO_2_1_stderr_>, chkfilename=0x0, TreeFilename=0x0, 
    maxmem=<optimized out>, rng=0x1303, stop_asap=0x0) at pm1.c:587
#6  0x000003ff9a72edf8 in ecm_factor (f=0x3ffc32922c8, n=0x3ffc32923e0, 
    B1=5000, p0=0x3ffc3292440) at factor.c:172
#7  0x000002aadd4b3e54 in main (argc=<optimized out>, argv=<optimized out>)
    at main.c:1458

Thread 1 (Thread 0x3ff9735f180 (LWP 4615)):
#0  0x000003ff9a6b4e48 in __gmpn_add_n () from /lib64/libgmp.so.10
#1  0x000003ff9a6d228c in __gmpn_toom_eval_dgr3_pm1 () from /lib64/libgmp.so.10
#2  0x000003ff9a6c9344 in __gmpn_toom42_mul () from /lib64/libgmp.so.10
#3  0x000003ff9a6b850c in __gmpn_mul () from /lib64/libgmp.so.10
#4  0x000003ff9a6ac170 in __gmpz_mul () from /lib64/libgmp.so.10
#5  0x000003ff9a724974 in mpres_pow_mul (modulus=0x3ff9735e880, 
    c=0x3ff9735e8f8, b=0x3ff9735e840, a=0x3ff9735e8f8) at mpmod.c:1100
#6  __ecm_mpres_pow (R=0x3ffc3291d58, BASE=<optimized out>, EXP=0x3ff9735e820, 
    modulus=0x3ff9735e880) at mpmod.c:1216
#7  0x000003ff9a735bdc in pm1_sequence_g._omp_fn.5 () at pm1fs2.c:2066
#8  0x000003ff9a7b5dd4 in ?? () from /lib64/libgomp.so.1
#9  0x000003ff9a55708c in start_thread () from /lib64/libpthread.so.0
#10 0x000003ff9a49cce0 in thread_start () from /lib64/libc.so.6

(gdb) frame 5
#5  0x000003ff9a724974 in mpres_pow_mul (modulus=0x3ff9735e880, 
    c=0x3ff9735e8f8, b=0x3ff9735e840, a=0x3ff9735e8f8) at mpmod.c:1100
1100	      mpz_mul (modulus->temp1, b, c);
(gdb) frame 6
#6  __ecm_mpres_pow (R=0x3ffc3291d58, BASE=<optimized out>, EXP=0x3ff9735e820, 
    modulus=0x3ff9735e880) at mpmod.c:1216
1216	        mpres_pow_mul (modulus->temp2, (w == 1) ? BASE : B[w/2],

Comment 3 Jerry James 2016-08-23 02:56:51 UTC
Thanks, Richard.  Is there any chance you could run that again, but under valgrind instead of gdb?  I wonder if we're trying to operate on gmp objects that have already been freed.

Comment 4 Richard W.M. Jones 2016-08-23 11:35:02 UTC
$ echo "(2^1009-1)/3454817" | LD_LIBRARY_PATH=. valgrind ./lt-ecm -pm1 -no-ntt 5e3 1e4-1e6 
==1505== Memcheck, a memory error detector
==1505== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==1505== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==1505== Command: ./lt-ecm -pm1 -no-ntt 5e3 1e4-1e6
==1505== 
GMP-ECM 7.0.3 [configured with GMP 6.1.1, --enable-openmp] [P-1]
Input number is (2^1009-1)/3454817 (298 digits)
Using B1=5000, B2=9972-1389888, polynomial x^1, x0=1943729787
Step 1 took 317ms
==1505== Thread 8:
==1505== Invalid read of size 4
==1505==    at 0x49AC068: __gmpz_mul (in /usr/lib64/libgmp.so.10.3.1)
==1505==    by 0x4914973: mpres_pow_mul (mpmod.c:1100)
==1505==    by 0x4914973: __ecm_mpres_pow (mpmod.c:1216)
==1505==    by 0x4925BDB: pm1_sequence_g._omp_fn.5 (pm1fs2.c:2066)
==1505==    by 0x48D5DD3: ??? (in /usr/lib64/libgomp.so.1.0.0)
==1505==    by 0x4B1708B: start_thread (in /usr/lib64/libpthread-2.23.so)
==1505==    by 0x4C2CCDF: thread_start (in /usr/lib64/libc-2.23.so)
==1505==  Address 0x509ab04 is 4 bytes after a block of size 16 alloc'd
==1505==    at 0x4873D4C: malloc (vg_replace_malloc.c:299)
==1505==    by 0x4914533: __ecm_mpres_pow (mpmod.c:1159)
==1505==    by 0x4925BDB: pm1_sequence_g._omp_fn.5 (pm1fs2.c:2066)
==1505==    by 0x48D5DD3: ??? (in /usr/lib64/libgomp.so.1.0.0)
==1505==    by 0x4B1708B: start_thread (in /usr/lib64/libpthread-2.23.so)
==1505==    by 0x4C2CCDF: thread_start (in /usr/lib64/libc-2.23.so)
==1505== 
Step 2 took 2077ms
********** Factor found in step 2: 198582684439
Found prime factor of 12 digits: 198582684439
Composite cofactor ((2^1009-1)/3454817)/198582684439 has 286 digits
==1505== 
==1505== HEAP SUMMARY:
==1505==     in use at exit: 5,360 bytes in 11 blocks
==1505==   total heap usage: 19,763 allocs, 19,752 frees, 3,506,632 bytes allocated
==1505== 
==1505== LEAK SUMMARY:
==1505==    definitely lost: 0 bytes in 0 blocks
==1505==    indirectly lost: 0 bytes in 0 blocks
==1505==      possibly lost: 2,016 bytes in 7 blocks
==1505==    still reachable: 3,344 bytes in 4 blocks
==1505==         suppressed: 0 bytes in 0 blocks
==1505== Rerun with --leak-check=full to see details of leaked memory
==1505== 
==1505== For counts of detected and suppressed errors, rerun with: -v
==1505== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
rjones@mustang:~/d/fedora/gmp-ecm/master/ecm-7.0.3/.libs$

Comment 5 Jerry James 2016-08-24 02:50:01 UTC
Aha, that's a great clue!  Thanks, Richard.  I appreciate the help.

Comment 8 Richard W.M. Jones 2016-08-30 07:50:39 UTC
The patch fixes the problem for me locally, and for that reason
I pushed it to Rawhide.  The primary arch build is:
http://koji.fedoraproject.org/koji/taskinfo?taskID=15435760

Hopefully it will also build OK on arm.koji.

Comment 9 Yaakov Selkowitz 2016-08-30 18:20:19 UTC
(In reply to Richard W.M. Jones from comment #8)
> The patch fixes the problem for me locally, and for that reason
> I pushed it to Rawhide.  The primary arch build is:
> http://koji.fedoraproject.org/koji/taskinfo?taskID=15435760
> 
> Hopefully it will also build OK on arm.koji.

It did: http://arm.koji.fedoraproject.org/koji/buildinfo?buildID=398799

Can we get this into at least F25 as well?

Comment 10 Richard W.M. Jones 2016-08-30 20:08:39 UTC
For sure.  I also did F24 since that contains the same new release.
http://koji.fedoraproject.org/koji/taskinfo?taskID=15439500
http://koji.fedoraproject.org/koji/taskinfo?taskID=15439502

Comment 11 Fedora Update System 2016-08-30 20:31:47 UTC
gmp-ecm-7.0.3-2.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-10fe5e7412

Comment 12 Fedora Update System 2016-08-30 20:32:07 UTC
gmp-ecm-7.0.3-2.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-14ada97cdd

Comment 13 Jerry James 2016-08-31 02:33:39 UTC
Thanks for doing those builds, Richard.  Much appreciated.

Comment 14 Fedora Update System 2016-08-31 16:23:29 UTC
gmp-ecm-7.0.3-2.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-10fe5e7412

Comment 15 Fedora Update System 2016-10-13 04:52:21 UTC
gmp-ecm-7.0.4-1.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-7c407e0ffc

Comment 16 Fedora Update System 2016-10-13 05:52:45 UTC
gmp-ecm-7.0.4-1.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-b25a63cc73

Comment 17 Fedora Update System 2016-10-19 17:27:36 UTC
gmp-ecm-7.0.4-1.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.

Comment 18 Fedora Update System 2016-10-22 07:55:13 UTC
gmp-ecm-7.0.4-1.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.