Bug 1104802

Summary: gp segfault
Product: [Fedora] Fedora Reporter: Jerry James <loganjerry>
Component: pariAssignee: Paul Howarth <paul>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: rawhideCC: han, paul, paulo.cesar.pereira.de.andrade, tremble
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
URL: http://pari.math.u-bordeaux.fr/cgi-bin/bugreport.cgi?bug=1589
Whiteboard:
Fixed In Version: pari-2.7.1-4.fc21 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-07-07 14:32:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jerry James 2014-06-04 17:45:21 UTC
Description of problem:
On Fedora 20 (pari 2.5.5):

gp > ellpow([0,1,0,2,-15],[2,1],5)
%2 = [37247908142/10128208321, 7601802384416381/1019292757217119]

On Rawhide (pari 2.7.1):

? ellpow([0,1,0,2,-15],[2,1],5)
  ***   at top-level: ellpow([0,1,0,2,-15]
  ***                 ^--------------------
  *** ellpow: bug in PARI/GP (Segmentation Fault), please report.
  ***   Break loop: type 'break' to go back to GP prompt
break> 

The same result is obtained if the new "ellmul" name is used in place of the deprecated "ellpow" name in 2.7.1.  Valgrind says:

==25692== Syscall param sendmsg(msg.msg_name) points to uninitialised byte(s)
==25692==    at 0x5244E60: __sendmsg_nocancel (syscall-template.S:81)
==25692==    by 0x3439216246: readline (readline.c:346)
==25692==    by 0x40F8F4: gprl_input (gp_rl.c:837)
==25692==    by 0x3419FAB044: input_loop (es.c:355)
==25692==    by 0x410F12: get_line_from_readline (gp_rl.c:872)
==25692==    by 0x40B9F4: gp_read_line (gp.c:1555)
==25692==    by 0x40D4E2: gp_main_loop (gp.c:1622)
==25692==    by 0x4089A6: main (gp.c:2252)
==25692==  Address 0xffefffae2 is on thread 1's stack
==25692== 
==25692== Invalid read of size 8
==25692==    at 0x3419E5FEA8: ellmul_Z (elliptic.c:109)
==25692==    by 0x3419E7169A: ellmul (elliptic.c:1545)
==25692==    by 0x3419FB81A7: closure_eval (eval.c:1209)
==25692==    by 0x3419FB9CA4: closure_evalres (eval.c:543)
==25692==    by 0x40D596: gp_main_loop (gp.c:1637)
==25692==    by 0x4089A6: main (gp.c:2252)
==25692==  Address 0x10 is not stack'd, malloc'd or (recently) free'd
==25692== 

Version-Release number of selected component (if applicable):
pari-gp-2.7.1-2.fc21.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Run gp
2. Enter "ellpow([0,1,0,2,-15],[2,1],5)"
3.

Actual results:
Failure due to a segfault.

Expected results:
The answer.

Additional info:

Comment 1 Jerry James 2014-06-04 21:22:46 UTC
The problem function is this one:

static int
ell_over_Fq(GEN E)
{
  long t = ell_get_type(E);
  return t==t_ELL_Fp || t==t_ELL_Fq;
}

The value of t is indeed 0x10 when the crash occurs, so somehow t is treated as a pointer and dereferenced.

Comment 2 Jerry James 2014-06-13 21:19:45 UTC
I reduced this to the following nonsense code, which exhibits the same behavior:

#include <stdio.h>
#include <stdlib.h>

inline static long ell_get_type(long *e) { return ((long **)e)[14][1]; }

static int
ell_over_Fq(long *E)
{
  long t = ell_get_type(E);
  return t==3 || t==4;
}

long *
ellmul(long *e)
{
  if (ell_over_Fq(e))
    {
      if ((((unsigned long)(e[0])) >> (__WORDSIZE - 7)) == 5U)
	return &e[1];
      else
	return e;
    }
  return e;
}

int
main ()
{
  long gen_0[2] = { 0x0200000000000002, 2L };
  long e[2] = { 0x2200000000000002, (long) gen_0 };

  ellmul(e);
  puts ("Successful completion");
  return EXIT_SUCCESS;
}

But this is bogus, because e, to which ell_get_type() is applied, doesn't have 14 elements.  It has 2.  The real problem is that we're walking off the end of an array, which somehow manifests as the strange error where the long int t is somehow treated as a pointer.  That may be a gcc bug, but nevertheless, there is a pari bug here, too.  In the case of the input given above, the real pari ell_get_type() function is called on an array of length 6.

Note that if 14 is changed to 1 in this example, the program completes successfully.

Comment 3 Paul Howarth 2014-06-24 15:35:00 UTC
Upstream says:

 "The correct syntax for both versions is
  E = ellinit([0,1,0,2,-15]);
  ellpow(E,[2,1],5)

  Of course, it should report an error instead of crashing."

Comment 4 Paul Howarth 2014-07-07 14:32:48 UTC
Upstream fix that will be in pari 2.7.2 is included in pari-2.7.1-4.fc21, resolving this issue.

Code relying on use of this syntax will need to be fixed to use the preferred format from Comment #3 if it will need to work with pari 2.8 onwards, which will treat the original syntax as an error.