Bug 201739

Summary: Macaulay2: ppc build hangs, never finishes
Product: [Fedora] Fedora Reporter: Rex Dieter <rdieter>
Component: Macaulay2Assignee: Rex Dieter <rdieter>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: dwmw2, extras-qa
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-01-07 01:08:09 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 179260, 213321    
Attachments:
Description Flags
fix endless loop none

Description Rex Dieter 2006-08-08 17:35:39 UTC
Summary says it all... i386, x86_64 builds fine?

13896: Macaulay2-0.9.20-0.2.20060808svn.fc6  (finished)
Target:	fedora-development-extras
Submitter:	rdieter math unl edu
Source:	Macaulay2-0_9_20-0_2_20060808svn_fc6
Started:	Tue Aug 8 11:54:54 2006
Ended:	Tue Aug 8 13:41:10 2006 (ran for 106 minutes)
Logs:
http://buildsys.fedoraproject.org/logs/fedora-development-extras/13896-Macaulay2-0.9.20-0.2.20060808svn.fc6/
Result:	"

13896 (Macaulay2): Build on target fedora-development-extras was killed by
rdieter.edu.
i386: hammer3.fedora.redhat.com	Status:done/done Build Time:  19 minutes
x86_64:	hammer2.fedora.redhat.com Status: done/done Build Time: 20 minutes
ppc: 	ppc2.fedora.redhat.com 	Status:  done/building (106 minutes and going)

Comment 1 David Woodhouse 2006-12-28 11:14:30 UTC
I assume this bug is just a placeholder while you debug the issue? What process
is stuck and what is it doing?

The Logs: link above seems not to work.

Comment 2 David Woodhouse 2006-12-28 12:16:17 UTC
As I'm sure you'll have worked out for yourself in the last four months, it's in
an endless loop re-executing itself...

execve("../bin/M2", ["../bin/M2", "-q", "--silent", "--stop", "-e",
"errorDepth=0", "./makesyms.m2", "-e", "exit 0"], [/* 67 vars */]) = 0

Did you make any progress investigating _why_ it's doing that?

Comment 3 David Woodhouse 2006-12-28 12:18:14 UTC
(gdb) break execve
Function "execve" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y

Breakpoint 1 (execve) pending.
(gdb) run
Starting program:
/home/dwmw2/working/extras/Macaulay2/devel/Macaulay2-0.9.95/Macaulay2/bin/M2 
(no debugging symbols found)
[Thread debugging using libthread_db enabled]
[New Thread 268388672 (LWP 5864)]
Breakpoint 2 at 0xf519fa8: file ../sysdeps/unix/sysv/linux/execve.c, line 32.
Pending breakpoint "execve" resolved
[Switching to Thread 268388672 (LWP 5864)]

Breakpoint 2, __execve (
    file=0xfc56f788
"/home/dwmw2/working/extras/Macaulay2/devel/Macaulay2-0.9.95/Macaulay2/bin/M2",
argv=0xfc56f5e4, envp=0xfc56f5ec)
    at ../sysdeps/unix/sysv/linux/execve.c:60
60        return INLINE_SYSCALL (execve, 3, file, argv, envp);
(gdb) bt
#0  __execve (
    file=0xfc56f788
"/home/dwmw2/working/extras/Macaulay2/devel/Macaulay2-0.9.95/Macaulay2/bin/M2",
argv=0xfc56f5e4, envp=0xfc56f5ec)
    at ../sysdeps/unix/sysv/linux/execve.c:60
#1  0x0f51a674 in *__GI_execvp (
    file=0xfc56f788
"/home/dwmw2/working/extras/Macaulay2/devel/Macaulay2-0.9.95/Macaulay2/bin/M2",
argv=0xfc56f5e4) at execvp.c:75
#2  0x1000a5e4 in ?? ()
#3  0x1000b7c0 in __gmp_default_allocate ()
#4  0x0f48dd4c in generic_start_main (
    main=0x1000b7b0 <__gmp_default_allocate+1440>, argc=1, ubp_av=0xfc56f5e4, 
    auxvec=0xfc56f6c8, init=<value optimized out>, fini=<value optimized out>, 
    rtld_fini=<value optimized out>, stack_end=<value optimized out>)
    at ../csu/libc-start.c:231
#5  0x0f48df74 in __libc_start_main (argc=1, ubp_av=0xfc56f5e4, 
    ubp_ev=<value optimized out>, auxvec=0xfc56f6c8, 
    rtld_fini=0xffceba0 <_dl_fini>, stinfo=0x102b0108, 
    stack_on_entry=0xfc56f5d0)
    at ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:127
#6  0x00000000 in ?? ()


Comment 4 David Woodhouse 2006-12-28 12:31:40 UTC
That's more useful if we don't let it strip itself during the build (which will
also be screwing your -debuginfo packages)...

(gdb) bt
#0  __execve (
    file=0xf81bf782
"/home/dwmw2/working/extras/Macaulay2/devel/Macaulay2-0.9.95/Macaulay2/bin/M2.tmp",
argv=0xf81bf5e4, envp=0xf81bf5ec)
    at ../sysdeps/unix/sysv/linux/execve.c:60
#1  0x0f51a674 in *__GI_execvp (
    file=0xf81bf782
"/home/dwmw2/working/extras/Macaulay2/devel/Macaulay2-0.9.95/Macaulay2/bin/M2.tmp",
argv=0xf81bf5e4) at execvp.c:75
#2  0x1000a5e4 in Macaulay2_main (argc=1, argv=0xf81bf5e4) at M2lib.c:371
#3  0x1000b7c0 in main (argc=-132384894, argv=0xf81bf5e4) at main.c:7


Comment 5 David Woodhouse 2006-12-28 13:14:47 UTC
Aha. It's the code which sets the ADDR_NO_RANDOMIZE bit in the personality and
re-execs itself. That doesn't always work as expected though -- when running
32-bit processes on ppc64 and x86_64, those extra bits get lost when we set the
personality to 32-bit. So it just sets that bit and re-executes itself over and
over again. A simple fix is to add '--no-personality' to the command line when
it re-execs, to avoid the loop.

I'm not sure if we actually _need_ to turn randomisation off -- if we do, that's
probably separate bug and should be fixed rather than worked around with a hack
like this.

Comment 6 David Woodhouse 2006-12-28 13:18:50 UTC
Created attachment 144462 [details]
fix endless loop

This adds '--no-personality' to the command line when re-executing, so we don't
get stuck in an endless loop when running 32-bit M2 on 64-bit machines.

I have to say I'm not massively impressed that the maintainer didn't manage to
fix this fairly simple and obvious bug in over four months. Fedora Extras is
not somewhere to just dump packages and forget them; we need active
maintenance.

Comment 7 David Woodhouse 2006-12-28 13:24:55 UTC
The lack of inheritance of personality is arguably a kernel bug; bug #220892

Comment 8 Rex Dieter 2006-12-28 20:17:56 UTC
David, thanks so much for the adept detective-work.

Comment 9 David Woodhouse 2006-12-30 14:21:01 UTC
Should I add this patch to FC-5, FC-6 and devel packages or will you do it?

We should _also_ fix the kernel, but M2 shouldn't fail like this even if the
kernel doesn't let it inherit the personality flags over exec.

Comment 10 Rex Dieter 2006-12-30 15:12:36 UTC
David, I won't have a chance to get to this for 1-2 weeks, so you certainly have
my blessing to patch in the meantime.

Comment 11 Rex Dieter 2006-12-31 03:08:14 UTC
Nevermind, I *should* be able to find the requisite few moments within the next
few days to incorporate the patch.

Comment 12 Rex Dieter 2007-01-06 22:51:27 UTC
%changelog
* Sat Jan 06 2007 Rex Dieter <rdieter[AT]fedoraproject.org> 0.9.95-3
- re-enable ppc build (#201739)

Cross your fingers, patched build queue'd/building now:
http://buildsys.fedoraproject.org/build-status/job.psp?uid=25172

Comment 13 Rex Dieter 2007-01-06 23:02:30 UTC
Re: comment #7
It just occurs to me that if personality isn't inherited, several other
personality modifiying/disabling apps will see breakage, (Extras lisps mostly
that I'm aware of) including gcl, sbcl, cmucl

Comment 14 Rex Dieter 2007-01-07 01:08:09 UTC
Builds completed, closing.