Bug 338971 - rpmbuild seg faults
Summary: rpmbuild seg faults
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: rpm
Version: 6
Hardware: All
OS: Linux
low
high
Target Milestone: ---
Assignee: Panu Matilainen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: bzcl34nup
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-10-18 23:19 UTC by Gary Thomas
Modified: 2008-04-04 21:36 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-04-04 21:36:49 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Gary Thomas 2007-10-18 23:19:30 UTC
Description of problem:

Running rpmbuild gets a segmentation fault (during packaging)

Version-Release number of selected component (if applicable):

Anything later than FC5

How reproducible:

I've tried it on virgin & updated FC6 and FC7 systems with same result.
Building the identical RPM (rebuild using a working SRPM) works fine
on FC5.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

This is an internal RPM package which I've been working with and
updating for years.  It (and previos revisions) has always built
with no trouble on older systems (FC5 and earlier).  Now that I've
updated to FC7, it fails consistently.

Note: the SRPM is quite large and takes hours to rebuild.  It's also
not the prettiest thing (I've had some problems getting it to play
nice with 'buildroot' and hence it *will* destroy the contents of
/opt/amltd)

The RPM package is at ftp://ftp.mlbassoc.com/private/amltd_tools-ppc-2-3.src.rpm

Comment 1 Gary Thomas 2007-10-19 10:36:59 UTC
Note: the build fails if you select just one of the 'uclibc' or 'glibc' packages.
It succeeds (no seg fault) if you select the eabi package.

Comment 2 Panu Matilainen 2007-10-20 08:01:18 UTC
Please get a backtrace of the crash (http://fedoraproject.org/wiki/StackTraces):
- install rpm-debuginfo
- run the build under gdb:
   $ gdb rpmbuild
   (gdb) run -bb amltd_tools.spec    (or whatever arguments you use for build)
..and when it crashes
   (gdb) bt
..and copy-paste the entire trace here.

Meanwhile I have my suspicions, disabling internal dependency generator from
being used *might* help the build complete. Even if so, there's a bug that needs
fixing...

Comment 3 Gary Thomas 2007-10-20 12:54:19 UTC
Using: rpm-debuginfo-4.4.2.1-1.fc7.i386.rpm
       elfutils-debuginfo-0.129-1.fc7.i386.rpm

   ...
Processing files: amltd_tools-ppc-linux-uclibc-2-3

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1208367392 (LWP 21236)]
0x0076f734 in elf32_offscn (elf=0x9d30ec8, offset=20336) at elf32_offscn.c:89
89              if (runp->data[i].shdr.ELFW(e,LIBELFBITS)->sh_offset == offset)
(gdb) bt
#0  0x0076f734 in elf32_offscn (elf=0x9d30ec8, offset=20336) at elf32_offscn.c:89
#1  0x0076f8a7 in gelf_offscn (elf=0x9d30f70, offset=707926488598446080) at
gelf_offscn.c:74
#2  0x008ea7c4 in rpmfcSYMLINK (fc=0x9c39438) at rpmfc.c:701
#3  0x008eaac5 in rpmfcApply (fc=0x9c39438) at rpmfc.c:1238
#4  0x008ec317 in rpmfcGenerateDepends (spec=0x94b9048, pkg=0x94d2ac0) at
rpmfc.c:1734
#5  0x008ddafa in processBinaryFiles (spec=0x94b9048, installSpecialDoc=4,
test=0) at files.c:2509
#6  0x008d7811 in buildSpec (ts=0x94b8e08, spec=0x94b9048, what=<value optimized
out>, test=0) at build.c:333
#7  0x0804a5fb in ?? ()
#8  0x0804a94b in ?? ()
#9  0x0804b743 in ?? ()
#10 0x005bdf70 in __libc_start_main () from /lib/libc.so.6
#11 0x08049b11 in ?? ()


Comment 4 Gary Thomas 2007-10-20 13:41:51 UTC
Note: this package builds many corss-compiled ELF libraries (it is building GLIBC
or uClibC at this point) and I surmise that it's the scanning of these on the x86
based build/development system that's causing grief.

Comment 5 Panu Matilainen 2007-10-22 05:26:43 UTC
Thanks, that proves my suspicions about the devel-symlink autodependency patch
being the problem (so this is only issue in Fedora, not rpm upstream). Adding
"%define _use_internal_dependency_generator 0" ought to get you a package that
builds until the actual problem is fixed.

BTW what's the exact setting where this happens (wrt comment #1), like this or...?
%define _build_powerpc_linux_glibc  0
%define _build_powerpc_eabi         0
%define _build_powerpc_linux_uclibc 1


Comment 6 Gary Thomas 2007-10-22 14:09:50 UTC
Turning off the internal dependency generator does work.  I look forward to
the real solution (as opposed to the work around, which I appreciate)

As for what it takes to make it fail, at least one of glibc or uclibc needs
to be built.

Note: the build for uclibc fails on a AMD/64 box I have, but runs fine on my
other boxes (AMD/32 as well as Intel/32).  I'll be investigating this separately.

Comment 7 Gary Thomas 2007-10-24 11:59:46 UTC
Sadly, I spoke too soon.  While disabling the internal dependency generator does
allow the package to be built, the resulting RPM cannot be installed (even on
the system where it was built!!)

[root@saturn FC7]# rpm -ivh amltd_tools-ppc-linux-2-3.i386.rpm 
error: Failed dependencies:
        nscd < 2.3.3-52 conflicts with glibc-2.6-4.i686


Comment 8 Mark Salter 2007-12-05 15:27:55 UTC
Hmm, this looks like a problem I ran into yesterday. In that case, I was
packaging a big-endian library on a little-endian host. The segfault I saw
happened at the same spot in libelf because
runp->data[i].shdr.ELFW(e,LIBELFBITS) was NULL. RPM is trying to find the soname
of the library by digging around in the elf file. When the endianess of the elf
is different from the host, libelf appears to defer filling in all of the
section header info (probably to avoid uneccessary copying and swab'ing of the
fields). Its not clear to me if this is an elfutils problem or if rpm is using
libelf incorrectly.


Comment 9 Mark Salter 2007-12-06 14:34:45 UTC
Here's a patch that works around the problem for me. It avoids using gelf_offscn
which seems to be the real culprit. But it isn't entirely clear to me if
gelf_offscn was wrong or if rpm was using it incorrectly.

diff -rup rpm-4.4.2.orig/build/rpmfc.c rpm-4.4.2/build/rpmfc.c
--- rpm-4.4.2.orig/build/rpmfc.c	2007-10-14 15:36:22.000000000 -0400
+++ rpm-4.4.2/build/rpmfc.c	2007-12-05 19:43:21.000000000 -0500
@@ -693,13 +693,16 @@ static int rpmfcSYMLINK(rpmfc fc)
       GElf_Shdr shdr_mem;
       Elf_Data * data = NULL;
-      Elf_Scn * scn;
+      Elf_Scn * scn = NULL;
-      GElf_Shdr *shdr;
+      GElf_Shdr *shdr = NULL;
 
       if (phdr == NULL || phdr->p_type != PT_DYNAMIC)
           continue;
 
-      scn = gelf_offscn(elf, phdr->p_offset);
-      shdr = gelf_getshdr(scn, &shdr_mem);
+      while ((scn = elf_nextscn(elf, scn)) != NULL) {
+	  shdr = gelf_getshdr(scn, &shdr_mem);
+	  if (shdr->sh_offset == phdr->p_offset)
+	      break;
+      }
 
-      if (shdr != NULL && shdr->sh_type == SHT_DYNAMIC)
+      if (scn != NULL && shdr != NULL && shdr->sh_type == SHT_DYNAMIC)
           data = elf_getdata (scn, NULL);


Comment 10 Panu Matilainen 2008-01-04 08:07:32 UTC
I put the above fix into rawhide now (rpm-4.4.2.2-12.fc9). Gary, can you check
whether it fixes your problem too?


Comment 11 Gary Thomas 2008-01-07 19:25:02 UTC
Yes, this now works.  Thanks

Comment 12 Panu Matilainen 2008-01-08 07:35:14 UTC
Ok, thanks for testing. I'll add this patch to F7/F8 rpm on next update.

Comment 13 Bug Zapper 2008-04-04 07:38:46 UTC
Fedora apologizes that these issues have not been resolved yet. We're
sorry it's taken so long for your bug to be properly triaged and acted
on. We appreciate the time you took to report this issue and want to
make sure no important bugs slip through the cracks.

If you're currently running a version of Fedora Core between 1 and 6,
please note that Fedora no longer maintains these releases. We strongly
encourage you to upgrade to a current Fedora release. In order to
refocus our efforts as a project we are flagging all of the open bugs
for releases which are no longer maintained and closing them.
http://fedoraproject.org/wiki/LifeCycle/EOL

If this bug is still open against Fedora Core 1 through 6, thirty days
from now, it will be closed 'WONTFIX'. If you can reporduce this bug in
the latest Fedora version, please change to the respective version. If
you are unable to do this, please add a comment to this bug requesting
the change.

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we are following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

And if you'd like to join the bug triage team to help make things
better, check out http://fedoraproject.org/wiki/BugZappers

Comment 14 Gary Thomas 2008-04-04 11:02:09 UTC
I've moved on to Fedora 7+ - close at will

Comment 15 John Poelstra 2008-04-04 21:36:49 UTC
Thanks for your update


Note You need to log in before you can comment on or make changes to this bug.