Description of problem: Running rpmbuild gets a segmentation fault (during packaging) Version-Release number of selected component (if applicable): Anything later than FC5 How reproducible: I've tried it on virgin & updated FC6 and FC7 systems with same result. Building the identical RPM (rebuild using a working SRPM) works fine on FC5. Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: This is an internal RPM package which I've been working with and updating for years. It (and previos revisions) has always built with no trouble on older systems (FC5 and earlier). Now that I've updated to FC7, it fails consistently. Note: the SRPM is quite large and takes hours to rebuild. It's also not the prettiest thing (I've had some problems getting it to play nice with 'buildroot' and hence it *will* destroy the contents of /opt/amltd) The RPM package is at ftp://ftp.mlbassoc.com/private/amltd_tools-ppc-2-3.src.rpm
Note: the build fails if you select just one of the 'uclibc' or 'glibc' packages. It succeeds (no seg fault) if you select the eabi package.
Please get a backtrace of the crash (http://fedoraproject.org/wiki/StackTraces): - install rpm-debuginfo - run the build under gdb: $ gdb rpmbuild (gdb) run -bb amltd_tools.spec (or whatever arguments you use for build) ..and when it crashes (gdb) bt ..and copy-paste the entire trace here. Meanwhile I have my suspicions, disabling internal dependency generator from being used *might* help the build complete. Even if so, there's a bug that needs fixing...
Using: rpm-debuginfo-4.4.2.1-1.fc7.i386.rpm elfutils-debuginfo-0.129-1.fc7.i386.rpm ... Processing files: amltd_tools-ppc-linux-uclibc-2-3 Program received signal SIGSEGV, Segmentation fault. [Switching to Thread -1208367392 (LWP 21236)] 0x0076f734 in elf32_offscn (elf=0x9d30ec8, offset=20336) at elf32_offscn.c:89 89 if (runp->data[i].shdr.ELFW(e,LIBELFBITS)->sh_offset == offset) (gdb) bt #0 0x0076f734 in elf32_offscn (elf=0x9d30ec8, offset=20336) at elf32_offscn.c:89 #1 0x0076f8a7 in gelf_offscn (elf=0x9d30f70, offset=707926488598446080) at gelf_offscn.c:74 #2 0x008ea7c4 in rpmfcSYMLINK (fc=0x9c39438) at rpmfc.c:701 #3 0x008eaac5 in rpmfcApply (fc=0x9c39438) at rpmfc.c:1238 #4 0x008ec317 in rpmfcGenerateDepends (spec=0x94b9048, pkg=0x94d2ac0) at rpmfc.c:1734 #5 0x008ddafa in processBinaryFiles (spec=0x94b9048, installSpecialDoc=4, test=0) at files.c:2509 #6 0x008d7811 in buildSpec (ts=0x94b8e08, spec=0x94b9048, what=<value optimized out>, test=0) at build.c:333 #7 0x0804a5fb in ?? () #8 0x0804a94b in ?? () #9 0x0804b743 in ?? () #10 0x005bdf70 in __libc_start_main () from /lib/libc.so.6 #11 0x08049b11 in ?? ()
Note: this package builds many corss-compiled ELF libraries (it is building GLIBC or uClibC at this point) and I surmise that it's the scanning of these on the x86 based build/development system that's causing grief.
Thanks, that proves my suspicions about the devel-symlink autodependency patch being the problem (so this is only issue in Fedora, not rpm upstream). Adding "%define _use_internal_dependency_generator 0" ought to get you a package that builds until the actual problem is fixed. BTW what's the exact setting where this happens (wrt comment #1), like this or...? %define _build_powerpc_linux_glibc 0 %define _build_powerpc_eabi 0 %define _build_powerpc_linux_uclibc 1
Turning off the internal dependency generator does work. I look forward to the real solution (as opposed to the work around, which I appreciate) As for what it takes to make it fail, at least one of glibc or uclibc needs to be built. Note: the build for uclibc fails on a AMD/64 box I have, but runs fine on my other boxes (AMD/32 as well as Intel/32). I'll be investigating this separately.
Sadly, I spoke too soon. While disabling the internal dependency generator does allow the package to be built, the resulting RPM cannot be installed (even on the system where it was built!!) [root@saturn FC7]# rpm -ivh amltd_tools-ppc-linux-2-3.i386.rpm error: Failed dependencies: nscd < 2.3.3-52 conflicts with glibc-2.6-4.i686
Hmm, this looks like a problem I ran into yesterday. In that case, I was packaging a big-endian library on a little-endian host. The segfault I saw happened at the same spot in libelf because runp->data[i].shdr.ELFW(e,LIBELFBITS) was NULL. RPM is trying to find the soname of the library by digging around in the elf file. When the endianess of the elf is different from the host, libelf appears to defer filling in all of the section header info (probably to avoid uneccessary copying and swab'ing of the fields). Its not clear to me if this is an elfutils problem or if rpm is using libelf incorrectly.
Here's a patch that works around the problem for me. It avoids using gelf_offscn which seems to be the real culprit. But it isn't entirely clear to me if gelf_offscn was wrong or if rpm was using it incorrectly. diff -rup rpm-4.4.2.orig/build/rpmfc.c rpm-4.4.2/build/rpmfc.c --- rpm-4.4.2.orig/build/rpmfc.c 2007-10-14 15:36:22.000000000 -0400 +++ rpm-4.4.2/build/rpmfc.c 2007-12-05 19:43:21.000000000 -0500 @@ -693,13 +693,16 @@ static int rpmfcSYMLINK(rpmfc fc) GElf_Shdr shdr_mem; Elf_Data * data = NULL; - Elf_Scn * scn; + Elf_Scn * scn = NULL; - GElf_Shdr *shdr; + GElf_Shdr *shdr = NULL; if (phdr == NULL || phdr->p_type != PT_DYNAMIC) continue; - scn = gelf_offscn(elf, phdr->p_offset); - shdr = gelf_getshdr(scn, &shdr_mem); + while ((scn = elf_nextscn(elf, scn)) != NULL) { + shdr = gelf_getshdr(scn, &shdr_mem); + if (shdr->sh_offset == phdr->p_offset) + break; + } - if (shdr != NULL && shdr->sh_type == SHT_DYNAMIC) + if (scn != NULL && shdr != NULL && shdr->sh_type == SHT_DYNAMIC) data = elf_getdata (scn, NULL);
I put the above fix into rawhide now (rpm-4.4.2.2-12.fc9). Gary, can you check whether it fixes your problem too?
Yes, this now works. Thanks
Ok, thanks for testing. I'll add this patch to F7/F8 rpm on next update.
Fedora apologizes that these issues have not been resolved yet. We're sorry it's taken so long for your bug to be properly triaged and acted on. We appreciate the time you took to report this issue and want to make sure no important bugs slip through the cracks. If you're currently running a version of Fedora Core between 1 and 6, please note that Fedora no longer maintains these releases. We strongly encourage you to upgrade to a current Fedora release. In order to refocus our efforts as a project we are flagging all of the open bugs for releases which are no longer maintained and closing them. http://fedoraproject.org/wiki/LifeCycle/EOL If this bug is still open against Fedora Core 1 through 6, thirty days from now, it will be closed 'WONTFIX'. If you can reporduce this bug in the latest Fedora version, please change to the respective version. If you are unable to do this, please add a comment to this bug requesting the change. Thanks for your help, and we apologize again that we haven't handled these issues to this point. The process we are following is outlined here: http://fedoraproject.org/wiki/BugZappers/F9CleanUp We will be following the process here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this doesn't happen again. And if you'd like to join the bug triage team to help make things better, check out http://fedoraproject.org/wiki/BugZappers
I've moved on to Fedora 7+ - close at will
Thanks for your update