Bug 210130

Summary: [RHEL4] multithreaded program crashes in do_lookup_x
Product: Red Hat Enterprise Linux 4 Reporter: Serguei Kolos <serguei.kolos>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.4CC: drepper, fweimer, jan.iven, matthias.schroder
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007-0807 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-11-15 16:36:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 211133    
Attachments:
Description Flags
GDB session log none

Description Serguei Kolos 2006-10-10 10:47:51 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.7) Gecko/20060915 Red Hat/1.5.0.7-0.1.el4 Firefox/1.5.0.7 pango-text

Description of problem:
My program crashes with Segmentation Fault in the do_lookup_x function. A snapshot of the GDB session is attached. It looks like the stack of the do_lookup_x gets corrupted somehow. Below is the code with my comments:

file do-lookup.h:
static int
__attribute_noinline__
do_lookup_x (const char *undef_name, unsigned long int hash,
	     const ElfW(Sym) *ref, struct sym_val *result,
	     struct r_scope_elem *scope, size_t i,
	     const struct r_found_version *const version, int flags,
	     struct link_map *skip, int type_class)
{
  struct link_map **list = scope->r_list; // the scope->r_list variable is always correct - see GDB session log
  size_t n = scope->r_nlist;  // the scope->r_nlist ariable is always correct - see GDB session log
  struct link_map *map;

  do
    {
      const ElfW(Sym) *symtab;
      const char *strtab;
      const ElfW(Half) *verstab;
      Elf_Symndx symidx;
      const ElfW(Sym) *sym;
      int num_versions = 0;
      const ElfW(Sym) *versioned_sym = NULL;

// Program crashes in the next line. The interesting point is that here
// 1. list != scope->r_list
// 2. n != scope->r_nlist
      map = list[i]->l_real;


Version-Release number of selected component (if applicable):
glibc-2.3.4-2.25 glibc-debuginfo-2.3.4-2.25 glibc-2.3.4-2.25.src.rpm

How reproducible:
Sometimes


Steps to Reproduce:
Unfortunately the program which crashes is rather complex one and I did not manage to reproduce the problem with a simple one since it is not clear what are the conditions which lead to this error.
My suspicion is that the issue happens because I use to call dlopen and dlsym at the beginning of my main function and this somehow clashes with the symbol resolution which is done in another thread by the system loader.

Actual Results:


Expected Results:


Additional info:

Comment 1 Serguei Kolos 2006-10-10 10:51:58 UTC
Created attachment 138127 [details]
GDB session log

Comment 2 Jakub Jelinek 2006-10-10 11:11:58 UTC
That sounds like what we fixed yesterday in CVS HEAD glibc:
2006-10-09  Ulrich Drepper  <drepper>
            Jakub Jelinek  <jakub>

        Implement reference counting of scope records.
        * elf/dl-close.c (_dl_close): Remove all scopes from removed objects
        from the list in objects which remain.  Always allocate new scope
        record.
        * elf/dl-open.c (dl_open_worker): When growing array for scopes,
        don't resize, allocate a new one.
        * elf/dl-runtime.c: Update reference counters before using a scope
        array.
        * elf/dl-sym.c: Likewise.
        * elf/dl-libc.c: Adjust for l_scope name change.
        * elf/dl-load.c: Likewise.
        * elf/dl-object.c: Likewise.
        * elf/rtld.c: Likewise.
        * include/link.h: Inlcude <rtld-lowlevel.h>.  Define struct
        r_scoperec.  Replace r_scope with pointer to r_scoperec structure.
        Add l_scoperec_lock.
        * sysdeps/generic/ldsodefs.h: Include <rtld-lowlevel.h>.
        * sysdeps/generic/rtld-lowlevel.h: New file.

        * include/atomic.h: Rename atomic_and to atomic_and_val and
        atomic_or to atomic_or_val.  Define new macros atomic_and and
        atomic_or which do not return values.
        * sysdeps/x86_64/bits/atomic.h: Define atomic_and and atomic_or.
        Various cleanups.
        * sysdeps/i386/i486/bits/atomic.h: Likewise.

nptl/
2006-10-09  Ulrich Drepper  <drepper>

        * sysdeps/unix/sysv/linux/rtld-lowlevel.h: New file..

This should show up in Fedora Development after Fedora Core 6 is released
and when it gets sufficiently tested there, we should consider backporting
that to RHEL5/FC6/RHEL4.

Comment 3 RHEL Program Management 2006-10-10 11:16:26 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 8 RHEL Program Management 2007-03-10 00:56:25 UTC
This bugzilla had previously been approved for engineering
consideration but Red Hat Product Management is currently reevaluating
this issue for inclusion in RHEL4.6.

Comment 10 RHEL Program Management 2007-05-09 09:22:58 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 12 Jakub Jelinek 2007-08-02 21:20:35 UTC
I have added to glibc-2.3.4-2.38 a workaround patch (the same as RHEL5 GA was
using).  The proper fix is IMNSHO too invasive for RHEL4 (and we have very
difficult issues with LinuxThreads and LinuxThreads ld.so there).

Comment 16 errata-xmlrpc 2007-11-15 16:36:44 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0807.html