Bug 220675 - xen/nosegneg is sensitive to LD_LIBRARY_PATH
xen/nosegneg is sensitive to LD_LIBRARY_PATH
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.0
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Rik van Riel
Brian Brock
: Regression
Depends On:
Blocks: 215201
  Show dependency treegraph
 
Reported: 2006-12-22 17:46 EST by Jan Kratochvil
Modified: 2007-11-30 17:07 EST (History)
5 users (show)

See Also:
Fixed In Version: RC
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-02-07 21:02:01 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
LD_DEBUG=all LD_LIBRARY_PATH=/lib /bin/echo (28.32 KB, text/plain)
2006-12-22 17:46 EST, Jan Kratochvil
no flags Details
"Documentation/kernel-parameters.txt" vdso sideeffects update. (499 bytes, patch)
2007-01-12 15:46 EST, Jan Kratochvil
no flags Details | Diff

  None (edit)
Description Jan Kratochvil 2006-12-22 17:46:54 EST
Description of problem:
Running on i686 XEN with environment variable `LD_LIBRARY_PATH=/lib' system will
use the default `/lib/libc-2.5.so' library instead of
`/lib/i686/nosegneg/libc-2.5.so' with the associated performance degradation.

Version-Release number of selected component (if applicable):
glibc-2.5-7.i686

How reproducible:
Always.

Steps to Reproduce:
1. LD_LIBRARY_PATH=/lib /bin/sleep 1h & cat /proc/$!/maps|grep libc-

Actual results:
[blah] /lib/libc-2.5.so

Expected results:
[blah] /lib/i686/nosegneg/libc-2.5.so

Additional info:
Kernel reports a even 200000/sec messages of various processes like:
kernel: 4gb seg fixup, process xinetd (pid 2002), cs:ip 73:00174849
Which is probably the reason of sluggish XEN on this system.
Comment 1 Jan Kratochvil 2006-12-22 17:46:54 EST
Created attachment 144322 [details]
LD_DEBUG=all LD_LIBRARY_PATH=/lib /bin/echo
Comment 2 Jakub Jelinek 2006-12-27 13:53:16 EST
This smells like a kernel bug:
objdump -s -j .note /tmp/vdso

/tmp/vdso:     file format elf32-i386

Contents of section .note:
 0460 06000000 04000000 00000000 4c696e75  ............Linu
 0470 78000000 12060200 04000000 12000000  x...............
 0480 02000000 474e5500 01000000 01000000  ....GNU.........
 0490 016e6f73 65676e65 67000000           .nosegneg...

Mask is 0x00000001, so bit 0 is set there.  Mask is what is used when ld.so.cache
comes to play, but when LD_LIBRARY_PATH is used, it needs the strings located
after it.  The format is always a uint8_t bit number followed by zero terminated
name, and ld.so excludes strings which are not set in mask:
#if defined NEED_DL_SYSINFO || defined NEED_DL_SYSINFO_DSO
  if (dsocaps != NULL)
    {
      const ElfW(Word) mask = ((const ElfW(Word) *) dsocaps)[-1];
      GLRO(dl_hwcap) |= (uint64_t) mask << _DL_FIRST_EXTRA;
      size_t len;
      for (const char *p = dsocaps; p < dsocaps + dsocapslen; p += len + 1)
        {
          uint_fast8_t bit = *p++;
          len = strlen (p);

          /* Skip entries that are not enabled in the mask word.  */
          if (__builtin_expect (mask & ((ElfW(Word)) 1 << bit), 1))
            {
              temp[m].str = p;
              temp[m].len = len;
              ++m;
            }
          else
            --cnt;
        }
    }
#endif

But, the kernel says that bit 1 is nosegneg, but only bit 0 is set in the mask.

vsyscall-note-xen.S has:
/*
 * This supplies .note.* sections to go into the PT_NOTE inside the vDSO text.
 * Here we can supply some information useful to userland.
 * First we get the vanilla i386 note that supplies the kernel version info.
 */

#include "vsyscall-note.S"

/*
 * Now we add a special note telling glibc's dynamic linker a fake hardware
 * flavor that it will use to choose the search path for libraries in the
 * same way it uses real hardware capabilities like "mmx".
 * We supply "nosegneg" as the fake capability, to indicate that we
 * do not like negative offsets in instructions using segment overrides,
 * since we implement those inefficiently.  This makes it possible to
 * install libraries optimized to avoid those access patterns in someplace
 * like /lib/i686/tls/nosegneg.  Note that an /etc/ld.so.conf.d/file
 * corresponding to the bits here is needed to make ldconfig work right.
 * It should contain:
 *      hwcap 0 nosegneg
 * to match the mapping of bit to name that we give here.
 */
#define NOTE_KERNELCAP_BEGIN(ncaps, mask) \
        ASM_ELF_NOTE_BEGIN(".note.kernelcap", "a", "GNU", 2) \
        .long ncaps, mask
#define NOTE_KERNELCAP(bit, name) \
        .byte bit; .asciz name
#define NOTE_KERNELCAP_END ASM_ELF_NOTE_END

NOTE_KERNELCAP_BEGIN(1, 1)
NOTE_KERNELCAP(1, "nosegneg")  /* Change 1 back to 0 when glibc is fixed! */
NOTE_KERNELCAP_END

There indeed was a glibc bug, but it has been fixed more than a year ago:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/generic/Attic/dl-sysdep.c.diff?r1=1.114&r2=1.116&cvsroot=glibc
(broken between April and September 2005, fixed since then).
So, the kernel should change.
Comment 3 RHEL Product and Program Management 2006-12-27 14:07:05 EST
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 5 Jakub Jelinek 2006-12-27 16:17:05 EST
I think one option is:
-NOTE_KERNELCAP(1, "nosegneg")  /* Change 1 back to 0 when glibc is fixed! */
+NOTE_KERNELCAP(0, "nosegneg")
(i.e. reverting the hack), the other, if we really want to stay compatible
even with glibcs from the short period of time when it was broken, would be
changing /etc/ld.so.conf.d/kernel* to 1 nosegneg and
-NOTE_KERNELCAP_BEGIN(1, 1)
+NOTE_KERNELCAP_BEGIN(1, 2)
(i.e. keep using bit 1 rather than bit 0 and adjust the mask for that as well
as ldconfig's configuration).
Comment 6 Roland McGrath 2006-12-27 16:18:53 EST
nothing is lost by doing it that way, in case anyone is paranoid about bug compat
Comment 7 Rik van Riel 2006-12-27 16:25:02 EST
I am pretty sure I reverted that bit in the kernel back to zero a while ago.  I
have absolutely no idea why it went back to 1 :(((
Comment 8 Jay Turner 2007-01-03 14:49:18 EST
QE ack for RHEL5.
Comment 9 Jay Turner 2007-01-10 10:50:02 EST
Built into 2.6.18-1.3002.el5.
Comment 10 Jan Kratochvil 2007-01-12 15:46:34 EST
Created attachment 145488 [details]
"Documentation/kernel-parameters.txt" vdso sideeffects update.

Confirming kernel-xen-2.6.18-1.3002.el5.i686 fixes the problem.

Please consider updating the "Documentation/kernel-parameters.txt"
documentation as the problem was still present with non-default
"/proc/sys/kernel/vdso".
Comment 11 RHEL Product and Program Management 2007-02-07 21:02:06 EST
A package has been built which should help the problem described in 
this bug report. This report is therefore being closed with a resolution 
of CURRENTRELEASE. You may reopen this bug report if the solution does 
not work for you.

Note You need to log in before you can comment on or make changes to this bug.