Description of problem: Running on i686 XEN with environment variable `LD_LIBRARY_PATH=/lib' system will use the default `/lib/libc-2.5.so' library instead of `/lib/i686/nosegneg/libc-2.5.so' with the associated performance degradation. Version-Release number of selected component (if applicable): glibc-2.5-7.i686 How reproducible: Always. Steps to Reproduce: 1. LD_LIBRARY_PATH=/lib /bin/sleep 1h & cat /proc/$!/maps|grep libc- Actual results: [blah] /lib/libc-2.5.so Expected results: [blah] /lib/i686/nosegneg/libc-2.5.so Additional info: Kernel reports a even 200000/sec messages of various processes like: kernel: 4gb seg fixup, process xinetd (pid 2002), cs:ip 73:00174849 Which is probably the reason of sluggish XEN on this system.
Created attachment 144322 [details] LD_DEBUG=all LD_LIBRARY_PATH=/lib /bin/echo
This smells like a kernel bug: objdump -s -j .note /tmp/vdso /tmp/vdso: file format elf32-i386 Contents of section .note: 0460 06000000 04000000 00000000 4c696e75 ............Linu 0470 78000000 12060200 04000000 12000000 x............... 0480 02000000 474e5500 01000000 01000000 ....GNU......... 0490 016e6f73 65676e65 67000000 .nosegneg... Mask is 0x00000001, so bit 0 is set there. Mask is what is used when ld.so.cache comes to play, but when LD_LIBRARY_PATH is used, it needs the strings located after it. The format is always a uint8_t bit number followed by zero terminated name, and ld.so excludes strings which are not set in mask: #if defined NEED_DL_SYSINFO || defined NEED_DL_SYSINFO_DSO if (dsocaps != NULL) { const ElfW(Word) mask = ((const ElfW(Word) *) dsocaps)[-1]; GLRO(dl_hwcap) |= (uint64_t) mask << _DL_FIRST_EXTRA; size_t len; for (const char *p = dsocaps; p < dsocaps + dsocapslen; p += len + 1) { uint_fast8_t bit = *p++; len = strlen (p); /* Skip entries that are not enabled in the mask word. */ if (__builtin_expect (mask & ((ElfW(Word)) 1 << bit), 1)) { temp[m].str = p; temp[m].len = len; ++m; } else --cnt; } } #endif But, the kernel says that bit 1 is nosegneg, but only bit 0 is set in the mask. vsyscall-note-xen.S has: /* * This supplies .note.* sections to go into the PT_NOTE inside the vDSO text. * Here we can supply some information useful to userland. * First we get the vanilla i386 note that supplies the kernel version info. */ #include "vsyscall-note.S" /* * Now we add a special note telling glibc's dynamic linker a fake hardware * flavor that it will use to choose the search path for libraries in the * same way it uses real hardware capabilities like "mmx". * We supply "nosegneg" as the fake capability, to indicate that we * do not like negative offsets in instructions using segment overrides, * since we implement those inefficiently. This makes it possible to * install libraries optimized to avoid those access patterns in someplace * like /lib/i686/tls/nosegneg. Note that an /etc/ld.so.conf.d/file * corresponding to the bits here is needed to make ldconfig work right. * It should contain: * hwcap 0 nosegneg * to match the mapping of bit to name that we give here. */ #define NOTE_KERNELCAP_BEGIN(ncaps, mask) \ ASM_ELF_NOTE_BEGIN(".note.kernelcap", "a", "GNU", 2) \ .long ncaps, mask #define NOTE_KERNELCAP(bit, name) \ .byte bit; .asciz name #define NOTE_KERNELCAP_END ASM_ELF_NOTE_END NOTE_KERNELCAP_BEGIN(1, 1) NOTE_KERNELCAP(1, "nosegneg") /* Change 1 back to 0 when glibc is fixed! */ NOTE_KERNELCAP_END There indeed was a glibc bug, but it has been fixed more than a year ago: http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/generic/Attic/dl-sysdep.c.diff?r1=1.114&r2=1.116&cvsroot=glibc (broken between April and September 2005, fixed since then). So, the kernel should change.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
I think one option is: -NOTE_KERNELCAP(1, "nosegneg") /* Change 1 back to 0 when glibc is fixed! */ +NOTE_KERNELCAP(0, "nosegneg") (i.e. reverting the hack), the other, if we really want to stay compatible even with glibcs from the short period of time when it was broken, would be changing /etc/ld.so.conf.d/kernel* to 1 nosegneg and -NOTE_KERNELCAP_BEGIN(1, 1) +NOTE_KERNELCAP_BEGIN(1, 2) (i.e. keep using bit 1 rather than bit 0 and adjust the mask for that as well as ldconfig's configuration).
nothing is lost by doing it that way, in case anyone is paranoid about bug compat
I am pretty sure I reverted that bit in the kernel back to zero a while ago. I have absolutely no idea why it went back to 1 :(((
QE ack for RHEL5.
Built into 2.6.18-1.3002.el5.
Created attachment 145488 [details] "Documentation/kernel-parameters.txt" vdso sideeffects update. Confirming kernel-xen-2.6.18-1.3002.el5.i686 fixes the problem. Please consider updating the "Documentation/kernel-parameters.txt" documentation as the problem was still present with non-default "/proc/sys/kernel/vdso".
A package has been built which should help the problem described in this bug report. This report is therefore being closed with a resolution of CURRENTRELEASE. You may reopen this bug report if the solution does not work for you.