Bug 708406 - [RFE] Kernel needs to enable nosegneg for Xen guests
Summary: [RFE] Kernel needs to enable nosegneg for Xen guests
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: glibc
Version: 19
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Carlos O'Donell
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-05-27 14:57 UTC by Chris Lalancette
Modified: 2016-11-24 12:27 UTC (History)
13 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-02-17 13:46:17 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Chris Lalancette 2011-05-27 14:57:51 UTC
Description of problem:
When installing the kernel on a i686 Xen dom0 or domU, the fedora kernel should probably put down a hwcap file to enable nosegneg.  This gives improved performance on Xen machines by not taking a hypervisor trap on every TLS access.  Note, however, that on bare-metal, nosegneg is slower than segneg, so you do *not* want to install it there.

The RHEL-6 kernel spec file (in BuildKernel) has a section like this:

    if grep '^CONFIG_XEN=y$' .config >/dev/null; then
      echo > ldconfig-kernel.conf "\
# This directive teaches ldconfig to search in nosegneg subdirectories
# and cache the DSOs there with extra bit 0 set in their hwcap match
# fields.  In Xen guest kernels, the vDSO tells the dynamic linker to
# search in nosegneg subdirectories and to match this extra hwcap bit
# in the ld.so.cache file.
hwcap 1 nosegneg"
    fi

While that does the right thing for CONFIG_XEN, the fact of the matter is that CONFIG_XEN is on in all configurations (pvops lets us use the same binary kernel for bare-metal and Xen operation).  Therefore, the above code probably needs to be combined with a conditional somewhere to only install the file if it is on a Xen machine.

jforbes, this additionally impacts the Fedora AMI build scripts; if something like the above does go into the Fedora kernel, then you can drop the similar thing from the AMI build script.

Comment 1 Chris Lalancette 2011-05-27 16:38:20 UTC
(In reply to comment #0)
> While that does the right thing for CONFIG_XEN, the fact of the matter is that
> CONFIG_XEN is on in all configurations (pvops lets us use the same binary
> kernel for bare-metal and Xen operation).  Therefore, the above code probably
> needs to be combined with a conditional somewhere to only install the file if
> it is on a Xen machine.

Never mind this part.  Paolo pointed out that the kernel is smart here and only honors this for Xen guests in 32-bit mode.  So the above snippet of spec file can be put into the Fedora kernel spec file directly.

Chris Lalancette

Comment 2 Dave Jones 2011-05-27 21:24:32 UTC
given that it would be an unconditional thing, why can't we just ship a ldconfig-kernel.conf in one of the userspace rpm's ?

Comment 3 Chris Lalancette 2011-05-31 13:37:16 UTC
(In reply to comment #2)
> given that it would be an unconditional thing, why can't we just ship a
> ldconfig-kernel.conf in one of the userspace rpm's ?

Part of the reason that this wasn't done in the past is that there are no xen-specific userspace RPMs that are *required* to be installed inside a Xen guest.

That being said, we could add this file to the glibc RPM (which is the other place where it would make sense).  Jakub, what do you think about this plan?

Comment 4 Jeff Law 2012-02-17 18:56:07 UTC
I'm coming into this discussion late and without an in-depth understanding of the issue.  

From what I've been able to gather you want to unconditionally install ldconfig-kernel.conf and want glibc to own the file?

Is there any performance or capability lost if this file is installed and we're not using xen?  ie, is installing this file always the right thing to do?   What (if any) reason has been given for not installing this file in the past?

Comment 5 Dave Jones 2012-02-17 22:26:30 UTC
from comment #1:

"Note, however, that on bare-metal, nosegneg is slower than segneg, so
you do *not* want to install it there."

so I guess the %post needs to check something to find out if it's being run in a virtual environment or not ?  Chris ?

Comment 6 Andrew Jones 2012-02-20 12:57:36 UTC
The vdso only gets this hwcap set for 32-bit xen. See arch/x86/xen/setup.c:fiddle_vdso() in the kernel source. Also, take a look at arch/x86/vdso/vdso32/note.S. It has a nice comment, which I've quoted below, that explains things nicely

#ifdef CONFIG_XEN
/*
 * Add a special note telling glibc's dynamic linker a fake hardware
 * flavor that it will use to choose the search path for libraries in the
 * same way it uses real hardware capabilities like "mmx".
 * We supply "nosegneg" as the fake capability, to indicate that we
 * do not like negative offsets in instructions using segment overrides,
 * since we implement those inefficiently.  This makes it possible to
 * install libraries optimized to avoid those access patterns in someplace
 * like /lib/i686/tls/nosegneg.  Note that an /etc/ld.so.conf.d/file
 * corresponding to the bits here is needed to make ldconfig work right.
 * It should contain:
 *      hwcap 1 nosegneg
 * to match the mapping of bit to name that we give here.
 *
 * At runtime, the fake hardware feature will be considered to be present
 * if its bit is set in the mask word.  So, we start with the mask 0, and
 * at boot time we set VDSO_NOTE_NONEGSEG_BIT if running under Xen.
 */ 
    
#include "../../xen/vdso.h"     /* Defines VDSO_NOTE_NONEGSEG_BIT.  */

ELFNOTE_START(GNU, 2, "a")
        .long 1                 /* ncaps */
VDSO32_NOTE_MASK:               /* Symbol used by arch/x86/xen/setup.c */
        .long 0                 /* mask */
        .byte VDSO_NOTE_NONEGSEG_BIT; .asciz "nosegneg" /* bit, name */
ELFNOTE_END
#endif

Comment 7 Jeff Law 2012-02-28 06:11:42 UTC
So is there a reasonable way to determine if glibc is being installed in a xen guest?  I don't mind dropping in the bits to make things more efficient for xen if there's a good way to know when it's the right thing to do.

Comment 8 Andrew Jones 2012-02-28 08:43:37 UTC
The /etc/ld.so.conf.d file can be put on *all* systems unconditionally. The 'hwcap 1 nosegneg' is just a mask that says we should take a look at that vdso bit. The kernel will only set that bit when running as a 32-bit xen guest, so in all other cases the ld.so file does nothing. This is what I was trying to show in comment 6. I see in Chris' initial description of this bug that he was concerned about the perf on non-xen machines. He must have missed the part about the vdso bit being the real on/off switch. I see in comment 1 that he already tried to straighten things out though. BTW, Chris is no longer with Red Hat, which is why I'm jumping in for him on this bug.

So the ld.so file with the 'hwcap 1 nosegneg' can be installed unconditionally - almost. If Fedora ever compiled its kernel with CONFIG_XEN=n, or if it ever dropped support for 32-bit kernels, then this ld.so file would become pointless. Thus glibc should check CONFIG_XEN=y before installing the ld.so file, just as the snip in this bug's description shows that the RHEL kernel does. Actually, it should check CONFIG_X86_32=y as well to be the cleanest solution.

Comment 9 Jeff Law 2012-03-05 19:48:27 UTC
OK.  It wasn't clear from reading c#6 that this wouldn't impact bare metal adversely.

So is there any way to check the kernel configuration parameters when we install the glibc rpm?  I realize that check won't be perfect (user could install a different kernel later).

Comment 10 Andrew Jones 2012-03-06 07:12:33 UTC
(In reply to comment #9)
> So is there any way to check the kernel configuration parameters when we
> install the glibc rpm?  I realize that check won't be perfect (user could
> install a different kernel later).

I'm not sure. Fedora doesn't (and I'm not sure who does) compile with IKCONFIG on, which means we can't count on /proc/config.gz. We can expect /boot/config-$(uname -r) if `uname -r` is a kernel installed by a Fedora rpm, but not if it's a kernel installed by other means.

Would it make more sense for this file to be in the kernel package? Like it is for RHEL?

Comment 11 Andrew Jones 2012-03-06 08:15:00 UTC
Hmm, actually maybe glibc is the best place for this file because we *don't* control how the kernel is installed on Fedora. We can get the equivalent of CONFIG_X86_32=? easily enough, I suppose. I'm sure glibc has ways to check that already - whatever the 'arch' command does? For CONFIG_XEN=y we can check for the existence of /sys/hypervisor/type, and if it's there, then read it and see if it is 'xen'. This isn't quite right because now we're checking for CONFIG_XEN_SYS_HYPERVISOR=y (which requires CONFIG_XEN=y), meaning it's not as precise. However, XEN_SYS_HYPERVISOR defaults to 'y' and only depends on XEN=y and SYSFS=y, so it's pretty safe to rely on it, and it's the least hacky way I can think of to find out if we're running on Xen.

Considering the purpose of this is to make a best effort at improving perf for 32-bit paravirt xen guests, then I think it's sufficient. Note, with these conditions, we'll end up installing this file on 32-bit fullvirt xen guests as well (and they don't need it), but as I said before, it won't hurt them. I can't think of any non-hacky way to detect pv vs. fv. We'd have the same issue if we were to look at kernel config options anyway, as there's not anyway to know if the kernel will be used as pv or fv xen guest.

Comment 12 Andrew Jones 2012-03-15 12:38:08 UTC
(In reply to comment #11)
> Note, with these
> conditions, we'll end up installing this file on 32-bit fullvirt xen guests as
> well (and they don't need it), but as I said before, it won't hurt them. I
> can't think of any non-hacky way to detect pv vs. fv. We'd have the same issue
> if we were to look at kernel config options anyway, as there's not anyway to
> know if the kernel will be used as pv or fv xen guest.

I have an update to this note. Recently I needed to come up with a pv vs. hvm check, so I poked a bit through the xen guest init code and determined that the least hacky way (still a bit hacky) is to check /proc/interrupts for a legacy interrupt, such as 0 - timer. hvm guests have these interrupts and pv guests don't. So if /sys/hypervisor/type is xen and we don't have legacy interrupts (! grep -q '^  0:' /proc/interrupts), then that means we're xenpv. And, if 'arch' reports 32-bit, then we're in the environment where we should install the file.

Comment 13 David Martinez 2012-06-08 14:53:34 UTC
So, trying to summarize here - 
In order to set nosegneg for Xen hvm instances, the glibc install script should try to detect whether it is being installed in a 32-bit Xen guest.  While it is technically a kernel option, the kernel could come from anywhere so if a non-official-channel kernel is used the nosegneg option could be missed, or could wrongly be applied to a non Xen guest.  So we have glibc take care of it in an install script.

Install script would add something like this:
------------------
# Determine if running on Xen-guest by checking for
# absence of legacy interrupt
grep -q -e '^  0:' /proc/interrupts; 
if [ "$?" -eq "0" ] 
then
    # Check for 32-bit architecture
    if [ "$(arch)" == "i686" ]  # other 32-bit possibilities here? i486 etc?
    then
        echo > ldconfig-kernel.conf "\
# This directive teaches ldconfig to search in nosegneg subdirectories
# and cache the DSOs there with extra bit 0 set in their hwcap match
# fields.  In Xen guest kernels, the vDSO tells the dynamic linker to
# search in nosegneg subdirectories and to match this extra hwcap bit
# in the ld.so.cache file.
hwcap 1 nosegneg"
   else
        # remove nosegneg?
   fi
fi
--------------------
That check is clever but I wonder if there is a way to make it less fragile.  I.e. we're counting on the two spaces before '0:' etc.

Does this still need the NEEDINFO tag?

Comment 14 Andrew Jones 2012-06-08 16:10:36 UTC
We want nosegneg for PV, not for HVM. After some changes to the grepping of /proc/interrupts technique in that other bug, we finally landed on 'grep -q 'xen-percpu-virq  *timer0' /proc/interrupts' to detect PV. So something like this should work

if [ "$(arch)" == "i586" -o "$(arch)" == "i686" ] && grep -q 'xen-percpu-virq  *timer0' /proc/interrupts
then
    echo > ldconfig-kernel.conf "\
# This directive teaches ldconfig to search in nosegneg subdirectories
# and cache the DSOs there with extra bit 0 set in their hwcap match
# fields.  In Xen guest kernels, the vDSO tells the dynamic linker to
# search in nosegneg subdirectories and to match this extra hwcap bit
# in the ld.so.cache file.
hwcap 1 nosegneg"
    else
        # remove nosegneg?
    fi
fi

Comment 15 Fedora Admin XMLRPC Client 2013-01-28 20:08:39 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 16 Matthew Miller 2013-05-13 16:23:45 UTC
Could we revisit this for F19? We're still carrying kludge in the EC2 kickstart and I'd like to get rid of that.

Comment 17 Carlos O'Donell 2013-05-13 18:27:24 UTC
I've taken over glibc from Jeff Law who was only temporarily looking after the package.

Previous summary restated in my own words:

Install ldconfig-kernel.conf into /etc/ld.so.conf.d/ if glibc detects, during post-install, that we are running in a paravirtualized Xen guest.

We install this file to indicate to the dynamic loader, via the ld.so.cache, that bit 1 of hwcap set to 1 means search into a platform directory "nonegseg" for optimized libraries, otherwise don't search there.

The Xen gues OS will make sure that the hwcap bit 1 is set to 1 when it is appropriate to search in "nonegseg" directories for those libraries.

Question:

Is there actually anything to fix?

The dynamic loader startup code (elf/dl-hwcaps.c) will use the vDSO note to provide the appropriate HWCAP override and setup search paths appropriately.

Are we installing ldconfig-kernel.conf for kernel without a vDSO? Is there such a thing? Does a 32-bit binary running under a 64-bit kernel get a 32-bit vDSO?

Comment 18 Matthew Miller 2013-05-13 20:37:29 UTC
I'll leave the vDSO question to someone who knows what they are talking about.

But, postinstall isn't really a good place to make the decision; an image may be created under something other than Xen but then ultimately run under Xen -- for example, Amazon EC2.

Comment 19 Carlos O'Donell 2013-05-14 02:31:18 UTC
(In reply to comment #18)
> But, postinstall isn't really a good place to make the decision; an image
> may be created under something other than Xen but then ultimately run under
> Xen -- for example, Amazon EC2.

Matthew,

OK, I've done all the background reading on the design of this and my last question is moot. We still need both the ldconfig-kernel.conf and the vDSO note.

Your example of creating an image under another system is a good example.

You appear to be arguing for unconditionally installing ldconfig-kernel.conf?

After reviewing everything, I don't see any reason why we wouldn't want to install ldconfig-kernel.conf e.g. "hwcap 1 nosegneg", since it makes PV Xen guests faster.

However, there glibc "nosegneg" variant is no longer being built.

I notice that there appears to be a glibc-xen package in the glibc.spec file that is no longer built. Such a package was previously built glibc with "-mno-tls-direct-seg-refs" for i386, i486 and i586 arches and put into a `i686/nosegneg' directory. We no longer support i386 as a valid arch since upstream no longer supports i386 either (lacks the primitives required for NPTL threading implementation). The i686 build of glibc, which is the default for x86, isn't built with "-mno-tls-direct-seg-refs", and I haven't verified that this actually makes a difference.

Jakub,

What is the history of the glibc-xen package?

Comment 20 Jakub Jelinek 2013-05-14 05:47:01 UTC
If you look say into RHEL 6 glibc.spec, you'll see that on i686 the nosegneg libraries are built and depending on the macros are either packaged directly in glibc package, or optionally could be packaged into a package of its own.

What happened with that support afterwards, I have no idea, but supposedly it might have been just dropped because KVM is now used instead of Xen, so the ugly hacks for Xen are unlikely to be needed.

But from quick skimming of F19 glibc, it looks like it is still there:
rpm -ql glibc.i686 | grep nosegneg

Comment 22 Fedora End Of Life 2015-01-09 16:40:53 UTC
This message is a notice that Fedora 19 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 19. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 19 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 23 Fedora End Of Life 2015-02-17 13:46:17 UTC
Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.