Bug 708350

Summary: nosegneg not used in 32-bit Xen guests
Product: Red Hat Enterprise Linux 6 Reporter: Paolo Bonzini <pbonzini>
Component: kernelAssignee: Red Hat Kernel Manager <kernel-mgr>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.0CC: clalance, drjones, jzheng, leiwang, pcao, qwan, xen-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-2.6.32-189.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 13:08:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Paolo Bonzini 2011-05-27 11:36:37 UTC
Description of problem:
The "nosegneg" hwcap is assigned to bit 1 in the 32-bit vDSO:

$ git grep -i no[ns]eg[ns]eg
vdso/vdso32/note.S:	.byte VDSO_NOTE_NONEGSEG_BIT; .asciz "nosegneg"	/* bit, name */
xen/vdso.h:#define VDSO_NOTE_NONEGSEG_BIT	1

However, for historical reasons the spec is associating it to bit 0.  This causes a pretty heavy performance hit when running 32-bit RHEL6 Xen PV guests on 32-bit hosts.

Version-Release number of selected component (if applicable):
kernel-2.6.32-131.el6

How reproducible:
100%

Steps to Reproduce:
1. Run "cat /proc/self/maps" in a 32-bit PV Xen guest, on a 32-bit RHEL5 host.
  
Actual results:
Output refers to /lib/i686/nosegneg/libc-2.12.so.

Expected results:
Output refers to /lib/libc-2.12.so

Additional info:
There are correctness problems when the "normal" libc is used, but they are probably not important for RHEL6 because they were fixed in 5.2 and RHEL6 guests are unlikely to work before update 5.  However, the performance difference can be seen on artificial testcases too:

/* The testcase tries accessing thread-local variables repeatedly via
   indirect (errno, which is actually *__errno_location) or direct
   (the current locale, which is read in nl_langinfo) accesses.  This causes
   a general protection trap to the hypervisor on every access when not
   using the nosegneg library.  It should be rougly just as fast on RHEL5
   dom{0,U} and RHEL6 domU (1-2x differences are expected), while it is
   50-100x slower on RHEL6 domU due to not using the nosegneg library.  */

#include <signal.h>
#include <stdio.h>
#include <errno.h>
#include <langinfo.h>

#define NOSEGNEG_FASTER 1

#ifdef NOSEGNEG_FASTER
volatile int x;
#else
volatile __thread int x;
#endif

void sigalrm(int sig)
{
	x = 1;
}

int main()
{
	signal(SIGALRM, sigalrm);
	long long i = 0;
	volatile int y;
	volatile char *z;
	alarm(3);
#ifdef NOSEGNEG_FASTER
	while (!x) { y = errno; z = nl_langinfo(CODESET); i++; }
#else
	while (!x) { z = nl_langinfo(CODESET); i++; }
#endif
	printf ("%lld\n", i/3);
}

Note that you can construct similarly artificial testcases that are _slower_ with nosegneg.  If you remove the #define in the example above, "x" uses a direct access and thus it is faster if nl_langinfo also uses a direct access; using nosegneg introduces the expensive alternation between direct and indirect.  However, since TLS is not very common outside libc, these cases are less relevant.

Comment 1 RHEL Program Management 2011-05-27 11:49:33 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 2 Chris Lalancette 2011-05-27 14:50:42 UTC
So while this is indeed broken, and needs to be fixed, the problem goes deeper than that (or I am mistaken).

The code in the RHEL-6 kernel.spec file that handles this looks like this:

%ifarch %{vdso_arches}
    make -s ARCH=$Arch INSTALL_MOD_PATH=$RPM_BUILD_ROOT vdso_install KERNELRELEASE=$KernelVer
    if grep '^CONFIG_XEN=y$' .config >/dev/null; then
      echo > ldconfig-kernel.conf "\
# This directive teaches ldconfig to search in nosegneg subdirectories
# and cache the DSOs there with extra bit 0 set in their hwcap match
# fields.  In Xen guest kernels, the vDSO tells the dynamic linker to
# search in nosegneg subdirectories and to match this extra hwcap bit
# in the ld.so.cache file.
hwcap 0 nosegneg"
    fi
...

However, won't this always do the wrong thing on bare-metal?  That is, since we are using pvops in the RHEL-6 kernel, CONFIG_XEN is always y, and therefore the kernel installation will always put down a file with the hwcap 0 nosegneg (indeed, looking on my *x86_64* RHEL-6 box, it has an entry like this).  We haven't noticed a problem so far because it was trying to set a bogus hwcap, but if we fix this to be "hwcap 1 nosegneg", we are going to take a performance hit (by not using segneg) on *all* boxes, no?

Nasty.  I'm not quite sure what to do here.

Chris Lalancette

Comment 3 Paolo Bonzini 2011-05-27 16:28:00 UTC
Actually no, the mask is left to 0 in the kernel, and the nosegneg bit is set to 1 only when running on Xen (fiddle_vdso in arch/x86/xen/setup.c).

Comment 4 Aristeu Rozanski 2011-06-14 14:58:34 UTC
Patch(es) available on kernel-2.6.32-157.el6

Comment 7 Pengzhen Cao 2011-07-29 07:12:01 UTC
can not reproduce this issue, could you tell me if what's wrong with my process.

Host version is : kernel-xen-2.6.18-276, xen-3.0.3-132   i386 
Guest: rhel6.1 i386 pv guest, tried several kernel, 2.6.32-71, 2.6.32-130...
the kernel mentioned in bug "2.6.32-131" has been deleted and I do not know if this is regression bug only on 2.6.32-131.

1. Create the guest, then inside guest
[root@virtlab-66-84-155 ~]# cat /proc/self/maps
00313000-00314000 r-xp 00000000 00:00 0 [vdso]
009d9000-009f7000 r-xp 00000000 fd:00 63560 /lib/ld-2.12.so
009f7000-009f8000 r--p 0001d000 fd:00 63560 /lib/ld-2.12.so
009f8000-009f9000 rw-p 0001e000 fd:00 63560 /lib/ld-2.12.so
009ff000-00b89000 r-xp 00000000 fd:00 63561 /lib/libc-2.12.so
00b89000-00b8b000 r--p 0018a000 fd:00 63561 /lib/libc-2.12.so
00b8b000-00b8c000 rw-p 0018c000 fd:00 63561 /lib/libc-2.12.so
00b8c000-00b8f000 rw-p 00000000 00:00 0
08048000-08053000 r-xp 00000000 fd:00 2853 /bin/cat
08053000-08054000 rw-p 0000a000 fd:00 2853 /bin/cat
0907c000-0909d000 rw-p 00000000 00:00 0 [heap]
b7512000-b7712000 r--p 00000000 fd:00 140869 /usr/lib/locale/locale-archive
b7712000-b7713000 rw-p 00000000 00:00 0
b7728000-b7729000 rw-p 00000000 00:00 0
bf995000-bf9aa000 rw-p 00000000 00:00 0 [stack]

2. Copy and compile the c code in description.
With "#define NOSEGNEG_FASTER 1"
[root@virtlab-66-84-155 ~]# gcc -o repro repro.c
[root@virtlab-66-84-155 ~]# time ./repro
903646

real 0m3.035s
user 0m2.992s
sys 0m0.004s

3. With "#define NOSEGNEG_FASTER 0", compile again
[root@virtlab-66-84-155 ~]# gcc -o nodefine_repro repro.c
[root@virtlab-66-84-155 ~]# time ./nodefine_repro
910538

real 0m3.024s
user 0m2.992s
sys 0m0.002s
[root@virtlab-66-84-155 ~]# uname -a
Linux virtlab-66-84-155.englab.nay.redhat.com 2.6.32-130.el6.i686 #1 SMP Tue Apr 5 19:56:32 EDT 2011 i686 i686 i386 GNU/Linux

Comment 8 Paolo Bonzini 2011-07-29 07:49:11 UTC
The bug is already that /proc/self/maps doesn't refer to nosegneg.

Comment 9 Pengzhen Cao 2011-07-29 08:01:39 UTC
(In reply to comment #8)
> The bug is already that /proc/self/maps doesn't refer to nosegneg.

OK, so the expected result should be  /proc/self/maps refer to "/lib/i686/nosegneg/libc-2.12.so" ? 

But how to use the c code in description to test the performance issue?

Comment 10 Paolo Bonzini 2011-07-29 08:47:40 UTC
> OK, so the expected result should be  /proc/self/maps refer to
> "/lib/i686/nosegneg/libc-2.12.so" ? 

Yes.  Actually, if you have older unfixed kernels installed in the virtual machine, you may have hit a problem we already found in updating 6.1 to 6.2.  In fact, I'm moving back to ASSIGNED due to this problem.

> But how to use the c code in description to test the performance issue?

You should be able to compare the speed of the C code on a VM with /lib/libc-2.12.so and one with /lib/i686/nosegneg/libc-2.12.so.

Comment 11 Aristeu Rozanski 2011-08-15 21:20:51 UTC
Patch(es) available on kernel-2.6.32-189.el6

Comment 14 Jinxin Zheng 2011-09-02 12:07:40 UTC
Host: RHEL5 32bit kernel-xen-2.6.18-283.
Guest: RHEL6 32bit PV domU.

Reproduced on 2.6.32-176.el6.i686:
$ cat /proc/self/maps |grep libc
009ff000-00b89000 r-xp 00000000 fd:00 63561      /lib/libc-2.12.so
00b89000-00b8b000 r--p 0018a000 fd:00 63561      /lib/libc-2.12.so
00b8b000-00b8c000 rw-p 0018c000 fd:00 63561      /lib/libc-2.12.so

Verified on 2.6.32-194.el6.i686:
$ cat /proc/self/maps |grep libc
00110000-0029f000 r-xp 00000000 fd:00 2468       /lib/i686/nosegneg/libc-2.12.so
0029f000-002a1000 r--p 0018f000 fd:00 2468       /lib/i686/nosegneg/libc-2.12.so
002a1000-002a2000 rw-p 00191000 fd:00 2468       /lib/i686/nosegneg/libc-2.12.so


Artificial test from comment 0:
$ gcc artificial-testcase.c -o test-nosegneg
$ ./test-nosegneg

dom0:
45375185

guest kernel-176:
1623129
(much slower than dom0.)

guest kernel-194:
48778869
(as fast as in dom0.)

Comment 15 errata-xmlrpc 2011-12-06 13:08:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2011-1530.html