Bug 448588

Summary: RFE: improve gettimeofday performance on hypervisors
Product: Red Hat Enterprise Linux 5 Reporter: Alok Kataria <akataria>
Component: kernelAssignee: Chris Lalancette <clalance>
Status: CLOSED ERRATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: high    
Version: 5.2CC: ahecox, ahe, bdevouge, bfox, charles.cooke, clalance, cward, dhecht, duck, dzickus, emcnabb, fluo, herrold, jcooper, jon.shanks, jpenix, jtluka, k.georgiou, pbonzini, prarit, qcai, riek, riel, rpacheco, sghosh, srihan, tao, xen-maint, zxvdr.au
Target Milestone: rcKeywords: FutureFeature, Triaged, ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
: 523280 (view as bug list) Environment:
Last Closed: 2009-09-02 08:18:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 483701, 485920, 512913, 523280    
Attachments:
Description Flags
x86: Implement support to synchronize RDTSC with LFENCE on Intel CPUs
none
x86: implement support to synchronize RDTSC through MFENCE on AMD CPUs
none
x86: introduce rdtsc_barrier() none

Description Alok Kataria 2008-05-27 18:19:14 UTC
Description of problem:

There have been a series of patches committed to the mainline kernel
recently that address a performance issue for gettimeofday when running
on hypervisors that enable hardware assisted virtualization.  The
non-ideal performance occurs because a CPUID instruction is used to
serialize the pipeline before RDTSC, and when using hardware
virtualization, CPUID always exits to the hypervisor.

The code in question also exists in the RHEL 5.2 64-bit kernel (see
get_cycles_sync in include/asm-x86_64/timex.h).

The fix is to use MFENCE/LFENCE instead of CPUID.  Here are links to
relevant patches by Andi Kleen which are now in git:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=de4218634e3df6d73a3e6cdfdf3a17fa3bc7e013
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=707fa8ed923b1b6a3d7af0d386b0b3abad28ed19
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fde1b3fa947c2512e3715962ebb1d3a6a9b9bb7d
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6d63de8dbcda98511206897562ecfcdacf18f523
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f06e4ec1c15691b0cfd2397ae32214fa36c90d71

Would you be able to make a similar change to the RHEL 5.2 kernel, to
address this issue ?

Thanks,
Alok

Comment 1 C. Cooke 2008-07-31 10:14:11 UTC
(In reply to comment #0)

We're also seeing this problem - a little test script (doing 90,000,000
gettimeofday() calls) takes two minutes on our RH VMs... and thirty seconds on
my desktop. 

Applying that kernel patch to the RH kernel isn't trivial - there have been a
*lot* of changes since 2.6.18 was cut. I'm giving it a go here, but I'm not
really a kernel coder.

Comment 2 Jon Shanks 2008-08-27 10:55:10 UTC
Currently this appears to be an issue with 64 bit architectures and the implementation behind the syscalls within the xen kernel in the VM and the way they all exit the vm to the hypervisor. Is this related to not taking advantage of segmentation protection? 
 
#include <stdio.h>
 
int main(int argc,char **argv) {
 
    int i;
    for ( i = 0 ; i < 90000000 ; i++ ) {
           gettimeofday();
 
     }
 
}
 
 
32 bit VM:
time ./time
 
real 1m42.460s
user 0m8.565s
sys  1m33.834s
 
64-bit VM:
time ./time_64
 
real 6m3.259s
user 0m26.750s
sys  5m36.501s  
 
Not 100% sure on a gettimeofday why it requires to exit out of the VM completely? Is it not possible to take advantage of the generic 'vsyscalls' implementation which is in the 2.6.x branches?

Comment 3 Prarit Bhargava 2008-09-03 11:03:33 UTC
*** Bug 460983 has been marked as a duplicate of this bug. ***

Comment 4 Chris Lalancette 2008-09-03 15:33:33 UTC
Well, I think there are actually two things at play in this BZ.  The original request is to shift from using CPUID to serialize gettimeofday to using MFENCE/LFENCE for serializing.  This should be faster under *full* virtualization, so a backport to RHEL-5 might be desirable.

Comment #3, however, talks about something different.  In particular, it's talking about the Xen kernel, which does indeed have the vsyscall stuff off in 64-bit.  I'm not entirely sure why; I don't think segmentation protection should have anything to do with it.  In a quick test, I turned it on, and your little benchmark there went from 2m17s on this box to about 30s.  I'm going to make a new BZ about that issue, since it doesn't really belong here.

Chris Lalancette

Comment 7 Alok Kataria 2008-10-22 16:57:36 UTC
(In reply to comment #4)
> Well, I think there are actually two things at play in this BZ.  The original
> request is to shift from using CPUID to serialize gettimeofday to using
> MFENCE/LFENCE for serializing.  This should be faster under *full*
> virtualization, so a backport to RHEL-5 might be desirable.
> 

<ping>
Has anybody been working on these patches, do we have a ETA as to which release can have this fix ? 

> Comment #3, however, talks about something different.  In particular, it's
> talking about the Xen kernel, which does indeed have the vsyscall stuff off in
> 64-bit.  I'm not entirely sure why; I don't think segmentation protection
> should have anything to do with it.  In a quick test, I turned it on, and your
> little benchmark there went from 2m17s on this box to about 30s.  I'm going to
> make a new BZ about that issue, since it doesn't really belong here.

Can you please cc me on this BZ.

Thanks,
Alok

Comment 8 Bill Burns 2008-10-28 15:09:18 UTC
There is no ETA for these, but the earliest we can evaluate it for would be RHEL 5.4 at this point.

Comment 9 Alok Kataria 2008-11-21 05:55:53 UTC
[changed the component category to "kernel", this is a generic kernel problem, only that the performance impact would be more for kernel running under hypervisors.]

I have cooked up some patches which use the mfence/lfence instead of cpuid.
Please have a look and let me know if you have any comments. Will upload them shortly.

Comment 10 Alok Kataria 2008-11-21 05:58:33 UTC
Created attachment 324275 [details]
x86: Implement support to synchronize RDTSC with LFENCE on Intel CPUs

Comment 11 Alok Kataria 2008-11-21 05:59:41 UTC
Created attachment 324276 [details]
x86: implement support to synchronize RDTSC through MFENCE on AMD CPUs

Comment 12 Alok Kataria 2008-11-21 06:00:49 UTC
Created attachment 324277 [details]
x86: introduce rdtsc_barrier()

Comment 13 Andrew Hecox 2008-11-21 23:20:20 UTC
*** Bug 468459 has been marked as a duplicate of this bug. ***

Comment 20 Chris Lalancette 2009-01-16 15:08:04 UTC
I took a quick look at these patches, and they look entirely reasonable.  The only question I have concerns the set_bit(X86_FEATURE_{L,M}FENCE_RDTSC, &c->x86_capability); don't we have to protect that by first checking if sse2 is enabled?  Upstream that's done with the "cpu_has_xmm2" check, but since RHEL-5 doesn't have that, we'd have to do something a little more primitive.  Or am I missing something?

Chris Lalancette

Comment 21 Alok Kataria 2009-01-16 19:36:23 UTC
(In reply to comment #20)
> I took a quick look at these patches, and they look entirely reasonable.  The
> only question I have concerns the set_bit(X86_FEATURE_{L,M}FENCE_RDTSC,
> &c->x86_capability); don't we have to protect that by first checking if sse2 is
> enabled?

These barrier changes are done only for 64bit code. All 64bit machines have SSE2 enabled, atleast thats what include/asm-x86_64/cpufeature.h  says

#define cpu_has_xmm2           1

So i don't think we need the xmm2 check for RHEL5 since the 32 and 64bit code is still separate.

Thanks,
Alok

  Upstream that's done with the "cpu_has_xmm2" check, but since RHEL-5
> doesn't have that, we'd have to do something a little more primitive.  Or am I
> missing something?
> 
> Chris Lalancette

Comment 22 Chris Lalancette 2009-01-19 07:52:28 UTC
Ah, of course, silly me.  Upstream has the combined 32/64 bit code, which is why it needs the protection.  OK, great, thanks a lot!

Chris Lalancette

Comment 23 Chris Lalancette 2009-01-23 10:24:32 UTC
I've uploaded a test kernel that contains this fix (along with several others)
to this location:

http://people.redhat.com/clalance/virttest

Could the original reporter try out the test kernels there, and report back if
it fixes the problem?

Thanks,
Chris Lalancette

Comment 24 Alok Kataria 2009-01-27 02:07:03 UTC
(In reply to comment #23)
> I've uploaded a test kernel that contains this fix (along with several others)
> to this location:
> 
> http://people.redhat.com/clalance/virttest
> 
> Could the original reporter try out the test kernels there, and report back if
> it fixes the problem?
> 

Yep this kernel does fix the performance problems for me, thanks for picking up the patches.

Alok

Comment 25 Chris Lalancette 2009-02-02 07:42:50 UTC
Great, thanks for the testing!

Chris Lalancette

Comment 26 RHEL Program Management 2009-02-16 15:04:52 UTC
Updating PM score.

Comment 30 Evan McNabb 2009-06-29 16:52:42 UTC
The fix for this is included in the latest RHEL5.4 beta kernels, available at:

http://people.redhat.com/dzickus/el5

Alok (or anyone else hitting this bug), can you please test this kernel when possible? Thanks!

Comment 31 Chris Ward 2009-07-03 18:03:10 UTC
~~ Attention - RHEL 5.4 Beta Released! ~~

RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner!

If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.

Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value.

Questions can be posted to this bug or your customer or partner representative.

Comment 32 Alok Kataria 2009-07-21 22:19:19 UTC
RHEL5.4 looks okay WRT these patches too. Thanks.

Comment 40 errata-xmlrpc 2009-09-02 08:18:45 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1243.html