Bug 181780

Summary: Gettimeofday() timer related slowdown and scaling issue
Product: Red Hat Enterprise Linux 4 Reporter: Andrew Bond <andrew.bond>
Component: kernelAssignee: Jim Paradis <jparadis>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: jbaron, jturner, k.georgiou, peterm, tao
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2006-0575 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-08-10 22:18:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 181409    
Attachments:
Description Flags
program to measure gettimeofday performance and scaling
none
Chart showing bmark test data from U1, U2, and custom U2.
none
gettimeof day fixes
none
upstream reference patch
none
Graph of RHEL4 U3 versus 2.6.15.4
none
patch to resolve "nopmtimer" not calling the virtual gettimeofday syscall
none
Partner help from Andy to plot the new RHEL4 U4 kernel against other data none

Description Andrew Bond 2006-02-16 15:30:43 UTC
Description of problem:
There is a noticeable slowdown in the update 2 performance of the gettimeofday()
call when compared to update 1.  It apprears to be partly related to the switch
from the PIT timer in U1 to the PM timer in U2.  However, there is also a
synchronization issue that causes the gettimeofday() system call in U2 to not
scale across processors.

The kernel boot line directives to boot other timers doesn't appear to work in
x86_64 and the PM timer is always used no matter which timer is selected at boot
time.

I created a U2 kernel that booted with the PIT timer and the performance of a
single thread increased to what U1 was able to do.  However, there is no scaling
across multiple processors.

I am attaching a pdf chart showing the output of a bmark.c test that I am also
attaching.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. run the bmark gettimeofday test on both U1 and U2
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Andrew Bond 2006-02-16 15:33:28 UTC
Created attachment 124758 [details]
program to measure gettimeofday performance and scaling

This is the program I used to generate the gettimeofday() perf data.

Comment 2 Andrew Bond 2006-02-16 15:36:27 UTC
Created attachment 124759 [details]
Chart showing bmark test data from U1, U2, and custom U2.

The U2-PIT test was run with a kernel that had the PM timer config file option
turned off.

Comment 3 Jason Baron 2006-02-16 21:07:34 UTC
Is this amd system or intel?

Comment 4 Andrew Bond 2006-02-16 21:20:47 UTC
AMD.

Comment 5 Jason Baron 2006-02-16 23:25:28 UTC
Created attachment 124794 [details]
gettimeof day fixes

i think with this patch, and passing 'nohpet' and 'nopmtimer' at the command
line the tsc will be used. you can veryify the gettimeofday timer via 'dmesg |
grep timer.c". 

I think the straight line is just going to be the default, unless the
commandline arguments are passed for AMD systems. it be interesting to see how
upstream benchmarks too. thanks.

Comment 6 Jason Baron 2006-02-17 15:01:27 UTC
Created attachment 124817 [details]
upstream reference patch

Comment 7 Jason Baron 2006-02-17 15:14:44 UTC
Also, passing 'nopmtimer' and 'nohpet' on the commandline with the shipping
kernel should get the better scaling behavior. 

Comment 8 Brian Maly 2006-02-17 15:16:59 UTC
what issue do these patches in comment #5 & #6 address?

this was a regression introduced when the PMtimer changes went into U2. looks
the regression is probably in time.c being that the regression exists even when
using PIT. being that the PMTimer patches got us inline with upstream the lack
of scalability almost certainly exists there as well (although it would
certainly be usefully to know this for sure).


Comment 9 Jason Baron 2006-02-17 15:23:05 UTC
The patch in comment #5 address the fact that 'unsynchronized_tsc' will almost
always return true due to the fact that the clustermap data structure, is not
properly initialized. This mean that Intel chips, never use the tsc for gtod,
which they really should. I agree that the 'flatline' in the chart is now going
to be the default for most x86_64 systems now. However, i suspect the 'PIT' line
in the chart is incorrect, and is likely one of the other timers.

Comment 10 Brian Maly 2006-02-17 15:44:55 UTC
the patch in comment #5 may be a good thing to investigate for Intel related
time scalibility issues, but this BZ affects AMD not Intel. AMD doesnt use tsc
for timekeeping being its deemed unreliable. this is why the PMTimer patch went
into U2.

it would be nice to see a graph of PIT, HPET and PMtimer scalability for U2 and
U3 to see if this is truly a generic regression or a regression that only
affects one method of timekeeping

Comment 11 Andrew Bond 2006-02-17 17:21:14 UTC
How can I force hpet timer usage in the stock kernel?  I'm using RHEL4 U3 .29

nohpet and nopmtimer forced use of PIT/TSC according to time.c
npmtimer only also forced use of PIT/TSC according to time.c
no options used PM timer as expected.

Comment 12 Andrew Bond 2006-02-17 19:22:30 UTC
Created attachment 124829 [details]
Graph of RHEL4 U3 versus 2.6.15.4

When using PM timer the 2.6.15.4 kernel maps directly with RHEL4 U3.  Using
nopmtimer in RHEL4 U3 shows scaling unlike my initial test with RHEL4 U2. 
However, 2.6.15.4 with nopmtimer starts over 2x higher at 1 thread (6,736,148
vs 3,145,266).

Comment 13 Jason Baron 2006-02-17 19:32:11 UTC
ok. that looks much better :) I suspect that the tsc code might run faster with
this patch: http://marc.theaimsgroup.com/?l=git-commits-head&m=113705338714125&w=2.

The comment says its Intel specific. but it really isn't. If you think its
important for us to scale better we might try backporting it. I'm also curious
how you encountered this issue in the first place.

Comment 14 Andrew Bond 2006-02-17 19:46:00 UTC
I went back and ran the stock U2 kernel with nopmtimer flag and it performs 
just like U3 did in my latest comparision graph.  It scales linearly starting 
from 3.1 million gtime()/sec at 1 thread.

In my original test with U2 forcing PIT, I had recompiled the U2 kernel with 
the CONFIG_X86_PM_TIMER unset.  That is what produced a higher 1 thread number 
in my first chart, but exhibited the same flat line across multiple theads.

We had originally been doing some testing with IA-64 gettimeofday() calls to 
solve a similar non-scaling issue with that architecture.  A comparison test 
was run with an Opteron box running U3 and we were surprised to find out it 
didn't show any scaling.  We stepped back to U2 and U1 to see how they 
performed and discovered the U1/U2 delta.

Comment 16 Peter Martuccelli 2006-05-05 17:55:29 UTC
Created attachment 128670 [details]
patch to resolve "nopmtimer" not calling the virtual gettimeofday syscall

Comment 19 Jason Baron 2006-05-08 15:40:42 UTC
committed in stream U4 build 35.3. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/


Comment 21 John Shakshober 2006-05-11 14:39:47 UTC
Created attachment 128895 [details]
Partner help from Andy to plot the new RHEL4 U4 kernel against other data

Comment 26 Red Hat Bugzilla 2006-08-10 22:18:41 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0575.html