Bug 180471

Summary: spurious calls to gettimeofday system call after upgrade to Update 2 from Update 1
Product: Red Hat Enterprise Linux 4 Reporter: Oliver Cook <oliver.cook>
Component: kernelAssignee: Brian Maly <bmaly>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: dduval, jakub, jbaron, k.georgiou
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 4.4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-09-12 14:55:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Java test script used to reproduce problem
none
ltrace output from running the attached script on RHES4.2
none
demonstration of steps performed to downgrade libc and further test results none

Description Oliver Cook 2006-02-08 12:24:21 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc4 Firefox/1.0.7

Description of problem:
After upgrading to ES4 Update 1 to ES4 Update 2 we are seeing a large increase in system time compared to user time.

We have tracked this down to excessive calls to the gettimeofday system call.

I have attached a test application (in Java) which demonstraces the problem. The output from the following command:

ltrace -S /usr/java/j2sdk1.4.2_02_bd/bin/java BFThreadTest 2>&1|awk -F\( '{print $1}'|sort|uniq -c | sort -n|grep SYS | tail -5

under the two releases is shown below:

Under RHES4.1:

     46 SYS_stat
     55 SYS_futex
     58 SYS_lstat
     76 SYS_open
    105 SYS_mmap

Under RHES4.2:

     54 SYS_futex
     58 SYS_lstat
     76 SYS_open
    105 SYS_mmap
   1658 SYS_gettimeofday

As you can see, RHES4.2 introduces over 1600 calls to the gettimeofday system call which were not executed by the same script running RHES4.1.

The ltrace output suggests that it is not libc, but rather then kernel that is introducing these extra calls, as the gettimeofdays are not preceded by any particular library call.

However, my analysis may not be correct, so I am lodging this as a bug against the kernel and also libc.

This has been tested using both the official Sun and Blackdown JVMs (version 1.4.x).


Version-Release number of selected component (if applicable):
kernel-2.6.9-22.EL glibc-2.3.4-2.13

How reproducible:
Always

Steps to Reproduce:
1. /path/to/bin/javac BFSimplexTest.java
2. ltrace -S /usr/java/j2sdk1.4.2_02_bd/bin/java BFThreadTest
  

Actual Results:  1000s of extra calls to SYS_gettimeofday comparse do running under RHES4 Update 1.

Expected Results:  Similar behaviour to that exhibited under RHES4 Update 1.

Additional info:

Comment 1 Oliver Cook 2006-02-08 12:26:48 UTC
Created attachment 124374 [details]
Java test script used to reproduce problem

Compile with:

/path/to/bin/javac BFSimplexTest.java

Comment 2 Oliver Cook 2006-02-08 12:27:30 UTC
Created attachment 124375 [details]
ltrace output from running the attached script on RHES4.2

Comment 3 Dominic Duval 2006-02-08 17:22:04 UTC
I tested the same thing on RHEL4-x86 U2, with similar results on IBM's J2RE
1.4.2 IBM build cxia32142-20050609. 

Comment 4 Oliver Cook 2006-02-09 11:17:48 UTC
At dduval's request, I have downgraded an affected system from glibc-2.3.4-2.13
to glibc-2.3.4-2.9 and re-run the test.

The test shows the same results. This either suggests that the downgrade did not
work (I have attached the steps I used perform the downgrade, and am confident
that the glibc downgrade did work), or that glibc is not the source of the extra
calls to SYS_gettimeofday.

I would appreciate advice on the next step for debugging the source of these
calls. Of interest might be the instruction pointer at the time of the
SYS_gettimeofday calls. It appears to be an address that is not on the process's
stack:

[ffffffffff600012] gettimeofday({1139414511, 422461}, NULL) = 0
[ffffffffff600012] gettimeofday({1139414511, 422490}, NULL) = 0
[ffffffffff600012] gettimeofday({1139414511, 422548}, NULL) = 0
[      3bb16c26dc] mmap(0x2add360000, 16777216, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x2add360000
[      3bb16c26dc] mmap(NULL, 135168,
PROT_READ|PROT_WRITE|PROT_EXEC,MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) =
0x2ae13b7000
[      3bb16c26dc] mmap(0x2ae13b7000, 36864, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x2ae13b7000
[      3bb16c26dc] mmap(0x2ae1380000, 32768, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x2ae1380000
[ffffffffff600012] gettimeofday({1139414511, 422745}, NULL) = 0
[ffffffffff600012] gettimeofday({1139414511, 422779}, NULL) = 0
[ffffffffff600012] gettimeofday({1139414511, 422841}, NULL) = 0


Comment 5 Oliver Cook 2006-02-09 11:18:52 UTC
Created attachment 124431 [details]
demonstration of steps performed to downgrade libc and further test results

Comment 6 Oliver Cook 2006-02-09 17:35:45 UTC
On the system in question, the above instruction pointer is in the
"vgettimeofday" symbol (from System.map-2.6.9-22.ELsmp):

ffffffff80508000 A _end
ffffffffff600000 t vgettimeofday
ffffffffff600180 D __xtime_lock

This is and of itself isn't very interesting. What I shall do is try to disable
vsyscalls so that the instruction pointer will be from the 'real' caller, rather
than being 'hidden' by the vsyscall.

In the interim, is there anywhere else I should be looking in order to trace
this bug more effectively?

Comment 7 Oliver Cook 2006-02-09 17:59:05 UTC
Using the 32bit Sun JVM to avoid running into vsyscalls I find the instruction
pointer to be in libpthread:

[        ffffe405] gettimeofday({19633435017014, 17862794186932070208},
NULL) = 0
[        ffffe405] gettimeofday({19981327367990, 17862794457515009856},
NULL) = 0
[        ffffe405] gettimeofday({20389349261110, 17862794186932070208},
NULL) = 0

f7ff3000-f7ff5000 rwxp 0000d000 08:08
2014                               /lib/tls/libpthread-2.3.4.so
f7ff5000-f7ff8000 rwxp f7ff5000 00:00 0 
ffe02000-ffe0c000 rwxp ffe02000 00:00 0 
ffe0c000-ffe0f000 ---p ffe0c000 00:00 0 
ffff6000-ffffe000 rwxp ffff6000 00:00 0 
ffffe000-fffff000 ---p 00000000 00:00 0 

As such I'm going to mark this bug as being against the glibc component now
rather than the kernel.

I'm also marking affected architectures as 'All' since dduval has reproduced
this on RHEL4-x86 32bit.

Ollie

Comment 8 Oliver Cook 2006-02-10 11:03:21 UTC
I misinterpreted the output of /proc/pid/maps: lines below a library demarkation
do not also refer to that library.

The call to gettimeofday is not originating from libpthread but an anonymous
page below the stack which _does not_ have execute permissions:

$ pmap -d 23547
23547:   /usr/java/j2sdk1.4.2_08/bin/java Sleep
Address           Kbytes Mode  Offset           Device    Mapping
00000000002fe000      84 r-x-- 0000000000000000 008:00008 ld-2.3.4.so
0000000000313000       4 r-x-- 0000000000015000 008:00008 ld-2.3.4.so
0000000000314000       4 rwx-- 0000000000016000 008:00008 ld-2.3.4.so
0000000000317000    1168 r-x-- 0000000000000000 008:00008 libc-2.3.4.so
000000000043b000       4 r-x-- 0000000000124000 008:00008 libc-2.3.4.so
000000000043c000      12 rwx-- 0000000000125000 008:00008 libc-2.3.4.so
000000000043f000       8 rwx-- 000000000043f000 000:00000   [ anon ]
0000000008048000      56 r-x-- 0000000000000000 008:00002 java
0000000008056000      12 rwx-- 000000000000d000 008:00002 java
0000000008059000     276 rwx-- 0000000008059000 000:00000   [ anon ]
00000000eb112000       4 ----- 00000000eb112000 000:00000   [ anon ]
00000000eb113000      52 rwx-- 00000000eb113000 000:00000   [ anon ]
00000000eb120000      12 ----- 00000000eb120000 000:00000   [ anon ]
00000000eb123000     448 rwx-- 00000000eb123000 000:00000   [ anon ]
00000000eb193000     112 r-xs- 0000000000000000 008:00002 sunjce_provider.jar
00000000eb1af000     752 r-xs- 0000000000000000 008:00002 localedata.jar
00000000eb26b000      12 r-xs- 0000000000000000 008:00002 dnsns.jar
00000000eb26e000      52 r-xs- 0000000000000000 008:00002 ldapsec.jar
00000000eb27b000       4 ----- 00000000eb27b000 000:00000   [ anon ]
00000000eb27c000     512 rwx-- 00000000eb27c000 000:00000   [ anon ]
00000000eb2fc000       4 ----- 00000000eb2fc000 000:00000   [ anon ]
00000000eb2fd000      52 rwx-- 00000000eb2fd000 000:00000   [ anon ]
00000000eb30a000      12 ----- 00000000eb30a000 000:00000   [ anon ]
00000000eb30d000     448 rwx-- 00000000eb30d000 000:00000   [ anon ]
00000000eb37d000       4 ----- 00000000eb37d000 000:00000   [ anon ]
00000000eb37e000      52 rwx-- 00000000eb37e000 000:00000   [ anon ]
00000000eb38b000      12 ----- 00000000eb38b000 000:00000   [ anon ]
00000000eb38e000     448 rwx-- 00000000eb38e000 000:00000   [ anon ]
00000000eb3fe000       4 ----- 00000000eb3fe000 000:00000   [ anon ]
00000000eb3ff000     512 rwx-- 00000000eb3ff000 000:00000   [ anon ]
00000000eb47f000    2048 r-x-- 0000000000000000 008:00002 locale-archive
00000000eb67f000       4 ----- 00000000eb67f000 000:00000   [ anon ]
00000000eb680000      52 rwx-- 00000000eb680000 000:00000   [ anon ]
00000000eb68d000      12 ----- 00000000eb68d000 000:00000   [ anon ]
00000000eb690000     580 rwx-- 00000000eb690000 000:00000   [ anon ]
00000000eb721000     892 ----- 00000000eb721000 000:00000   [ anon ]
00000000eb83e000       4 ----- 00000000eb83e000 000:00000   [ anon ]
00000000eb83f000      52 rwx-- 00000000eb83f000 000:00000   [ anon ]
00000000eb84c000      12 ----- 00000000eb84c000 000:00000   [ anon ]
00000000eb84f000     448 rwx-- 00000000eb84f000 000:00000   [ anon ]
00000000eb8bf000       4 ----- 00000000eb8bf000 000:00000   [ anon ]
00000000eb8c0000     524 rwx-- 00000000eb8c0000 000:00000   [ anon ]
00000000eb943000     120 rwx-- 00000000eb943000 000:00000   [ anon ]
00000000eb961000       4 rwx-- 00000000eb961000 000:00000   [ anon ]
00000000eb962000     116 rwx-- 00000000eb962000 000:00000   [ anon ]
00000000eb97f000       4 rwx-- 00000000eb97f000 000:00000   [ anon ]
00000000eb980000       4 rwx-- 00000000eb980000 000:00000   [ anon ]
00000000eb981000       8 rwx-- 00000000eb981000 000:00000   [ anon ]
00000000eb983000     112 rwx-- 00000000eb983000 000:00000   [ anon ]
00000000eb99f000       8 rwx-- 00000000eb99f000 000:00000   [ anon ]
00000000eb9a1000     120 rwx-- 00000000eb9a1000 000:00000   [ anon ]
00000000eb9bf000     644 rwx-- 00000000eb9bf000 000:00000   [ anon ]
00000000eba60000    4352 rwx-- 00000000eba60000 000:00000   [ anon ]
00000000ebea0000    1408 rwx-- 00000000ebea0000 000:00000   [ anon ]
00000000ec000000   59136 rwx-- 00000000ec000000 000:00000   [ anon ]
00000000ef9c0000    4096 rwx-- 00000000ef9c0000 000:00000   [ anon ]
00000000efdc0000   61440 rwx-- 00000000efdc0000 000:00000   [ anon ]
00000000f39c7000     168 rwx-- 00000000f39c7000 000:00000   [ anon ]
00000000f39f1000     504 rwx-- 00000000f39f1000 000:00000   [ anon ]
00000000f3a6f000     512 rwx-- 00000000f3a6f000 000:00000   [ anon ]
00000000f3aef000   32256 rwx-- 00000000f3aef000 000:00000   [ anon ]
00000000f5a6f000    5760 r-xs- 0000000000000000 008:00002 charsets.jar
00000000f600f000      68 r-xs- 0000000000000000 008:00002 jce.jar
00000000f6020000     884 r-xs- 0000000000000000 008:00002 jsse.jar
00000000f60fd000      88 r-xs- 0000000000000000 008:00002 sunrsasign.jar
00000000f6113000     296 rwx-- 00000000f6113000 000:00000   [ anon ]
00000000f615d000   26284 r-xs- 0000000000000000 008:00002 rt.jar
00000000f7b08000      80 r-x-- 0000000000000000 008:00002 libzip.so
00000000f7b1c000      12 rwx-- 0000000000013000 008:00002 libzip.so
00000000f7b1f000     128 r-x-- 0000000000000000 008:00002 libjava.so
00000000f7b3f000       8 rwx-- 000000000001f000 008:00002 libjava.so
00000000f7b41000      64 r-x-- 0000000000000000 008:00002 libverify.so
00000000f7b51000       8 rwx-- 000000000000f000 008:00002 libverify.so
00000000f7b53000      36 r-x-- 0000000000000000 008:00008 libnss_files-2.3.4.so
00000000f7b5c000       8 rwx-- 0000000000008000 008:00008 libnss_files-2.3.4.so
00000000f7b60000      16 rwxs- 0000000000000000 008:00006 23547
00000000f7b64000      32 r-x-- 0000000000000000 008:00002 libhpi.so
00000000f7b6c000       4 rwx-- 0000000000007000 008:00002 libhpi.so
00000000f7b6d000     132 r-x-- 0000000000000000 008:00008 libm-2.3.4.so
00000000f7b8e000       8 rwx-- 0000000000020000 008:00008 libm-2.3.4.so
00000000f7b90000      72 r-x-- 0000000000000000 008:00008 libnsl-2.3.4.so
00000000f7ba2000       8 rwx-- 0000000000011000 008:00008 libnsl-2.3.4.so
00000000f7ba4000       8 rwx-- 00000000f7ba4000 000:00000   [ anon ]
00000000f7bac000    4120 r-x-- 0000000000000000 008:00002 libjvm.so
00000000f7fb2000     108 rwx-- 0000000000405000 008:00002 libjvm.so
00000000f7fcd000      80 rwx-- 00000000f7fcd000 000:00000   [ anon ]
00000000f7fe1000       8 r-x-- 0000000000000000 008:00008 libdl-2.3.4.so
00000000f7fe3000       8 rwx-- 0000000000001000 008:00008 libdl-2.3.4.so
00000000f7fe5000      56 r-x-- 0000000000000000 008:00008 libpthread-2.3.4.so
00000000f7ff3000       8 rwx-- 000000000000d000 008:00008 libpthread-2.3.4.so
00000000f7ff5000      12 rwx-- 00000000f7ff5000 000:00000   [ anon ]
00000000ffe02000      40 rwx-- 00000000ffe02000 000:00000   [ anon ]
00000000ffe0c000      12 ----- 00000000ffe0c000 000:00000   [ anon ]
00000000ffff6000      32 rwx-- 00000000ffff6000 000:00000   [ stack ]
00000000ffffe000       4 ----- 0000000000000000 000:00000   [ anon ]
mapped: 213272K    writeable/private: 170152K    shared: 34028K

Unlike all the other anonymous pages that this process has mapped, though, the
offset for the final 4K page at ffffe000 is shown as 0, rather than being equal
to the start address.

Is this difference pertinent?

More importantly, how is a page with no execute permissions executing a system call?

Comment 9 Jakub Jelinek 2006-02-14 07:38:23 UTC
x86_64 glibc never invokes gettimeofday syscall directly, it always calls the
vsyscall:
/* For the calculation see asm/vsyscall.h.  */
#define VSYSCALL_ADDR_vgettimeofday     0xffffffffff600000


ENTRY (__gettimeofday)
        /* Align stack.  */
        sub     $0x8, %rsp
        cfi_adjust_cfa_offset(8)
        movq    $VSYSCALL_ADDR_vgettimeofday, %rax
        callq   *%rax
        /* Check error return.  */
        cmpl    $-4095, %eax
        jae     SYSCALL_ERROR_LABEL

L(pseudo_end):
        add     $0x8, %rsp
        cfi_adjust_cfa_offset(-8)
        ret
PSEUDO_END(__gettimeofday)

The vDSO is provided by the kernel:
static int __vsyscall(0) vgettimeofday(struct timeval * tv, struct timezone * tz)
{
        if (unlikely(!__sysctl_vsyscall))
        return gettimeofday(tv,tz);
        if (tv)
                do_vgettimeofday(tv);
        if (tz)
                do_get_tz(tz);
        return 0;
}



Comment 10 Jason Baron 2006-02-17 15:12:37 UTC
what architecture/kernel is running?

Comment 11 Oliver Cook 2006-02-17 15:48:16 UTC
This was seen on a stock RHES4.2 kernel (kernel-2.6.9-22.EL) running on 64bit
(Opteron).

dduval notes, also in this ticket, that he has reproduced it on a
different architecture:

"I tested the same thing on RHEL4-x86 U2, with similar results on IBM's J2RE
1.4.2 IBM build cxia32142-20050609. "

Ollie

Comment 12 Jason Baron 2006-02-17 21:09:55 UTC
ok, i think i've gotten to the bottom of this. In U2 we added the 'pmtimer' to
x86_64. This sets: sysctl_vsyscall = 0; in arch/x86_64/kernel/time.c. If you
look in arch/x86_64/kernel/vsyscall.c, it does:

static int __vsyscall(0) vgettimeofday(struct timeval * tv, struct timezone * tz)
{
        if (unlikely(!__sysctl_vsyscall))
        return gettimeofday(tv,tz);
        if (tv)
                do_vgettimeofday(tv);
        if (tz)
                do_get_tz(tz);
        return 0;
}

The first call is actually to the syscall.

We made this change b/c the tsc gets out of sync b/w cpus and can lead to
unstable time. The upstream kernel does the same. To get the old behavior you
can pass 'nopmtimer' and 'nohpet' and the commandline. That will return things
to using the tsc, and you shouldn't see the syscalls anymore. Is this a serious
performance degradation for your applicaton?

thanks.

Comment 13 Oliver Cook 2006-02-18 10:37:48 UTC
When running with this problem under U2 we saw a much higher user:sys CPU 
ratio. Our CPUs were noticably busier for the same workload, so yes I would 
classify this as a performance degradation.

I will try nopmtimer and nohpet as kernel boot options and see how that affects 
us.

What are the disadvantages of using the TSC timer instead?

Thanks.

Comment 14 bugzilla 2006-07-10 19:15:29 UTC
We are also exhibiting the same problem (abnormally high sys CPU time, abnormally high # of calls to 
gettimeofday) when moving from U1 to U2 on an IBM x366 server (x86_64, 4 dual core Xeons, kernel 
2.6.9-22.ELsmp).

We're using DB2 8.1 fixpack 11, and typical query time increases by 3-4x on U2.  The system:user time 
rises from approximately 10% under U1 to 80% under U2.  We were forced to revert to U1, which 
resolved the issue.  These results were repeatable across several machines running U1 and U2.

I discovered the high # (tens of thousands) of gettimeofday calls independently, using strace, and then 
today found this corroborating bug report.  I see that no followup occurred after the suggestion to use 
the TSC.  I can't find any similar bug report here nor any mention of a fix in the 2.6.9-34.ELsmp kernel 
changelog.

So let me ask: did using the 'nopmtimer' and 'nohpet' options resolve the issue on U2?  Is there another 
fix other than to use the apparently unreliable TSC?  In the meantime we are reverting our U2 systems 
to U1, as queries running at 1/4 of previous speed is not acceptable to users.

Thanks. -- Jim Ursetto


Comment 15 Oliver Cook 2006-07-12 13:44:28 UTC
In response to Jim Ursetto, we did not try using the TSC timer so I cannot
verify if that resolves the problem under U2. We also downgraded to U1 which is
where we remain for the time being.

Ollie Cook (oliver.cook)

Comment 16 Brian Maly 2006-08-16 19:32:34 UTC
The vsyscall for gettimeofday() was pointing to the wrong entry address in RHEL4
U3. This was fixed as of U4. If the RHEL4.3 kernel is used, vgettimeofday() will
only result in dissapoint. Can you upgrade to the RHEL4.4 kernel and re-test? I
believe the problem has been resolved.

Comment 17 Oliver Cook 2006-08-17 05:41:34 UTC
Brian, did the same bug exist in 4.2 or was it introduced in 4.3, then fixed in 
4.4?

I have asked our operational teams to upgrade a host to 4.4 see if can 
demonstrate the problem is resolved.

Comment 18 Brian Maly 2006-09-12 14:53:43 UTC
The bug was a regression introduced in a previous release (4.3), but the bug was
fixed in the 4.4 kernel. Upgrading the kernel (to 4.4) should solve the issue. 

Since this bug has been fixed, this bug should get closed. Please re-open the
bug if you discover the problem still exists (in that case its a different
problem and not the vsyscall).



Comment 19 Brian Maly 2006-09-12 14:55:54 UTC
Closing this bug since a patch that addressed this issue was included in 4.4