Bug 180471
Summary: | spurious calls to gettimeofday system call after upgrade to Update 2 from Update 1 | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Oliver Cook <oliver.cook> |
Component: | kernel | Assignee: | Brian Maly <bmaly> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4.0 | CC: | dduval, jakub, jbaron, k.georgiou |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | 4.4 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-09-12 14:55:54 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Oliver Cook
2006-02-08 12:24:21 UTC
Created attachment 124374 [details]
Java test script used to reproduce problem
Compile with:
/path/to/bin/javac BFSimplexTest.java
Created attachment 124375 [details]
ltrace output from running the attached script on RHES4.2
I tested the same thing on RHEL4-x86 U2, with similar results on IBM's J2RE 1.4.2 IBM build cxia32142-20050609. At dduval's request, I have downgraded an affected system from glibc-2.3.4-2.13 to glibc-2.3.4-2.9 and re-run the test. The test shows the same results. This either suggests that the downgrade did not work (I have attached the steps I used perform the downgrade, and am confident that the glibc downgrade did work), or that glibc is not the source of the extra calls to SYS_gettimeofday. I would appreciate advice on the next step for debugging the source of these calls. Of interest might be the instruction pointer at the time of the SYS_gettimeofday calls. It appears to be an address that is not on the process's stack: [ffffffffff600012] gettimeofday({1139414511, 422461}, NULL) = 0 [ffffffffff600012] gettimeofday({1139414511, 422490}, NULL) = 0 [ffffffffff600012] gettimeofday({1139414511, 422548}, NULL) = 0 [ 3bb16c26dc] mmap(0x2add360000, 16777216, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x2add360000 [ 3bb16c26dc] mmap(NULL, 135168, PROT_READ|PROT_WRITE|PROT_EXEC,MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x2ae13b7000 [ 3bb16c26dc] mmap(0x2ae13b7000, 36864, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x2ae13b7000 [ 3bb16c26dc] mmap(0x2ae1380000, 32768, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x2ae1380000 [ffffffffff600012] gettimeofday({1139414511, 422745}, NULL) = 0 [ffffffffff600012] gettimeofday({1139414511, 422779}, NULL) = 0 [ffffffffff600012] gettimeofday({1139414511, 422841}, NULL) = 0 Created attachment 124431 [details]
demonstration of steps performed to downgrade libc and further test results
On the system in question, the above instruction pointer is in the "vgettimeofday" symbol (from System.map-2.6.9-22.ELsmp): ffffffff80508000 A _end ffffffffff600000 t vgettimeofday ffffffffff600180 D __xtime_lock This is and of itself isn't very interesting. What I shall do is try to disable vsyscalls so that the instruction pointer will be from the 'real' caller, rather than being 'hidden' by the vsyscall. In the interim, is there anywhere else I should be looking in order to trace this bug more effectively? Using the 32bit Sun JVM to avoid running into vsyscalls I find the instruction pointer to be in libpthread: [ ffffe405] gettimeofday({19633435017014, 17862794186932070208}, NULL) = 0 [ ffffe405] gettimeofday({19981327367990, 17862794457515009856}, NULL) = 0 [ ffffe405] gettimeofday({20389349261110, 17862794186932070208}, NULL) = 0 f7ff3000-f7ff5000 rwxp 0000d000 08:08 2014 /lib/tls/libpthread-2.3.4.so f7ff5000-f7ff8000 rwxp f7ff5000 00:00 0 ffe02000-ffe0c000 rwxp ffe02000 00:00 0 ffe0c000-ffe0f000 ---p ffe0c000 00:00 0 ffff6000-ffffe000 rwxp ffff6000 00:00 0 ffffe000-fffff000 ---p 00000000 00:00 0 As such I'm going to mark this bug as being against the glibc component now rather than the kernel. I'm also marking affected architectures as 'All' since dduval has reproduced this on RHEL4-x86 32bit. Ollie I misinterpreted the output of /proc/pid/maps: lines below a library demarkation do not also refer to that library. The call to gettimeofday is not originating from libpthread but an anonymous page below the stack which _does not_ have execute permissions: $ pmap -d 23547 23547: /usr/java/j2sdk1.4.2_08/bin/java Sleep Address Kbytes Mode Offset Device Mapping 00000000002fe000 84 r-x-- 0000000000000000 008:00008 ld-2.3.4.so 0000000000313000 4 r-x-- 0000000000015000 008:00008 ld-2.3.4.so 0000000000314000 4 rwx-- 0000000000016000 008:00008 ld-2.3.4.so 0000000000317000 1168 r-x-- 0000000000000000 008:00008 libc-2.3.4.so 000000000043b000 4 r-x-- 0000000000124000 008:00008 libc-2.3.4.so 000000000043c000 12 rwx-- 0000000000125000 008:00008 libc-2.3.4.so 000000000043f000 8 rwx-- 000000000043f000 000:00000 [ anon ] 0000000008048000 56 r-x-- 0000000000000000 008:00002 java 0000000008056000 12 rwx-- 000000000000d000 008:00002 java 0000000008059000 276 rwx-- 0000000008059000 000:00000 [ anon ] 00000000eb112000 4 ----- 00000000eb112000 000:00000 [ anon ] 00000000eb113000 52 rwx-- 00000000eb113000 000:00000 [ anon ] 00000000eb120000 12 ----- 00000000eb120000 000:00000 [ anon ] 00000000eb123000 448 rwx-- 00000000eb123000 000:00000 [ anon ] 00000000eb193000 112 r-xs- 0000000000000000 008:00002 sunjce_provider.jar 00000000eb1af000 752 r-xs- 0000000000000000 008:00002 localedata.jar 00000000eb26b000 12 r-xs- 0000000000000000 008:00002 dnsns.jar 00000000eb26e000 52 r-xs- 0000000000000000 008:00002 ldapsec.jar 00000000eb27b000 4 ----- 00000000eb27b000 000:00000 [ anon ] 00000000eb27c000 512 rwx-- 00000000eb27c000 000:00000 [ anon ] 00000000eb2fc000 4 ----- 00000000eb2fc000 000:00000 [ anon ] 00000000eb2fd000 52 rwx-- 00000000eb2fd000 000:00000 [ anon ] 00000000eb30a000 12 ----- 00000000eb30a000 000:00000 [ anon ] 00000000eb30d000 448 rwx-- 00000000eb30d000 000:00000 [ anon ] 00000000eb37d000 4 ----- 00000000eb37d000 000:00000 [ anon ] 00000000eb37e000 52 rwx-- 00000000eb37e000 000:00000 [ anon ] 00000000eb38b000 12 ----- 00000000eb38b000 000:00000 [ anon ] 00000000eb38e000 448 rwx-- 00000000eb38e000 000:00000 [ anon ] 00000000eb3fe000 4 ----- 00000000eb3fe000 000:00000 [ anon ] 00000000eb3ff000 512 rwx-- 00000000eb3ff000 000:00000 [ anon ] 00000000eb47f000 2048 r-x-- 0000000000000000 008:00002 locale-archive 00000000eb67f000 4 ----- 00000000eb67f000 000:00000 [ anon ] 00000000eb680000 52 rwx-- 00000000eb680000 000:00000 [ anon ] 00000000eb68d000 12 ----- 00000000eb68d000 000:00000 [ anon ] 00000000eb690000 580 rwx-- 00000000eb690000 000:00000 [ anon ] 00000000eb721000 892 ----- 00000000eb721000 000:00000 [ anon ] 00000000eb83e000 4 ----- 00000000eb83e000 000:00000 [ anon ] 00000000eb83f000 52 rwx-- 00000000eb83f000 000:00000 [ anon ] 00000000eb84c000 12 ----- 00000000eb84c000 000:00000 [ anon ] 00000000eb84f000 448 rwx-- 00000000eb84f000 000:00000 [ anon ] 00000000eb8bf000 4 ----- 00000000eb8bf000 000:00000 [ anon ] 00000000eb8c0000 524 rwx-- 00000000eb8c0000 000:00000 [ anon ] 00000000eb943000 120 rwx-- 00000000eb943000 000:00000 [ anon ] 00000000eb961000 4 rwx-- 00000000eb961000 000:00000 [ anon ] 00000000eb962000 116 rwx-- 00000000eb962000 000:00000 [ anon ] 00000000eb97f000 4 rwx-- 00000000eb97f000 000:00000 [ anon ] 00000000eb980000 4 rwx-- 00000000eb980000 000:00000 [ anon ] 00000000eb981000 8 rwx-- 00000000eb981000 000:00000 [ anon ] 00000000eb983000 112 rwx-- 00000000eb983000 000:00000 [ anon ] 00000000eb99f000 8 rwx-- 00000000eb99f000 000:00000 [ anon ] 00000000eb9a1000 120 rwx-- 00000000eb9a1000 000:00000 [ anon ] 00000000eb9bf000 644 rwx-- 00000000eb9bf000 000:00000 [ anon ] 00000000eba60000 4352 rwx-- 00000000eba60000 000:00000 [ anon ] 00000000ebea0000 1408 rwx-- 00000000ebea0000 000:00000 [ anon ] 00000000ec000000 59136 rwx-- 00000000ec000000 000:00000 [ anon ] 00000000ef9c0000 4096 rwx-- 00000000ef9c0000 000:00000 [ anon ] 00000000efdc0000 61440 rwx-- 00000000efdc0000 000:00000 [ anon ] 00000000f39c7000 168 rwx-- 00000000f39c7000 000:00000 [ anon ] 00000000f39f1000 504 rwx-- 00000000f39f1000 000:00000 [ anon ] 00000000f3a6f000 512 rwx-- 00000000f3a6f000 000:00000 [ anon ] 00000000f3aef000 32256 rwx-- 00000000f3aef000 000:00000 [ anon ] 00000000f5a6f000 5760 r-xs- 0000000000000000 008:00002 charsets.jar 00000000f600f000 68 r-xs- 0000000000000000 008:00002 jce.jar 00000000f6020000 884 r-xs- 0000000000000000 008:00002 jsse.jar 00000000f60fd000 88 r-xs- 0000000000000000 008:00002 sunrsasign.jar 00000000f6113000 296 rwx-- 00000000f6113000 000:00000 [ anon ] 00000000f615d000 26284 r-xs- 0000000000000000 008:00002 rt.jar 00000000f7b08000 80 r-x-- 0000000000000000 008:00002 libzip.so 00000000f7b1c000 12 rwx-- 0000000000013000 008:00002 libzip.so 00000000f7b1f000 128 r-x-- 0000000000000000 008:00002 libjava.so 00000000f7b3f000 8 rwx-- 000000000001f000 008:00002 libjava.so 00000000f7b41000 64 r-x-- 0000000000000000 008:00002 libverify.so 00000000f7b51000 8 rwx-- 000000000000f000 008:00002 libverify.so 00000000f7b53000 36 r-x-- 0000000000000000 008:00008 libnss_files-2.3.4.so 00000000f7b5c000 8 rwx-- 0000000000008000 008:00008 libnss_files-2.3.4.so 00000000f7b60000 16 rwxs- 0000000000000000 008:00006 23547 00000000f7b64000 32 r-x-- 0000000000000000 008:00002 libhpi.so 00000000f7b6c000 4 rwx-- 0000000000007000 008:00002 libhpi.so 00000000f7b6d000 132 r-x-- 0000000000000000 008:00008 libm-2.3.4.so 00000000f7b8e000 8 rwx-- 0000000000020000 008:00008 libm-2.3.4.so 00000000f7b90000 72 r-x-- 0000000000000000 008:00008 libnsl-2.3.4.so 00000000f7ba2000 8 rwx-- 0000000000011000 008:00008 libnsl-2.3.4.so 00000000f7ba4000 8 rwx-- 00000000f7ba4000 000:00000 [ anon ] 00000000f7bac000 4120 r-x-- 0000000000000000 008:00002 libjvm.so 00000000f7fb2000 108 rwx-- 0000000000405000 008:00002 libjvm.so 00000000f7fcd000 80 rwx-- 00000000f7fcd000 000:00000 [ anon ] 00000000f7fe1000 8 r-x-- 0000000000000000 008:00008 libdl-2.3.4.so 00000000f7fe3000 8 rwx-- 0000000000001000 008:00008 libdl-2.3.4.so 00000000f7fe5000 56 r-x-- 0000000000000000 008:00008 libpthread-2.3.4.so 00000000f7ff3000 8 rwx-- 000000000000d000 008:00008 libpthread-2.3.4.so 00000000f7ff5000 12 rwx-- 00000000f7ff5000 000:00000 [ anon ] 00000000ffe02000 40 rwx-- 00000000ffe02000 000:00000 [ anon ] 00000000ffe0c000 12 ----- 00000000ffe0c000 000:00000 [ anon ] 00000000ffff6000 32 rwx-- 00000000ffff6000 000:00000 [ stack ] 00000000ffffe000 4 ----- 0000000000000000 000:00000 [ anon ] mapped: 213272K writeable/private: 170152K shared: 34028K Unlike all the other anonymous pages that this process has mapped, though, the offset for the final 4K page at ffffe000 is shown as 0, rather than being equal to the start address. Is this difference pertinent? More importantly, how is a page with no execute permissions executing a system call? x86_64 glibc never invokes gettimeofday syscall directly, it always calls the vsyscall: /* For the calculation see asm/vsyscall.h. */ #define VSYSCALL_ADDR_vgettimeofday 0xffffffffff600000 ENTRY (__gettimeofday) /* Align stack. */ sub $0x8, %rsp cfi_adjust_cfa_offset(8) movq $VSYSCALL_ADDR_vgettimeofday, %rax callq *%rax /* Check error return. */ cmpl $-4095, %eax jae SYSCALL_ERROR_LABEL L(pseudo_end): add $0x8, %rsp cfi_adjust_cfa_offset(-8) ret PSEUDO_END(__gettimeofday) The vDSO is provided by the kernel: static int __vsyscall(0) vgettimeofday(struct timeval * tv, struct timezone * tz) { if (unlikely(!__sysctl_vsyscall)) return gettimeofday(tv,tz); if (tv) do_vgettimeofday(tv); if (tz) do_get_tz(tz); return 0; } what architecture/kernel is running? This was seen on a stock RHES4.2 kernel (kernel-2.6.9-22.EL) running on 64bit (Opteron). dduval notes, also in this ticket, that he has reproduced it on a different architecture: "I tested the same thing on RHEL4-x86 U2, with similar results on IBM's J2RE 1.4.2 IBM build cxia32142-20050609. " Ollie ok, i think i've gotten to the bottom of this. In U2 we added the 'pmtimer' to x86_64. This sets: sysctl_vsyscall = 0; in arch/x86_64/kernel/time.c. If you look in arch/x86_64/kernel/vsyscall.c, it does: static int __vsyscall(0) vgettimeofday(struct timeval * tv, struct timezone * tz) { if (unlikely(!__sysctl_vsyscall)) return gettimeofday(tv,tz); if (tv) do_vgettimeofday(tv); if (tz) do_get_tz(tz); return 0; } The first call is actually to the syscall. We made this change b/c the tsc gets out of sync b/w cpus and can lead to unstable time. The upstream kernel does the same. To get the old behavior you can pass 'nopmtimer' and 'nohpet' and the commandline. That will return things to using the tsc, and you shouldn't see the syscalls anymore. Is this a serious performance degradation for your applicaton? thanks. When running with this problem under U2 we saw a much higher user:sys CPU ratio. Our CPUs were noticably busier for the same workload, so yes I would classify this as a performance degradation. I will try nopmtimer and nohpet as kernel boot options and see how that affects us. What are the disadvantages of using the TSC timer instead? Thanks. We are also exhibiting the same problem (abnormally high sys CPU time, abnormally high # of calls to gettimeofday) when moving from U1 to U2 on an IBM x366 server (x86_64, 4 dual core Xeons, kernel 2.6.9-22.ELsmp). We're using DB2 8.1 fixpack 11, and typical query time increases by 3-4x on U2. The system:user time rises from approximately 10% under U1 to 80% under U2. We were forced to revert to U1, which resolved the issue. These results were repeatable across several machines running U1 and U2. I discovered the high # (tens of thousands) of gettimeofday calls independently, using strace, and then today found this corroborating bug report. I see that no followup occurred after the suggestion to use the TSC. I can't find any similar bug report here nor any mention of a fix in the 2.6.9-34.ELsmp kernel changelog. So let me ask: did using the 'nopmtimer' and 'nohpet' options resolve the issue on U2? Is there another fix other than to use the apparently unreliable TSC? In the meantime we are reverting our U2 systems to U1, as queries running at 1/4 of previous speed is not acceptable to users. Thanks. -- Jim Ursetto In response to Jim Ursetto, we did not try using the TSC timer so I cannot verify if that resolves the problem under U2. We also downgraded to U1 which is where we remain for the time being. Ollie Cook (oliver.cook) The vsyscall for gettimeofday() was pointing to the wrong entry address in RHEL4 U3. This was fixed as of U4. If the RHEL4.3 kernel is used, vgettimeofday() will only result in dissapoint. Can you upgrade to the RHEL4.4 kernel and re-test? I believe the problem has been resolved. Brian, did the same bug exist in 4.2 or was it introduced in 4.3, then fixed in 4.4? I have asked our operational teams to upgrade a host to 4.4 see if can demonstrate the problem is resolved. The bug was a regression introduced in a previous release (4.3), but the bug was fixed in the 4.4 kernel. Upgrading the kernel (to 4.4) should solve the issue. Since this bug has been fixed, this bug should get closed. Please re-open the bug if you discover the problem still exists (in that case its a different problem and not the vsyscall). Closing this bug since a patch that addressed this issue was included in 4.4 |