Bug 5866
Summary: | High load average under smp kernel when using gnome or E | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Gordon Messmer <gordon.messmer> |
Component: | kernel | Assignee: | Alan Cox <alan> |
Status: | CLOSED NOTABUG | QA Contact: | |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 6.1 | CC: | alan, juanco |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2000-02-05 23:51:25 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Gordon Messmer
1999-10-12 05:51:56 UTC
On further investigation, all of the software mentioned have exceedingly short timeouts specified in calls to select() or to poll(). They redraw often even under the uniprocessor kernel, but don't take nearly as much processor time. The problem probably stems from a kernel lock when reading/writing to the UNIX socket to the X server. It seems like the kernel locks introduced in 2.2.11 have really hurt the performance of UNIX sockets. Kernels prior to this don't exhibit this behavior. 2.2.11 doesnt change the AF_UNIX locking at all. In fact we've been reducing locking. It looks like its just applications ticking over it may also be artificially high as the sample rate for load average is only 100HZ Based on alan's comments, I've investigated further. I wrote a simple program that would emulate the behavior that I observed in GNOME's panel apps. The program forks, the parent creates a UNIX socket and reads 10000 packets of a user-specified size from that socket after the child connects. The child writes 10000 packets of data, with a 100 nanosecond delay in between writes, on exit it displays the total amount of time spent in writes and the average write time. This can be had from ftp://duke.eburg.com/pub/linux/test_socket.c if you like. OK, under the uniprocessor kernel, under all tested circumstances, this program never takes more than a fraction of a percentage of the CPU time. This is what I expected to see, as it's how the X apps behave. Now, I tested it under the 2.2.12 smp kernel on the console, and observed the same results. No high load, no unusual CPU utilization. That had me for a second. Then, I went into X, with GNOME running. GNOME was behaving as described, taking 35-45 percent of the available CPU time. Then, I ran the test_socket program, and watched both the parent and child processes take ~10% of the available CPU time, unlike what I'd observed on the console. The load may have been due to something else (incidental), as it fell to normal during tests, but the CPU utilization was still very high. This leads me to believe that when many AF_UNIX sockets are in use concurrently, performance suffers. 2.2.11 is the first 2.2.x series kernel that I can find that had the file linux/smp_lock.h included. The patch for 2.2.11 did contain some smp locking changes to net/unix/af_unix.c (by davem??) as seen on http://www.kernelnotes.org/v22patch/patch-2.2.11/linux_net_unix_af_unix.c.html And, I tested the same program under a 2.2.5 smp kernel with no abnormal effects. If you would like me to test any kernel in between, please let me know. Can you run your test combined with something eating CPU. The uptime load data is a 100Hz sample. It isnt real accurate informtion when you have timing synchronizations involved. Without data on slow down of a CPU intensive task in both cases its iffy to assume the cpu load data is accurate. I wasn't quite sure what you wanted, so what I did was this: In the uniprocessor kernel, I first created a set of ten directories and spawned terminals within each one. Then, I ran the test app in each of these directories concurrently. There was no additional load or processor time used significant enought to show up in 'top' or 'xosview'. Good enough, that's what I expect. Then, I rebooted to the smp kernel, got into X and again spawned ten xterms. In each one of the terminals I spawned the test app, so there were ten processes communicating in addition to the X applications already running. 'multiload_app' showed about 10% "user" processor use and about 90% "nice" processor use. The average transport time increased only from about 11ns to 12ns. Hardly a bother. I noticed that each of the 'socket_test' were using about 4.1% processor time in 'top', very consistently. So was X and a couple other applications. So, does this 100Hz sample also apply to processor use, as shown by 'top' and 'xosview'? What had me curious in the first place, was that af_unix uses (un)lock_kernel, while the ipv4 sources do not. I appreciate your time. (and I know how hard it is to find) |