Bug 113851

Summary: Bad: Performance issue with VMware Workstation 4.0.5
Product: Red Hat Enterprise Linux 3 Reporter: Erik Bussink <erik>
Component: kernelAssignee: Todd Barr <tbarr>
Status: CLOSED WONTFIX QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: ed.greshko, emorisse, jbs, jch, jon, jvergis, k.georgiou, lwoodman, petrides, vandrove
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
URL: http://www.vmware.com/download/workstation_beta_status.html
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-10-10 19:49:09 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
Top output
none
vmstat output
none
lsmod output
none
VMWare screenshot under high load none

Description Erik Bussink 2004-01-19 10:28:14 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031007

Description of problem:
Upgrading from the kernel-2.4.21-4.0.2.EL to the Quarterly Update 1
kernel-2.4.21-9.EL affects the normal behaviour of VMware Workstation
4.0.5 (Build 6030). Once a virtual machine is being started inside the
VMware workstation, the vmware-vmx process will quickly eat up all the
cpu available, and become unresponsive to user inputs (mouse movements
and keyboard movements).

This issue was also reproduced using the latest beta release of VMware
Workstation 4.5-beta-rc1 (Build 6979) expect to ship in the next two
weeks.

Version-Release number of selected component (if applicable):
kernel-2.4.21-9.EL

How reproducible:
Always

Steps to Reproduce:
1. Install VMware Workstation 4.0.5 (Build 6030) on RHEL3
kernel-2.4.21-9.EL
2. Update the vmware-config.pl script to compile the new modules
3. Start a virtual machine
4. Check the output of TOP.
    

Actual Results:  Unresponsive graphical inputs using mouse or keyboard.

Additional info:
Comment 3 Larry Woodman 2004-01-29 14:31:09 EST
First of all, can someone start by attaching a top and/or vmstat
output when the system is under load?  Next, we will want an
AltSysrq-M output when the system us in the state described above. 
Also, please provide an exact description of the system(hardware and
lsmod output) that is experiencing these problems.

Larry

Comment 4 Erich Morisse 2004-01-31 10:24:32 EST
Created attachment 97388 [details]
Top output
Comment 5 Erich Morisse 2004-01-31 10:25:19 EST
Created attachment 97389 [details]
vmstat output
Comment 6 Erich Morisse 2004-01-31 10:26:01 EST
Created attachment 97390 [details]
lsmod output
Comment 7 Erich Morisse 2004-01-31 10:26:33 EST
Created attachment 97391 [details]
VMWare screenshot under high load
Comment 8 Erich Morisse 2004-01-31 10:27:51 EST
adding emorisse@redhat.com to CC list.
Comment 9 Erich Morisse 2004-01-31 10:50:39 EST
I am finding that both the host and client? systems are still
responsive to mouse and keyboard events, but very slow to respond. 
Under particularly high load, it can be up to 10 seconds.
Comment 10 Petr Vandrovec 2004-01-31 13:22:42 EST
Problem is caused by wakeup_kswapd. Under some circumstances (reliably
triggered by running VMware) it fires up kswapd and sets task state
for currently running process to TASK_RUNNING. This usually does not
matter - but it matters for poll, where it causes that after one loop
which calls poll methods for all fds process does not go to sleep, but
immediately second round is performed. vmmon code unfortunately relies
on old behavior, which guaranteed that poll will sleep, so it returns
POLLIN whenever it is second round (only purpose of vmmon poll method
is to wakeup application at next timer tick; unfortunately this is not
available through timeout argument, which sleeps always at least two
ticks). Updated vmmon should be available with WS4.5, but I still
think that this __alloc_pages() behavior should be revisited, as it
causes unnecessary looping in poll.

Now to the data I gathered.

If you take 2.4.21-9.EL sources, and apply all patches before
linux-2.4.21-rmap.patch, you'll get kernel which never sets task state to
TASK_RUNNING while processing poll(). Never during ~10min test.

Then if you'll apply linux-2.4.21-rmap.patch to it, you'll get kernel
which sets task state to TASK_RUNNING while processing poll() about
200 times a minute, that is about 3 times a second. Too small to
notice on "top" or in performance drop, but clearly something fishy is
going on.

But then, apply linux-2.4.21-rmap-updates.patch. And you'll get more
than 4000 such events every second! (4000 is what get through klogd,
but there are visible places where entries are missing). That is,
almost any memory allocation which happens in kernel fires up kswapd!

Due to this poll() issued by vmware-vmx immediately returns,
vmware-vmx issues gettimeofday(), finds that kernel woke it up too
early, and goes back to poll() again. But this poll() again fires up
kswapd, and everything happens again and again, forcing both
vmware-vmx and kswapd to do nothing useful and just grab CPU cycles
needed somewhere else.
Comment 11 Erik Bussink 2004-02-04 19:06:08 EST
VMware has a new release of their VMware Workstation 4.5.0 Beta, Build
7174. This release seems to have fixed the performance hit on
2.4.21-9.EL kernel. It actually even recoginzes the 2.4.21-9.EL stock
kernel for it's modules.
Comment 12 Erik Bussink 2004-02-04 19:07:57 EST
URL for the latest VMware Workstation 4.5.0 Release Candidate (Build 7174)
http://www.vmware.com/download/workstation_beta_status.html
Comment 13 James 2004-02-10 12:33:28 EST
I have the same problem with 2.4.21-9EL kernel. To get around the 
problem I had to compile kernel 2.4.24 and run vmware using bash -c 
"LD_ASSUME_KERNEL=2.4.0 vmware".  When I do this I have NO problems 
with VMware but a lot of other problems in RedHat such as KDE 
programs not running, segmentation faults.