Description of problem:
dom0 can get starved of resources making domU's fail to function
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. in dom0, attach to iSCSI 'disks' provided by a domU
2. use these disks in another domU under heavy load
gory details here:
iSCSI dies, all guests stop responding (regardless of iSCSI involvement or not) until iSCSI is killed.
nothing falls over
this is apparently due to dom0 being starved of resources. Luke on the centos-virt list had seen this before and recommended this setting:
xm sched-credit -d 0 60000
which solves the problem. 60000 may be overkill, but due to dom0's unique roll, it should be able to preempt domU's to avoid these kinds of deadlocks. His wise aphorism: "if the dom0 is unhappy, everyone is unhappy."
There may be situations where one wants dom0 to be starved in the scheduler, but that's not a good default.
In fact, there are situations where you want dom0 and the guests to have about equal priority.
One case is where dom0 simply forwards network packets into the guests. If the system gets a lot of network traffic, dom0 will stay busy and the guests do not get to take network packets off their queues, leading to excessive packet loss.
Giving dom0 and the guests about equal priority solves this issue.
I have to agree with you that the current default is not ideal for some situations, but changing it will cause regressions for other workloads.
I do not believe that changing the default at this time in the RHEL 5 release cycle would be a good idea.