| Summary: | IP packet loss on dom0 while xen domU running high load | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Kirby Zhou <kirbyzhou> |
| Component: | xen | Assignee: | Xen Maintainance List <xen-maint> |
| Status: | CLOSED NOTABUG | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 5.5 | CC: | drjones, leiwang, mrezanin, xen-maint, yuzhang |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2011-03-17 08:25:17 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Kirby Zhou
2011-03-16 11:56:29 UTC
More info:
in the file /etc/xen/vm-test, change the following lines
vcpus = 14
cpus = "0-11"
to
vcpus = 14
cpus = "0-15"
The actual result would be:
[ 3] Server Report:
[ 3] 0.0-10.0 sec 58.9 MBytes 49.4 Mbits/sec 0.011 ms 7734/625000 (1.2%)
the packet drop rate is lower than prior, but still high.
I *guess* this is by design. All VCPUs including the ones belonging to Domain0 are given the same priority by default for schedule. So in case of high CPU load, all VCPUs belonging to DomainU and Domain0 have to share the limited computing resources. Thus the performance of Domain0 and DomainU is not isolated. Xen does provide some kind of scheme for performance isolation, such as CPU WEIGHT and CAP. But it really depends. For the scenario in Description, you could either (1) Give lower weight to DomainU VCPUs via 'xm sched-credit' or (2) Give a proper CAP(also via xm sched-credit) to DomainU to make sure it couldn't consume too much cpu time. or (3) Simply reduce number of VCPUs of DomainU. All these three methods could make sure Domain0 VCPUs get more CPU time, and of course would have some impact on application running within DomainU. It depends on how you trade off. BTW, it is not recommended to run applications within Domain0. I agree with Yufang that this isn't a bug. When the hypervisor is scheduling vcpus it can't know that some of them are trying to receive udp packets, which, if not received immediately, will get dropped. When you move the bzip into dom0 then the dom0 kernel can schedule more intelligently, and thus keep the udp receiving task active, avoiding packet loss. I would be a bit more concerned if you had packet loss while using tcp, but even then without the hypervisor being informed of vcpu priorities (using sched-credit as pointed out by Yufang), then the dom0 vcpus may not get the cycles they need to keep a tcp connection alive. |