Bug 1095627
| Summary: | missing vhost schedule causing thread starvation | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Michael S. Tsirkin <mst> |
| Component: | kernel | Assignee: | Michael S. Tsirkin <mst> |
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 6.6 | CC: | chayang, gchakkar, jraju, jskeoch, juzhang, michen, mst, qiguo, qzhang, rhod, sluo, vyasevic |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | kernel-2.6.32-465.el6 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-10-14 06:08:13 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Michael S. Tsirkin
2014-05-08 08:52:50 UTC
build with a fix http://brewweb.devel.redhat.com/brew/taskinfo?taskID=7432666 This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release. Hi Michael I can not reproduce this bug, could you help check my tests, very thanks. Steps: host builds: # rpm -q qemu-kvm qemu-kvm-0.12.1.2-2.425.el6.x86_64 # uname -r 2.6.32-458.el6.x86_64 Guest kernel: # uname -r 2.6.32-431.19.1.el6.x86_64 Steps: 1.Boot 2 guests with vhost: guest1: # /usr/libexec/qemu-kvm -cpu Penryn -m 4G -smp 4,sockets=1,cores=4,threads=1 -M pc -enable-kvm -device piix3-usb-uhci,id=usb -name rhel7 -nodefaults -nodefconfig -device virtio-balloon-pci,id=balloon0 -vnc :20 -vga std -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor stdio -drive file=/root/qzhang/rhel6.5-64-backup.qcow2,if=none,media=disk,format=qcow2,rerror=stop,werror=stop,aio=native,id=scsi-disk0 -device virtio-scsi-pci,id=bus2 -device scsi-hd,bus=bus2.0,drive=scsi-disk0,id=disk0 -netdev tap,id=netdev0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=netdev0,id=vn1,mac=52:54:00:12:34:1a guest2: # /usr/libexec/qemu-kvm -cpu Penryn -m 4G -smp 4,sockets=1,cores=4,threads=1 -M pc -enable-kvm -device piix3-usb-uhci,id=usb -name rhel7 -nodefaults -nodefconfig -device virtio-balloon-pci,id=balloon0 -vnc :10 -vga std -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor stdio -drive file=/root/qzhang/rhel6.5-64-backupcp1.qcow2,if=none,media=disk,format=qcow2,rerror=stop,werror=stop,aio=native,id=scsi-disk0 -device virtio-scsi-pci,id=bus2 -device scsi-hd,bus=bus2.0,drive=scsi-disk0,id=disk0 -netdev tap,id=netdev0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=netdev0,id=vn1,mac=52:54:00:12:34:0a 2.pin vhost threads to same cpu: # pgrep vhost 10633 10699 # taskset -p 01 10633 pid 10633's current affinity mask: ff pid 10633's new affinity mask: 1 # taskset -p 01 10699 pid 10699's current affinity mask: ff pid 10699's new affinity mask: 1 # taskset -pc 10633 pid 10633's current affinity list: 0 # taskset -pc 10699 pid 10699's current affinity list: 0 3.Run netserver in both guests. and launch several UDP_STREAM netperf instances in host: # for i in $(seq 15) ; do netperf -H 10.66.10.169 -l 172800 -t UDP_STREAM -- -m 65507 & done # for i in $(seq 15) ; do netperf -H 10.66.11.129 -l 172800 -t UDP_STREAM -- -m 65507 & done 4.Monitor the resources via top: # top -p 10633,10699 .... top - 19:00:03 up 1 day, 6:23, 8 users, load average: 3.15, 2.85, 2.68 Tasks: 2 total, 0 running, 2 sleeping, 0 stopped, 0 zombie Cpu(s): 13.4%us, 21.5%sy, 0.0%ni, 64.3%id, 0.3%wa, 0.0%hi, 0.5%si, 0.0%st Mem: 8001996k total, 7819568k used, 182428k free, 54240k buffers Swap: 8142840k total, 692k used, 8142148k free, 5798588k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND 10633 root 20 0 0 0 0 S 43.8 0.0 17:28.75 0 vhost-10620 10699 root 20 0 0 0 0 S 43.8 0.0 17:08.43 0 vhost-10690 the threads are exhausting nearly same resources in host. so I did not reproduce this bug, since the reporter's reproducible is not 100%, I will wait for 1 day, and check it tommorow and update result here. Hi Michael, is there sth wrong in my steps or any suggestions can help us to reproduce it? thanks, yes as you see together they don't reach 100% this is why it does not trigger. I think guest to host will reproduce this faster. also try tcp. also maybe look at bandwidth with -D. (In reply to Michael S. Tsirkin from comment #4) > yes as you see together they don't reach 100% this is why > it does not trigger. > I think guest to host will reproduce this faster. > also try tcp. > also maybe look at bandwidth with -D. Thank you for your quick response~~ In fact, I firstly test is from guest to host, but can not reproduce, anyway, I will cancel current instances and resume to test from guest to host and add some tcp instances, and update here after long time. thanks Patch(es) available on kernel-2.6.32-465.el6 *** Bug 1090938 has been marked as a duplicate of this bug. *** Appears to also solve customer reported issues from Bug 1090938. Please consider for z-stream. There are a number of interested customers who wish to follow the progress of this bug, this is amplified by the closed (as duplicate) public bug which points to this restricted bug. Can I ask that you review the contents and if appropriate reconsider the groups applied, thank you. John go ahead and make it public as appropriate Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2014-1392.html |