Bug 555649
Summary: | no response for operation on rhel3 during dd | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Suqin Huang <shuang> | ||||
Component: | kvm | Assignee: | Virtualization Maintenance <virt-maint> | ||||
Status: | CLOSED WONTFIX | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 5.4.z | CC: | ebachalo, ehabkost, gcosta, Jes.Sorensen, llim, mfranc, mkenneth, plyons, quintela, qzhang, tburke, virt-maint, xwei, ykaul | ||||
Target Milestone: | rc | Keywords: | Triaged | ||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 596989 (view as bug list) | Environment: | |||||
Last Closed: | 2011-11-03 13:31:35 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 580948 | ||||||
Attachments: |
|
Description
Suqin Huang
2010-01-15 04:45:00 UTC
Created attachment 384517 [details]
strace
Is it always reproducible? from 5.4.0 to 5.4.4? Is it always safe on 5.4.4->5.4.4? retest 5 times with 3.9-x86_64 2.4.21-50 can reproduce from 5.4.0 to 5.4.4 every time. 1. operation get response occasionally (Click Main Manu, command "clear") (1/5) 2. no response for any operation (4/5) also can reproduce from 5.4.4->5.4.4, the result is a little batter than 5.4.0 to 5.4.4. 1. Can click Main Manu, issue command "ls", open new terminal, but no response when open app, such as openoffice. (3/5) 2. no response for any operation (1/5) 3. operation get response occasionally (1/5) I am using here: Host A: root@deus ~]# rpm -qa *kvm* etherboot-zroms-kvm-5.4.4-10.el5 kvm-qemu-img-83-105.el5 kmod-kvm-83-105.el5 kvm-83-105.el5 [root@deus ~]# uname -a Linux deus.mitica 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux [root@deus ~]# Host B: root@gnomo ~]# rpm -qa *kvm* etherboot-zroms-kvm-5.4.4-10.el5 kmod-kvm-83-105.el5_4.19 kvm-83-105.el5_4.19 kvm-qemu-img-83-105.el5_4.13 [root@gnomo ~]# unam e-a -bash: unam: command not found [root@gnomo ~]# uname -a Linux gnomo.mitica 2.6.18-164.11.1.el5 #1 SMP Wed Jan 6 13:26:04 EST 2010 x86_64 x86_64 x86_64 GNU/Linux And I can do that test. I can repeat it several times. I also did several translations on host B -> host B with all valid combinations of 5.4.0 <-> 5.4.4 as source and target. My command line is: usr/libexec/qemu-kvm -m 1024 -smp 1 -name rhel3.9X-32 -uuid 0b17c0fd-db6a-2817-8dfb-a4371dbae29b -no-kvm-pit-reinjection -monitor stdio -boot c -drive file=/mnt/images/images/rhel3.9-32X.img,if=ide,index=0,boot=on,cache=none -net nic,macaddr=54:52:00:23:b1:a7,vlan=0 -net tap,script=/etc/kvm-ifup,vlan=0,ifname=vnet0 -serial none -parallel none -usb -vnc :0 -k es -M rhel5.4.4 -cpu qemu64,+sse2 -incoming tcp:0:4444 [-rtc-td-hack] I have tried with several combinations (normal CPU, no -rtc-td-hack, ...) all works for me. Could you test that issue is fixed for you? If it is not, could you test in a host that is not a Nehalem? Only remaining difference is that I am using Core Duos. vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz stepping : 6 I read it again, and found the -smp 2 part. Up works without any problem, but smp fails from time to time. investigating it. *** Bug 524761 has been marked as a duplicate of this bug. *** Updating host to 5.4.4 and using 5.4.4 -> 5.4.4 fails equally. Same failures with 5.5.0 in 5.5.0 mode. This problem is not related with migration. You can reproduce the problem without any migration involved. Without smp, it works as expected. An easy way to reproduce is to do: - dd if=/dev/zero of=test count=8000 bs=512k - while true; do date -u && sleep 1; done - ping in other window You will see that at times, the date and the ping stall for times as long as 30 seconds. During that stalls, nothing works. neithher keyboard, mouse, ssh sessions. I can reproduce the problem without migration. up kernel can not work well neither, no response when open oowriter, gedit, and other apps. Ok, I do have some updates on this, although not yet a fix. The responsible for this behaviour is the cache=none flag. It works perfectly for all other cache methods. perf does not show any big hogs, so my current theory is that we're misreporting something at the block layer. Also, it is also present on RHEL6/upstream. I will clone this bug to reflect this. This is the same bug describe in rawhide's #563103. It is a glibc bug. We should get the fix and backport it in glibc. This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. This request was erroneously denied for the current release of Red Hat Enterprise Linux. The error has been fixed and this request has been re-proposed for the current release. I have tried Comment 8 sometimes can reproduce. && in my host,it's response,but very very very slow,up to 30 min for response a command. I can reproduce this by the follow operations: 1) start the guest into level 3 by cmd in hostA: /usr/libexec/qemu-kvm -smp 1 -m 2G -drive file=/media/live-migragion.qcow2,media=disk,if=ide,cache=none,index=0,serial=fb-bde1-8bcf10f72b98 -net nic,vlan=0,macaddr=00:65:4a:01:00:37,model=e1000 -net tap,vlan=0,script=/media/qemu-ifup-switch -uuid `uuidgen` -no-hpet -rtc-td-hack -startdate now -cpu qemu64,+sse2 -monitor stdio -vnc :0 -name 3.9 in HostB for standby: /usr/libexec/qemu-kvm -smp 1 -m 2G -drive file=/media/live-migragion.qcow2,media=disk,if=ide,cache=none,index=0,serial=fb-bde1-8bcf10f72b98 -net nic,vlan=0,macaddr=00:65:4a:01:00:37,model=e1000 -net tap,vlan=0,script=/media/qemu-ifup-switch -uuid `uuidgen` -no-hpet -rtc-td-hack -startdate now -cpu qemu64,+sse2 -monitor stdio -vnc :0 -name 3.9 -incoming tcp:0:4444 2) exec cmd in guest: dd if=/dev/zero of=file.img count=100 bs=512 SSH to the guest by ssh root@guestIP #the guest works normal and finish the operation. dd if=/dev/zero of=file.img count=8000 bs=512 #while it dd ing ,do migration 3) migrate -d tcp:hostB:4444 send keys to guest from hostB's kvm monitor #sendkey alt-f2 #now the dd process is freeze in guest. #ps aux|grep -i dd can see it's status is D 4)do cmd in guest's 2nd console " while true; do date -u && sleep 1; done" it outputs normal, one line per second. 5)get the 3rd console of guest by send key from host B: sendkey alt-f3 #login to guest now can take as long as 5 min to response(just type root and press enter), startx it takes 35 min for fully start a gnome desktop. while the gnome starting ,the SSH session still on,but stalls for several mins . type a 'free -m' or 'top' in SSH takes 5 min for output. RHEL-5 glibc does not provide preadv/pwritev, so I don't see how this can have anything to do with glibc. |