Bug 589017
Summary: | [rhel5.5] [kvm] dead lock in qemu during off-line migration | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Haim <hateya> |
Component: | kvm | Assignee: | Juan Quintela <quintela> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | low | ||
Version: | 5.7 | CC: | bcao, danken, gleb, hateya, iheim, juzhang, llim, mgoldboi, quintela, virt-maint, yeylon, ykaul |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | kvm-83-213.el5 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2011-01-13 23:35:28 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 580949 |
Description
Haim
2010-05-05 06:34:21 UTC
please not that same operation succeeds over other hosts running same version of kvm and kernel 2.6.18-194. also - operation was performed from rhev-m --> vdsm --> kvm backtrace of thread 1: 0 0x0000003834e0d89b in write () from /lib64/libpthread.so.0 #1 0x0000000000473bdc in file_write (s=<value optimized out>, buf=0x1d2f9058, size=20480) at migration-exec.c:42 #2 0x000000000046afaf in migrate_fd_put_buffer (opaque=0x1d04dd40, data=0x1d2f9058, size=20480) at migration.c:211 #3 0x000000000049bd2d in buffered_put_buffer (opaque=0x1d2d8d40, buf=0x1d2f6058 "\275 \377\377\377\213\301\301\351\002\363\245\213È\341\003\363\244\203M\374\377\213\205 \377\377\377\211\205\020\377\377\377\213\205\024\377\377\377\203\300\003\203\340\374\001\205 \377\377\377\213\205 \377\377\377\211\205", pos=<value optimized out>, size=32768) at buffered_file.c:134 #4 0x0000000000471b38 in qemu_fflush (f=0x1d2f6010) at savevm.c:419 #5 0x0000000000472e95 in qemu_put_buffer (f=0x1d2f6010, buf=0x2b1d8c5fc33a "\003~\034\003^ \213E\374\353\337_^[\311\302\004", size=3270) at savevm.c:482 #6 0x0000000000408af8 in ram_save_block (f=0x1d2f6010) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/vl.c:3358 #7 0x0000000000408b6c in ram_save_live (f=0x1d2f6010, stage=2, opaque=<value optimized out>) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/vl.c:3427 #8 0x0000000000472cea in qemu_savevm_state_iterate (f=0x1d2f6010) at savevm.c:768 #9 0x000000000046b09c in migrate_fd_put_ready (opaque=<value optimized out>) at migration.c:256 #10 0x00000000004071bc in qemu_run_timers (ptimer_head=0xb38e00, current_time=158911986) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/vl.c:1271 #11 0x0000000000409577 in main_loop_wait (timeout=<value optimized out>) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/vl.c:4021 #12 0x00000000004ff1ea in kvm_main_loop () at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.c:596 #13 0x000000000040e425 in main_loop (argc=43, argv=0x7fff8cc0c588, envp=<value optimized out>) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/vl.c:4040 #14 main (argc=43, argv=0x7fff8cc0c588, envp=<value optimized out>) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/vl.c:6476 The process is not really stuck. Seams that qemu_run_timers() always calls migrate_fd_put_ready() so nothing else has chances to run. Reproduced, and should be fixed in next released KVM. (In reply to comment #1) > please not that same operation succeeds over other hosts running same version > of kvm and kernel 2.6.18-194. > > also - operation was performed from rhev-m --> vdsm --> kvm Hi, Harm you mean some specific host can trigger it .could you supply me the host info so that I can reproduce it ? thanks Mike (In reply to comment #9) > (In reply to comment #1) > > please not that same operation succeeds over other hosts running same version > > of kvm and kernel 2.6.18-194. > > > > also - operation was performed from rhev-m --> vdsm --> kvm > > > Hi, Harm > > you mean some specific host can trigger it .could you supply me the host info > so that I can reproduce it ? > > thanks > Mike Mike, bug was opened long time ago, which means that I don't have the exact host, and information. please see Juan comment - he managed to reproduced, maybe you can ask him. other then that, bug was surly fixed, as we run lots of migration\suspend testing (regression) on rhel5.x, and no one in our group came across it lately. this is up to you. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0028.html |