Bug 983415 - migration with non-shared storage with full disk copy will cause domain crash
migration with non-shared storage with full disk copy will cause domain crash
Status: CLOSED DUPLICATE of bug 916067
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
x86_64 All
high Severity high
: rc
: ---
Assigned To: Virtualization Maintenance
Virtualization Bugs
Depends On:
  Show dependency treegraph
Reported: 2013-07-11 03:37 EDT by yanbing du
Modified: 2013-07-22 17:34 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2013-07-22 17:34:55 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
libvirtd-debug.log (187.88 KB, application/x-rar)
2013-07-11 03:37 EDT, yanbing du
no flags Details

  None (edit)
Description yanbing du 2013-07-11 03:37:10 EDT
Created attachment 772058 [details]

when do migration with --copy-storage-all via virsh, which means migration with non-shared storage with full disk copy, will cause domain crash.

Version-Release number of selected component (if applicable):
# uname -a
Linux hp-dl585g7-01.qe.lab.eng.nay.redhat.com 2.6.32-395.el6.x86_64 #1 SMP Tue Jul 2 14:34:40 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux

# rpm -q libvirt

# rpm -qa|grep qemu-kvm

How reproducible:

Steps to Reproduce:
# virsh migrate --live test qemu+ssh://root@$dest_host/system --copy-storage-all --verbose
error: Unable to read from monitor: Connection reset by peer

gdb debug info:

#gdb attach $qemu-kvm-pid

(gdb) c
[New Thread 0x7fc24ebfd700 (LWP 33189)]

Program received signal SIGSEGV, Segmentation fault.
monitor_flush (mon=0x0) at /usr/src/debug/qemu-kvm-
279	    buf = qstring_get_str(mon->outbuf);
(gdb) t a a bt

Thread 3 (Thread 0x7fc24ebfd700 (LWP 33189)):
#0  0x00007fc29cd0601e in __lll_lock_wait_private () from /lib64/libpthread.so.0
#1  0x00007fc29ccffecb in _L_lock_4712 () from /lib64/libpthread.so.0
#2  0x00007fc29ccffb34 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fc29ad8890d in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7fc295357700 (LWP 33173)):
#0  0x00007fc29cd06054 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fc29cd01388 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00007fc29cd01257 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007fc29d3d3180 in post_kvm_run (kvm=<value optimized out>, env=0x7fc29fb09a60) at /usr/src/debug/qemu-kvm-
#4  0x00007fc29d3d4728 in kvm_run (env=0x7fc29fb09a60) at /usr/src/debug/qemu-kvm-
#5  0x00007fc29d3d4ba9 in kvm_cpu_exec (env=<value optimized out>) at /usr/src/debug/qemu-kvm-
#6  0x00007fc29d3d5a8d in kvm_main_loop_cpu (_env=0x7fc29fb09a60) at /usr/src/debug/qemu-kvm-
#7  ap_main_loop (_env=0x7fc29fb09a60) at /usr/src/debug/qemu-kvm-
#8  0x00007fc29ccff851 in start_thread () from /lib64/libpthread.so.0
#9  0x00007fc29ad8890d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7fc29d31f980 (LWP 33171)):
#0  monitor_flush (mon=0x0) at /usr/src/debug/qemu-kvm-
#1  0x00007fc29d43eb1a in blk_mig_save_bulked_block (mon=0x0, f=0x7fc2a00a94e0, is_async=1) at /usr/src/debug/qemu-kvm-
#2  0x00007fc29d43ed8f in block_save_live (mon=0x0, f=0x7fc2a00a94e0, stage=1, opaque=<value optimized out>) at /usr/src/debug/qemu-kvm-
#3  0x00007fc29d43935b in qemu_savevm_state_begin (mon=0x0, f=0x7fc2a00a94e0, blk_enable=<value optimized out>, shared=<value optimized out>) at /usr/src/debug/qemu-kvm-
#4  0x00007fc29d430d7f in migrate_fd_connect (s=0x7fc29fc27b00) at /usr/src/debug/qemu-kvm-
#5  0x00007fc29d440920 in fd_start_outgoing_migration (mon=0x7fc29fb872d0, fdname=0x7fc29fc27a43 "migrate", bandwidth_limit=4294967295, detach=1, blk=1, inc=0)
    at /usr/src/debug/qemu-kvm-
#6  0x00007fc29d4313b5 in do_migrate (mon=0x7fc29fb872d0, qdict=<value optimized out>, ret_data=<value optimized out>) at /usr/src/debug/qemu-kvm-
#7  0x00007fc29d3b7e50 in monitor_call_handler (mon=0x7fc29fb872d0, cmd=0x7fc29d899de0, params=<value optimized out>) at /usr/src/debug/qemu-kvm-
#8  0x00007fc29d3b8ad4 in handle_qmp_command (parser=<value optimized out>, tokens=<value optimized out>) at /usr/src/debug/qemu-kvm-
#9  0x00007fc29d411624 in json_message_process_token (lexer=0x7fc29fb87380, token=0x7fc29fc271d0, type=JSON_OPERATOR, x=109, y=10) at /usr/src/debug/qemu-kvm-
#10 0x00007fc29d4112c0 in json_lexer_feed_char (lexer=0x7fc29fb87380, ch=125 '}', flush=false) at /usr/src/debug/qemu-kvm-
#11 0x00007fc29d411409 in json_lexer_feed (lexer=0x7fc29fb87380, buffer=0x7fff0c247ba0 "}\317\t", <incomplete sequence \375>, size=1) at /usr/src/debug/qemu-kvm-
#12 0x00007fc29d3b777b in monitor_control_read (opaque=<value optimized out>, buf=<value optimized out>, size=<value optimized out>) at /usr/src/debug/qemu-kvm-
#13 0x00007fc29d434d8a in qemu_chr_be_write (chan=<value optimized out>, cond=<value optimized out>, opaque=0x7fc29faefbd0) at /usr/src/debug/qemu-kvm-
#14 tcp_chr_read (chan=<value optimized out>, cond=<value optimized out>, opaque=0x7fc29faefbd0) at /usr/src/debug/qemu-kvm-
#15 0x00007fc29ca48f0e in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
#16 0x00007fc29d3b03ca in glib_select_poll (timeout=1000) at /usr/src/debug/qemu-kvm-
#17 main_loop_wait (timeout=1000) at /usr/src/debug/qemu-kvm-
#18 0x00007fc29d3d2c6a in kvm_main_loop () at /usr/src/debug/qemu-kvm-
#19 0x00007fc29d3b3d48 in main_loop (argc=46, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-
#20 main (argc=46, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-
(gdb) c
[Thread 0x7fc24ebfd700 (LWP 33189) exited]
[Thread 0x7fc295357700 (LWP 33173) exited]

Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.

--- Additional comment from Jiri Denemark on 2013-07-10 06:47:05 EDT ---

That's an unrelated qemu-kvm bug.

--- Additional comment from yanbing du on 2013-07-11 03:23:43 EDT ---

(In reply to Jiri Denemark from comment #18)
> That's an unrelated qemu-kvm bug.

OKay. So for this bug, migration for the local storage will not get the error: Unsafe migration: Migration may lead to data corruption if disks use cache != none

I will file a bug against qemu-kvm for the domain crash problem and move this bug to VERIFIED.
Comment 2 Qunfang Zhang 2013-07-11 05:21:35 EDT
We don't support migration with block migration and there is a similar bug as below.  CC'ing Paolo to have a look. 

Bug 916067  - when cancel the migration with ctrl+c during block migration(full disk copy or incremental disk copy), then migration again will cause domain destroyed
Comment 4 Ademar Reis 2013-07-22 17:34:55 EDT
As Paolo mentioned, we don't support block migration in RHEL6.5, but I'm closing it as duplicate of Bug 916067, which is open and being investigated by Paolo anyway.

*** This bug has been marked as a duplicate of bug 916067 ***

Note You need to log in before you can comment on or make changes to this bug.