Bug 1752320
Summary: | vm gets stuck when migrate vm back and forth with remote-viewer trying to connect | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Han Han <hhan> | ||||||
Component: | qemu-kvm | Assignee: | Dr. David Alan Gilbert <dgilbert> | ||||||
qemu-kvm sub component: | General | QA Contact: | Li Xiaohui <xiaohli> | ||||||
Status: | CLOSED ERRATA | Docs Contact: | |||||||
Severity: | unspecified | ||||||||
Priority: | medium | CC: | ddepaula, dgilbert, fjin, jinzhao, juzhang, rbalakri, virt-maint, xiaohli, yafu, zhguo | ||||||
Version: | 8.1 | ||||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | qemu-kvm-2.12.0-97.module+el8.2.0+5545+14c6799f | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2020-04-28 15:32:34 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
last message on I think destination side is: 2019-09-11T02:02:26.818516Z qemu-kvm: warning: Spice: Connection reset by peer 2019-09-11T02:02:26.818713Z qemu-kvm: warning: Spice: Connection reset by peer 2019-09-11T02:02:26.818741Z qemu-kvm: warning: Spice: Connection reset by peer warning: chardev_can_read called on non open chardev! 2019-09-11T02:02:30.213079Z qemu-kvm: warning: chardev_can_read called on non open chardev! 2019-09-11 02:18:12.297+0000: shutting down, reason=destroyed and that seems to be after the source thinks the migration has completed. I've not seen that 'chardev_can_read' warning before - that's odd! The message comes from usbredir: hw/usb/redirect.c: static int usbredir_chardev_can_read(void *opaque) { USBRedirDevice *dev = opaque; if (!dev->parser) { WARNING("chardev_can_read called on non open chardev!\n"); return 0; } so I guess there's some ordering problem when it's disconnected. I have tried to reproduce via libvirt on the latest host, didn't hit this issue after 20 hours, only tried one times, will try again later: 1.migrate.sh & run.sh still work well 2."virsh domstats nfs" & "virsh qemu-monitor-command nfs --hmp info nfs" can get guest status Hi Han han, When you met this issue, how long did you run migrate.sh & run.sh ? (In reply to Li Xiaohui from comment #5) > Hi Han han, > When you met this issue, how long did you run migrate.sh & run.sh ? I usually reproduce it by less than 100 times of remote-viewer being executed, within half an hour. Could you please provide your ENV and let me have a look? Reproduced here using your scripts; took about 10 migrations. warning: chardev_can_read called on non open chardev! 2019-11-08T10:55:01.144742Z qemu-kvm: warning: chardev_can_read called on non open chardev! As well as the warning about the non-open chardev (that comes from usbredir), the hang is coming on the write side of things as we recursively try and do a write on the same socket: #0 0x00007fc77b4a68dd in __lll_lock_wait () from target:/lib64/libpthread.so.0 #1 0x00007fc77b49faf9 in pthread_mutex_lock () from target:/lib64/libpthread.so.0 #2 0x000055738d7774fd in qemu_mutex_lock_impl (mutex=0x55738f0b18e8, file=0x55738d90b78d "chardev/char.c", line=111) at util/qemu-thread-posix.c:66 #3 0x000055738d70cc03 in qemu_chr_write_buffer (s=s@entry=0x55738f0b18c0, buf=buf@entry=0x55738fe2db80 "", len=80, offset=offset@entry=0x7ffef54237d0, write_all=false) at chardev/char.c:111 (gdb) p *s $2 = {parent_obj = {class = 0x55738f053c60, free = 0x7fc77ff8a380 <g_free>, properties = 0x55738f0aec60, ref = 1, parent = 0x55738f0a98f0}, chr_write_lock = {lock = {__data = {__lock = 2, __count = 0, __owner = 25470, __nusers = 1, __kind = 0, __spins = 0, __elision = 0, __list = { __prev = 0x0, __next = 0x0}}, __size = "\002\000\000\000\000\000\000\000~c\000\000\001", '\000' <repeats 26 times>, __align = 2}, initialized = true}, be = 0x55739046dcf0, label = 0x55738f0a9cb0 "charredir0", filename = 0x55738f0abc60 "spicevmc", logfd = -1, be_open = 1, gsource = 0x0, gcontext = 0x0, features = {0}} #4 0x000055738d70cf13 in qemu_chr_write (s=0x55738f0b18c0, buf=buf@entry=0x55738fe2db80 "", len=len@entry=80, write_all=write_all@entry=false) at chardev/char.c:149 #5 0x000055738d70efe3 in qemu_chr_fe_write (be=be@entry=0x55739046dcf0, buf=buf@entry=0x55738fe2db80 "", len=len@entry=80) at chardev/char-fe.c:42 #6 0x000055738d62836c in usbredir_write (priv=0x55739046c640, data=0x55738fe2db80 "", count=80) at hw/usb/redirect.c:289 #7 0x00007fc77fd3241b in usbredirparser_do_write () from target:/lib64/libusbredirparser.so.1 #8 0x000055738d56c371 in vmc_write (sin=<optimized out>, buf=<optimized out>, len=<optimized out>) at chardev/spice.c:34 #9 0x00007fc77d33380c in red_char_device_write_to_device (dev=0x55738f0ed960) at char-device.c:492 #10 red_char_device_write_to_device (dev=0x55738f0ed960) at char-device.c:458 #11 0x00007fc77d3341bd in red_char_device_wakeup (dev=0x55738f0ed960) at char-device.c:854 #12 0x00007fc77d36b645 in spice_server_char_device_wakeup (sin=sin@entry=0x55738f0b1950) at reds.c:3264 #13 0x000055738d56ca64 in spice_chr_write (chr=0x55738f0b18c0, buf=0x55738fe2db80 "", len=80) at chardev/spice.c:202 #14 0x000055738d70cc50 in qemu_chr_write_buffer (s=s@entry=0x55738f0b18c0, buf=buf@entry=0x55738fe2db80 "", len=80, offset=offset@entry=0x7ffef5423970, write_all=false) at chardev/char.c:114 #15 0x000055738d70cf13 in qemu_chr_write (s=0x55738f0b18c0, buf=buf@entry=0x55738fe2db80 "", len=len@entry=80, write_all=write_all@entry=false) at chardev/char.c:149 #16 0x000055738d70efe3 in qemu_chr_fe_write (be=be@entry=0x55739046dcf0, buf=buf@entry=0x55738fe2db80 "", len=len@entry=80) at chardev/char-fe.c:42 #17 0x000055738d62836c in usbredir_write (priv=0x55739046c640, data=0x55738fe2db80 "", count=80) at hw/usb/redirect.c:289 #18 0x00007fc77fd3241b in usbredirparser_do_write () from target:/lib64/libusbredirparser.so.1 #19 0x000055738d627cc8 in usbredir_create_parser (dev=0x55739046c640) at hw/usb/redirect.c:1235 #20 0x00007fc77d378452 in spicevmc_connect (channel=<optimized out>, client=0x55738f519920, stream=<optimized out>, migration=<optimized out>, caps=<optimized out>) at spicevmc.c:793 #21 0x00007fc77d359c85 in red_channel_connect (channel=channel@entry=0x55738f0ea1c0, client=client@entry=0x55738f519920, stream=stream@entry=0x55739049fb30, migration=0, caps=caps@entry=0x7ffef5423b30) at red-channel.c:523 #22 0x00007fc77d367633 in reds_channel_do_link (channel=channel@entry=0x55738f0ea1c0, client=client@entry=0x55738f519920, link_msg=link_msg@entry=0x55738f080750, stream=0x55739049fb30) at reds.c:2039 #23 0x00007fc77d36dc00 in reds_handle_other_links (link=0x55739049fa50, reds=0x55738f0c1920) at reds.c:2180 #24 reds_handle_link (link=link@entry=0x55739049fa50) at reds.c:2196 #25 0x00007fc77d36e24f in reds_handle_ticket (opaque=0x55739049fa50) at reds.c:2250 #26 0x000055738d774882 in aio_dispatch_handlers (ctx=ctx@entry=0x55738f01baf0) at util/aio-posix.c:429 #27 0x000055738d77522c in aio_dispatch (ctx=0x55738f01baf0) at util/aio-posix.c:460 #28 0x000055738d771cc2 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 #29 0x00007fc77ff8472d in g_main_context_dispatch () from target:/lib64/libglib-2.0.so.0 #30 0x000055738d7742d8 in glib_pollfds_poll () at util/main-loop.c:218 #31 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:241 #32 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:517 #33 0x000055738d55d3e9 in main_loop () at vl.c:1809 and we block in s->chr_write_lock in qemu_chr_write_buffer This means we're going through qemu_chr_write_buffer twice on the same chardev; shouldn't happen. I'm fairly sure that aio based call to reds_handle_other_links comes via: async reds_handle_ticket probably from reds_get_spice_ticket that's got a couple of routes, probably from reds_handle_auth_mechanism ? from reds_handle_read_link_done from reds_handle_read_header_done from reds_handle_read_magic_done from reds_handle_new_link from reds_handle_ssl_accept or reds_init_client_ssl_connection or spice_server_add_client or reds_handle_read_magic_done (websocket) i.e. when the remote-viewer connects. it's not clear what the migration interaction is yet #23 0x00007fc77d36dc00 in reds_handle_other_links (link=0x55739049fa50, reds=0x55738f0c1920) at reds.c:2180 reds->dst_do_seamless_migrate= 0 (gdb) p *client $8 = {parent = {g_type_instance = {g_class = 0x55738f53ab20}, ref_count = 1, qdata = 0x0}, reds = 0x55738f0c1920, channels = 0x55739047ab20, mcc = 0x5573903ecde0, lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = { __prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, thread_id = 140494834314944, disconnecting = 0, during_target_migrate = 0, seamless_migrate = 0, num_migrated_channels = 0} Also triggered on current upstream. (41[34]'s /opt/bz1752320/ for ref) Also noticed: Not all cases where the chardev_can_read warning happens cause the hang. There's also a bunch of debug of the form: warning: Spice: display:0 (0x557996221940): unexpected Spice: usbredir:0 (0x55799621f9e0): unexpected 2019-11-08T19:45:23.064895Z qemu-system-x86_64: warning: Spice: main:0 (0x557996221890): unexpected I think the 'chardev_can_read called on non open chardev' could be due to usbredir_chardev_close_bh destroying the parser before it removes the watch; but I'm not sure yet. The actual recursion seems similar to the one spice protects with it's commit: commit 0c1f5b00e7907aefee13f86a234558f00cd6c7ef Author: Uri Lublin <uril> Date: Mon Feb 2 12:35:59 2015 +0200 char-device: spice_char_device_write_to_device: protect against recursion which was added to handle a wakeup based recursion - but we've only actually gone through spices write path once so far; the other side comign from red_channel_connect->spicevmc_connect Some debug; seems to suggest that the 'chardev open' is happening after the post_load? usbredir_post_load vm_change_state_handler: running=1 state=9 DAG: qemu_spice_display_start entry spice_display_is_running: 0 warning: chardev_can_read called on non open chardev! usbredir_vm_state_change running=1 state=9 2019-12-17T12:15:08.155275Z qemu-system-x86_64: warning: chardev_can_read called on non open chardev! 2019-12-17T12:15:08.155317Z qemu-system-x86_64: usb-redir: chardev open (top) usbredir_chardev_close_bh: entry dev=0x55824617e150 usbredir_device_disconnect entry dev=0x55824617e150 2019-12-17T12:15:08.155339Z qemu-system-x86_64: usb-redir: removing 0 packet-ids from cancelled queue 2019-12-17T12:15:08.155348Z qemu-system-x86_64: usb-redir: removing 0 packet-ids from already-in-flight queue usbredir_device_disconnect exit dev=0x55824617e150 usbredir_chardev_close_bh: exit dev=0x55824617e150 2019-12-17T12:15:08.155367Z qemu-system-x86_64: usb-redir: creating usbredirparser usbredir_write: count=80 entry usbredir_write: be open: 0x558244ee1c00 - charredir0 2019-12-17T12:15:08.155570Z qemu-system-x86_64: usbredirparser: Peer version: spice-gtk 0.37, using 64-bits ids usbredir_hello usbredir_write: count=80 entry usbredir_write: be open: 0x558244ee1c00 - charredir0 (In reply to Dr. David Alan Gilbert from comment #11) > I think the 'chardev_can_read called on non open chardev' could be due to > usbredir_chardev_close_bh > destroying the parser before it removes the watch; but I'm not sure yet. Flipping that around doesn't help fix either bug. I'm trying the following recursion guard; it looks like it's working but needs more testing. diff --git a/hw/usb/redirect.c b/hw/usb/redirect.c index e0f5ca6f81..02e4ea594f 100644 --- a/hw/usb/redirect.c +++ b/hw/usb/redirect.c @@ -113,6 +113,7 @@ struct USBRedirDevice { /* Properties */ CharBackend cs; bool enable_streams; + bool in_write; uint8_t debug; int32_t bootindex; char *filter_str; @@ -290,6 +291,13 @@ static int usbredir_write(void *priv, uint8_t *data, int count) return 0; } + /* Recursion check */ + if (dev->in_write) { + DPRINTF("usbredir_write recursion\n"); + return 0; + } + dev->in_write=true; + r = qemu_chr_fe_write(&dev->cs, data, count); if (r < count) { if (!dev->watch) { @@ -300,6 +308,7 @@ static int usbredir_write(void *priv, uint8_t *data, int count) r = 0; } } + dev->in_write=false; return r; } Please try: http://brew-task-repos.usersys.redhat.com/repos/scratch/dgilbert/qemu-kvm/4.2.0/4.el8.bz1752320a/ seems to fix it for me. Patch sent upstream: [PATCH] usbredir: Prevent recursion in usbredir_write (In reply to Dr. David Alan Gilbert from comment #16) > Please try: > http://brew-task-repos.usersys.redhat.com/repos/scratch/dgilbert/qemu-kvm/4. > 2.0/4.el8.bz1752320a/ > > seems to fix it for me. Unluckily, I cannot reproduce the bug anymore on libvirt-5.6.0-4.module+el8.1.0+4160+b50057dc.x86_64 qemu-kvm-4.1.0-9.module+el8.1.0+4210+23b2046a.x86_64. I have run it as the scripts for 30mins and more than 600 times of migration cycle, not reproduced. I can summarize the issues that found by the these scripts here. Maybe you can find something in common. remote-viewer gets SIGABRT when connect to a being migrated VM: https://bugzilla.redhat.com/show_bug.cgi?id=1746239#c4 remote-viewer gets SIGSEGV when connect to a being migrated VM: https://bugzilla.redhat.com/show_bug.cgi?id=1746239#c0 qemu-kvm get SIGABRT when setting spice usb redir during migration: https://bugzilla.redhat.com/show_bug.cgi?id=1786413#c0 qemu-kvm get SIGSEGV when setting spice usb redir during migration: https://bugzilla.redhat.com/show_bug.cgi?id=1786413#c2 More update: the bug is hard to reproduce but it did appears sometime after comment18. I will continue find a steady way to reproduce it. (In reply to Han Han from comment #19) > More update: the bug is hard to reproduce but it did appears sometime after > comment18. > I will continue find a steady way to reproduce it. Yes it's difficult - sometimes it repeats often; sometimes it hides. I'd also seen the SIGSEGV in remote viewer 1746239 - but they're very random; often lots of different places. I'd not seen either the abrt or the other qemu crashes. Dave (In reply to Dr. David Alan Gilbert from comment #16) > Please try: > http://brew-task-repos.usersys.redhat.com/repos/scratch/dgilbert/qemu-kvm/4. > 2.0/4.el8.bz1752320a/ > > seems to fix it for me. For your scratch build, I have run the bug reproducing scripts for over one day. The bug is not reproduced. Version: libvirt-5.10.0-1.module+el8.2.0+5135+ed3b2489.x86_64 qemu-kvm-4.2.0-4.el8.bz1752320a.x86_64 Created attachment 1649328 [details] segment fault in scratch build: L1 XMLs and backtrace Hello, another segment fault was found on scratch build when loop migrations: Version: libvirt-5.10.0-1.module+el8.2.0+5135+ed3b2489.x86_64 qemu-kvm-4.2.0-4.el8.bz1752320a.x86_64 Setup: Prepare 2 L1 host for migration, one with interface bandwidth limit: <bandwidth> <inbound average='20480'/> <outbound average='20480'/> </bandwidth> Steps: Follow steps1~3 comment0 Results: Sometimes following warnings pop up on virsh migration cmdline: Migration: [ 0 %]error: internal error: qemu unexpectedly closed the monitor: 2020-01-03T03:07:12.112093Z qemu-kvm: warning: host doesn't support requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12] 2020-01-03T03:07:12.112109Z qemu-kvm: warning: host doesn't support requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13] 2020-01-03T03:07:12.114502Z qemu-kvm: warning: host doesn't support requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12] 2020-01-03T03:07:12.114513Z qemu-kvm: warning: host doesn't support requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13] And then get a segment fault: # abrt-cli ls id 0d37d84a6d9cdc4dc9a8ecee86350ec983faf720 reason: main_channel_client_is_low_bandwidth(): qemu-kvm killed by SIGSEGV time: Fri 03 Jan 2020 11:07:12 AM CST cmdline: /usr/libexec/qemu-kvm -name guest=nfs-8.2,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1112-nfs-8.2/master-key.aes -machine pc-q35-rhel8.2.0,accel=kvm,usb=off,dump-guest-core=off -cpu Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,arch-capabilities=on,xsaves=on,pdpe1gb=on,skip-l1dfl-vmentry=on,spec-ctrl=off,mpx=off -m 1024 -overcommit mem-lock=off -smp 2,sockets=2,cores=1,threads=1 -uuid e5717b24-74cc-4aa3-b2c3-e55cd2589f98 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=37,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-pci-bridge,id=pci.7,bus=pci.4,addr=0x0 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x1d -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 -device qemu-xhci,id=usb1,bus=pci.2,addr=0x0 -device virtio-scsi-pci,id=scsi0,bus=pci.1,addr=0x0 -device ahci,id=sata1,bus=pci.7,addr=0x1 -blockdev '{\"driver\":\"gluster\",\"volume\":\"gv\",\"path\":\"nfs-8.2.qcow2\",\"server\":[{\"type\":\"inet\",\"host\":\"gls1.usersys.redhat.com\",\"port\":\"24007\"}],\"debug\":0,\"node-name\":\"libvirt-1-storage\",\"auto-read-only\":true,\"discard\":\"unmap\"}' -blockdev '{\"node-name\":\"libvirt-1-format\",\"read-only\":false,\"driver\":\"qcow2\",\"file\":\"libvirt-1-storage\",\"backing\":null}' -device virtio-blk-pci,scsi=off,bus=pci.5,addr=0x0,drive=libvirt-1-format,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb1.0,port=2 -incoming defer -device virtio-balloon-pci,id=balloon0,bus=pci.3,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on package: 15:qemu-kvm-core-4.2.0-4.el8.bz1752320a uid: 107 (qemu) count: 1 Directory: /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404 Run 'abrt-cli report /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404' for creating a case in Red Hat Customer Portal Backtrace #0 0x00007fc0cbc3d588 in main_channel_client_is_low_bandwidth (mcc=mcc@entry=0x556b771c02a0) at main-channel-client.c:656 #1 0x00007fc0cbc1f516 in dcc_config_socket (rcc=0x556b77648540) at dcc.c:1426 #2 0x00007fc0cbc440a3 in red_channel_client_config_socket (rcc=0x556b77648540) at red-channel-client.c:1046 #3 red_channel_client_initable_init (initable=<optimized out>, cancellable=<optimized out>, error=0x0) at red-channel-client.c:925 #4 0x00007fc0c6eed5df in g_initable_new_valist (object_type=<optimized out>, first_property_name=0x7fc0cbce2dc4 "channel", var_args=0x7fc0a4f90350, cancellable=0x0, error=0x0) at ginitable.c:248 #5 0x00007fc0c6eed68d in g_initable_new (object_type=<optimized out>, cancellable=cancellable@entry=0x0, error=error@entry=0x0, first_property_name=first_property_name@entry=0x7fc0cbce2dc4 "channel") at ginitable.c:162 #6 0x00007fc0cbc2015d in dcc_new (display=display@entry=0x556b7716d930, client=client@entry=0x556b77166dd0, stream=stream@entry=0x556b775d5570, mig_target=mig_target@entry=1, caps=caps@entry=0x556b785333a8, image_compression=SPICE_IMAGE_COMPRESSION_AUTO_GLZ, jpeg_state=SPICE_WAN_COMPRESSION_AUTO, zlib_glz_state=SPICE_WAN_COMPRESSION_AUTO) at dcc.c:507 #7 0x00007fc0cbc2b8c6 in display_channel_connect (channel=<optimized out>, client=0x556b77166dd0, stream=0x556b775d5570, migration=1, caps=0x556b785333a8) at display-channel.c:2616 #8 0x00007fc0cbc4200b in handle_dispatcher_connect (opaque=<optimized out>, payload=0x556b78533390) at red-channel.c:511 #9 0x00007fc0cbc2747f in dispatcher_handle_single_read (dispatcher=0x556b785328a0) at dispatcher.c:287 #10 dispatcher_handle_recv_read (dispatcher=0x556b785328a0) at dispatcher.c:307 #11 0x00007fc0cbc2d79f in watch_func (source=<optimized out>, condition=<optimized out>, data=0x556b785333e0) at event-loop.c:119 #12 0x00007fc0ce27867d in g_main_dispatch (context=0x556b785332a0) at gmain.c:3176 #13 g_main_context_dispatch (context=context@entry=0x556b785332a0) at gmain.c:3829 #14 0x00007fc0ce278a48 in g_main_context_iterate (context=0x556b785332a0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3902 #15 0x00007fc0ce278d72 in g_main_loop_run (loop=0x556b77a59c60) at gmain.c:4098 #16 0x00007fc0cbc5b47b in red_worker_main (arg=0x556b785326c0) at red-worker.c:1139 #17 0x00007fc0c9d832de in start_thread (arg=<optimized out>) at pthread_create.c:486 #18 0x00007fc0c9ab4e83 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (In reply to Han Han from comment #22) > Created attachment 1649328 [details] > segment fault in scratch build: L1 XMLs and backtrace > > Hello, another segment fault was found on scratch build when loop migrations: > Version: libvirt-5.10.0-1.module+el8.2.0+5135+ed3b2489.x86_64 > qemu-kvm-4.2.0-4.el8.bz1752320a.x86_64 > > Setup: > Prepare 2 L1 host for migration, one with interface bandwidth limit: > <bandwidth> > <inbound average='20480'/> > <outbound average='20480'/> > </bandwidth> > > Steps: Follow steps1~3 comment0 > > Results: Sometimes following warnings pop up on virsh migration cmdline: > Migration: [ 0 %]error: internal error: qemu unexpectedly closed the > monitor: 2020-01-03T03:07:12.112093Z qemu-kvm: warning: host doesn't support > requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12] > 2020-01-03T03:07:12.112109Z qemu-kvm: warning: host doesn't support > requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13] > > 2020-01-03T03:07:12.114502Z qemu-kvm: warning: host doesn't support > requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12] > > 2020-01-03T03:07:12.114513Z qemu-kvm: warning: host doesn't support > requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13] > > And then get a segment fault: > # abrt-cli ls > id 0d37d84a6d9cdc4dc9a8ecee86350ec983faf720 > reason: main_channel_client_is_low_bandwidth(): qemu-kvm killed by > SIGSEGV > time: Fri 03 Jan 2020 11:07:12 AM CST > cmdline: /usr/libexec/qemu-kvm -name guest=nfs-8.2,debug-threads=on > -S -object > secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1112-nfs-8. > 2/master-key.aes -machine > pc-q35-rhel8.2.0,accel=kvm,usb=off,dump-guest-core=off -cpu > Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on, > umip=on,arch-capabilities=on,xsaves=on,pdpe1gb=on,skip-l1dfl-vmentry=on,spec- > ctrl=off,mpx=off -m 1024 -overcommit mem-lock=off -smp > 2,sockets=2,cores=1,threads=1 -uuid e5717b24-74cc-4aa3-b2c3-e55cd2589f98 > -no-user-config -nodefaults -chardev > socket,id=charmonitor,fd=37,server,nowait -mon > chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew > -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global > ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device > pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on, > addr=0x2 -device > pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device > pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device > pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device > pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device > pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device > pcie-pci-bridge,id=pci.7,bus=pci.4,addr=0x0 -device > ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 -device > ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on, > addr=0x1d -device > ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 -device > ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 -device > qemu-xhci,id=usb1,bus=pci.2,addr=0x0 -device > virtio-scsi-pci,id=scsi0,bus=pci.1,addr=0x0 -device > ahci,id=sata1,bus=pci.7,addr=0x1 -blockdev > '{\"driver\":\"gluster\",\"volume\":\"gv\",\"path\":\"nfs-8.2.qcow2\", > \"server\":[{\"type\":\"inet\",\"host\":\"gls1.usersys.redhat.com\",\"port\": > \"24007\"}],\"debug\":0,\"node-name\":\"libvirt-1-storage\",\"auto-read- > only\":true,\"discard\":\"unmap\"}' -blockdev > '{\"node-name\":\"libvirt-1-format\",\"read-only\":false,\"driver\": > \"qcow2\",\"file\":\"libvirt-1-storage\",\"backing\":null}' -device > virtio-blk-pci,scsi=off,bus=pci.5,addr=0x0,drive=libvirt-1-format,id=virtio- > disk0,bootindex=1 -chardev pty,id=charserial0 -device > isa-serial,chardev=charserial0,id=serial0 -device > usb-tablet,id=input0,bus=usb.0,port=1 -spice > port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -device > qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0, > vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -chardev > spicevmc,id=charredir0,name=usbredir -device > usb-redir,chardev=charredir0,id=redir0,bus=usb1.0,port=2 -incoming defer > -device virtio-balloon-pci,id=balloon0,bus=pci.3,addr=0x0 -sandbox > on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg > timestamp=on > package: 15:qemu-kvm-core-4.2.0-4.el8.bz1752320a > uid: 107 (qemu) > count: 1 > Directory: /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404 > Run 'abrt-cli report /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404' for > creating a case in Red Hat Customer Portal > > Backtrace > #0 0x00007fc0cbc3d588 in main_channel_client_is_low_bandwidth > (mcc=mcc@entry=0x556b771c02a0) at main-channel-client.c:656 > #1 0x00007fc0cbc1f516 in dcc_config_socket (rcc=0x556b77648540) at > dcc.c:1426 > #2 0x00007fc0cbc440a3 in red_channel_client_config_socket > (rcc=0x556b77648540) at red-channel-client.c:1046 > #3 red_channel_client_initable_init (initable=<optimized out>, > cancellable=<optimized out>, error=0x0) at red-channel-client.c:925 > #4 0x00007fc0c6eed5df in g_initable_new_valist (object_type=<optimized > out>, first_property_name=0x7fc0cbce2dc4 "channel", var_args=0x7fc0a4f90350, > cancellable=0x0, error=0x0) at ginitable.c:248 > #5 0x00007fc0c6eed68d in g_initable_new (object_type=<optimized out>, > cancellable=cancellable@entry=0x0, error=error@entry=0x0, > first_property_name=first_property_name@entry=0x7fc0cbce2dc4 "channel") > at ginitable.c:162 > #6 0x00007fc0cbc2015d in dcc_new (display=display@entry=0x556b7716d930, > client=client@entry=0x556b77166dd0, stream=stream@entry=0x556b775d5570, > mig_target=mig_target@entry=1, caps=caps@entry=0x556b785333a8, > image_compression=SPICE_IMAGE_COMPRESSION_AUTO_GLZ, > jpeg_state=SPICE_WAN_COMPRESSION_AUTO, > zlib_glz_state=SPICE_WAN_COMPRESSION_AUTO) at dcc.c:507 > #7 0x00007fc0cbc2b8c6 in display_channel_connect (channel=<optimized out>, > client=0x556b77166dd0, stream=0x556b775d5570, migration=1, > caps=0x556b785333a8) at display-channel.c:2616 > #8 0x00007fc0cbc4200b in handle_dispatcher_connect (opaque=<optimized out>, > payload=0x556b78533390) at red-channel.c:511 > #9 0x00007fc0cbc2747f in dispatcher_handle_single_read > (dispatcher=0x556b785328a0) at dispatcher.c:287 > #10 dispatcher_handle_recv_read (dispatcher=0x556b785328a0) at > dispatcher.c:307 > #11 0x00007fc0cbc2d79f in watch_func (source=<optimized out>, > condition=<optimized out>, data=0x556b785333e0) at event-loop.c:119 > #12 0x00007fc0ce27867d in g_main_dispatch (context=0x556b785332a0) at > gmain.c:3176 > #13 g_main_context_dispatch (context=context@entry=0x556b785332a0) at > gmain.c:3829 > #14 0x00007fc0ce278a48 in g_main_context_iterate (context=0x556b785332a0, > block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at > gmain.c:3902 > #15 0x00007fc0ce278d72 in g_main_loop_run (loop=0x556b77a59c60) at > gmain.c:4098 > #16 0x00007fc0cbc5b47b in red_worker_main (arg=0x556b785326c0) at > red-worker.c:1139 > #17 0x00007fc0c9d832de in start_thread (arg=<optimized out>) at > pthread_create.c:486 > #18 0x00007fc0c9ab4e83 in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Please file that one as a separate bug against spice; I don't think it's related to my change (which is in the usb redirect code) Please add a note here to say what the new bz is. (In reply to Dr. David Alan Gilbert from comment #23) > (In reply to Han Han from comment #22) > > Created attachment 1649328 [details] > > segment fault in scratch build: L1 XMLs and backtrace > > > > Hello, another segment fault was found on scratch build when loop migrations: > > Version: libvirt-5.10.0-1.module+el8.2.0+5135+ed3b2489.x86_64 > > qemu-kvm-4.2.0-4.el8.bz1752320a.x86_64 > > > > Setup: > > Prepare 2 L1 host for migration, one with interface bandwidth limit: > > <bandwidth> > > <inbound average='20480'/> > > <outbound average='20480'/> > > </bandwidth> > > > > Steps: Follow steps1~3 comment0 > > > > Results: Sometimes following warnings pop up on virsh migration cmdline: > > Migration: [ 0 %]error: internal error: qemu unexpectedly closed the > > monitor: 2020-01-03T03:07:12.112093Z qemu-kvm: warning: host doesn't support > > requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12] > > 2020-01-03T03:07:12.112109Z qemu-kvm: warning: host doesn't support > > requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13] > > > > 2020-01-03T03:07:12.114502Z qemu-kvm: warning: host doesn't support > > requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12] > > > > 2020-01-03T03:07:12.114513Z qemu-kvm: warning: host doesn't support > > requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13] > > > > And then get a segment fault: > > # abrt-cli ls > > id 0d37d84a6d9cdc4dc9a8ecee86350ec983faf720 > > reason: main_channel_client_is_low_bandwidth(): qemu-kvm killed by > > SIGSEGV > > time: Fri 03 Jan 2020 11:07:12 AM CST > > cmdline: /usr/libexec/qemu-kvm -name guest=nfs-8.2,debug-threads=on > > -S -object > > secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1112-nfs-8. > > 2/master-key.aes -machine > > pc-q35-rhel8.2.0,accel=kvm,usb=off,dump-guest-core=off -cpu > > Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on, > > umip=on,arch-capabilities=on,xsaves=on,pdpe1gb=on,skip-l1dfl-vmentry=on,spec- > > ctrl=off,mpx=off -m 1024 -overcommit mem-lock=off -smp > > 2,sockets=2,cores=1,threads=1 -uuid e5717b24-74cc-4aa3-b2c3-e55cd2589f98 > > -no-user-config -nodefaults -chardev > > socket,id=charmonitor,fd=37,server,nowait -mon > > chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew > > -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global > > ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device > > pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on, > > addr=0x2 -device > > pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device > > pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device > > pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device > > pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device > > pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device > > pcie-pci-bridge,id=pci.7,bus=pci.4,addr=0x0 -device > > ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 -device > > ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on, > > addr=0x1d -device > > ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 -device > > ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 -device > > qemu-xhci,id=usb1,bus=pci.2,addr=0x0 -device > > virtio-scsi-pci,id=scsi0,bus=pci.1,addr=0x0 -device > > ahci,id=sata1,bus=pci.7,addr=0x1 -blockdev > > '{\"driver\":\"gluster\",\"volume\":\"gv\",\"path\":\"nfs-8.2.qcow2\", > > \"server\":[{\"type\":\"inet\",\"host\":\"gls1.usersys.redhat.com\",\"port\": > > \"24007\"}],\"debug\":0,\"node-name\":\"libvirt-1-storage\",\"auto-read- > > only\":true,\"discard\":\"unmap\"}' -blockdev > > '{\"node-name\":\"libvirt-1-format\",\"read-only\":false,\"driver\": > > \"qcow2\",\"file\":\"libvirt-1-storage\",\"backing\":null}' -device > > virtio-blk-pci,scsi=off,bus=pci.5,addr=0x0,drive=libvirt-1-format,id=virtio- > > disk0,bootindex=1 -chardev pty,id=charserial0 -device > > isa-serial,chardev=charserial0,id=serial0 -device > > usb-tablet,id=input0,bus=usb.0,port=1 -spice > > port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -device > > qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0, > > vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -chardev > > spicevmc,id=charredir0,name=usbredir -device > > usb-redir,chardev=charredir0,id=redir0,bus=usb1.0,port=2 -incoming defer > > -device virtio-balloon-pci,id=balloon0,bus=pci.3,addr=0x0 -sandbox > > on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg > > timestamp=on > > package: 15:qemu-kvm-core-4.2.0-4.el8.bz1752320a > > uid: 107 (qemu) > > count: 1 > > Directory: /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404 > > Run 'abrt-cli report /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404' for > > creating a case in Red Hat Customer Portal > > > > Backtrace > > #0 0x00007fc0cbc3d588 in main_channel_client_is_low_bandwidth > > (mcc=mcc@entry=0x556b771c02a0) at main-channel-client.c:656 > > #1 0x00007fc0cbc1f516 in dcc_config_socket (rcc=0x556b77648540) at > > dcc.c:1426 > > #2 0x00007fc0cbc440a3 in red_channel_client_config_socket > > (rcc=0x556b77648540) at red-channel-client.c:1046 > > #3 red_channel_client_initable_init (initable=<optimized out>, > > cancellable=<optimized out>, error=0x0) at red-channel-client.c:925 > > #4 0x00007fc0c6eed5df in g_initable_new_valist (object_type=<optimized > > out>, first_property_name=0x7fc0cbce2dc4 "channel", var_args=0x7fc0a4f90350, > > cancellable=0x0, error=0x0) at ginitable.c:248 > > #5 0x00007fc0c6eed68d in g_initable_new (object_type=<optimized out>, > > cancellable=cancellable@entry=0x0, error=error@entry=0x0, > > first_property_name=first_property_name@entry=0x7fc0cbce2dc4 "channel") > > at ginitable.c:162 > > #6 0x00007fc0cbc2015d in dcc_new (display=display@entry=0x556b7716d930, > > client=client@entry=0x556b77166dd0, stream=stream@entry=0x556b775d5570, > > mig_target=mig_target@entry=1, caps=caps@entry=0x556b785333a8, > > image_compression=SPICE_IMAGE_COMPRESSION_AUTO_GLZ, > > jpeg_state=SPICE_WAN_COMPRESSION_AUTO, > > zlib_glz_state=SPICE_WAN_COMPRESSION_AUTO) at dcc.c:507 > > #7 0x00007fc0cbc2b8c6 in display_channel_connect (channel=<optimized out>, > > client=0x556b77166dd0, stream=0x556b775d5570, migration=1, > > caps=0x556b785333a8) at display-channel.c:2616 > > #8 0x00007fc0cbc4200b in handle_dispatcher_connect (opaque=<optimized out>, > > payload=0x556b78533390) at red-channel.c:511 > > #9 0x00007fc0cbc2747f in dispatcher_handle_single_read > > (dispatcher=0x556b785328a0) at dispatcher.c:287 > > #10 dispatcher_handle_recv_read (dispatcher=0x556b785328a0) at > > dispatcher.c:307 > > #11 0x00007fc0cbc2d79f in watch_func (source=<optimized out>, > > condition=<optimized out>, data=0x556b785333e0) at event-loop.c:119 > > #12 0x00007fc0ce27867d in g_main_dispatch (context=0x556b785332a0) at > > gmain.c:3176 > > #13 g_main_context_dispatch (context=context@entry=0x556b785332a0) at > > gmain.c:3829 > > #14 0x00007fc0ce278a48 in g_main_context_iterate (context=0x556b785332a0, > > block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at > > gmain.c:3902 > > #15 0x00007fc0ce278d72 in g_main_loop_run (loop=0x556b77a59c60) at > > gmain.c:4098 > > #16 0x00007fc0cbc5b47b in red_worker_main (arg=0x556b785326c0) at > > red-worker.c:1139 > > #17 0x00007fc0c9d832de in start_thread (arg=<optimized out>) at > > pthread_create.c:486 > > #18 0x00007fc0c9ab4e83 in clone () at > > ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 > > Please file that one as a separate bug against spice; I don't think it's > related to my change (which is in the usb redirect code) > Please add a note here to say what the new bz is. New a bug on spice: https://bugzilla.redhat.com/show_bug.cgi?id=1787536 Thanks; please don't change the state of this bug - I've set it back to assigned. QE note: This also has the fix for 1786414's initial problem in; but I'll keep 1786414 open until we solve the other problem we found on the corresponding 1786413. QA_ACK, please? QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks ok, reproduce this bz on hosts(kernel-4.18.0-128.el8.x86_64 & qemu-img-4.1.0-9.module+el8.1.0+4210+23b2046a.x86_64) after test as Comment 0, do ping-pong migration about 30min: (1)get error from qemu log on src host: ************************************************************ warning: chardev_can_read called on non open chardev! 2020-03-10T12:46:08.191923Z qemu-kvm: warning: chardev_can_read called on non open chardev! ******************************************************************************************* (2)couldn't connect guest via remote-viewer (3)command hang: # virsh domstats nfs verify this bz on hosts(kernel-4.18.0-185.el8.x86_64&qemu-img-2.12.0-99.module+el8.2.0+5827+8c39933c.x86_64&libvirt-4.5.0-41.module+el8.2.0+5928+db9eea38.x86_64) as Comment 32 by trying two times: test steps: (1)start nfs guest via virsh command Notes: change two places in nfs.xml: a. on rhel8.2-non-av, use "pc-q35-rhel7.6.0" to instead of "pc-q35-rhel8.0.0"; b. add "cache=none" into system disk since get error when mgiration: error: Unsafe migration: Migration may lead to data corruption if disks use cache != none or cache != directsync (2)on other client host(not src or dst), run migrate.sh & run.sh Test results: wait > 1 hour, vm works well and migrate successfully. Dave, I get some warnings from qemu log during migration, could you help see whether it's a issue? 2020-03-11T02:51:04.775928Z qemu-kvm: warning: usb-redir connection broken during migration (process:111953): Spice-WARNING **: 22:51:05.363: Connection reset by peer (process:111953): Spice-WARNING **: 22:51:05.364: Connection reset by peer 2020-03-11 02:51:06.036+0000: initiating migration 2020-03-11 02:51:14.105+0000: shutting down, reason=migrated As long as it's working, and a normal migration test shows usb/spice still working after migration, I don't worry about those warnings much. Yeah, let's close it verified as migration finish successfully and vm works well. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:1587 |
Created attachment 1615431 [details] The vm log, libvirtd log of src and dst host; Scripts for reproducing Description of problem: As subject Version-Release number of selected component (if applicable): Src and dst host: libvirt-5.6.0-4.module+el8.1.0+4160+b50057dc.x86_64 qemu-kvm-4.1.0-9.module+el8.1.0+4210+23b2046a.x86_64 Desktop: virt-viewer-7.0-8.el8.x86_64 How reproducible: 30% Steps to Reproduce: Preparation: 1. Make sure dst and src hostnames could be resolved 2. Open ports of firewalld in src and dst hosts: 5900-6000/tcp 49152-49252/tcp Steps: 1. Prepare a running vm named nfs 2. Run migrate.sh to migrate the vm back and forth 3. Run run.sh to connect vm continuously by remote-viewer until migrate.sh doesn't pop up any migration progress. It means migration cannot be started. 4. Login to the host that remote-viewer connected. Try to get the stats of vm; # virsh domstats nfs The command will get stuck. And the `virsh qemu-monitor-command nfs --hmp info nfs` will get timeout, too. Actual results: As above Expected results: No stuck