RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1752320 - vm gets stuck when migrate vm back and forth with remote-viewer trying to connect
Summary: vm gets stuck when migrate vm back and forth with remote-viewer trying to con...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: qemu-kvm
Version: 8.1
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: rc
: ---
Assignee: Dr. David Alan Gilbert
QA Contact: Li Xiaohui
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-09-16 03:02 UTC by Han Han
Modified: 2023-04-29 10:32 UTC (History)
10 users (show)

Fixed In Version: qemu-kvm-2.12.0-97.module+el8.2.0+5545+14c6799f
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-28 15:32:34 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
The vm log, libvirtd log of src and dst host; Scripts for reproducing (7.17 MB, application/x-xz)
2019-09-16 03:02 UTC, Han Han
no flags Details
segment fault in scratch build: L1 XMLs and backtrace (7.60 KB, application/gzip)
2020-01-03 03:45 UTC, Han Han
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-27329 0 None None None 2023-04-29 10:32:23 UTC
Red Hat Product Errata RHEA-2020:1587 0 None None None 2020-04-28 15:34:17 UTC

Description Han Han 2019-09-16 03:02:44 UTC
Created attachment 1615431 [details]
The vm log, libvirtd log of src and dst host; Scripts for reproducing

Description of problem:
As subject

Version-Release number of selected component (if applicable):
Src and dst host:
libvirt-5.6.0-4.module+el8.1.0+4160+b50057dc.x86_64
qemu-kvm-4.1.0-9.module+el8.1.0+4210+23b2046a.x86_64

Desktop:
virt-viewer-7.0-8.el8.x86_64

How reproducible:
30%


Steps to Reproduce:
Preparation:
1. Make sure dst and src hostnames could be resolved
2. Open ports of firewalld in src and dst hosts:
5900-6000/tcp 49152-49252/tcp

Steps:
1. Prepare a running vm named nfs
2. Run migrate.sh to migrate the vm back and forth
3. Run run.sh to connect vm continuously by remote-viewer until migrate.sh doesn't pop up any migration progress.
It means migration cannot be started.
4. Login to the host that remote-viewer connected. Try to get the stats of vm;
# virsh domstats nfs

The command will get stuck. And the `virsh qemu-monitor-command nfs --hmp info nfs` will get timeout, too.

Actual results:
As above

Expected results:
No stuck

Comment 2 Dr. David Alan Gilbert 2019-09-16 18:31:03 UTC
last message on I think destination side is:

2019-09-11T02:02:26.818516Z qemu-kvm: warning: Spice: Connection reset by peer
2019-09-11T02:02:26.818713Z qemu-kvm: warning: Spice: Connection reset by peer
2019-09-11T02:02:26.818741Z qemu-kvm: warning: Spice: Connection reset by peer
warning: chardev_can_read called on non open chardev!

2019-09-11T02:02:30.213079Z qemu-kvm: warning: chardev_can_read called on non open chardev!

2019-09-11 02:18:12.297+0000: shutting down, reason=destroyed

and that seems to be after the source thinks the migration has completed.

I've not seen that 'chardev_can_read' warning before - that's odd!

Comment 3 Dr. David Alan Gilbert 2019-09-16 18:38:13 UTC
The message comes from usbredir:

hw/usb/redirect.c:
static int usbredir_chardev_can_read(void *opaque)
{
    USBRedirDevice *dev = opaque;

    if (!dev->parser) {
        WARNING("chardev_can_read called on non open chardev!\n");
        return 0;
    }

so I guess there's some ordering problem when it's disconnected.

Comment 4 Li Xiaohui 2019-09-26 08:40:06 UTC
I have tried to reproduce via libvirt on the latest host, didn't hit this issue after 20 hours, only tried one times, will try again later:
1.migrate.sh & run.sh still work well
2."virsh domstats nfs" & "virsh qemu-monitor-command nfs --hmp info nfs" can get guest status

Comment 5 Li Xiaohui 2019-09-26 08:43:00 UTC
Hi Han han,
When you met this issue, how long did you run migrate.sh & run.sh ?

Comment 6 Han Han 2019-09-29 07:44:20 UTC
(In reply to Li Xiaohui from comment #5)
> Hi Han han,
> When you met this issue, how long did you run migrate.sh & run.sh ?

I usually reproduce it by less than 100 times of remote-viewer being executed, within half an hour.
Could you please provide your ENV and let me have a look?

Comment 7 Dr. David Alan Gilbert 2019-11-08 10:57:20 UTC
Reproduced here using your scripts; took about 10 migrations.

warning: chardev_can_read called on non open chardev!

2019-11-08T10:55:01.144742Z qemu-kvm: warning: chardev_can_read called on non open chardev!

Comment 8 Dr. David Alan Gilbert 2019-11-08 14:34:37 UTC
As well as the warning about the non-open chardev (that comes from usbredir), the hang is coming on the write side of things
as we recursively try and do a write on the same socket:
#0  0x00007fc77b4a68dd in __lll_lock_wait () from target:/lib64/libpthread.so.0
#1  0x00007fc77b49faf9 in pthread_mutex_lock () from target:/lib64/libpthread.so.0
#2  0x000055738d7774fd in qemu_mutex_lock_impl (mutex=0x55738f0b18e8, file=0x55738d90b78d "chardev/char.c", line=111) at util/qemu-thread-posix.c:66
#3  0x000055738d70cc03 in qemu_chr_write_buffer (s=s@entry=0x55738f0b18c0, buf=buf@entry=0x55738fe2db80 "", len=80, offset=offset@entry=0x7ffef54237d0,
    write_all=false) at chardev/char.c:111
(gdb) p *s
$2 = {parent_obj = {class = 0x55738f053c60, free = 0x7fc77ff8a380 <g_free>, properties = 0x55738f0aec60, ref = 1, parent = 0x55738f0a98f0},
  chr_write_lock = {lock = {__data = {__lock = 2, __count = 0, __owner = 25470, __nusers = 1, __kind = 0, __spins = 0, __elision = 0, __list = {
          __prev = 0x0, __next = 0x0}}, __size = "\002\000\000\000\000\000\000\000~c\000\000\001", '\000' <repeats 26 times>, __align = 2},
    initialized = true}, be = 0x55739046dcf0, label = 0x55738f0a9cb0 "charredir0", filename = 0x55738f0abc60 "spicevmc", logfd = -1, be_open = 1,
  gsource = 0x0, gcontext = 0x0, features = {0}}
 
#4  0x000055738d70cf13 in qemu_chr_write (s=0x55738f0b18c0, buf=buf@entry=0x55738fe2db80 "", len=len@entry=80, write_all=write_all@entry=false)
    at chardev/char.c:149
#5  0x000055738d70efe3 in qemu_chr_fe_write (be=be@entry=0x55739046dcf0, buf=buf@entry=0x55738fe2db80 "", len=len@entry=80) at chardev/char-fe.c:42
#6  0x000055738d62836c in usbredir_write (priv=0x55739046c640, data=0x55738fe2db80 "", count=80) at hw/usb/redirect.c:289
#7  0x00007fc77fd3241b in usbredirparser_do_write () from target:/lib64/libusbredirparser.so.1
#8  0x000055738d56c371 in vmc_write (sin=<optimized out>, buf=<optimized out>, len=<optimized out>) at chardev/spice.c:34
#9  0x00007fc77d33380c in red_char_device_write_to_device (dev=0x55738f0ed960) at char-device.c:492
#10 red_char_device_write_to_device (dev=0x55738f0ed960) at char-device.c:458
#11 0x00007fc77d3341bd in red_char_device_wakeup (dev=0x55738f0ed960) at char-device.c:854
#12 0x00007fc77d36b645 in spice_server_char_device_wakeup (sin=sin@entry=0x55738f0b1950) at reds.c:3264
#13 0x000055738d56ca64 in spice_chr_write (chr=0x55738f0b18c0, buf=0x55738fe2db80 "", len=80) at chardev/spice.c:202
#14 0x000055738d70cc50 in qemu_chr_write_buffer (s=s@entry=0x55738f0b18c0, buf=buf@entry=0x55738fe2db80 "", len=80, offset=offset@entry=0x7ffef5423970,
    write_all=false) at chardev/char.c:114
#15 0x000055738d70cf13 in qemu_chr_write (s=0x55738f0b18c0, buf=buf@entry=0x55738fe2db80 "", len=len@entry=80, write_all=write_all@entry=false)
    at chardev/char.c:149
#16 0x000055738d70efe3 in qemu_chr_fe_write (be=be@entry=0x55739046dcf0, buf=buf@entry=0x55738fe2db80 "", len=len@entry=80) at chardev/char-fe.c:42
#17 0x000055738d62836c in usbredir_write (priv=0x55739046c640, data=0x55738fe2db80 "", count=80) at hw/usb/redirect.c:289
#18 0x00007fc77fd3241b in usbredirparser_do_write () from target:/lib64/libusbredirparser.so.1
#19 0x000055738d627cc8 in usbredir_create_parser (dev=0x55739046c640) at hw/usb/redirect.c:1235
#20 0x00007fc77d378452 in spicevmc_connect (channel=<optimized out>, client=0x55738f519920, stream=<optimized out>, migration=<optimized out>,
    caps=<optimized out>) at spicevmc.c:793
#21 0x00007fc77d359c85 in red_channel_connect (channel=channel@entry=0x55738f0ea1c0, client=client@entry=0x55738f519920,
    stream=stream@entry=0x55739049fb30, migration=0, caps=caps@entry=0x7ffef5423b30) at red-channel.c:523
#22 0x00007fc77d367633 in reds_channel_do_link (channel=channel@entry=0x55738f0ea1c0, client=client@entry=0x55738f519920,
    link_msg=link_msg@entry=0x55738f080750, stream=0x55739049fb30) at reds.c:2039
#23 0x00007fc77d36dc00 in reds_handle_other_links (link=0x55739049fa50, reds=0x55738f0c1920) at reds.c:2180
#24 reds_handle_link (link=link@entry=0x55739049fa50) at reds.c:2196
#25 0x00007fc77d36e24f in reds_handle_ticket (opaque=0x55739049fa50) at reds.c:2250
#26 0x000055738d774882 in aio_dispatch_handlers (ctx=ctx@entry=0x55738f01baf0) at util/aio-posix.c:429
#27 0x000055738d77522c in aio_dispatch (ctx=0x55738f01baf0) at util/aio-posix.c:460
#28 0x000055738d771cc2 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
#29 0x00007fc77ff8472d in g_main_context_dispatch () from target:/lib64/libglib-2.0.so.0
#30 0x000055738d7742d8 in glib_pollfds_poll () at util/main-loop.c:218
#31 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:241
#32 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:517
#33 0x000055738d55d3e9 in main_loop () at vl.c:1809

and we block in s->chr_write_lock in qemu_chr_write_buffer

Comment 9 Dr. David Alan Gilbert 2019-11-08 17:45:52 UTC
This means we're going through qemu_chr_write_buffer twice on the same chardev; shouldn't happen.

I'm fairly sure that aio based call to reds_handle_other_links comes via:

async reds_handle_ticket probably from reds_get_spice_ticket
  that's got a couple of routes, probably from reds_handle_auth_mechanism ?
from reds_handle_read_link_done
from reds_handle_read_header_done
from reds_handle_read_magic_done
from reds_handle_new_link
from reds_handle_ssl_accept or reds_init_client_ssl_connection or spice_server_add_client or reds_handle_read_magic_done (websocket)

i.e. when the remote-viewer connects.
it's not clear what the migration interaction is yet

#23 0x00007fc77d36dc00 in reds_handle_other_links (link=0x55739049fa50, reds=0x55738f0c1920) at reds.c:2180
    reds->dst_do_seamless_migrate= 0
(gdb) p *client
$8 = {parent = {g_type_instance = {g_class = 0x55738f53ab20}, ref_count = 1, qdata = 0x0}, reds = 0x55738f0c1920, channels = 0x55739047ab20,
  mcc = 0x5573903ecde0, lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {
        __prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, thread_id = 140494834314944, disconnecting = 0,
  during_target_migrate = 0, seamless_migrate = 0, num_migrated_channels = 0}

Comment 10 Dr. David Alan Gilbert 2019-11-08 19:50:17 UTC
Also triggered on current upstream. (41[34]'s /opt/bz1752320/  for ref)

Also noticed:
  Not all cases where the chardev_can_read warning happens cause the hang.
There's also a bunch of debug of the form:
warning: Spice: display:0 (0x557996221940): unexpected
Spice: usbredir:0 (0x55799621f9e0): unexpected
2019-11-08T19:45:23.064895Z qemu-system-x86_64: warning: Spice: main:0 (0x557996221890): unexpected

Comment 11 Dr. David Alan Gilbert 2019-12-17 15:28:29 UTC
I think the 'chardev_can_read called on non open chardev' could be due to usbredir_chardev_close_bh
destroying the parser before it removes the watch; but I'm not sure yet.

Comment 12 Dr. David Alan Gilbert 2019-12-17 15:32:57 UTC
The actual recursion seems similar to the one spice protects with it's commit:

commit 0c1f5b00e7907aefee13f86a234558f00cd6c7ef
Author: Uri Lublin <uril>
Date:   Mon Feb 2 12:35:59 2015 +0200

    char-device: spice_char_device_write_to_device: protect against recursion

which was added to handle a wakeup based recursion - but we've only actually
gone through spices write path once so far; the other side comign from red_channel_connect->spicevmc_connect

Comment 13 Dr. David Alan Gilbert 2019-12-17 15:44:38 UTC
Some debug;  seems to suggest that the 'chardev open' is happening after the post_load?
usbredir_post_load
vm_change_state_handler: running=1 state=9
DAG: qemu_spice_display_start entry spice_display_is_running: 0
warning: chardev_can_read called on non open chardev!

usbredir_vm_state_change running=1 state=9
2019-12-17T12:15:08.155275Z qemu-system-x86_64: warning: chardev_can_read called on non open chardev!

2019-12-17T12:15:08.155317Z qemu-system-x86_64: usb-redir: chardev open (top)

usbredir_chardev_close_bh: entry dev=0x55824617e150
usbredir_device_disconnect entry dev=0x55824617e150
2019-12-17T12:15:08.155339Z qemu-system-x86_64: usb-redir: removing 0 packet-ids from cancelled queue

2019-12-17T12:15:08.155348Z qemu-system-x86_64: usb-redir: removing 0 packet-ids from already-in-flight queue

usbredir_device_disconnect exit dev=0x55824617e150
usbredir_chardev_close_bh: exit dev=0x55824617e150
2019-12-17T12:15:08.155367Z qemu-system-x86_64: usb-redir: creating usbredirparser

usbredir_write: count=80 entry
usbredir_write: be open: 0x558244ee1c00 - charredir0
2019-12-17T12:15:08.155570Z qemu-system-x86_64: usbredirparser: Peer version: spice-gtk 0.37, using 64-bits ids
usbredir_hello
usbredir_write: count=80 entry
usbredir_write: be open: 0x558244ee1c00 - charredir0

Comment 14 Dr. David Alan Gilbert 2019-12-17 16:36:37 UTC
(In reply to Dr. David Alan Gilbert from comment #11)
> I think the 'chardev_can_read called on non open chardev' could be due to
> usbredir_chardev_close_bh
> destroying the parser before it removes the watch; but I'm not sure yet.

Flipping that around doesn't help fix either bug.

Comment 15 Dr. David Alan Gilbert 2019-12-17 20:18:11 UTC
I'm trying the following recursion guard; it looks like it's working but needs more testing.

diff --git a/hw/usb/redirect.c b/hw/usb/redirect.c
index e0f5ca6f81..02e4ea594f 100644
--- a/hw/usb/redirect.c
+++ b/hw/usb/redirect.c
@@ -113,6 +113,7 @@ struct USBRedirDevice {
     /* Properties */
     CharBackend cs;
     bool enable_streams;
+    bool in_write;
     uint8_t debug;
     int32_t bootindex;
     char *filter_str;
@@ -290,6 +291,13 @@ static int usbredir_write(void *priv, uint8_t *data, int count)
         return 0;
     }
 
+    /* Recursion check */
+    if (dev->in_write) {
+        DPRINTF("usbredir_write recursion\n");
+        return 0;
+    }
+    dev->in_write=true;
+
     r = qemu_chr_fe_write(&dev->cs, data, count);
     if (r < count) {
         if (!dev->watch) {
@@ -300,6 +308,7 @@ static int usbredir_write(void *priv, uint8_t *data, int count)
             r = 0;
         }
     }
+    dev->in_write=false;
     return r;
 }

Comment 16 Dr. David Alan Gilbert 2019-12-18 11:22:11 UTC
Please try:
http://brew-task-repos.usersys.redhat.com/repos/scratch/dgilbert/qemu-kvm/4.2.0/4.el8.bz1752320a/

seems to fix it for me.

Comment 17 Dr. David Alan Gilbert 2019-12-18 11:30:42 UTC
Patch sent upstream:
  [PATCH] usbredir: Prevent recursion in usbredir_write

Comment 18 Han Han 2019-12-25 05:55:30 UTC
(In reply to Dr. David Alan Gilbert from comment #16)
> Please try:
> http://brew-task-repos.usersys.redhat.com/repos/scratch/dgilbert/qemu-kvm/4.
> 2.0/4.el8.bz1752320a/
> 
> seems to fix it for me.

Unluckily, I cannot reproduce the bug anymore on libvirt-5.6.0-4.module+el8.1.0+4160+b50057dc.x86_64 qemu-kvm-4.1.0-9.module+el8.1.0+4210+23b2046a.x86_64. I have run it as the scripts for 30mins and more than 600 times of migration cycle, not reproduced.

I can summarize the issues that found by the these scripts here. Maybe you can find something in common.

remote-viewer gets SIGABRT when connect to a being migrated VM: https://bugzilla.redhat.com/show_bug.cgi?id=1746239#c4

remote-viewer gets SIGSEGV when connect to a being migrated VM: https://bugzilla.redhat.com/show_bug.cgi?id=1746239#c0

qemu-kvm get SIGABRT when setting spice usb redir during migration: https://bugzilla.redhat.com/show_bug.cgi?id=1786413#c0

qemu-kvm get SIGSEGV when setting spice usb redir during migration: https://bugzilla.redhat.com/show_bug.cgi?id=1786413#c2

Comment 19 Han Han 2019-12-25 09:05:50 UTC
More update: the bug is hard to reproduce but it did appears sometime after comment18.
I will continue find a steady way to reproduce it.

Comment 20 Dr. David Alan Gilbert 2020-01-02 15:43:59 UTC
(In reply to Han Han from comment #19)
> More update: the bug is hard to reproduce but it did appears sometime after
> comment18.
> I will continue find a steady way to reproduce it.

Yes it's difficult - sometimes it repeats often; sometimes it hides.

I'd also seen the SIGSEGV in remote viewer 1746239 - but they're very random; often lots of different places.
I'd not seen either the abrt or the other qemu crashes.

Dave

Comment 21 Han Han 2020-01-03 03:00:19 UTC
(In reply to Dr. David Alan Gilbert from comment #16)
> Please try:
> http://brew-task-repos.usersys.redhat.com/repos/scratch/dgilbert/qemu-kvm/4.
> 2.0/4.el8.bz1752320a/
> 
> seems to fix it for me.

For your scratch build, I have run the bug reproducing scripts for over one day.
The bug is not reproduced.
Version:
libvirt-5.10.0-1.module+el8.2.0+5135+ed3b2489.x86_64
qemu-kvm-4.2.0-4.el8.bz1752320a.x86_64

Comment 22 Han Han 2020-01-03 03:45:22 UTC
Created attachment 1649328 [details]
segment fault in scratch build: L1 XMLs and backtrace

Hello, another segment fault was found on scratch build when loop migrations:
Version: libvirt-5.10.0-1.module+el8.2.0+5135+ed3b2489.x86_64
qemu-kvm-4.2.0-4.el8.bz1752320a.x86_64

Setup:
Prepare 2 L1 host for migration, one with interface bandwidth limit:
      <bandwidth>
        <inbound average='20480'/>
        <outbound average='20480'/>
      </bandwidth>

Steps: Follow steps1~3 comment0

Results: Sometimes following warnings pop up on virsh migration cmdline:
Migration: [  0 %]error: internal error: qemu unexpectedly closed the monitor: 2020-01-03T03:07:12.112093Z qemu-kvm: warning: host doesn't support requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
2020-01-03T03:07:12.112109Z qemu-kvm: warning: host doesn't support requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]                                                                         
2020-01-03T03:07:12.114502Z qemu-kvm: warning: host doesn't support requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]                                                                          
2020-01-03T03:07:12.114513Z qemu-kvm: warning: host doesn't support requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]  

And then get a segment fault:
# abrt-cli ls
id 0d37d84a6d9cdc4dc9a8ecee86350ec983faf720
reason:         main_channel_client_is_low_bandwidth(): qemu-kvm killed by SIGSEGV
time:           Fri 03 Jan 2020 11:07:12 AM CST
cmdline:        /usr/libexec/qemu-kvm -name guest=nfs-8.2,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1112-nfs-8.2/master-key.aes -machine pc-q35-rhel8.2.0,accel=kvm,usb=off,dump-guest-core=off -cpu Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,arch-capabilities=on,xsaves=on,pdpe1gb=on,skip-l1dfl-vmentry=on,spec-ctrl=off,mpx=off -m 1024 -overcommit mem-lock=off -smp 2,sockets=2,cores=1,threads=1 -uuid e5717b24-74cc-4aa3-b2c3-e55cd2589f98 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=37,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-pci-bridge,id=pci.7,bus=pci.4,addr=0x0 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x1d -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 -device qemu-xhci,id=usb1,bus=pci.2,addr=0x0 -device virtio-scsi-pci,id=scsi0,bus=pci.1,addr=0x0 -device ahci,id=sata1,bus=pci.7,addr=0x1 -blockdev '{\"driver\":\"gluster\",\"volume\":\"gv\",\"path\":\"nfs-8.2.qcow2\",\"server\":[{\"type\":\"inet\",\"host\":\"gls1.usersys.redhat.com\",\"port\":\"24007\"}],\"debug\":0,\"node-name\":\"libvirt-1-storage\",\"auto-read-only\":true,\"discard\":\"unmap\"}' -blockdev '{\"node-name\":\"libvirt-1-format\",\"read-only\":false,\"driver\":\"qcow2\",\"file\":\"libvirt-1-storage\",\"backing\":null}' -device virtio-blk-pci,scsi=off,bus=pci.5,addr=0x0,drive=libvirt-1-format,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb1.0,port=2 -incoming defer -device virtio-balloon-pci,id=balloon0,bus=pci.3,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
package:        15:qemu-kvm-core-4.2.0-4.el8.bz1752320a
uid:            107 (qemu)
count:          1
Directory:      /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404
Run 'abrt-cli report /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404' for creating a case in Red Hat Customer Portal

Backtrace
#0  0x00007fc0cbc3d588 in main_channel_client_is_low_bandwidth (mcc=mcc@entry=0x556b771c02a0) at main-channel-client.c:656
#1  0x00007fc0cbc1f516 in dcc_config_socket (rcc=0x556b77648540) at dcc.c:1426
#2  0x00007fc0cbc440a3 in red_channel_client_config_socket (rcc=0x556b77648540) at red-channel-client.c:1046
#3  red_channel_client_initable_init (initable=<optimized out>, cancellable=<optimized out>, error=0x0) at red-channel-client.c:925
#4  0x00007fc0c6eed5df in g_initable_new_valist (object_type=<optimized out>, first_property_name=0x7fc0cbce2dc4 "channel", var_args=0x7fc0a4f90350, cancellable=0x0, error=0x0) at ginitable.c:248
#5  0x00007fc0c6eed68d in g_initable_new (object_type=<optimized out>, cancellable=cancellable@entry=0x0, error=error@entry=0x0, first_property_name=first_property_name@entry=0x7fc0cbce2dc4 "channel")
    at ginitable.c:162
#6  0x00007fc0cbc2015d in dcc_new (display=display@entry=0x556b7716d930, client=client@entry=0x556b77166dd0, stream=stream@entry=0x556b775d5570, mig_target=mig_target@entry=1, caps=caps@entry=0x556b785333a8, 
    image_compression=SPICE_IMAGE_COMPRESSION_AUTO_GLZ, jpeg_state=SPICE_WAN_COMPRESSION_AUTO, zlib_glz_state=SPICE_WAN_COMPRESSION_AUTO) at dcc.c:507
#7  0x00007fc0cbc2b8c6 in display_channel_connect (channel=<optimized out>, client=0x556b77166dd0, stream=0x556b775d5570, migration=1, caps=0x556b785333a8) at display-channel.c:2616
#8  0x00007fc0cbc4200b in handle_dispatcher_connect (opaque=<optimized out>, payload=0x556b78533390) at red-channel.c:511
#9  0x00007fc0cbc2747f in dispatcher_handle_single_read (dispatcher=0x556b785328a0) at dispatcher.c:287
#10 dispatcher_handle_recv_read (dispatcher=0x556b785328a0) at dispatcher.c:307
#11 0x00007fc0cbc2d79f in watch_func (source=<optimized out>, condition=<optimized out>, data=0x556b785333e0) at event-loop.c:119
#12 0x00007fc0ce27867d in g_main_dispatch (context=0x556b785332a0) at gmain.c:3176
#13 g_main_context_dispatch (context=context@entry=0x556b785332a0) at gmain.c:3829
#14 0x00007fc0ce278a48 in g_main_context_iterate (context=0x556b785332a0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3902
#15 0x00007fc0ce278d72 in g_main_loop_run (loop=0x556b77a59c60) at gmain.c:4098
#16 0x00007fc0cbc5b47b in red_worker_main (arg=0x556b785326c0) at red-worker.c:1139
#17 0x00007fc0c9d832de in start_thread (arg=<optimized out>) at pthread_create.c:486
#18 0x00007fc0c9ab4e83 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment 23 Dr. David Alan Gilbert 2020-01-03 09:12:03 UTC
(In reply to Han Han from comment #22)
> Created attachment 1649328 [details]
> segment fault in scratch build: L1 XMLs and backtrace
> 
> Hello, another segment fault was found on scratch build when loop migrations:
> Version: libvirt-5.10.0-1.module+el8.2.0+5135+ed3b2489.x86_64
> qemu-kvm-4.2.0-4.el8.bz1752320a.x86_64
> 
> Setup:
> Prepare 2 L1 host for migration, one with interface bandwidth limit:
>       <bandwidth>
>         <inbound average='20480'/>
>         <outbound average='20480'/>
>       </bandwidth>
> 
> Steps: Follow steps1~3 comment0
> 
> Results: Sometimes following warnings pop up on virsh migration cmdline:
> Migration: [  0 %]error: internal error: qemu unexpectedly closed the
> monitor: 2020-01-03T03:07:12.112093Z qemu-kvm: warning: host doesn't support
> requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
> 2020-01-03T03:07:12.112109Z qemu-kvm: warning: host doesn't support
> requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]       
> 
> 2020-01-03T03:07:12.114502Z qemu-kvm: warning: host doesn't support
> requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]        
> 
> 2020-01-03T03:07:12.114513Z qemu-kvm: warning: host doesn't support
> requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]  
> 
> And then get a segment fault:
> # abrt-cli ls
> id 0d37d84a6d9cdc4dc9a8ecee86350ec983faf720
> reason:         main_channel_client_is_low_bandwidth(): qemu-kvm killed by
> SIGSEGV
> time:           Fri 03 Jan 2020 11:07:12 AM CST
> cmdline:        /usr/libexec/qemu-kvm -name guest=nfs-8.2,debug-threads=on
> -S -object
> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1112-nfs-8.
> 2/master-key.aes -machine
> pc-q35-rhel8.2.0,accel=kvm,usb=off,dump-guest-core=off -cpu
> Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,
> umip=on,arch-capabilities=on,xsaves=on,pdpe1gb=on,skip-l1dfl-vmentry=on,spec-
> ctrl=off,mpx=off -m 1024 -overcommit mem-lock=off -smp
> 2,sockets=2,cores=1,threads=1 -uuid e5717b24-74cc-4aa3-b2c3-e55cd2589f98
> -no-user-config -nodefaults -chardev
> socket,id=charmonitor,fd=37,server,nowait -mon
> chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew
> -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global
> ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device
> pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,
> addr=0x2 -device
> pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device
> pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device
> pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device
> pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device
> pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device
> pcie-pci-bridge,id=pci.7,bus=pci.4,addr=0x0 -device
> ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 -device
> ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,
> addr=0x1d -device
> ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 -device
> ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 -device
> qemu-xhci,id=usb1,bus=pci.2,addr=0x0 -device
> virtio-scsi-pci,id=scsi0,bus=pci.1,addr=0x0 -device
> ahci,id=sata1,bus=pci.7,addr=0x1 -blockdev
> '{\"driver\":\"gluster\",\"volume\":\"gv\",\"path\":\"nfs-8.2.qcow2\",
> \"server\":[{\"type\":\"inet\",\"host\":\"gls1.usersys.redhat.com\",\"port\":
> \"24007\"}],\"debug\":0,\"node-name\":\"libvirt-1-storage\",\"auto-read-
> only\":true,\"discard\":\"unmap\"}' -blockdev
> '{\"node-name\":\"libvirt-1-format\",\"read-only\":false,\"driver\":
> \"qcow2\",\"file\":\"libvirt-1-storage\",\"backing\":null}' -device
> virtio-blk-pci,scsi=off,bus=pci.5,addr=0x0,drive=libvirt-1-format,id=virtio-
> disk0,bootindex=1 -chardev pty,id=charserial0 -device
> isa-serial,chardev=charserial0,id=serial0 -device
> usb-tablet,id=input0,bus=usb.0,port=1 -spice
> port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -device
> qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,
> vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -chardev
> spicevmc,id=charredir0,name=usbredir -device
> usb-redir,chardev=charredir0,id=redir0,bus=usb1.0,port=2 -incoming defer
> -device virtio-balloon-pci,id=balloon0,bus=pci.3,addr=0x0 -sandbox
> on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg
> timestamp=on
> package:        15:qemu-kvm-core-4.2.0-4.el8.bz1752320a
> uid:            107 (qemu)
> count:          1
> Directory:      /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404
> Run 'abrt-cli report /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404' for
> creating a case in Red Hat Customer Portal
> 
> Backtrace
> #0  0x00007fc0cbc3d588 in main_channel_client_is_low_bandwidth
> (mcc=mcc@entry=0x556b771c02a0) at main-channel-client.c:656
> #1  0x00007fc0cbc1f516 in dcc_config_socket (rcc=0x556b77648540) at
> dcc.c:1426
> #2  0x00007fc0cbc440a3 in red_channel_client_config_socket
> (rcc=0x556b77648540) at red-channel-client.c:1046
> #3  red_channel_client_initable_init (initable=<optimized out>,
> cancellable=<optimized out>, error=0x0) at red-channel-client.c:925
> #4  0x00007fc0c6eed5df in g_initable_new_valist (object_type=<optimized
> out>, first_property_name=0x7fc0cbce2dc4 "channel", var_args=0x7fc0a4f90350,
> cancellable=0x0, error=0x0) at ginitable.c:248
> #5  0x00007fc0c6eed68d in g_initable_new (object_type=<optimized out>,
> cancellable=cancellable@entry=0x0, error=error@entry=0x0,
> first_property_name=first_property_name@entry=0x7fc0cbce2dc4 "channel")
>     at ginitable.c:162
> #6  0x00007fc0cbc2015d in dcc_new (display=display@entry=0x556b7716d930,
> client=client@entry=0x556b77166dd0, stream=stream@entry=0x556b775d5570,
> mig_target=mig_target@entry=1, caps=caps@entry=0x556b785333a8, 
>     image_compression=SPICE_IMAGE_COMPRESSION_AUTO_GLZ,
> jpeg_state=SPICE_WAN_COMPRESSION_AUTO,
> zlib_glz_state=SPICE_WAN_COMPRESSION_AUTO) at dcc.c:507
> #7  0x00007fc0cbc2b8c6 in display_channel_connect (channel=<optimized out>,
> client=0x556b77166dd0, stream=0x556b775d5570, migration=1,
> caps=0x556b785333a8) at display-channel.c:2616
> #8  0x00007fc0cbc4200b in handle_dispatcher_connect (opaque=<optimized out>,
> payload=0x556b78533390) at red-channel.c:511
> #9  0x00007fc0cbc2747f in dispatcher_handle_single_read
> (dispatcher=0x556b785328a0) at dispatcher.c:287
> #10 dispatcher_handle_recv_read (dispatcher=0x556b785328a0) at
> dispatcher.c:307
> #11 0x00007fc0cbc2d79f in watch_func (source=<optimized out>,
> condition=<optimized out>, data=0x556b785333e0) at event-loop.c:119
> #12 0x00007fc0ce27867d in g_main_dispatch (context=0x556b785332a0) at
> gmain.c:3176
> #13 g_main_context_dispatch (context=context@entry=0x556b785332a0) at
> gmain.c:3829
> #14 0x00007fc0ce278a48 in g_main_context_iterate (context=0x556b785332a0,
> block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at
> gmain.c:3902
> #15 0x00007fc0ce278d72 in g_main_loop_run (loop=0x556b77a59c60) at
> gmain.c:4098
> #16 0x00007fc0cbc5b47b in red_worker_main (arg=0x556b785326c0) at
> red-worker.c:1139
> #17 0x00007fc0c9d832de in start_thread (arg=<optimized out>) at
> pthread_create.c:486
> #18 0x00007fc0c9ab4e83 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Please file that one as a separate bug against spice; I don't think it's related to my change (which is in the usb redirect code)
Please add a note here to say what the new bz is.

Comment 24 Han Han 2020-01-03 10:05:38 UTC
(In reply to Dr. David Alan Gilbert from comment #23)
> (In reply to Han Han from comment #22)
> > Created attachment 1649328 [details]
> > segment fault in scratch build: L1 XMLs and backtrace
> > 
> > Hello, another segment fault was found on scratch build when loop migrations:
> > Version: libvirt-5.10.0-1.module+el8.2.0+5135+ed3b2489.x86_64
> > qemu-kvm-4.2.0-4.el8.bz1752320a.x86_64
> > 
> > Setup:
> > Prepare 2 L1 host for migration, one with interface bandwidth limit:
> >       <bandwidth>
> >         <inbound average='20480'/>
> >         <outbound average='20480'/>
> >       </bandwidth>
> > 
> > Steps: Follow steps1~3 comment0
> > 
> > Results: Sometimes following warnings pop up on virsh migration cmdline:
> > Migration: [  0 %]error: internal error: qemu unexpectedly closed the
> > monitor: 2020-01-03T03:07:12.112093Z qemu-kvm: warning: host doesn't support
> > requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
> > 2020-01-03T03:07:12.112109Z qemu-kvm: warning: host doesn't support
> > requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]       
> > 
> > 2020-01-03T03:07:12.114502Z qemu-kvm: warning: host doesn't support
> > requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]        
> > 
> > 2020-01-03T03:07:12.114513Z qemu-kvm: warning: host doesn't support
> > requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]  
> > 
> > And then get a segment fault:
> > # abrt-cli ls
> > id 0d37d84a6d9cdc4dc9a8ecee86350ec983faf720
> > reason:         main_channel_client_is_low_bandwidth(): qemu-kvm killed by
> > SIGSEGV
> > time:           Fri 03 Jan 2020 11:07:12 AM CST
> > cmdline:        /usr/libexec/qemu-kvm -name guest=nfs-8.2,debug-threads=on
> > -S -object
> > secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1112-nfs-8.
> > 2/master-key.aes -machine
> > pc-q35-rhel8.2.0,accel=kvm,usb=off,dump-guest-core=off -cpu
> > Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,
> > umip=on,arch-capabilities=on,xsaves=on,pdpe1gb=on,skip-l1dfl-vmentry=on,spec-
> > ctrl=off,mpx=off -m 1024 -overcommit mem-lock=off -smp
> > 2,sockets=2,cores=1,threads=1 -uuid e5717b24-74cc-4aa3-b2c3-e55cd2589f98
> > -no-user-config -nodefaults -chardev
> > socket,id=charmonitor,fd=37,server,nowait -mon
> > chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew
> > -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global
> > ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device
> > pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,
> > addr=0x2 -device
> > pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device
> > pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device
> > pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device
> > pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device
> > pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device
> > pcie-pci-bridge,id=pci.7,bus=pci.4,addr=0x0 -device
> > ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 -device
> > ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,
> > addr=0x1d -device
> > ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 -device
> > ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 -device
> > qemu-xhci,id=usb1,bus=pci.2,addr=0x0 -device
> > virtio-scsi-pci,id=scsi0,bus=pci.1,addr=0x0 -device
> > ahci,id=sata1,bus=pci.7,addr=0x1 -blockdev
> > '{\"driver\":\"gluster\",\"volume\":\"gv\",\"path\":\"nfs-8.2.qcow2\",
> > \"server\":[{\"type\":\"inet\",\"host\":\"gls1.usersys.redhat.com\",\"port\":
> > \"24007\"}],\"debug\":0,\"node-name\":\"libvirt-1-storage\",\"auto-read-
> > only\":true,\"discard\":\"unmap\"}' -blockdev
> > '{\"node-name\":\"libvirt-1-format\",\"read-only\":false,\"driver\":
> > \"qcow2\",\"file\":\"libvirt-1-storage\",\"backing\":null}' -device
> > virtio-blk-pci,scsi=off,bus=pci.5,addr=0x0,drive=libvirt-1-format,id=virtio-
> > disk0,bootindex=1 -chardev pty,id=charserial0 -device
> > isa-serial,chardev=charserial0,id=serial0 -device
> > usb-tablet,id=input0,bus=usb.0,port=1 -spice
> > port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -device
> > qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,
> > vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -chardev
> > spicevmc,id=charredir0,name=usbredir -device
> > usb-redir,chardev=charredir0,id=redir0,bus=usb1.0,port=2 -incoming defer
> > -device virtio-balloon-pci,id=balloon0,bus=pci.3,addr=0x0 -sandbox
> > on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg
> > timestamp=on
> > package:        15:qemu-kvm-core-4.2.0-4.el8.bz1752320a
> > uid:            107 (qemu)
> > count:          1
> > Directory:      /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404
> > Run 'abrt-cli report /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404' for
> > creating a case in Red Hat Customer Portal
> > 
> > Backtrace
> > #0  0x00007fc0cbc3d588 in main_channel_client_is_low_bandwidth
> > (mcc=mcc@entry=0x556b771c02a0) at main-channel-client.c:656
> > #1  0x00007fc0cbc1f516 in dcc_config_socket (rcc=0x556b77648540) at
> > dcc.c:1426
> > #2  0x00007fc0cbc440a3 in red_channel_client_config_socket
> > (rcc=0x556b77648540) at red-channel-client.c:1046
> > #3  red_channel_client_initable_init (initable=<optimized out>,
> > cancellable=<optimized out>, error=0x0) at red-channel-client.c:925
> > #4  0x00007fc0c6eed5df in g_initable_new_valist (object_type=<optimized
> > out>, first_property_name=0x7fc0cbce2dc4 "channel", var_args=0x7fc0a4f90350,
> > cancellable=0x0, error=0x0) at ginitable.c:248
> > #5  0x00007fc0c6eed68d in g_initable_new (object_type=<optimized out>,
> > cancellable=cancellable@entry=0x0, error=error@entry=0x0,
> > first_property_name=first_property_name@entry=0x7fc0cbce2dc4 "channel")
> >     at ginitable.c:162
> > #6  0x00007fc0cbc2015d in dcc_new (display=display@entry=0x556b7716d930,
> > client=client@entry=0x556b77166dd0, stream=stream@entry=0x556b775d5570,
> > mig_target=mig_target@entry=1, caps=caps@entry=0x556b785333a8, 
> >     image_compression=SPICE_IMAGE_COMPRESSION_AUTO_GLZ,
> > jpeg_state=SPICE_WAN_COMPRESSION_AUTO,
> > zlib_glz_state=SPICE_WAN_COMPRESSION_AUTO) at dcc.c:507
> > #7  0x00007fc0cbc2b8c6 in display_channel_connect (channel=<optimized out>,
> > client=0x556b77166dd0, stream=0x556b775d5570, migration=1,
> > caps=0x556b785333a8) at display-channel.c:2616
> > #8  0x00007fc0cbc4200b in handle_dispatcher_connect (opaque=<optimized out>,
> > payload=0x556b78533390) at red-channel.c:511
> > #9  0x00007fc0cbc2747f in dispatcher_handle_single_read
> > (dispatcher=0x556b785328a0) at dispatcher.c:287
> > #10 dispatcher_handle_recv_read (dispatcher=0x556b785328a0) at
> > dispatcher.c:307
> > #11 0x00007fc0cbc2d79f in watch_func (source=<optimized out>,
> > condition=<optimized out>, data=0x556b785333e0) at event-loop.c:119
> > #12 0x00007fc0ce27867d in g_main_dispatch (context=0x556b785332a0) at
> > gmain.c:3176
> > #13 g_main_context_dispatch (context=context@entry=0x556b785332a0) at
> > gmain.c:3829
> > #14 0x00007fc0ce278a48 in g_main_context_iterate (context=0x556b785332a0,
> > block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at
> > gmain.c:3902
> > #15 0x00007fc0ce278d72 in g_main_loop_run (loop=0x556b77a59c60) at
> > gmain.c:4098
> > #16 0x00007fc0cbc5b47b in red_worker_main (arg=0x556b785326c0) at
> > red-worker.c:1139
> > #17 0x00007fc0c9d832de in start_thread (arg=<optimized out>) at
> > pthread_create.c:486
> > #18 0x00007fc0c9ab4e83 in clone () at
> > ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> 
> Please file that one as a separate bug against spice; I don't think it's
> related to my change (which is in the usb redirect code)
> Please add a note here to say what the new bz is.

New a bug on spice: https://bugzilla.redhat.com/show_bug.cgi?id=1787536

Comment 25 Dr. David Alan Gilbert 2020-01-03 10:19:59 UTC
Thanks; please don't change the state of this bug - I've set it back to assigned.

Comment 27 Dr. David Alan Gilbert 2020-01-15 12:15:47 UTC
QE note: This also has the fix for 1786414's initial problem in;  but I'll keep 1786414 open until we solve the other problem we found on the corresponding 1786413.

Comment 28 Danilo de Paula 2020-01-15 14:36:56 UTC
QA_ACK, please?

Comment 31 Ademar Reis 2020-02-05 23:06:14 UTC
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 32 Li Xiaohui 2020-03-10 13:08:07 UTC
ok, reproduce this bz on hosts(kernel-4.18.0-128.el8.x86_64 & qemu-img-4.1.0-9.module+el8.1.0+4210+23b2046a.x86_64) after test as Comment 0,
do ping-pong migration about 30min:
(1)get error from qemu log on src host:
************************************************************
warning: chardev_can_read called on non open chardev!

2020-03-10T12:46:08.191923Z qemu-kvm: warning: chardev_can_read called on non open chardev!
*******************************************************************************************

(2)couldn't connect guest via remote-viewer
(3)command hang: # virsh domstats nfs

Comment 33 Li Xiaohui 2020-03-11 03:24:57 UTC
verify this bz on hosts(kernel-4.18.0-185.el8.x86_64&qemu-img-2.12.0-99.module+el8.2.0+5827+8c39933c.x86_64&libvirt-4.5.0-41.module+el8.2.0+5928+db9eea38.x86_64) as Comment 32 by trying two times:

test steps:
(1)start nfs guest via virsh command 
Notes: change two places in nfs.xml: 
a. on rhel8.2-non-av, use "pc-q35-rhel7.6.0" to instead of "pc-q35-rhel8.0.0";
b. add "cache=none" into system disk since get error when mgiration:
error: Unsafe migration: Migration may lead to data corruption if disks use cache != none or cache != directsync
(2)on other client host(not src or dst), run migrate.sh & run.sh

Test results:
wait > 1 hour, vm works well and migrate successfully.

Dave, I get some warnings from qemu log during migration, could you help see whether it's a issue?
2020-03-11T02:51:04.775928Z qemu-kvm: warning: usb-redir connection broken during migration
(process:111953): Spice-WARNING **: 22:51:05.363: Connection reset by peer
(process:111953): Spice-WARNING **: 22:51:05.364: Connection reset by peer
2020-03-11 02:51:06.036+0000: initiating migration
2020-03-11 02:51:14.105+0000: shutting down, reason=migrated

Comment 34 Dr. David Alan Gilbert 2020-03-11 10:41:35 UTC
As long as it's working, and a normal migration test shows usb/spice still working after migration, I don't worry about those warnings much.

Comment 35 Li Xiaohui 2020-03-11 11:10:34 UTC
Yeah, let's close it verified as migration finish successfully and vm works well.

Comment 37 errata-xmlrpc 2020-04-28 15:32:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:1587


Note You need to log in before you can comment on or make changes to this bug.