Bug 1752320

Summary:

vm gets stuck when migrate vm back and forth with remote-viewer trying to connect

Product:

Red Hat Enterprise Linux 8

Reporter:

Han Han <hhan>

Component:

qemu-kvm

Assignee:

Dr. David Alan Gilbert <dgilbert>

qemu-kvm sub component:

General

QA Contact:

Li Xiaohui <xiaohli>

Status:

CLOSED ERRATA

Docs Contact:

Severity:

unspecified

Priority:

medium

CC:

ddepaula, dgilbert, fjin, jinzhao, juzhang, rbalakri, virt-maint, xiaohli, yafu, zhguo

Version:

8.1

Target Milestone:

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

qemu-kvm-2.12.0-97.module+el8.2.0+5545+14c6799f

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2020-04-28 15:32:34 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
The vm log, libvirtd log of src and dst host; Scripts for reproducing	none
segment fault in scratch build: L1 XMLs and backtrace	none

Description Han Han 2019-09-16 03:02:44 UTC

Created attachment 1615431 [details]
The vm log, libvirtd log of src and dst host; Scripts for reproducing

Description of problem:
As subject

Version-Release number of selected component (if applicable):
Src and dst host:
libvirt-5.6.0-4.module+el8.1.0+4160+b50057dc.x86_64
qemu-kvm-4.1.0-9.module+el8.1.0+4210+23b2046a.x86_64

Desktop:
virt-viewer-7.0-8.el8.x86_64

How reproducible:
30%


Steps to Reproduce:
Preparation:
1. Make sure dst and src hostnames could be resolved
2. Open ports of firewalld in src and dst hosts:
5900-6000/tcp 49152-49252/tcp

Steps:
1. Prepare a running vm named nfs
2. Run migrate.sh to migrate the vm back and forth
3. Run run.sh to connect vm continuously by remote-viewer until migrate.sh doesn't pop up any migration progress.
It means migration cannot be started.
4. Login to the host that remote-viewer connected. Try to get the stats of vm;
# virsh domstats nfs

The command will get stuck. And the `virsh qemu-monitor-command nfs --hmp info nfs` will get timeout, too.

Actual results:
As above

Expected results:
No stuck

Comment 2 Dr. David Alan Gilbert 2019-09-16 18:31:03 UTC

last message on I think destination side is:

2019-09-11T02:02:26.818516Z qemu-kvm: warning: Spice: Connection reset by peer
2019-09-11T02:02:26.818713Z qemu-kvm: warning: Spice: Connection reset by peer
2019-09-11T02:02:26.818741Z qemu-kvm: warning: Spice: Connection reset by peer
warning: chardev_can_read called on non open chardev!

2019-09-11T02:02:30.213079Z qemu-kvm: warning: chardev_can_read called on non open chardev!

2019-09-11 02:18:12.297+0000: shutting down, reason=destroyed

and that seems to be after the source thinks the migration has completed.

I've not seen that 'chardev_can_read' warning before - that's odd!

Comment 3 Dr. David Alan Gilbert 2019-09-16 18:38:13 UTC

The message comes from usbredir:

hw/usb/redirect.c:
static int usbredir_chardev_can_read(void *opaque)
{
    USBRedirDevice *dev = opaque;

    if (!dev->parser) {
        WARNING("chardev_can_read called on non open chardev!\n");
        return 0;
    }

so I guess there's some ordering problem when it's disconnected.

Comment 4 Li Xiaohui 2019-09-26 08:40:06 UTC

I have tried to reproduce via libvirt on the latest host, didn't hit this issue after 20 hours, only tried one times, will try again later:
1.migrate.sh & run.sh still work well
2."virsh domstats nfs" & "virsh qemu-monitor-command nfs --hmp info nfs" can get guest status

Comment 5 Li Xiaohui 2019-09-26 08:43:00 UTC

Hi Han han,
When you met this issue, how long did you run migrate.sh & run.sh ?

Comment 6 Han Han 2019-09-29 07:44:20 UTC

(In reply to Li Xiaohui from comment #5)
> Hi Han han,
> When you met this issue, how long did you run migrate.sh & run.sh ?

I usually reproduce it by less than 100 times of remote-viewer being executed, within half an hour.
Could you please provide your ENV and let me have a look?

Comment 7 Dr. David Alan Gilbert 2019-11-08 10:57:20 UTC

Reproduced here using your scripts; took about 10 migrations.

warning: chardev_can_read called on non open chardev!

2019-11-08T10:55:01.144742Z qemu-kvm: warning: chardev_can_read called on non open chardev!

Comment 8 Dr. David Alan Gilbert 2019-11-08 14:34:37 UTC

As well as the warning about the non-open chardev (that comes from usbredir), the hang is coming on the write side of things
as we recursively try and do a write on the same socket:
#0  0x00007fc77b4a68dd in __lll_lock_wait () from target:/lib64/libpthread.so.0
#1  0x00007fc77b49faf9 in pthread_mutex_lock () from target:/lib64/libpthread.so.0
#2  0x000055738d7774fd in qemu_mutex_lock_impl (mutex=0x55738f0b18e8, file=0x55738d90b78d "chardev/char.c", line=111) at util/qemu-thread-posix.c:66
#3  0x000055738d70cc03 in qemu_chr_write_buffer (s=s@entry=0x55738f0b18c0, buf=buf@entry=0x55738fe2db80 "", len=80, offset=offset@entry=0x7ffef54237d0,
    write_all=false) at chardev/char.c:111
(gdb) p *s
$2 = {parent_obj = {class = 0x55738f053c60, free = 0x7fc77ff8a380 <g_free>, properties = 0x55738f0aec60, ref = 1, parent = 0x55738f0a98f0},
  chr_write_lock = {lock = {__data = {__lock = 2, __count = 0, __owner = 25470, __nusers = 1, __kind = 0, __spins = 0, __elision = 0, __list = {
          __prev = 0x0, __next = 0x0}}, __size = "\002\000\000\000\000\000\000\000~c\000\000\001", '\000' <repeats 26 times>, __align = 2},
    initialized = true}, be = 0x55739046dcf0, label = 0x55738f0a9cb0 "charredir0", filename = 0x55738f0abc60 "spicevmc", logfd = -1, be_open = 1,
  gsource = 0x0, gcontext = 0x0, features = {0}}
 
#4  0x000055738d70cf13 in qemu_chr_write (s=0x55738f0b18c0, buf=buf@entry=0x55738fe2db80 "", len=len@entry=80, write_all=write_all@entry=false)
    at chardev/char.c:149
#5  0x000055738d70efe3 in qemu_chr_fe_write (be=be@entry=0x55739046dcf0, buf=buf@entry=0x55738fe2db80 "", len=len@entry=80) at chardev/char-fe.c:42
#6  0x000055738d62836c in usbredir_write (priv=0x55739046c640, data=0x55738fe2db80 "", count=80) at hw/usb/redirect.c:289
#7  0x00007fc77fd3241b in usbredirparser_do_write () from target:/lib64/libusbredirparser.so.1
#8  0x000055738d56c371 in vmc_write (sin=<optimized out>, buf=<optimized out>, len=<optimized out>) at chardev/spice.c:34
#9  0x00007fc77d33380c in red_char_device_write_to_device (dev=0x55738f0ed960) at char-device.c:492
#10 red_char_device_write_to_device (dev=0x55738f0ed960) at char-device.c:458
#11 0x00007fc77d3341bd in red_char_device_wakeup (dev=0x55738f0ed960) at char-device.c:854
#12 0x00007fc77d36b645 in spice_server_char_device_wakeup (sin=sin@entry=0x55738f0b1950) at reds.c:3264
#13 0x000055738d56ca64 in spice_chr_write (chr=0x55738f0b18c0, buf=0x55738fe2db80 "", len=80) at chardev/spice.c:202
#14 0x000055738d70cc50 in qemu_chr_write_buffer (s=s@entry=0x55738f0b18c0, buf=buf@entry=0x55738fe2db80 "", len=80, offset=offset@entry=0x7ffef5423970,
    write_all=false) at chardev/char.c:114
#15 0x000055738d70cf13 in qemu_chr_write (s=0x55738f0b18c0, buf=buf@entry=0x55738fe2db80 "", len=len@entry=80, write_all=write_all@entry=false)
    at chardev/char.c:149
#16 0x000055738d70efe3 in qemu_chr_fe_write (be=be@entry=0x55739046dcf0, buf=buf@entry=0x55738fe2db80 "", len=len@entry=80) at chardev/char-fe.c:42
#17 0x000055738d62836c in usbredir_write (priv=0x55739046c640, data=0x55738fe2db80 "", count=80) at hw/usb/redirect.c:289
#18 0x00007fc77fd3241b in usbredirparser_do_write () from target:/lib64/libusbredirparser.so.1
#19 0x000055738d627cc8 in usbredir_create_parser (dev=0x55739046c640) at hw/usb/redirect.c:1235
#20 0x00007fc77d378452 in spicevmc_connect (channel=<optimized out>, client=0x55738f519920, stream=<optimized out>, migration=<optimized out>,
    caps=<optimized out>) at spicevmc.c:793
#21 0x00007fc77d359c85 in red_channel_connect (channel=channel@entry=0x55738f0ea1c0, client=client@entry=0x55738f519920,
    stream=stream@entry=0x55739049fb30, migration=0, caps=caps@entry=0x7ffef5423b30) at red-channel.c:523
#22 0x00007fc77d367633 in reds_channel_do_link (channel=channel@entry=0x55738f0ea1c0, client=client@entry=0x55738f519920,
    link_msg=link_msg@entry=0x55738f080750, stream=0x55739049fb30) at reds.c:2039
#23 0x00007fc77d36dc00 in reds_handle_other_links (link=0x55739049fa50, reds=0x55738f0c1920) at reds.c:2180
#24 reds_handle_link (link=link@entry=0x55739049fa50) at reds.c:2196
#25 0x00007fc77d36e24f in reds_handle_ticket (opaque=0x55739049fa50) at reds.c:2250
#26 0x000055738d774882 in aio_dispatch_handlers (ctx=ctx@entry=0x55738f01baf0) at util/aio-posix.c:429
#27 0x000055738d77522c in aio_dispatch (ctx=0x55738f01baf0) at util/aio-posix.c:460
#28 0x000055738d771cc2 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
#29 0x00007fc77ff8472d in g_main_context_dispatch () from target:/lib64/libglib-2.0.so.0
#30 0x000055738d7742d8 in glib_pollfds_poll () at util/main-loop.c:218
#31 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:241
#32 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:517
#33 0x000055738d55d3e9 in main_loop () at vl.c:1809

and we block in s->chr_write_lock in qemu_chr_write_buffer

Comment 9 Dr. David Alan Gilbert 2019-11-08 17:45:52 UTC

This means we're going through qemu_chr_write_buffer twice on the same chardev; shouldn't happen.

I'm fairly sure that aio based call to reds_handle_other_links comes via:

async reds_handle_ticket probably from reds_get_spice_ticket
  that's got a couple of routes, probably from reds_handle_auth_mechanism ?
from reds_handle_read_link_done
from reds_handle_read_header_done
from reds_handle_read_magic_done
from reds_handle_new_link
from reds_handle_ssl_accept or reds_init_client_ssl_connection or spice_server_add_client or reds_handle_read_magic_done (websocket)

i.e. when the remote-viewer connects.
it's not clear what the migration interaction is yet

#23 0x00007fc77d36dc00 in reds_handle_other_links (link=0x55739049fa50, reds=0x55738f0c1920) at reds.c:2180
    reds->dst_do_seamless_migrate= 0
(gdb) p *client
$8 = {parent = {g_type_instance = {g_class = 0x55738f53ab20}, ref_count = 1, qdata = 0x0}, reds = 0x55738f0c1920, channels = 0x55739047ab20,
  mcc = 0x5573903ecde0, lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {
        __prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, thread_id = 140494834314944, disconnecting = 0,
  during_target_migrate = 0, seamless_migrate = 0, num_migrated_channels = 0}

Comment 10 Dr. David Alan Gilbert 2019-11-08 19:50:17 UTC

Also triggered on current upstream. (41[34]'s /opt/bz1752320/  for ref)

Also noticed:
  Not all cases where the chardev_can_read warning happens cause the hang.
There's also a bunch of debug of the form:
warning: Spice: display:0 (0x557996221940): unexpected
Spice: usbredir:0 (0x55799621f9e0): unexpected
2019-11-08T19:45:23.064895Z qemu-system-x86_64: warning: Spice: main:0 (0x557996221890): unexpected

Comment 11 Dr. David Alan Gilbert 2019-12-17 15:28:29 UTC

I think the 'chardev_can_read called on non open chardev' could be due to usbredir_chardev_close_bh
destroying the parser before it removes the watch; but I'm not sure yet.

Comment 12 Dr. David Alan Gilbert 2019-12-17 15:32:57 UTC

The actual recursion seems similar to the one spice protects with it's commit:

commit 0c1f5b00e7907aefee13f86a234558f00cd6c7ef
Author: Uri Lublin <uril>
Date:   Mon Feb 2 12:35:59 2015 +0200

    char-device: spice_char_device_write_to_device: protect against recursion

which was added to handle a wakeup based recursion - but we've only actually
gone through spices write path once so far; the other side comign from red_channel_connect->spicevmc_connect

Comment 13 Dr. David Alan Gilbert 2019-12-17 15:44:38 UTC

Some debug;  seems to suggest that the 'chardev open' is happening after the post_load?
usbredir_post_load
vm_change_state_handler: running=1 state=9
DAG: qemu_spice_display_start entry spice_display_is_running: 0
warning: chardev_can_read called on non open chardev!

usbredir_vm_state_change running=1 state=9
2019-12-17T12:15:08.155275Z qemu-system-x86_64: warning: chardev_can_read called on non open chardev!

2019-12-17T12:15:08.155317Z qemu-system-x86_64: usb-redir: chardev open (top)

usbredir_chardev_close_bh: entry dev=0x55824617e150
usbredir_device_disconnect entry dev=0x55824617e150
2019-12-17T12:15:08.155339Z qemu-system-x86_64: usb-redir: removing 0 packet-ids from cancelled queue

2019-12-17T12:15:08.155348Z qemu-system-x86_64: usb-redir: removing 0 packet-ids from already-in-flight queue

usbredir_device_disconnect exit dev=0x55824617e150
usbredir_chardev_close_bh: exit dev=0x55824617e150
2019-12-17T12:15:08.155367Z qemu-system-x86_64: usb-redir: creating usbredirparser

usbredir_write: count=80 entry
usbredir_write: be open: 0x558244ee1c00 - charredir0
2019-12-17T12:15:08.155570Z qemu-system-x86_64: usbredirparser: Peer version: spice-gtk 0.37, using 64-bits ids
usbredir_hello
usbredir_write: count=80 entry
usbredir_write: be open: 0x558244ee1c00 - charredir0

Comment 14 Dr. David Alan Gilbert 2019-12-17 16:36:37 UTC

(In reply to Dr. David Alan Gilbert from comment #11)
> I think the 'chardev_can_read called on non open chardev' could be due to
> usbredir_chardev_close_bh
> destroying the parser before it removes the watch; but I'm not sure yet.

Flipping that around doesn't help fix either bug.

Comment 15 Dr. David Alan Gilbert 2019-12-17 20:18:11 UTC

I'm trying the following recursion guard; it looks like it's working but needs more testing.

diff --git a/hw/usb/redirect.c b/hw/usb/redirect.c
index e0f5ca6f81..02e4ea594f 100644
--- a/hw/usb/redirect.c
+++ b/hw/usb/redirect.c
@@ -113,6 +113,7 @@ struct USBRedirDevice {
     /* Properties */
     CharBackend cs;
     bool enable_streams;
+    bool in_write;
     uint8_t debug;
     int32_t bootindex;
     char *filter_str;
@@ -290,6 +291,13 @@ static int usbredir_write(void *priv, uint8_t *data, int count)
         return 0;
     }
 
+    /* Recursion check */
+    if (dev->in_write) {
+        DPRINTF("usbredir_write recursion\n");
+        return 0;
+    }
+    dev->in_write=true;
+
     r = qemu_chr_fe_write(&dev->cs, data, count);
     if (r < count) {
         if (!dev->watch) {
@@ -300,6 +308,7 @@ static int usbredir_write(void *priv, uint8_t *data, int count)
             r = 0;
         }
     }
+    dev->in_write=false;
     return r;
 }

Comment 16 Dr. David Alan Gilbert 2019-12-18 11:22:11 UTC

Please try:
http://brew-task-repos.usersys.redhat.com/repos/scratch/dgilbert/qemu-kvm/4.2.0/4.el8.bz1752320a/

seems to fix it for me.

Comment 17 Dr. David Alan Gilbert 2019-12-18 11:30:42 UTC

Patch sent upstream:
  [PATCH] usbredir: Prevent recursion in usbredir_write

Comment 18 Han Han 2019-12-25 05:55:30 UTC

(In reply to Dr. David Alan Gilbert from comment #16)
> Please try:
> http://brew-task-repos.usersys.redhat.com/repos/scratch/dgilbert/qemu-kvm/4.
> 2.0/4.el8.bz1752320a/
> 
> seems to fix it for me.

Unluckily, I cannot reproduce the bug anymore on libvirt-5.6.0-4.module+el8.1.0+4160+b50057dc.x86_64 qemu-kvm-4.1.0-9.module+el8.1.0+4210+23b2046a.x86_64. I have run it as the scripts for 30mins and more than 600 times of migration cycle, not reproduced.

I can summarize the issues that found by the these scripts here. Maybe you can find something in common.

remote-viewer gets SIGABRT when connect to a being migrated VM: https://bugzilla.redhat.com/show_bug.cgi?id=1746239#c4

remote-viewer gets SIGSEGV when connect to a being migrated VM: https://bugzilla.redhat.com/show_bug.cgi?id=1746239#c0

qemu-kvm get SIGABRT when setting spice usb redir during migration: https://bugzilla.redhat.com/show_bug.cgi?id=1786413#c0

qemu-kvm get SIGSEGV when setting spice usb redir during migration: https://bugzilla.redhat.com/show_bug.cgi?id=1786413#c2

Comment 19 Han Han 2019-12-25 09:05:50 UTC

More update: the bug is hard to reproduce but it did appears sometime after comment18.
I will continue find a steady way to reproduce it.

Comment 20 Dr. David Alan Gilbert 2020-01-02 15:43:59 UTC

(In reply to Han Han from comment #19)
> More update: the bug is hard to reproduce but it did appears sometime after
> comment18.
> I will continue find a steady way to reproduce it.

Yes it's difficult - sometimes it repeats often; sometimes it hides.

I'd also seen the SIGSEGV in remote viewer 1746239 - but they're very random; often lots of different places.
I'd not seen either the abrt or the other qemu crashes.

Dave

Comment 21 Han Han 2020-01-03 03:00:19 UTC

(In reply to Dr. David Alan Gilbert from comment #16)
> Please try:
> http://brew-task-repos.usersys.redhat.com/repos/scratch/dgilbert/qemu-kvm/4.
> 2.0/4.el8.bz1752320a/
> 
> seems to fix it for me.

For your scratch build, I have run the bug reproducing scripts for over one day.
The bug is not reproduced.
Version:
libvirt-5.10.0-1.module+el8.2.0+5135+ed3b2489.x86_64
qemu-kvm-4.2.0-4.el8.bz1752320a.x86_64

Comment 22 Han Han 2020-01-03 03:45:22 UTC

Created attachment 1649328 [details]
segment fault in scratch build: L1 XMLs and backtrace

Hello, another segment fault was found on scratch build when loop migrations:
Version: libvirt-5.10.0-1.module+el8.2.0+5135+ed3b2489.x86_64
qemu-kvm-4.2.0-4.el8.bz1752320a.x86_64

Setup:
Prepare 2 L1 host for migration, one with interface bandwidth limit:
      <bandwidth>
        <inbound average='20480'/>
        <outbound average='20480'/>
      </bandwidth>

Steps: Follow steps1~3 comment0

Results: Sometimes following warnings pop up on virsh migration cmdline:
Migration: [  0 %]error: internal error: qemu unexpectedly closed the monitor: 2020-01-03T03:07:12.112093Z qemu-kvm: warning: host doesn't support requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
2020-01-03T03:07:12.112109Z qemu-kvm: warning: host doesn't support requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]                                                                         
2020-01-03T03:07:12.114502Z qemu-kvm: warning: host doesn't support requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]                                                                          
2020-01-03T03:07:12.114513Z qemu-kvm: warning: host doesn't support requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]  

And then get a segment fault:
# abrt-cli ls
id 0d37d84a6d9cdc4dc9a8ecee86350ec983faf720
reason:         main_channel_client_is_low_bandwidth(): qemu-kvm killed by SIGSEGV
time:           Fri 03 Jan 2020 11:07:12 AM CST
cmdline:        /usr/libexec/qemu-kvm -name guest=nfs-8.2,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1112-nfs-8.2/master-key.aes -machine pc-q35-rhel8.2.0,accel=kvm,usb=off,dump-guest-core=off -cpu Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,arch-capabilities=on,xsaves=on,pdpe1gb=on,skip-l1dfl-vmentry=on,spec-ctrl=off,mpx=off -m 1024 -overcommit mem-lock=off -smp 2,sockets=2,cores=1,threads=1 -uuid e5717b24-74cc-4aa3-b2c3-e55cd2589f98 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=37,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-pci-bridge,id=pci.7,bus=pci.4,addr=0x0 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x1d -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 -device qemu-xhci,id=usb1,bus=pci.2,addr=0x0 -device virtio-scsi-pci,id=scsi0,bus=pci.1,addr=0x0 -device ahci,id=sata1,bus=pci.7,addr=0x1 -blockdev '{\"driver\":\"gluster\",\"volume\":\"gv\",\"path\":\"nfs-8.2.qcow2\",\"server\":[{\"type\":\"inet\",\"host\":\"gls1.usersys.redhat.com\",\"port\":\"24007\"}],\"debug\":0,\"node-name\":\"libvirt-1-storage\",\"auto-read-only\":true,\"discard\":\"unmap\"}' -blockdev '{\"node-name\":\"libvirt-1-format\",\"read-only\":false,\"driver\":\"qcow2\",\"file\":\"libvirt-1-storage\",\"backing\":null}' -device virtio-blk-pci,scsi=off,bus=pci.5,addr=0x0,drive=libvirt-1-format,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb1.0,port=2 -incoming defer -device virtio-balloon-pci,id=balloon0,bus=pci.3,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
package:        15:qemu-kvm-core-4.2.0-4.el8.bz1752320a
uid:            107 (qemu)
count:          1
Directory:      /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404
Run 'abrt-cli report /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404' for creating a case in Red Hat Customer Portal

Backtrace
#0  0x00007fc0cbc3d588 in main_channel_client_is_low_bandwidth (mcc=mcc@entry=0x556b771c02a0) at main-channel-client.c:656
#1  0x00007fc0cbc1f516 in dcc_config_socket (rcc=0x556b77648540) at dcc.c:1426
#2  0x00007fc0cbc440a3 in red_channel_client_config_socket (rcc=0x556b77648540) at red-channel-client.c:1046
#3  red_channel_client_initable_init (initable=<optimized out>, cancellable=<optimized out>, error=0x0) at red-channel-client.c:925
#4  0x00007fc0c6eed5df in g_initable_new_valist (object_type=<optimized out>, first_property_name=0x7fc0cbce2dc4 "channel", var_args=0x7fc0a4f90350, cancellable=0x0, error=0x0) at ginitable.c:248
#5  0x00007fc0c6eed68d in g_initable_new (object_type=<optimized out>, cancellable=cancellable@entry=0x0, error=error@entry=0x0, first_property_name=first_property_name@entry=0x7fc0cbce2dc4 "channel")
    at ginitable.c:162
#6  0x00007fc0cbc2015d in dcc_new (display=display@entry=0x556b7716d930, client=client@entry=0x556b77166dd0, stream=stream@entry=0x556b775d5570, mig_target=mig_target@entry=1, caps=caps@entry=0x556b785333a8, 
    image_compression=SPICE_IMAGE_COMPRESSION_AUTO_GLZ, jpeg_state=SPICE_WAN_COMPRESSION_AUTO, zlib_glz_state=SPICE_WAN_COMPRESSION_AUTO) at dcc.c:507
#7  0x00007fc0cbc2b8c6 in display_channel_connect (channel=<optimized out>, client=0x556b77166dd0, stream=0x556b775d5570, migration=1, caps=0x556b785333a8) at display-channel.c:2616
#8  0x00007fc0cbc4200b in handle_dispatcher_connect (opaque=<optimized out>, payload=0x556b78533390) at red-channel.c:511
#9  0x00007fc0cbc2747f in dispatcher_handle_single_read (dispatcher=0x556b785328a0) at dispatcher.c:287
#10 dispatcher_handle_recv_read (dispatcher=0x556b785328a0) at dispatcher.c:307
#11 0x00007fc0cbc2d79f in watch_func (source=<optimized out>, condition=<optimized out>, data=0x556b785333e0) at event-loop.c:119
#12 0x00007fc0ce27867d in g_main_dispatch (context=0x556b785332a0) at gmain.c:3176
#13 g_main_context_dispatch (context=context@entry=0x556b785332a0) at gmain.c:3829
#14 0x00007fc0ce278a48 in g_main_context_iterate (context=0x556b785332a0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3902
#15 0x00007fc0ce278d72 in g_main_loop_run (loop=0x556b77a59c60) at gmain.c:4098
#16 0x00007fc0cbc5b47b in red_worker_main (arg=0x556b785326c0) at red-worker.c:1139
#17 0x00007fc0c9d832de in start_thread (arg=<optimized out>) at pthread_create.c:486
#18 0x00007fc0c9ab4e83 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment 23 Dr. David Alan Gilbert 2020-01-03 09:12:03 UTC

(In reply to Han Han from comment #22)
> Created attachment 1649328 [details]
> segment fault in scratch build: L1 XMLs and backtrace
> 
> Hello, another segment fault was found on scratch build when loop migrations:
> Version: libvirt-5.10.0-1.module+el8.2.0+5135+ed3b2489.x86_64
> qemu-kvm-4.2.0-4.el8.bz1752320a.x86_64
> 
> Setup:
> Prepare 2 L1 host for migration, one with interface bandwidth limit:
>       <bandwidth>
>         <inbound average='20480'/>
>         <outbound average='20480'/>
>       </bandwidth>
> 
> Steps: Follow steps1~3 comment0
> 
> Results: Sometimes following warnings pop up on virsh migration cmdline:
> Migration: [  0 %]error: internal error: qemu unexpectedly closed the
> monitor: 2020-01-03T03:07:12.112093Z qemu-kvm: warning: host doesn't support
> requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
> 2020-01-03T03:07:12.112109Z qemu-kvm: warning: host doesn't support
> requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]       
> 
> 2020-01-03T03:07:12.114502Z qemu-kvm: warning: host doesn't support
> requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]        
> 
> 2020-01-03T03:07:12.114513Z qemu-kvm: warning: host doesn't support
> requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]  
> 
> And then get a segment fault:
> # abrt-cli ls
> id 0d37d84a6d9cdc4dc9a8ecee86350ec983faf720
> reason:         main_channel_client_is_low_bandwidth(): qemu-kvm killed by
> SIGSEGV
> time:           Fri 03 Jan 2020 11:07:12 AM CST
> cmdline:        /usr/libexec/qemu-kvm -name guest=nfs-8.2,debug-threads=on
> -S -object
> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1112-nfs-8.
> 2/master-key.aes -machine
> pc-q35-rhel8.2.0,accel=kvm,usb=off,dump-guest-core=off -cpu
> Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,
> umip=on,arch-capabilities=on,xsaves=on,pdpe1gb=on,skip-l1dfl-vmentry=on,spec-
> ctrl=off,mpx=off -m 1024 -overcommit mem-lock=off -smp
> 2,sockets=2,cores=1,threads=1 -uuid e5717b24-74cc-4aa3-b2c3-e55cd2589f98
> -no-user-config -nodefaults -chardev
> socket,id=charmonitor,fd=37,server,nowait -mon
> chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew
> -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global
> ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device
> pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,
> addr=0x2 -device
> pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device
> pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device
> pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device
> pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device
> pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device
> pcie-pci-bridge,id=pci.7,bus=pci.4,addr=0x0 -device
> ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 -device
> ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,
> addr=0x1d -device
> ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 -device
> ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 -device
> qemu-xhci,id=usb1,bus=pci.2,addr=0x0 -device
> virtio-scsi-pci,id=scsi0,bus=pci.1,addr=0x0 -device
> ahci,id=sata1,bus=pci.7,addr=0x1 -blockdev
> '{\"driver\":\"gluster\",\"volume\":\"gv\",\"path\":\"nfs-8.2.qcow2\",
> \"server\":[{\"type\":\"inet\",\"host\":\"gls1.usersys.redhat.com\",\"port\":
> \"24007\"}],\"debug\":0,\"node-name\":\"libvirt-1-storage\",\"auto-read-
> only\":true,\"discard\":\"unmap\"}' -blockdev
> '{\"node-name\":\"libvirt-1-format\",\"read-only\":false,\"driver\":
> \"qcow2\",\"file\":\"libvirt-1-storage\",\"backing\":null}' -device
> virtio-blk-pci,scsi=off,bus=pci.5,addr=0x0,drive=libvirt-1-format,id=virtio-
> disk0,bootindex=1 -chardev pty,id=charserial0 -device
> isa-serial,chardev=charserial0,id=serial0 -device
> usb-tablet,id=input0,bus=usb.0,port=1 -spice
> port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -device
> qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,
> vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -chardev
> spicevmc,id=charredir0,name=usbredir -device
> usb-redir,chardev=charredir0,id=redir0,bus=usb1.0,port=2 -incoming defer
> -device virtio-balloon-pci,id=balloon0,bus=pci.3,addr=0x0 -sandbox
> on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg
> timestamp=on
> package:        15:qemu-kvm-core-4.2.0-4.el8.bz1752320a
> uid:            107 (qemu)
> count:          1
> Directory:      /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404
> Run 'abrt-cli report /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404' for
> creating a case in Red Hat Customer Portal
> 
> Backtrace
> #0  0x00007fc0cbc3d588 in main_channel_client_is_low_bandwidth
> (mcc=mcc@entry=0x556b771c02a0) at main-channel-client.c:656
> #1  0x00007fc0cbc1f516 in dcc_config_socket (rcc=0x556b77648540) at
> dcc.c:1426
> #2  0x00007fc0cbc440a3 in red_channel_client_config_socket
> (rcc=0x556b77648540) at red-channel-client.c:1046
> #3  red_channel_client_initable_init (initable=<optimized out>,
> cancellable=<optimized out>, error=0x0) at red-channel-client.c:925
> #4  0x00007fc0c6eed5df in g_initable_new_valist (object_type=<optimized
> out>, first_property_name=0x7fc0cbce2dc4 "channel", var_args=0x7fc0a4f90350,
> cancellable=0x0, error=0x0) at ginitable.c:248
> #5  0x00007fc0c6eed68d in g_initable_new (object_type=<optimized out>,
> cancellable=cancellable@entry=0x0, error=error@entry=0x0,
> first_property_name=first_property_name@entry=0x7fc0cbce2dc4 "channel")
>     at ginitable.c:162
> #6  0x00007fc0cbc2015d in dcc_new (display=display@entry=0x556b7716d930,
> client=client@entry=0x556b77166dd0, stream=stream@entry=0x556b775d5570,
> mig_target=mig_target@entry=1, caps=caps@entry=0x556b785333a8, 
>     image_compression=SPICE_IMAGE_COMPRESSION_AUTO_GLZ,
> jpeg_state=SPICE_WAN_COMPRESSION_AUTO,
> zlib_glz_state=SPICE_WAN_COMPRESSION_AUTO) at dcc.c:507
> #7  0x00007fc0cbc2b8c6 in display_channel_connect (channel=<optimized out>,
> client=0x556b77166dd0, stream=0x556b775d5570, migration=1,
> caps=0x556b785333a8) at display-channel.c:2616
> #8  0x00007fc0cbc4200b in handle_dispatcher_connect (opaque=<optimized out>,
> payload=0x556b78533390) at red-channel.c:511
> #9  0x00007fc0cbc2747f in dispatcher_handle_single_read
> (dispatcher=0x556b785328a0) at dispatcher.c:287
> #10 dispatcher_handle_recv_read (dispatcher=0x556b785328a0) at
> dispatcher.c:307
> #11 0x00007fc0cbc2d79f in watch_func (source=<optimized out>,
> condition=<optimized out>, data=0x556b785333e0) at event-loop.c:119
> #12 0x00007fc0ce27867d in g_main_dispatch (context=0x556b785332a0) at
> gmain.c:3176
> #13 g_main_context_dispatch (context=context@entry=0x556b785332a0) at
> gmain.c:3829
> #14 0x00007fc0ce278a48 in g_main_context_iterate (context=0x556b785332a0,
> block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at
> gmain.c:3902
> #15 0x00007fc0ce278d72 in g_main_loop_run (loop=0x556b77a59c60) at
> gmain.c:4098
> #16 0x00007fc0cbc5b47b in red_worker_main (arg=0x556b785326c0) at
> red-worker.c:1139
> #17 0x00007fc0c9d832de in start_thread (arg=<optimized out>) at
> pthread_create.c:486
> #18 0x00007fc0c9ab4e83 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Please file that one as a separate bug against spice; I don't think it's related to my change (which is in the usb redirect code)
Please add a note here to say what the new bz is.

Comment 24 Han Han 2020-01-03 10:05:38 UTC

(In reply to Dr. David Alan Gilbert from comment #23)
> (In reply to Han Han from comment #22)
> > Created attachment 1649328 [details]
> > segment fault in scratch build: L1 XMLs and backtrace
> > 
> > Hello, another segment fault was found on scratch build when loop migrations:
> > Version: libvirt-5.10.0-1.module+el8.2.0+5135+ed3b2489.x86_64
> > qemu-kvm-4.2.0-4.el8.bz1752320a.x86_64
> > 
> > Setup:
> > Prepare 2 L1 host for migration, one with interface bandwidth limit:
> >       <bandwidth>
> >         <inbound average='20480'/>
> >         <outbound average='20480'/>
> >       </bandwidth>
> > 
> > Steps: Follow steps1~3 comment0
> > 
> > Results: Sometimes following warnings pop up on virsh migration cmdline:
> > Migration: [  0 %]error: internal error: qemu unexpectedly closed the
> > monitor: 2020-01-03T03:07:12.112093Z qemu-kvm: warning: host doesn't support
> > requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
> > 2020-01-03T03:07:12.112109Z qemu-kvm: warning: host doesn't support
> > requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]       
> > 
> > 2020-01-03T03:07:12.114502Z qemu-kvm: warning: host doesn't support
> > requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]        
> > 
> > 2020-01-03T03:07:12.114513Z qemu-kvm: warning: host doesn't support
> > requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]  
> > 
> > And then get a segment fault:
> > # abrt-cli ls
> > id 0d37d84a6d9cdc4dc9a8ecee86350ec983faf720
> > reason:         main_channel_client_is_low_bandwidth(): qemu-kvm killed by
> > SIGSEGV
> > time:           Fri 03 Jan 2020 11:07:12 AM CST
> > cmdline:        /usr/libexec/qemu-kvm -name guest=nfs-8.2,debug-threads=on
> > -S -object
> > secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1112-nfs-8.
> > 2/master-key.aes -machine
> > pc-q35-rhel8.2.0,accel=kvm,usb=off,dump-guest-core=off -cpu
> > Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,
> > umip=on,arch-capabilities=on,xsaves=on,pdpe1gb=on,skip-l1dfl-vmentry=on,spec-
> > ctrl=off,mpx=off -m 1024 -overcommit mem-lock=off -smp
> > 2,sockets=2,cores=1,threads=1 -uuid e5717b24-74cc-4aa3-b2c3-e55cd2589f98
> > -no-user-config -nodefaults -chardev
> > socket,id=charmonitor,fd=37,server,nowait -mon
> > chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew
> > -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global
> > ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device
> > pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,
> > addr=0x2 -device
> > pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device
> > pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device
> > pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device
> > pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device
> > pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device
> > pcie-pci-bridge,id=pci.7,bus=pci.4,addr=0x0 -device
> > ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 -device
> > ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,
> > addr=0x1d -device
> > ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 -device
> > ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 -device
> > qemu-xhci,id=usb1,bus=pci.2,addr=0x0 -device
> > virtio-scsi-pci,id=scsi0,bus=pci.1,addr=0x0 -device
> > ahci,id=sata1,bus=pci.7,addr=0x1 -blockdev
> > '{\"driver\":\"gluster\",\"volume\":\"gv\",\"path\":\"nfs-8.2.qcow2\",
> > \"server\":[{\"type\":\"inet\",\"host\":\"gls1.usersys.redhat.com\",\"port\":
> > \"24007\"}],\"debug\":0,\"node-name\":\"libvirt-1-storage\",\"auto-read-
> > only\":true,\"discard\":\"unmap\"}' -blockdev
> > '{\"node-name\":\"libvirt-1-format\",\"read-only\":false,\"driver\":
> > \"qcow2\",\"file\":\"libvirt-1-storage\",\"backing\":null}' -device
> > virtio-blk-pci,scsi=off,bus=pci.5,addr=0x0,drive=libvirt-1-format,id=virtio-
> > disk0,bootindex=1 -chardev pty,id=charserial0 -device
> > isa-serial,chardev=charserial0,id=serial0 -device
> > usb-tablet,id=input0,bus=usb.0,port=1 -spice
> > port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -device
> > qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,
> > vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -chardev
> > spicevmc,id=charredir0,name=usbredir -device
> > usb-redir,chardev=charredir0,id=redir0,bus=usb1.0,port=2 -incoming defer
> > -device virtio-balloon-pci,id=balloon0,bus=pci.3,addr=0x0 -sandbox
> > on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg
> > timestamp=on
> > package:        15:qemu-kvm-core-4.2.0-4.el8.bz1752320a
> > uid:            107 (qemu)
> > count:          1
> > Directory:      /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404
> > Run 'abrt-cli report /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404' for
> > creating a case in Red Hat Customer Portal
> > 
> > Backtrace
> > #0  0x00007fc0cbc3d588 in main_channel_client_is_low_bandwidth
> > (mcc=mcc@entry=0x556b771c02a0) at main-channel-client.c:656
> > #1  0x00007fc0cbc1f516 in dcc_config_socket (rcc=0x556b77648540) at
> > dcc.c:1426
> > #2  0x00007fc0cbc440a3 in red_channel_client_config_socket
> > (rcc=0x556b77648540) at red-channel-client.c:1046
> > #3  red_channel_client_initable_init (initable=<optimized out>,
> > cancellable=<optimized out>, error=0x0) at red-channel-client.c:925
> > #4  0x00007fc0c6eed5df in g_initable_new_valist (object_type=<optimized
> > out>, first_property_name=0x7fc0cbce2dc4 "channel", var_args=0x7fc0a4f90350,
> > cancellable=0x0, error=0x0) at ginitable.c:248
> > #5  0x00007fc0c6eed68d in g_initable_new (object_type=<optimized out>,
> > cancellable=cancellable@entry=0x0, error=error@entry=0x0,
> > first_property_name=first_property_name@entry=0x7fc0cbce2dc4 "channel")
> >     at ginitable.c:162
> > #6  0x00007fc0cbc2015d in dcc_new (display=display@entry=0x556b7716d930,
> > client=client@entry=0x556b77166dd0, stream=stream@entry=0x556b775d5570,
> > mig_target=mig_target@entry=1, caps=caps@entry=0x556b785333a8, 
> >     image_compression=SPICE_IMAGE_COMPRESSION_AUTO_GLZ,
> > jpeg_state=SPICE_WAN_COMPRESSION_AUTO,
> > zlib_glz_state=SPICE_WAN_COMPRESSION_AUTO) at dcc.c:507
> > #7  0x00007fc0cbc2b8c6 in display_channel_connect (channel=<optimized out>,
> > client=0x556b77166dd0, stream=0x556b775d5570, migration=1,
> > caps=0x556b785333a8) at display-channel.c:2616
> > #8  0x00007fc0cbc4200b in handle_dispatcher_connect (opaque=<optimized out>,
> > payload=0x556b78533390) at red-channel.c:511
> > #9  0x00007fc0cbc2747f in dispatcher_handle_single_read
> > (dispatcher=0x556b785328a0) at dispatcher.c:287
> > #10 dispatcher_handle_recv_read (dispatcher=0x556b785328a0) at
> > dispatcher.c:307
> > #11 0x00007fc0cbc2d79f in watch_func (source=<optimized out>,
> > condition=<optimized out>, data=0x556b785333e0) at event-loop.c:119
> > #12 0x00007fc0ce27867d in g_main_dispatch (context=0x556b785332a0) at
> > gmain.c:3176
> > #13 g_main_context_dispatch (context=context@entry=0x556b785332a0) at
> > gmain.c:3829
> > #14 0x00007fc0ce278a48 in g_main_context_iterate (context=0x556b785332a0,
> > block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at
> > gmain.c:3902
> > #15 0x00007fc0ce278d72 in g_main_loop_run (loop=0x556b77a59c60) at
> > gmain.c:4098
> > #16 0x00007fc0cbc5b47b in red_worker_main (arg=0x556b785326c0) at
> > red-worker.c:1139
> > #17 0x00007fc0c9d832de in start_thread (arg=<optimized out>) at
> > pthread_create.c:486
> > #18 0x00007fc0c9ab4e83 in clone () at
> > ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> 
> Please file that one as a separate bug against spice; I don't think it's
> related to my change (which is in the usb redirect code)
> Please add a note here to say what the new bz is.

New a bug on spice: https://bugzilla.redhat.com/show_bug.cgi?id=1787536

Comment 25 Dr. David Alan Gilbert 2020-01-03 10:19:59 UTC

Thanks; please don't change the state of this bug - I've set it back to assigned.

Comment 27 Dr. David Alan Gilbert 2020-01-15 12:15:47 UTC

QE note: This also has the fix for 1786414's initial problem in;  but I'll keep 1786414 open until we solve the other problem we found on the corresponding 1786413.

Comment 28 Danilo de Paula 2020-01-15 14:36:56 UTC

QA_ACK, please?

Comment 31 Ademar Reis 2020-02-05 23:06:14 UTC

QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 32 Li Xiaohui 2020-03-10 13:08:07 UTC

ok, reproduce this bz on hosts(kernel-4.18.0-128.el8.x86_64 & qemu-img-4.1.0-9.module+el8.1.0+4210+23b2046a.x86_64) after test as Comment 0,
do ping-pong migration about 30min:
(1)get error from qemu log on src host:
************************************************************
warning: chardev_can_read called on non open chardev!

2020-03-10T12:46:08.191923Z qemu-kvm: warning: chardev_can_read called on non open chardev!
*******************************************************************************************

(2)couldn't connect guest via remote-viewer
(3)command hang: # virsh domstats nfs

Comment 33 Li Xiaohui 2020-03-11 03:24:57 UTC

verify this bz on hosts(kernel-4.18.0-185.el8.x86_64&qemu-img-2.12.0-99.module+el8.2.0+5827+8c39933c.x86_64&libvirt-4.5.0-41.module+el8.2.0+5928+db9eea38.x86_64) as Comment 32 by trying two times:

test steps:
(1)start nfs guest via virsh command 
Notes: change two places in nfs.xml: 
a. on rhel8.2-non-av, use "pc-q35-rhel7.6.0" to instead of "pc-q35-rhel8.0.0";
b. add "cache=none" into system disk since get error when mgiration:
error: Unsafe migration: Migration may lead to data corruption if disks use cache != none or cache != directsync
(2)on other client host(not src or dst), run migrate.sh & run.sh

Test results:
wait > 1 hour, vm works well and migrate successfully.

Dave, I get some warnings from qemu log during migration, could you help see whether it's a issue?
2020-03-11T02:51:04.775928Z qemu-kvm: warning: usb-redir connection broken during migration
(process:111953): Spice-WARNING **: 22:51:05.363: Connection reset by peer
(process:111953): Spice-WARNING **: 22:51:05.364: Connection reset by peer
2020-03-11 02:51:06.036+0000: initiating migration
2020-03-11 02:51:14.105+0000: shutting down, reason=migrated

Comment 34 Dr. David Alan Gilbert 2020-03-11 10:41:35 UTC

As long as it's working, and a normal migration test shows usb/spice still working after migration, I don't worry about those warnings much.

Comment 35 Li Xiaohui 2020-03-11 11:10:34 UTC

Yeah, let's close it verified as migration finish successfully and vm works well.

Comment 37 errata-xmlrpc 2020-04-28 15:32:34 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:1587