RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1787536 - Segment fault when low bandwidth migration with remote-viewer connected
Summary: Segment fault when low bandwidth migration with remote-viewer connected
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: spice
Version: 8.2
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: rc
: 8.0
Assignee: Frediano Ziglio
QA Contact: SPICE QE bug list
URL:
Whiteboard:
Depends On: 1840240
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-01-03 10:02 UTC by Han Han
Modified: 2020-11-04 04:07 UTC (History)
7 users (show)

Fixed In Version: spice-0.14.3-2.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-11-04 04:07:31 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Scripts, XMLs, all threads backtrace (8.32 KB, application/gzip)
2020-01-03 10:02 UTC, Han Han
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2020:4818 0 None None None 2020-11-04 04:07:52 UTC

Description Han Han 2020-01-03 10:02:28 UTC
Created attachment 1649381 [details]
Scripts, XMLs, all threads backtrace

Description of problem:
As subject

Version-Release number of selected component (if applicable):
qemu-kvm-4.2.0-4.el8.bz1752320a.x86_64
libvirt-5.10.0-1.module+el8.2.0+5135+ed3b2489.x86_64
spice-server-0.14.2-1.el8.x86_64

How reproducible:
10%

Steps to Reproduce:
1. Setup 2 L1 nest host on physical machine, one with bandwidth limit 20MB/s:
      <bandwidth>
        <inbound average='20480'/>
        <outbound average='20480'/>
      </bandwidth>
2. Start a vm on L1 host, migrate it to another L1 host, then migrate it back, in loops

3. Try to connect spice console via remote-viewer, then after a random seconds, disconnect, then connect again, in loops

Some times you will get a warning on migration cmdline and a segment fault on spice:
Migration: [  0 %]error: internal error: qemu unexpectedly closed the monitor: 2020-01-03T03:07:12.112093Z qemu-kvm: warning: host doesn't support requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
2020-01-03T03:07:12.112109Z qemu-kvm: warning: host doesn't support requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]                                                                         
2020-01-03T03:07:12.114502Z qemu-kvm: warning: host doesn't support requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]                                                                          
2020-01-03T03:07:12.114513Z qemu-kvm: warning: host doesn't support requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]  

And then get a segment fault:
# abrt-cli ls
id 0d37d84a6d9cdc4dc9a8ecee86350ec983faf720
reason:         main_channel_client_is_low_bandwidth(): qemu-kvm killed by SIGSEGV
time:           Fri 03 Jan 2020 11:07:12 AM CST
cmdline:        /usr/libexec/qemu-kvm -name guest=nfs-8.2,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1112-nfs-8.2/master-key.aes -machine pc-q35-rhel8.2.0,accel=kvm,usb=off,dump-guest-core=off -cpu Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,arch-capabilities=on,xsaves=on,pdpe1gb=on,skip-l1dfl-vmentry=on,spec-ctrl=off,mpx=off -m 1024 -overcommit mem-lock=off -smp 2,sockets=2,cores=1,threads=1 -uuid e5717b24-74cc-4aa3-b2c3-e55cd2589f98 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=37,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-pci-bridge,id=pci.7,bus=pci.4,addr=0x0 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x1d -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 -device qemu-xhci,id=usb1,bus=pci.2,addr=0x0 -device virtio-scsi-pci,id=scsi0,bus=pci.1,addr=0x0 -device ahci,id=sata1,bus=pci.7,addr=0x1 -blockdev '{\"driver\":\"gluster\",\"volume\":\"gv\",\"path\":\"nfs-8.2.qcow2\",\"server\":[{\"type\":\"inet\",\"host\":\"gls1.usersys.redhat.com\",\"port\":\"24007\"}],\"debug\":0,\"node-name\":\"libvirt-1-storage\",\"auto-read-only\":true,\"discard\":\"unmap\"}' -blockdev '{\"node-name\":\"libvirt-1-format\",\"read-only\":false,\"driver\":\"qcow2\",\"file\":\"libvirt-1-storage\",\"backing\":null}' -device virtio-blk-pci,scsi=off,bus=pci.5,addr=0x0,drive=libvirt-1-format,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb1.0,port=2 -incoming defer -device virtio-balloon-pci,id=balloon0,bus=pci.3,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
package:        15:qemu-kvm-core-4.2.0-4.el8.bz1752320a
uid:            107 (qemu)
count:          1
Directory:      /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404
Run 'abrt-cli report /var/spool/abrt/ccpp-2020-01-03-11:07:12-1404' for creating a case in Red Hat Customer Portal

Core backtrace:
#0  0x00007fc0cbc3d588 in main_channel_client_is_low_bandwidth (mcc=mcc@entry=0x556b771c02a0) at main-channel-client.c:656
#1  0x00007fc0cbc1f516 in dcc_config_socket (rcc=0x556b77648540) at dcc.c:1426
#2  0x00007fc0cbc440a3 in red_channel_client_config_socket (rcc=0x556b77648540) at red-channel-client.c:1046
#3  red_channel_client_initable_init (initable=<optimized out>, cancellable=<optimized out>, error=0x0) at red-channel-client.c:925
#4  0x00007fc0c6eed5df in g_initable_new_valist (object_type=<optimized out>, first_property_name=0x7fc0cbce2dc4 "channel", var_args=0x7fc0a4f90350, cancellable=0x0, error=0x0) at ginitable.c:248
#5  0x00007fc0c6eed68d in g_initable_new (object_type=<optimized out>, cancellable=cancellable@entry=0x0, error=error@entry=0x0, first_property_name=first_property_name@entry=0x7fc0cbce2dc4 "channel")
    at ginitable.c:162
#6  0x00007fc0cbc2015d in dcc_new (display=display@entry=0x556b7716d930, client=client@entry=0x556b77166dd0, stream=stream@entry=0x556b775d5570, mig_target=mig_target@entry=1, caps=caps@entry=0x556b785333a8, 
    image_compression=SPICE_IMAGE_COMPRESSION_AUTO_GLZ, jpeg_state=SPICE_WAN_COMPRESSION_AUTO, zlib_glz_state=SPICE_WAN_COMPRESSION_AUTO) at dcc.c:507
#7  0x00007fc0cbc2b8c6 in display_channel_connect (channel=<optimized out>, client=0x556b77166dd0, stream=0x556b775d5570, migration=1, caps=0x556b785333a8) at display-channel.c:2616
#8  0x00007fc0cbc4200b in handle_dispatcher_connect (opaque=<optimized out>, payload=0x556b78533390) at red-channel.c:511
#9  0x00007fc0cbc2747f in dispatcher_handle_single_read (dispatcher=0x556b785328a0) at dispatcher.c:287
#10 dispatcher_handle_recv_read (dispatcher=0x556b785328a0) at dispatcher.c:307
#11 0x00007fc0cbc2d79f in watch_func (source=<optimized out>, condition=<optimized out>, data=0x556b785333e0) at event-loop.c:119
#12 0x00007fc0ce27867d in g_main_dispatch (context=0x556b785332a0) at gmain.c:3176
#13 g_main_context_dispatch (context=context@entry=0x556b785332a0) at gmain.c:3829
#14 0x00007fc0ce278a48 in g_main_context_iterate (context=0x556b785332a0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3902
#15 0x00007fc0ce278d72 in g_main_loop_run (loop=0x556b77a59c60) at gmain.c:4098
#16 0x00007fc0cbc5b47b in red_worker_main (arg=0x556b785326c0) at red-worker.c:1139
#17 0x00007fc0c9d832de in start_thread (arg=<optimized out>) at pthread_create.c:486
#18 0x00007fc0c9ab4e83 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Actual results:
As above

Expected results:
No segment fault

Additional info:
Attachment:
L1-host1.xml L1-host2.xml: Libvirt domain xml for L1 hosts
migrate.sh: script for step2
run.sh: script for step3
gdb.txt: all thread backtrace of coredump

Comment 1 Dr. David Alan Gilbert 2020-01-03 10:24:05 UTC
the package Han is using here is with a fix for bz 1752320:
  https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg03792.html

so shouldn't be related to this one.
Han's scripts are a nice set of race-condition detection repeatedly migrating and repeatedly
reconnecting the viewer.

Comment 2 Frediano Ziglio 2020-01-21 15:55:25 UTC
Is it possible to have a core dump?

Comment 3 Han Han 2020-01-22 03:35:07 UTC
(In reply to Frediano Ziglio from comment #2)
> Is it possible to have a core dump?

I'm sorry that the coredump has been cleaned. I need some time to reproduce it again.
You can refer to the all threads backtrace first

Comment 4 Frediano Ziglio 2020-01-22 08:37:29 UTC
Source code is:

int main_channel_client_is_low_bandwidth(MainChannelClient *mcc)
{
    // TODO: configurable?
    return mcc->priv->bitrate_per_sec < 10 * 1024 * 1024;
}

Which corresponds to:

   35580:       f3 0f 1e fa             endbr64 
   35584:       48 8b 47 20             mov    0x20(%rdi),%rax
   35588:       48 81 78 18 ff ff 9f    cmpq   $0x9fffff,0x18(%rax)    <==== fault
   3558f:       00 
   35590:       0f 96 c0                setbe  %al
   35593:       0f b6 c0                movzbl %al,%eax
   35596:       c3                      retq   

the fault happened deferencing mcc->priv pointer. From stack trace looks like mcc pointer is valid but maybe pointing to dandling data.
This could happen if MCC is freed before the entire session which was fixed in the code later, specifically the patch is

commit 59be4f19c46cbeab0b8f405816b7bc4afe253187
Author: Frediano Ziglio <fziglio>
Date:   Thu Aug 24 21:43:18 2017 +0100

    red-client: Make sure MainChannelClient is freed as last
    
    MainChannelClient is used by other clients to store some data
    so should not disappear if other clients are still present.
    Keep a owning reference to it and release after RedClient is
    released.
    
    Signed-off-by: Frediano Ziglio <fziglio>

Comment 15 errata-xmlrpc 2020-11-04 04:07:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (spice bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:4818


Note You need to log in before you can comment on or make changes to this bug.