Bug 1792576

Summary: virt-manager crashes: python3 xcb_io.c:263: poll_for_event: Assertion `!xcb_xlib_threads_sequence_lost' failed.
Product: [Community] Virtualization Tools Reporter: Brad G <hai7neeb9aik3aiy>
Component: virt-managerAssignee: Cole Robinson <crobinso>
Status: CLOSED DUPLICATE QA Contact:
Severity: medium Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: berrange, crobinso, fziglio, gscrivan, tburke
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-29 09:11:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
virt-manager crashlog 1 none

Description Brad G 2020-01-18 03:35:14 UTC
For the last year or two I have been seeing chronic intermittent crashes in virt-manager. It happens anywhere from 0-5 times a day during my regular daily usage, which is about four hours M-F. I don't know exactly what causes it, but interactivity such as clicking on an object that opens a window, UI box, or just about anything else will trigger the crash. It crashes on the down-click event, not the up-click event, as evidenced by the state of the VM when I restart virt-manager. It's difficult to reliably reproduce but if I keep using a VM long enough it will happen.

My environment is Debian unstable, KDE/plasma. I am currently on virt-manager version 2.2.1, Debian package version 1:2.2.1-3. This problem happens with both Windows and Linux VMs.

If I remember right, this problem first started with version 2.0 and that may have been as far back as two years ago. Prior to 2.0, this crash did not occur.

There is no error message. virt-manager and all of it's windows simply close. I have run virt-manager with --debug and I get this interesting error:

[xcb] Unknown sequence number while processing queue
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
python3: ../../src/xcb_io.c:263: poll_for_event: Assertion `!xcb_xlib_threads_sequence_lost' failed.
Aborted

I previously reported this bug to Debian here, but Debian is dead and nobody cares:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=943708

I found this similar bug, but it has very limited info and there was no follow-up:
https://bugs.launchpad.net/ubuntu/+source/virt-manager/+bug/867575

Comment 1 Cole Robinson 2020-01-20 23:02:09 UTC
Thanks for the report and collecting that info. We've had trickling crash reports in Fedora for the same time period. I've duped a bunch of them to this one bug: https://bugzilla.redhat.com/show_bug.cgi?id=1756065

Quoting from that bug:

(In reply to Cole Robinson from comment #18)
> Sorry, I haven't been able to reproduce so it's tough to fix. The bugs
> getting closed is from Fedoras end-of-life script, it's not really a human
> doing it.
> 
> For those that can reliably reproduce, a few things that will help:
> 
> * Are any of your VMs using spice GL?
> * This reproduces on fully up to date fedora31?
> * Is there any reliable steps to reproduce?
> * Is python3-libguestfs installed? Is 'libguest VM introspection' enabled in
> Edit->Preferences ?
> * Are you using the system tray icon?
> 
> Also if anyone can reliably reproduce, please attach
> ~/.cache/virt-manager/virt-manager.log from after you hit the crash, and
> make a note of the time when the app crashes

If you can provide that info, it will be helpful, minus the fedora31 piece. Some other bits:

* When this crashes, is there always a VM actively running with a VNC/SPICE console open? Or previously a VM with a console open during the app runtime?

Comment 2 Brad G 2020-01-21 03:04:26 UTC
> * Are any of your VMs using spice GL?

No.

> * Is there any reliable steps to reproduce?

Depends on how reliable you want reliable to be. I just spent about five minutes trying to make virt-manager crash and was three seconds away from giving up when I was finally able too make it happen after right-clicking on the desktop, causing the context menu to open. I had been using one of my VMs for about two hours earlier today and didn't have any problems at that time.

What causes the crash to happen is always some kind of user interaction such as a right-click which opens a context menu, or left-clicking on a link in a browser (causing the text to highlight), double-clicking on a file to open it, or other similar input activity with corresponding visual feedback. The crash occurs at the very moment between the user input and the visual feedback. It's very common that it happens on context menus after I've down-clicked, and when I restore virt-manager the context menu will still be there. I've never found virt-manager crashed in the morning on it's own. This is something that I've only ever seen happen on user-input, and specifically mouse-input, though I won't claim it exclusively happens on mouse-input.

I use my VMs 4-6 hours each day and will see anywhere from 0-5 crashes per day. So it happens but reliable reproduction is reliably unreliable. It's like there is a 1/200 chance that a mouse click will cause a crash.

One VM is Windows 7/10 (recently upgraded), and the other is Linux Mint MATE. Both have the qemu/spice guest agents installed.

There is no other pattern I can discern.


> * Is python3-libguestfs installed? Is 'libguest VM introspection' enabled in
> Edit->Preferences ?

That appears to be the debian package python3-guestfs, which is not installed. The UI checkbox is checked but greyed-out.


> * Are you using the system tray icon?

No and never have. I'm a KDE/Plasma user.


> Also if anyone can reliably reproduce, please attach
> ~/.cache/virt-manager/virt-manager.log from after you hit the crash, and
> make a note of the time when the app crashes

I just made it crash. Unfortunately the last log line was from about three hours ago, so I don't see any help there. It's 900KB. Still want it?



I am motivated to get this bug fixed. What info do you need from me? If you want a trace give me the exact commands you want me to run and I'll see what I can do to crash it.

Comment 3 Brad G 2020-01-21 03:07:05 UTC
> * When this crashes, is there always a VM actively running with a VNC/SPICE console open? Or previously a VM with a console open during the app runtime?

My usage is entirely local on the local machine from within virt-manager. All Spice, no VNC. I usually have two VMs open on different local desktops. The virt-manager console is always open.

Comment 4 Cole Robinson 2020-01-21 20:37:21 UTC
Thanks for the info.

> 
> I am motivated to get this bug fixed. What info do you need from me? If you
> want a trace give me the exact commands you want me to run and I'll see what
> I can do to crash it.

Great. Since you reproduce so frequently I think we can get to the bottom of it.

First, grab virt-manager from git:

  git clone https://github.com/virt-manager/virt-manager
  cd virt-manager
  ./virt-manager --debug

Verify that the app works as expected like that.
Use the app like you normally do, and verify that it crashes during normal operation. This will confirm
that upstream git is still affected (I expect it is, but this just eliminates a variable)

Now you should run it under gdb and wait for it to crash again. I'm not positive about the steps to get
good backtraces on mint/ubuntu.

I'm guessing you'll want to do: sudo apt-get install gdb python3.7-dbg
But replacing '3.7' with whatever the system version of python3 is for your distro.
Then run:

  gdb --eval-command=run --args python3 ./virt-manager --debug

virt-manager should start up as normal. After that, in the gdb terminal, press ctrl-c.
Verify that the command 'thread apply all py-bt' shows some python backtraces. Hopefully it just works.
Type 'quit' in gdb. We verified gdb is working fine for our needs.

Now, run that gdb command again, and use virt-manager like normal. When you hit the crash, virt-manager
won't disappear, instead it will just freeze, because gdb essentially paused the app. Go to the gdb
terminal and run these two commands

  thread apply all bt
  thread apply all py-bt

Copy all the output, up to and including about 100 lines of 'virt-manager --debug' output before you ran
those commands, and attach it here.

Comment 5 Brad G 2020-01-22 09:33:13 UTC
Created attachment 1654498 [details]
virt-manager crashlog 1

Comment 6 Brad G 2020-01-22 09:34:06 UTC
I have attached a log as requested.

Debian has a gdb-minimal package and I wasted a ton of time before I figured out why gdb would not load python support. Big facepalm there.

Comment 7 Cole Robinson 2020-01-23 16:57:18 UTC
Thanks for the info. There's still some missing symbols in the debug output. Can you figure out how to install debug symbols for packages like spice-glib and spice-gtk? It might help future debugging but I don't think it's going to solve the issue.

FWIW in googling I also found this other user's bug, who could crash virt-viewer/remote-viewer: https://bugzilla.redhat.com/show_bug.cgi?id=1758384

It's making me think this is spice-gtk and not virt-manager. I've asked for help on spice-devel list for further debugging

Comment 8 Cole Robinson 2020-01-23 18:04:12 UTC
Some more ideas:

* Run with spice debugging: SPICE_DEBUG=1 gdb ...
    * you should see a bunch of obvious spice debug output when you open a VM spice console in virt-manager

* Reduce the spice usage surface to see if you can identify a culprit. Make these changes on an offline VM, then start it.
    * does the crash reproduce after removing all sound devices from the VM? (this will close the spice audio channel)
    * does the crash reproduce after removing USB Redirector devices from the VM? (this will close the usb redirection channel)

Comment 9 Frediano Ziglio 2020-01-26 09:10:37 UTC
Interesting in https://bugs.launchpad.net/ubuntu/+source/virt-manager/+bug/867575 (https://launchpadlibrarian.net/81931994/ThreadStacktrace.txt) there seems to be 2 threads using X11 detected.
Interesting because usually crashes in this case are not really helpful, probably second thread access won't be noted and crash will report next access from main thread.
Unfortunately PyEval_EvalFrameEx calls are not much helpful, you cannot see Python code or what was trying to do.

Comment 10 Brad G 2020-01-26 09:45:33 UTC
It might be a week or two before I update again but I'll see if I can get more crash logs or info that might be helpful.

Comment 11 Frediano Ziglio 2020-01-29 09:11:16 UTC

*** This bug has been marked as a duplicate of bug 1758384 ***