Bug 1069351

Summary: Endless stream of error dialogs when libvirtd stops/crashes on remote host
Product: [Fedora] Fedora Reporter: Dr. David Alan Gilbert <dgilbert>
Component: virt-managerAssignee: Cole Robinson <crobinso>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 20CC: berrange, crobinso, info, mkelly, rhack, vg.aetera, virt-maint, vyasevic
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: virt-manager-1.0.1-2.fc20 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-04-25 04:29:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
virt-manager --debug log none

Description Dr. David Alan Gilbert 2014-02-24 19:46:48 UTC
Description of problem:
I kicked off a network install (F20 running virt-manager, RHEL7-post-beta1 on remote machine, RHEL7-beta1 as guest being installed) and returned from dinner to find ~250 error dialogs with the same message.
(I'm not sure whether to split this into two bugs - one for the failure that I don't know what caused it, and another for the fact it produced 250 dialogs)

Version-Release number of selected component (if applicable):

F20 machine running virt-manager
virt-manager-common-1.0.0-2.fc20.noarch
virt-manager-1.0.0-2.fc20.noarch

RHEL7 remote box

libvirt-daemon-1.1.1-23.el7.x86_64
libvirt-daemon-driver-interface-1.1.1-23.el7.x86_64
libvirt-glib-0.1.7-3.el7.x86_64
libvirt-client-1.1.1-23.el7.x86_64
libvirt-daemon-driver-network-1.1.1-23.el7.x86_64
libvirt-daemon-driver-nwfilter-1.1.1-23.el7.x86_64
libvirt-daemon-driver-storage-1.1.1-23.el7.x86_64
libvirt-daemon-driver-secret-1.1.1-23.el7.x86_64
libvirt-python-1.1.1-23.el7.x86_64
libvirt-daemon-kvm-1.1.1-23.el7.x86_64
libvirt-daemon-config-network-1.1.1-23.el7.x86_64
libvirt-daemon-driver-qemu-1.1.1-23.el7.x86_64
libvirt-daemon-driver-nodedev-1.1.1-23.el7.x86_64

How reproducible:
It's done it both times I've tried to do this install on this host, not tried it on any other hosts yet.

Steps to Reproduce:
1.Create a disk to store the VM in a storage pool
2.Use network install, give it the URL - I also gave it more RAM and more cores but otherwise left it as the default
3. Leave it to cook

Actual results:
~250 error dialogs saying 'Error polling connection 'qemu+ssh://root@vl402/system': internal error: client socket is closed'
(I can still login to the remote machine and restart the install and it failed again - so I don't know the cause of the error).
Another dialog appears every few seconds.

Expected results:
A nice fresh clean installed guest

Additional info:

There are no logs on the target machine in /var/log/libvirtd
Jorunacltl just has:
Feb 24 14:01:46 virtlab402.virt.bos.redhat.com libvirtd[2543]: No response from client 0x7ffc73985450 after 5 keepalive messages in 30 seconds

(Twice - I'm assuming once from each install attempt?)


On the virt-manager end the log looks like:

[Mon, 24 Feb 2014 19:23:34 virt-manager 23671] DEBUG (create:1779) Starting background install process
[Mon, 24 Feb 2014 19:23:34 virt-manager 23671] DEBUG (distroinstaller:412) Using scratchdir=/home/dgilbert/.cache/virt-manager/boot
[Mon, 24 Feb 2014 19:23:35 virt-manager 23671] DEBUG (urlfetcher:293) Finding distro store for location=http://download.devel.redhat.com/released/RHEL-7/7.0-Beta-1/Server/x86_64/os/
[Mon, 24 Feb 2014 19:23:35 virt-manager 23671] DEBUG (urlfetcher:99) Fetching URI: http://download.devel.redhat.com/released/RHEL-7/7.0-Beta-1/Server/x86_64/os/.treeinfo
[Mon, 24 Feb 2014 19:23:35 virt-manager 23671] DEBUG (urlfetcher:110) Saved file to /home/dgilbert/.cache/virt-manager/boot/virtinst-.treeinfo.UYcFIe
[Mon, 24 Feb 2014 19:23:35 virt-manager 23671] DEBUG (urlfetcher:99) Fetching URI: http://download.devel.redhat.com/released/RHEL-7/7.0-Beta-1/Server/x86_64/os/images/pxeboot/vmlinuz
[Mon, 24 Feb 2014 19:23:38 virt-manager 23671] DEBUG (urlfetcher:110) Saved file to /home/dgilbert/.cache/virt-manager/boot/virtinst-vmlinuz.WZqj38
[Mon, 24 Feb 2014 19:23:38 virt-manager 23671] DEBUG (urlfetcher:99) Fetching URI: http://download.devel.redhat.com/released/RHEL-7/7.0-Beta-1/Server/x86_64/os/images/pxeboot/initrd.img
[Mon, 24 Feb 2014 19:23:49 virt-manager 23671] DEBUG (urlfetcher:110) Saved file to /home/dgilbert/.cache/virt-manager/boot/virtinst-initrd.img.cfpRN4
[Mon, 24 Feb 2014 19:23:49 virt-manager 23671] DEBUG (distroinstaller:286) Uploading kernel/initrd media
[Mon, 24 Feb 2014 19:23:50 virt-manager 23671] DEBUG (distroinstaller:72) Existing pool 'boot-scratch' found for /var/lib/libvirt/boot
[Mon, 24 Feb 2014 19:23:50 virt-manager 23671] DEBUG (diskbackend:171) Path '' is target for pool 'boot-scratch'. Creating volume 'virtinst-vmlinuz.WZqj38'.
[Mon, 24 Feb 2014 19:23:57 virt-manager 23671] DEBUG (storage:701) Creating storage volume 'virtinst-vmlinuz.WZqj38' with xml:
<volume>
  <name>virtinst-vmlinuz.WZqj38</name>
  <capacity>4813520</capacity>
  <allocation>0</allocation>
  <target>
    <format type="raw"/>
  </target>
</volume>

[Mon, 24 Feb 2014 19:23:57 virt-manager 23671] DEBUG (storage:730) Using vol create flags=0
[Mon, 24 Feb 2014 19:23:57 virt-manager 23671] DEBUG (storage:770) Couldn't lookup storage volume in prog thread.
[Mon, 24 Feb 2014 19:23:57 virt-manager 23671] DEBUG (storage:737) Storage volume 'virtinst-vmlinuz.WZqj38' install complete.
[Mon, 24 Feb 2014 19:25:13 virt-manager 23671] DEBUG (engine:320) Tick is slow, not running at requested rate.
[Mon, 24 Feb 2014 19:29:28 virt-manager 23671] ERROR (engine:362) Could not refresh connection qemu+ssh://root@vl402/system
Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/engine.py", line 348, in _tick_single_conn
    conn.tick(**kwargs)
  File "/usr/share/virt-manager/virtManager/connection.py", line 1143, in tick
    self.hostinfo = self._backend.getInfo()
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3652, in getInfo
    if ret is None: raise libvirtError ('virNodeGetInfo() failed', conn=self)
libvirtError: Cannot write data: Broken pipe
[Mon, 24 Feb 2014 19:29:28 virt-manager 23671] DEBUG (installer:187) Removing /home/dgilbert/.cache/virt-manager/boot/virtinst-vmlinuz.WZqj38
[Mon, 24 Feb 2014 19:29:28 virt-manager 23671] DEBUG (engine:363) Closing connection since libvirtd appears to have stopped
[Mon, 24 Feb 2014 19:29:28 virt-manager 23671] DEBUG (installer:187) Removing /home/dgilbert/.cache/virt-manager/boot/virtinst-initrd.img.cfpRN4
[Mon, 24 Feb 2014 19:29:28 virt-manager 23671] DEBUG (error:84) error dialog message:
summary=Unable to complete install: 'internal error: client socket is closed'
details=Unable to complete install: 'internal error: client socket is closed'

Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 91, in cb_wrapper
    callback(asyncjob, *args, **kwargs)
  File "/usr/share/virt-manager/virtManager/create.py", line 1780, in do_install
    guest.start_install(meter=meter)
  File "/usr/share/virt-manager/virtinst/guest.py", line 381, in start_install
    self._prepare_install(meter, dry)
  File "/usr/share/virt-manager/virtinst/guest.py", line 274, in _prepare_install
    util.make_scratchdir(self.conn, self.type))
  File "/usr/share/virt-manager/virtinst/installer.py", line 201, in prepare
    self._prepare(guest, meter, scratchdir)
  File "/usr/share/virt-manager/virtinst/distroinstaller.py", line 438, in _prepare
    self._prepare_kernel_url(guest, fetcher)
  File "/usr/share/virt-manager/virtinst/distroinstaller.py", line 362, in _prepare_kernel_url
    fetcher.meter, kernel, initrd)
  File "/usr/share/virt-manager/virtinst/distroinstaller.py", line 289, in _upload_media
    kvol = _upload_file(conn, meter, pool, kernel)
  File "/usr/share/virt-manager/virtinst/distroinstaller.py", line 158, in _upload_file
    vol.delete(0)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2703, in delete
    if ret == -1: raise libvirtError ('virStorageVolDelete() failed', vol=self)
libvirtError: internal error: client socket is closed

[Mon, 24 Feb 2014 19:29:28 virt-manager 23671] DEBUG (error:84) error dialog message:
summary=Error polling connection 'qemu+ssh://root@vl402/system': internal error: client socket is closed
details=Error polling connection 'qemu+ssh://root@vl402/system': internal error: client socket is closed

and then it just keeps going with those.

Comment 1 Cole Robinson 2014-02-25 20:19:22 UTC
Thanks for the report. That error usually means that libvirtd crashed, we should be closing the connection here but the recent release regressed. Fixed upstream now:

commit ce64d037bff56db994fedd065a9a34b8e827dda2
Author: Cole Robinson <crobinso>
Date:   Tue Feb 25 15:17:34 2014 -0500

    engine: Fix closing connection when tick() fails (bz 1069351)

Comment 2 Dr. David Alan Gilbert 2014-02-26 11:57:21 UTC
Thanks. I don't think the remote libvirtd actually died, I suspect this was just the remote being slow to respond while stuffing the initramfs down to it over a congested connection.

Comment 3 Cole Robinson 2014-02-28 18:23:30 UTC
Fixed in virt-manager-1.0.0-3.fc21

Comment 4 Fedora Update System 2014-02-28 18:24:06 UTC
virt-manager-1.0.0-3.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/virt-manager-1.0.0-3.fc20

Comment 5 Cole Robinson 2014-02-28 19:14:32 UTC
Ignore the 'rawhide' comment

Comment 6 Fedora Update System 2014-03-01 14:06:59 UTC
Package virt-manager-1.0.0-3.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing virt-manager-1.0.0-3.fc20'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-3272/virt-manager-1.0.0-3.fc20
then log in and leave karma (feedback).

Comment 7 Fedora Update System 2014-03-04 06:45:35 UTC
virt-manager-1.0.0-3.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 8 Dr. David Alan Gilbert 2014-03-10 12:12:47 UTC
Just hit it on -3; I was doing a yum update on one of the remote hosts, so I guess
that restarted libvirtd.

[Mon, 10 Mar 2014 11:58:32 virt-manager 3099] DEBUG (console:271) Creating SSH tunnel: ssh -l root vl403 sh -c 'nc -q 2>&1 | grep "requires an argument" >/dev/null;if [ $? -eq 0 ] ; then   CMD="nc -q 0 127.0.0.1 5900";else   CMD="nc 127.0.0.1 5900";fi;eval "$CMD";'
[Mon, 10 Mar 2014 11:58:32 virt-manager 3099] DEBUG (console:291) Open  tunnel PID=11763 OUTFD=89 ERRFD=91
[Mon, 10 Mar 2014 12:00:21 virt-manager 3099] ERROR (engine:371) Could not refresh connection qemu+ssh://root@vl403/system
Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/engine.py", line 349, in _tick_single_conn
    conn.tick(**kwargs)
  File "/usr/share/virt-manager/virtManager/connection.py", line 1150, in tick
    self.hostinfo = self._backend.getInfo()
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3652, in getInfo
    if ret is None: raise libvirtError ('virNodeGetInfo() failed', conn=self)
libvirtError: End of file while reading data: Ncat: Broken pipe.: Input/output error
[Mon, 10 Mar 2014 12:00:21 virt-manager 3099] DEBUG (engine:372) Closing connection since libvirtd appears to have stopped
[Mon, 10 Mar 2014 12:00:24 virt-manager 3099] DEBUG (error:84) error dialog message:
summary=Error polling connection 'qemu+ssh://root@vl403/system': internal error: client socket is closed
details=Error polling connection 'qemu+ssh://root@vl403/system': internal error: client socket is closed
[Mon, 10 Mar 2014 12:00:27 virt-manager 3099] DEBUG (error:84) error dialog message:
summary=Error polling connection 'qemu+ssh://root@vl403/system': internal error: client socket is closed
details=Error polling connection 'qemu+ssh://root@vl403/system': internal error: client socket is closed

and then it was spitting another dialog about once every 3 seconds.

Comment 9 Cole Robinson 2014-03-10 13:36:05 UTC
Yeah, sorry, I messed up the first patch. The test scenario I used was pretty contrived, I should have just done the expected test of shutting down libvirtd while virt-manager is connected.

commit 081e34715ffa5a210e1e0c8670fe3a1a3ec5180b
Author: Cole Robinson <crobinso>
Date:   Mon Mar 10 09:33:04 2014 -0400

    connection: Handle errors when deregistering events on close (bz 1069351)
    
    Otherwise this interrupts the close/cleanup routine, and the connection
    never appears to disconnect in the UI. This causes error dialog spamming
    when libvirtd goes down.

Comment 10 Cole Robinson 2014-03-11 11:55:16 UTC
*** Bug 1074856 has been marked as a duplicate of this bug. ***

Comment 11 Fedora Update System 2014-03-12 12:30:24 UTC
Package virt-manager-1.0.0-6.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing virt-manager-1.0.0-6.fc20'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-3775/virt-manager-1.0.0-6.fc20
then log in and leave karma (feedback).

Comment 12 Cole Robinson 2014-03-13 14:00:03 UTC
*** Bug 1075815 has been marked as a duplicate of this bug. ***

Comment 13 Cole Robinson 2014-03-13 16:59:14 UTC
*** Bug 1076155 has been marked as a duplicate of this bug. ***

Comment 14 Vlad Yasevich 2014-03-13 19:01:00 UTC
Tried the package from testing.  Seems to work for me.

Thanks

Comment 15 Madison Kelly 2014-03-13 19:14:45 UTC
virt-manager-1.0.0-6.fc20 from comment #11 fixed the problem for me, too.

Comment 16 Fedora Update System 2014-03-19 08:40:52 UTC
virt-manager-1.0.0-6.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 17 Netbulae 2014-04-10 14:37:25 UTC
I'm running this version of virt-manager but still have many dialogs spawning when the connection closes:

When my VPN connection closes while having a remote console open in virt-manager, I get 10-20 or sometimes even more dialog pop-ups with the same error message "Error polling connection 'qemu+ssh://xxx@yyyy/system': 'NoneType' object has no attribute 'getinfo'"

It's annoying but a killall -9 virt-manager saves me from closing all the dialog windows.

Version-Release number of selected component (if applicable):

virt-manager-1.0.0-6.fc20.noarc

How reproducible:

Everytime

Steps to Reproduce:
1. Connect to remote location with vpn
2. open virt-manager and connect to remote server
3. open console of a VM
4. disconnect VPN

Comment 18 Cole Robinson 2014-04-10 14:42:39 UTC
There is a newer version of virt-manager in updates, 1.0.1, that had additional fixes. Please update to that version, fully restart virt-manager, and see if the issue persists.

Comment 19 Vlad Yasevich 2014-04-15 19:36:48 UTC
Issue persists with  virt-manager-1.0.1-1.fc20.noarch.

The symptom here is a little different.  The connection is reset and when
virt-manager detects this, we get many spawned dialogs of 2 different kinds:

First:
Error polling connection 'qemu:///system': internal error: client socket is closed

Second:
Error polling connection 'qemu+ssh://root.eng.bos.redhat.com/system': 'NoneType' object has no attribute 'getInfo'


Sometimes the second dialog doesn't show up.

-vlad

Comment 20 Cole Robinson 2014-04-15 19:43:26 UTC
FWIW I can't reproduce the issue with latest virt-manager.

Please ensure virt-manager has been fully restarted since updating.

If the issue still persists, please provide virt-manager --debug output when reproducing

Comment 21 Vlad Yasevich 2014-04-16 01:22:51 UTC
Created attachment 886707 [details]
virt-manager --debug log

I managed to isolate the reproduction scenario:

1) Connect virt-manager to the remote system via ssh.
2) Suspect the system running virt-manager
3) Shutdown the remote system.
4) Wake up system running virt-manager
   - at this point virt-manager things it's still connected.
5) Boot up the remote system again
    - When the system comes up the ssh socket is reset
6) Close a bunch of dialogs.

I've attached the output of virt-manager --debug.

-vlad

Comment 22 Cole Robinson 2014-04-16 18:13:02 UTC
thanks vlad, I managed to reproduce and fix upstream. Reopening this

Comment 23 Fedora Update System 2014-04-16 21:41:53 UTC
virt-manager-1.0.1-2.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/virt-manager-1.0.1-2.fc20

Comment 24 Fedora Update System 2014-04-18 15:41:46 UTC
Package virt-manager-1.0.1-2.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing virt-manager-1.0.1-2.fc20'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-5335/virt-manager-1.0.1-2.fc20
then log in and leave karma (feedback).

Comment 25 Fedora Update System 2014-04-25 04:29:47 UTC
virt-manager-1.0.1-2.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.