Bug 1315606 - qemu:///session disconnects after 30 seconds
Summary: qemu:///session disconnects after 30 seconds
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: libvirt
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Libvirt Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: TRACKER-bugs-affecting-libguestfs
TreeView+ depends on / blocked
 
Reported: 2016-03-08 08:31 UTC by Richard W.M. Jones
Modified: 2016-04-13 05:16 UTC (History)
12 users (show)

Fixed In Version: libvirt-1.3.2-2.fc24
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-03-25 16:41:23 UTC


Attachments (Terms of Use)
libvirt (client side) log (12.77 KB, text/plain)
2016-03-08 10:21 UTC, Richard W.M. Jones
no flags Details
libvirtd (server side) log (5.38 MB, application/x-xz)
2016-03-08 10:24 UTC, Richard W.M. Jones
no flags Details

Description Richard W.M. Jones 2016-03-08 08:31:34 UTC
Description of problem:

libvirt fails to launch with error: Cannot write data: Broken pipe [code=38 int1=32]

It'd be really good if libvirt didn't do this.  Why isn't there a
regression test to stop this happening?

Version-Release number of selected component (if applicable):

libvirt-1.3.2-1.fc25.x86_64

How reproducible:

About 1 in 20 launches.

Steps to Reproduce:

  while guestfish -a /dev/null run -v -x >/tmp/log 2>&1; do echo -n .; done

Comment 1 Richard W.M. Jones 2016-03-08 10:21:11 UTC
Created attachment 1134086 [details]
libvirt (client side) log

Comment 2 Richard W.M. Jones 2016-03-08 10:24:11 UTC
Created attachment 1134087 [details]
libvirtd (server side) log

Note this contains the server-side logs of many runs.  Only the last one will correspond to the client side failure log posted in the previous log.

There is no qemu log file, one was not generated as far as I can tell.

Comment 3 Richard W.M. Jones 2016-03-08 14:42:54 UTC
Weirdly I can only reproduce this when I use the hardened build
linking command line (-specs=/usr/lib/rpm/redhat/redhat-hardened-ld).

Comment 4 Richard W.M. Jones 2016-03-09 12:07:46 UTC
(In reply to Richard W.M. Jones from comment #3)
> Weirdly I can only reproduce this when I use the hardened build
> linking command line (-specs=/usr/lib/rpm/redhat/redhat-hardened-ld).

Actually I don't think this is true.  It's just the bug is
impossibly difficult to reproduce on demand.  Sometimes it
happens all the time, sometimes it never seems to happen.
Still looking ...

Comment 5 Richard W.M. Jones 2016-03-09 14:45:48 UTC
Oh I see.  This is actually a reoccurrence of the problem we had
where libvirtd would drop the connection if you hold it open for
> 30 seconds.  ie:

  https://bugzilla.redhat.com/show_bug.cgi?id=1240283#c8

See also the simple reproducer:

  https://bugzilla.redhat.com/show_bug.cgi?id=1240283#c9

When I run the reproducer on Rawhide, I get:

$ ./bz1240283.py 
libvirt: XML-RPC error : Cannot write data: Broken pipe
Traceback (most recent call last):
  File "./bz1240283.py", line 25, in <module>
    caps = conn.getCapabilities()
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3735, in getCapabilities
    if ret is None: raise libvirtError ('virConnectGetCapabilities() failed', conn=self)
libvirt.libvirtError: Cannot write data: Broken pipe

Comment 6 Richard W.M. Jones 2016-03-09 14:50:10 UTC
It appears to be fixed by 
https://github.com/nertpinx/libvirt/commit/78b0ccc71e99f769068974ff56638c99b1c3b4de

I applied this patch to the libvirt package, and the
reproducer no longer fails:

$ ./bz1240283.py 
NOT reproduced RHBZ#1240283

Please add this reproducer to the libvirt test suite.  It's a good
regression test that requires no special dependencies.

Also libguestfs hasn't crashed yet, although it's hard to say
whether that is proof because the libguestfs crash was difficult
to reproduce reliably.

Comment 7 Cole Robinson 2016-03-09 15:11:22 UTC
(In reply to Richard W.M. Jones from comment #6)
> It appears to be fixed by 
> https://github.com/nertpinx/libvirt/commit/
> 78b0ccc71e99f769068974ff56638c99b1c3b4de
> 

'daemon: properly check for clients'

martin, has that patch been posted upstream yet? I don't see it on the list

> 
> Please add this reproducer to the libvirt test suite.  It's a good
> regression test that requires no special dependencies.
> 

the libvirt.git test suite doesn't have any functional tests that launch a daemon for example, at least it's not a trivial addition. especially that this requires waiting for a 30 second timeout. if you're interested in this I'd suggest bringing it up on the list for further discussion

Comment 8 Fedora Update System 2016-03-09 16:00:43 UTC
libvirt-1.3.2-2.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-a8f520b3f6

Comment 9 Martin Kletzander 2016-03-09 16:17:01 UTC
(In reply to Cole Robinson from comment #7)
I'm hoping to post the series today, it is not posted yet, though.  The test could be marked as expensive (only ran on ci.centos.org and when we want to, plus the --timeout can be changed, so it's quite easy to check that.  i think we can include something like that.

Comment 10 Fedora Update System 2016-03-10 01:55:25 UTC
libvirt-1.3.2-2.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-a8f520b3f6

Comment 11 Martin Kletzander 2016-03-10 05:41:47 UTC
Posted upstream as a part of bigger series:

https://www.redhat.com/archives/libvir-list/2016-March/msg00339.html

Comment 12 Cole Robinson 2016-03-25 16:41:23 UTC
Patches are in rawhide, and in updates-testing for f24

Comment 13 Fedora Update System 2016-03-26 18:15:32 UTC
libvirt-1.3.2-2.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.