Bug 1247746 - virDomainCreateXML fails with Cannot write data: Broken pipe
Summary: virDomainCreateXML fails with Cannot write data: Broken pipe
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: libvirt
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Libvirt Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: TRACKER-bugs-affecting-libguestfs
TreeView+ depends on / blocked
 
Reported: 2015-07-28 18:13 UTC by Richard W.M. Jones
Modified: 2015-07-29 11:14 UTC (History)
17 users (show)

Fixed In Version: libvirt-1.2.17-2.fc24
Doc Type: Bug Fix
Doc Text:
Clone Of: 1240283
Environment:
Last Closed: 2015-07-29 11:14:52 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Richard W.M. Jones 2015-07-28 18:13:40 UTC
This bug, initially found on RHEL, has now started to affect
Fedora too.

+++ This bug was initially created as a clone of Bug #1240283 +++

Description of problem:

Libvirt fails to start the libguestfs appliance, reporting this
error:

Original error from libvirt: Cannot write data: Broken pipe [code=38 domain=7]

Version-Release number of selected component (if applicable):

libvirt-client.x86_64 0:1.2.17-1.fc23

How reproducible:

100%

Steps to Reproduce:
1. See reproducer script here:

https://bugzilla.redhat.com/show_bug.cgi?id=1240283#c9

--- Additional comment from Richard W.M. Jones on 2015-07-08 11:05:22 EDT ---

(In reply to Richard W.M. Jones from comment #6)
> Created attachment 1049893 [details]
> libvirtd.log
> 
> It turns out this is reproducible on baremetal, although it's
> much rarer.  To try reproducing it, do the following as NON-root:
> 
> killall libvirtd
> libguestfs-test-tool

Actually I see why this is intermittent/difficult to reproduce.

When libguestfs launch() is called, it basically does:

 - conn = virConnectOpenAuth
 - call virConnectGetCapabilities (conn)
 - rebuild the libguestfs appliance, if it needs to be rebuilt
 - call virDomainCreateXML (conn)

The rebuild step is either instantaneous (if it doesn't need to
be rebuilt) or on RHEL takes about 2 minutes.

I think this is only failing when the rebuild step happens.  Could
it be that libvirtd --timeout is broken, so that it's timing out
the daemon even though there is a client connection?

I can reliably reproduce this bug when I trigger an appliance
rebuild (do this command: sudo touch /usr/lib64/guestfs/supermin.d/ )

--- Additional comment from Richard W.M. Jones on 2015-07-08 11:14:15 EDT ---

Based on the comment above, this is a simple reproducer
which doesn't involve anything except python + libvirt.

$ ./bz1240283.py 
libvirt: XML-RPC error : Cannot write data: Broken pipe
Traceback (most recent call last):
  File "./bz1240283.py", line 25, in <module>
    caps = conn.getCapabilities()
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3505, in getCapabilities
    if ret is None: raise libvirtError ('virConnectGetCapabilities() failed', conn=self)
libvirt.libvirtError: Cannot write data: Broken pipe

--- Additional comment from Martin Kletzander on 2015-07-10 05:15:45 EDT ---

Fixed upstream by commit v1.2.17-77-gb7ea58c26219:

commit b7ea58c262194037042284a14fb1608c9cf31884
Author: Martin Kletzander <mkletzan>
Date:   Fri Jul 10 10:35:31 2015 +0200

    rpc: Rework timerActive logic in daemon

Comment 1 Richard W.M. Jones 2015-07-29 11:14:52 UTC
http://koji.fedoraproject.org/koji/taskinfo?taskID=10524211


Note You need to log in before you can comment on or make changes to this bug.