Bug 1240283 - virDomainCreateXML fails with Cannot write data: Broken pipe
Summary: virDomainCreateXML fails with Cannot write data: Broken pipe
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Martin Kletzander
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: TRACKER-bugs-affecting-libguestfs
TreeView+ depends on / blocked
 
Reported: 2015-07-06 12:40 UTC by Richard W.M. Jones
Modified: 2015-11-19 06:48 UTC (History)
8 users (show)

Fixed In Version: libvirt-1.2.17-3.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1247746 (view as bug list)
Environment:
Last Closed: 2015-11-19 06:48:16 UTC
Target Upstream Version:


Attachments (Terms of Use)
build.log (2.15 MB, text/plain)
2015-07-06 12:40 UTC, Richard W.M. Jones
no flags Details
libvirtd.log (586.78 KB, text/plain)
2015-07-08 14:51 UTC, Richard W.M. Jones
no flags Details
libvirt-client.log (53.12 KB, text/plain)
2015-07-08 14:52 UTC, Richard W.M. Jones
no flags Details
bz1240283.py (550 bytes, text/plain)
2015-07-08 15:14 UTC, Richard W.M. Jones
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:2202 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2015-11-19 08:17:58 UTC

Description Richard W.M. Jones 2015-07-06 12:40:01 UTC
Created attachment 1048816 [details]
build.log

Description of problem:

Libvirt is broken in RHEL 7.2.  See the build.log attached.

Version-Release number of selected component (if applicable):

libvirt-devel-1.2.17-1.el7.x86_64

How reproducible:

Unknown, at least once.

Steps to Reproduce:
1. Run libguestfs-test-tool.

Comment 2 Richard W.M. Jones 2015-07-06 13:26:37 UTC
This is not 100% reproducible, since I started another build and
that works.

This suggests it's another race in the start-up of the session
libvirtd.

Comment 3 Richard W.M. Jones 2015-07-07 10:30:13 UTC
Another build, same error:
http://download.devel.redhat.com/brewroot/work/tasks/6393/9476393/build.log

Adding Regression keyword, since this is a regression over
previous behaviour.

Comment 5 Michal Privoznik 2015-07-08 14:22:46 UTC
Rich, this looks like a daemon crasher to me. Let me see if my investigation will lead to anything. BTW do you know if it is possible to get the daemon logs from the build? Because the logs you've linked show client logs (I might be able to reproduce even so - the domain XML is there).

Comment 6 Richard W.M. Jones 2015-07-08 14:51:48 UTC
Created attachment 1049893 [details]
libvirtd.log

It turns out this is reproducible on baremetal, although it's
much rarer.  To try reproducing it, do the following as NON-root:

killall libvirtd
libguestfs-test-tool

Attached is libvirtd.log from a failing run, and in the next
comment I will attach libvirt-client.log which is the LIBVIRT_DEBUG=1
output from the same run.

Comment 7 Richard W.M. Jones 2015-07-08 14:52:17 UTC
Created attachment 1049894 [details]
libvirt-client.log

LIBVIRT_DEBUG=1 output from the same run as the comment above.

Comment 8 Richard W.M. Jones 2015-07-08 15:05:22 UTC
(In reply to Richard W.M. Jones from comment #6)
> Created attachment 1049893 [details]
> libvirtd.log
> 
> It turns out this is reproducible on baremetal, although it's
> much rarer.  To try reproducing it, do the following as NON-root:
> 
> killall libvirtd
> libguestfs-test-tool

Actually I see why this is intermittent/difficult to reproduce.

When libguestfs launch() is called, it basically does:

 - conn = virConnectOpenAuth
 - call virConnectGetCapabilities (conn)
 - rebuild the libguestfs appliance, if it needs to be rebuilt
 - call virDomainCreateXML (conn)

The rebuild step is either instantaneous (if it doesn't need to
be rebuilt) or on RHEL takes about 2 minutes.

I think this is only failing when the rebuild step happens.  Could
it be that libvirtd --timeout is broken, so that it's timing out
the daemon even though there is a client connection?

I can reliably reproduce this bug when I trigger an appliance
rebuild (do this command: sudo touch /usr/lib64/guestfs/supermin.d/ )

Comment 9 Richard W.M. Jones 2015-07-08 15:14:15 UTC
Created attachment 1049908 [details]
bz1240283.py

Based on the comment above, this is a simple reproducer
which doesn't involve anything except python + libvirt.

$ ./bz1240283.py 
libvirt: XML-RPC error : Cannot write data: Broken pipe
Traceback (most recent call last):
  File "./bz1240283.py", line 25, in <module>
    caps = conn.getCapabilities()
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3505, in getCapabilities
    if ret is None: raise libvirtError ('virConnectGetCapabilities() failed', conn=self)
libvirt.libvirtError: Cannot write data: Broken pipe

Comment 10 Martin Kletzander 2015-07-10 09:15:45 UTC
Fixed upstream by commit v1.2.17-77-gb7ea58c26219:

commit b7ea58c262194037042284a14fb1608c9cf31884
Author: Martin Kletzander <mkletzan@redhat.com>
Date:   Fri Jul 10 10:35:31 2015 +0200

    rpc: Rework timerActive logic in daemon

Comment 13 zhenfeng wang 2015-09-23 03:40:36 UTC
Cound reproduce this bug with libvirt-1.2.17-1.el7.x86_64 with Richard's reproducer

$ ./reproduce.py 
libvirt: XML-RPC error : Cannot write data: Broken pipe
Traceback (most recent call last):
  File "./reproduce.py", line 25, in <module>
    caps = conn.getCapabilities()
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3714, in getCapabilities
    if ret is None: raise libvirtError ('virConnectGetCapabilities() failed', conn=self)
libvirt.libvirtError: Cannot write data: Broken pipe

Verify the bug with libvirt-1.2.17-9.el7, could get the expect info
1.Run the reproducer, could get the expect info
$ ./reproduce.py 
NOT reproduced RHBZ#1240283

2.Get the host capabilities info with no-root user in read-only mode, could list the info correctly
$virsh -r -c qemu:///system capabilities
<capabilities>

  <host>
    <uuid>0028bd0f-d97b-e111-0000-e839354bfeea</uuid>
    <cpu>
      <arch>x86_64</arch>
      <model>SandyBridge</model>
      <vendor>Intel</vendor>
      <topology sockets='1' cores='4' threads='2'/>
---

According to upper steps, mark this bug verifed

Comment 15 errata-xmlrpc 2015-11-19 06:48:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2202.html


Note You need to log in before you can comment on or make changes to this bug.