Bug 1240283

Summary: virDomainCreateXML fails with Cannot write data: Broken pipe
Product: Red Hat Enterprise Linux 7 Reporter: Richard W.M. Jones <rjones>
Component: libvirtAssignee: Martin Kletzander <mkletzan>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.2CC: dyuan, mkletzan, mprivozn, mzhan, rbalakri, rjones, tzheng, zhwang
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-1.2.17-3.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1247746 (view as bug list) Environment:
Last Closed: 2015-11-19 06:48:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 910269    
Attachments:
Description Flags
build.log
none
libvirtd.log
none
libvirt-client.log
none
bz1240283.py none

Description Richard W.M. Jones 2015-07-06 12:40:01 UTC
Created attachment 1048816 [details]
build.log

Description of problem:

Libvirt is broken in RHEL 7.2.  See the build.log attached.

Version-Release number of selected component (if applicable):

libvirt-devel-1.2.17-1.el7.x86_64

How reproducible:

Unknown, at least once.

Steps to Reproduce:
1. Run libguestfs-test-tool.

Comment 2 Richard W.M. Jones 2015-07-06 13:26:37 UTC
This is not 100% reproducible, since I started another build and
that works.

This suggests it's another race in the start-up of the session
libvirtd.

Comment 3 Richard W.M. Jones 2015-07-07 10:30:13 UTC
Another build, same error:
http://download.devel.redhat.com/brewroot/work/tasks/6393/9476393/build.log

Adding Regression keyword, since this is a regression over
previous behaviour.

Comment 5 Michal Privoznik 2015-07-08 14:22:46 UTC
Rich, this looks like a daemon crasher to me. Let me see if my investigation will lead to anything. BTW do you know if it is possible to get the daemon logs from the build? Because the logs you've linked show client logs (I might be able to reproduce even so - the domain XML is there).

Comment 6 Richard W.M. Jones 2015-07-08 14:51:48 UTC
Created attachment 1049893 [details]
libvirtd.log

It turns out this is reproducible on baremetal, although it's
much rarer.  To try reproducing it, do the following as NON-root:

killall libvirtd
libguestfs-test-tool

Attached is libvirtd.log from a failing run, and in the next
comment I will attach libvirt-client.log which is the LIBVIRT_DEBUG=1
output from the same run.

Comment 7 Richard W.M. Jones 2015-07-08 14:52:17 UTC
Created attachment 1049894 [details]
libvirt-client.log

LIBVIRT_DEBUG=1 output from the same run as the comment above.

Comment 8 Richard W.M. Jones 2015-07-08 15:05:22 UTC
(In reply to Richard W.M. Jones from comment #6)
> Created attachment 1049893 [details]
> libvirtd.log
> 
> It turns out this is reproducible on baremetal, although it's
> much rarer.  To try reproducing it, do the following as NON-root:
> 
> killall libvirtd
> libguestfs-test-tool

Actually I see why this is intermittent/difficult to reproduce.

When libguestfs launch() is called, it basically does:

 - conn = virConnectOpenAuth
 - call virConnectGetCapabilities (conn)
 - rebuild the libguestfs appliance, if it needs to be rebuilt
 - call virDomainCreateXML (conn)

The rebuild step is either instantaneous (if it doesn't need to
be rebuilt) or on RHEL takes about 2 minutes.

I think this is only failing when the rebuild step happens.  Could
it be that libvirtd --timeout is broken, so that it's timing out
the daemon even though there is a client connection?

I can reliably reproduce this bug when I trigger an appliance
rebuild (do this command: sudo touch /usr/lib64/guestfs/supermin.d/ )

Comment 9 Richard W.M. Jones 2015-07-08 15:14:15 UTC
Created attachment 1049908 [details]
bz1240283.py

Based on the comment above, this is a simple reproducer
which doesn't involve anything except python + libvirt.

$ ./bz1240283.py 
libvirt: XML-RPC error : Cannot write data: Broken pipe
Traceback (most recent call last):
  File "./bz1240283.py", line 25, in <module>
    caps = conn.getCapabilities()
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3505, in getCapabilities
    if ret is None: raise libvirtError ('virConnectGetCapabilities() failed', conn=self)
libvirt.libvirtError: Cannot write data: Broken pipe

Comment 10 Martin Kletzander 2015-07-10 09:15:45 UTC
Fixed upstream by commit v1.2.17-77-gb7ea58c26219:

commit b7ea58c262194037042284a14fb1608c9cf31884
Author: Martin Kletzander <mkletzan>
Date:   Fri Jul 10 10:35:31 2015 +0200

    rpc: Rework timerActive logic in daemon

Comment 13 zhenfeng wang 2015-09-23 03:40:36 UTC
Cound reproduce this bug with libvirt-1.2.17-1.el7.x86_64 with Richard's reproducer

$ ./reproduce.py 
libvirt: XML-RPC error : Cannot write data: Broken pipe
Traceback (most recent call last):
  File "./reproduce.py", line 25, in <module>
    caps = conn.getCapabilities()
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3714, in getCapabilities
    if ret is None: raise libvirtError ('virConnectGetCapabilities() failed', conn=self)
libvirt.libvirtError: Cannot write data: Broken pipe

Verify the bug with libvirt-1.2.17-9.el7, could get the expect info
1.Run the reproducer, could get the expect info
$ ./reproduce.py 
NOT reproduced RHBZ#1240283

2.Get the host capabilities info with no-root user in read-only mode, could list the info correctly
$virsh -r -c qemu:///system capabilities
<capabilities>

  <host>
    <uuid>0028bd0f-d97b-e111-0000-e839354bfeea</uuid>
    <cpu>
      <arch>x86_64</arch>
      <model>SandyBridge</model>
      <vendor>Intel</vendor>
      <topology sockets='1' cores='4' threads='2'/>
---

According to upper steps, mark this bug verifed

Comment 15 errata-xmlrpc 2015-11-19 06:48:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2202.html