Description of problem:
libvirt fails to launch with error: Cannot write data: Broken pipe [code=38 int1=32]
It'd be really good if libvirt didn't do this. Why isn't there a
regression test to stop this happening?
Version-Release number of selected component (if applicable):
About 1 in 20 launches.
Steps to Reproduce:
while guestfish -a /dev/null run -v -x >/tmp/log 2>&1; do echo -n .; done
Created attachment 1134086 [details]
libvirt (client side) log
Created attachment 1134087 [details]
libvirtd (server side) log
Note this contains the server-side logs of many runs. Only the last one will correspond to the client side failure log posted in the previous log.
There is no qemu log file, one was not generated as far as I can tell.
Weirdly I can only reproduce this when I use the hardened build
linking command line (-specs=/usr/lib/rpm/redhat/redhat-hardened-ld).
(In reply to Richard W.M. Jones from comment #3)
> Weirdly I can only reproduce this when I use the hardened build
> linking command line (-specs=/usr/lib/rpm/redhat/redhat-hardened-ld).
Actually I don't think this is true. It's just the bug is
impossibly difficult to reproduce on demand. Sometimes it
happens all the time, sometimes it never seems to happen.
Still looking ...
Oh I see. This is actually a reoccurrence of the problem we had
where libvirtd would drop the connection if you hold it open for
> 30 seconds. ie:
See also the simple reproducer:
When I run the reproducer on Rawhide, I get:
libvirt: XML-RPC error : Cannot write data: Broken pipe
Traceback (most recent call last):
File "./bz1240283.py", line 25, in <module>
caps = conn.getCapabilities()
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3735, in getCapabilities
if ret is None: raise libvirtError ('virConnectGetCapabilities() failed', conn=self)
libvirt.libvirtError: Cannot write data: Broken pipe
It appears to be fixed by
I applied this patch to the libvirt package, and the
reproducer no longer fails:
NOT reproduced RHBZ#1240283
Please add this reproducer to the libvirt test suite. It's a good
regression test that requires no special dependencies.
Also libguestfs hasn't crashed yet, although it's hard to say
whether that is proof because the libguestfs crash was difficult
to reproduce reliably.
(In reply to Richard W.M. Jones from comment #6)
> It appears to be fixed by
'daemon: properly check for clients'
martin, has that patch been posted upstream yet? I don't see it on the list
> Please add this reproducer to the libvirt test suite. It's a good
> regression test that requires no special dependencies.
the libvirt.git test suite doesn't have any functional tests that launch a daemon for example, at least it's not a trivial addition. especially that this requires waiting for a 30 second timeout. if you're interested in this I'd suggest bringing it up on the list for further discussion
libvirt-1.3.2-2.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-a8f520b3f6
(In reply to Cole Robinson from comment #7)
I'm hoping to post the series today, it is not posted yet, though. The test could be marked as expensive (only ran on ci.centos.org and when we want to, plus the --timeout can be changed, so it's quite easy to check that. i think we can include something like that.
libvirt-1.3.2-2.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-a8f520b3f6
Posted upstream as a part of bigger series:
Patches are in rawhide, and in updates-testing for f24
libvirt-1.3.2-2.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.