Bug 1464083 - libvirtd doesn't give a decent error if inotify limits are too low
Summary: libvirtd doesn't give a decent error if inotify limits are too low
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: libvirt
Version: 27
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Libvirt Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: TRACKER-bugs-affecting-libguestfs
TreeView+ depends on / blocked
 
Reported: 2017-06-22 12:00 UTC by Richard W.M. Jones
Modified: 2018-11-30 17:44 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-30 17:44:57 UTC
Type: Bug


Attachments (Terms of Use)
libvirtd log when it fails (1.04 MB, text/plain)
2017-06-22 12:02 UTC, Richard W.M. Jones
no flags Details

Description Richard W.M. Jones 2017-06-22 12:00:40 UTC
Description of problem:

Basically I hit the bug exactly as diagnosed and worked around here:
https://github.com/Connexions/devops/wiki/libvirtd-won't-start

$ virsh list --all
error: failed to connect to the hypervisor
error: Cannot recv data: Connection reset by peer
$ cat /proc/sys/fs/inotify/max_user_watches 
8192
$ cat /proc/sys/fs/inotify/max_user_instances 
128
$ sudo sysctl -n -w fs.inotify.max_user_watches=16384
16384
$ sudo sysctl -n -w fs.inotify.max_user_instances=256
256
$ virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     tmp-bz1431579                  shut off


Version-Release number of selected component (if applicable):

libvirt-daemon-3.2.1-3.fc26.x86_64

How reproducible:

As above.

Steps to Reproduce:

Unclear how to exactly reproduce it.

Comment 1 Richard W.M. Jones 2017-06-22 12:02:10 UTC
Created attachment 1290663 [details]
libvirtd log when it fails

Comment 2 Richard W.M. Jones 2017-06-22 15:52:40 UTC
Actually the problem is a bit stranger than I thought.  It appears
that something leaks the inotify watches, so that even increasing
the limits does not help - eventually it will run out again.

This happens after using 'make check-release' which is a very
long test of libguestfs which runs many hundreds, possibly thousands
of VM instances using libvirt.

Comment 3 Richard W.M. Jones 2017-06-22 15:54:43 UTC
It turns out (thanks lsof) this is actually caused by leaking
gpg-agent instances.  I'll file another bug about that.

However libvirt could still give a decent error message.

Comment 4 Daniel Berrangé 2017-06-22 16:21:23 UTC
The only bits of libvirt using inotify are UML and Xen drivers, each only register a single watch, at initial startup. So if there is a failure, it should only hit at libvirtd startup time - i guess if you are using libvirt session mode though, and have enough time for libvirtd to shutdown you'd be starting it multiple time, and so might not see the failure immediately in your test suite.

The problem we're seeing with error reporting here is related to the auto-spawn of libvirtd. We're successfully spawning libvirtd, and at least starting to connect to it, because the listener socket is ready, then UML fails to setup inotify, causing it to shutdown again, at which point virsh gets the error. We've no way to get the errors reported by libvirtd, back to virsh, hence the somewhat unhelpful error message we see.

Comment 5 Richard W.M. Jones 2017-06-22 17:01:47 UTC
(In reply to Daniel Berrange from comment #4)
> The only bits of libvirt using inotify are UML and Xen drivers, each only
> register a single watch, at initial startup. So if there is a failure, it
> should only hit at libvirtd startup time - i guess if you are using libvirt
> session mode though, and have enough time for libvirtd to shutdown you'd be
> starting it multiple time, and so might not see the failure immediately in
> your test suite.

Just to clarify: libvirtd (session instance) cannot be started
at all.  There is no session daemon, and running trivial virsh commands
fails, and there is no session daemon running afterwards either.

Comment 6 Jan Kurik 2017-08-15 07:50:35 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 27 development cycle.
Changing version to '27'.

Comment 7 Ben Cotton 2018-11-27 18:34:43 UTC
This message is a reminder that Fedora 27 is nearing its end of life.
On 2018-Nov-30  Fedora will stop maintaining and issuing updates for
Fedora 27. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora  'version' of '27'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 27 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 8 Ben Cotton 2018-11-30 17:44:57 UTC
Fedora 27 changed to end-of-life (EOL) status on 2018-11-30. Fedora 27 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.