The *dbus-daemon* service no longer becomes unresponsive due to leaking file descriptors
Previously, the *dbus-daemon* service incorrectly handled multiple messages containing file descriptors if they were received in a short time period. As a consequence, *dbus-daemon* leaked file descriptors and became unresponsive. A patch has been applied to correctly handle multiple file descriptors from different messages inside *dbus-daemon*. As a result, *dbus-daemon* closes and passes file descriptors correctly and no longer becomes unresponsive in the described situation.
Description of problem:
dbus-daemon fails to close correct file descriptors.
Version-Release number of selected component (if applicable):
'dbus-daemon: Failed to close file descriptor: Could not close fd' messages appear in journal logs
Correct file descriptors should be closed.
This is a known bug which was fixed through https://cgit.freedesktop.org/dbus/dbus/commit/?id=07f4c12efe3b9bd45d109bc5fbaf6d9dbf69d78e
We're hitting this bug often enough that dbus-daemon hits the open fd limit (2144 by default). Once this happens, dbus-daemon starts chewing through 100% CPU time of a core until dbus-daemon is restarted.
In a cluster of 8 machines with the same workload and same uptime we've observed a wide variance in the number of "leaked" fds per hour, ranging from ~0.42 fds/hour to 1.38 fds/hour. However, all machines are leaking fds over time, just at different rates.
Observed with dbus-1.6.12-13.el7.x86_64.
It would be ideal if this could be fixed in time for RHEL 7.3.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.