Bug 1370381

Summary: dbus-daemon keeps locking and using all CPU
Product: Red Hat Enterprise Linux 7 Reporter: Rupesh Patel <rupatel>
Component: dbusAssignee: David King <dking>
Status: CLOSED ERRATA QA Contact: Desktop QE <desktop-qa-list>
Severity: high Docs Contact:
Priority: high    
Version: 7.3CC: cww, dking, jkoten, mcepl, mclasen, rupatel, tpelka
Target Milestone: rcKeywords: OtherQA
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: dbus-1.10.24-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 12:52:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1298243, 1393395, 1420851, 1477211, 1479818    
Attachments:
Description Flags
strace file none

Description Rupesh Patel 2016-08-26 05:51:14 UTC
dbus-libs-1.6.12-14.el7_2.x86_64
dbus-1.6.12-14.el7_2.x86_64

Strace,

5608  23:14:47.667691 accept4(3, 0x7ffe2f1468c0, [16], SOCK_CLOEXEC) = -1 EMFILE (Too many open files) <0.000007>
5608  23:14:47.667710 fcntl(-1, F_GETFD) = -1 EBADF (Bad file descriptor) <0.000006>
5608  23:14:47.667729 epoll_wait(4, {{EPOLLIN, {u32=3, u64=14883886646603808771}}}, 64, -1) = 1 <0.000006>
5608  23:14:47.667748 accept4(3, 0x7ffe2f1468c0, [16], SOCK_CLOEXEC) = -1 EMFILE (Too many open files) <0.000006>
5608  23:14:47.667767 fcntl(-1, F_GETFD) = -1 EBADF (Bad file descriptor) <0.000007>
5608  23:14:47.667788 epoll_wait(4, {{EPOLLIN, {u32=3, u64=14883886646603808771}}}, 64, -1) = 1 <0.000006>
5608  23:14:47.667808 accept4(3, 0x7ffe2f1468c0, [16], SOCK_CLOEXEC) = -1 EMFILE (Too many open files) <0.000007>
5608  23:14:47.667827 fcntl(-1, F_GETFD) = -1 EBADF (Bad file descriptor) <0.000006>

lsof,

dbus-daem  5973  5975           dbus 2132r      REG                0,3          0 4287922432 /proc/15792/cmdline
dbus-daem  5973  5975           dbus 2133r      REG                0,3          0 4287908724 /proc/16255/cmdline
dbus-daem  5973  5975           dbus 2134r      REG                0,3          0 4287917369 /proc/16086/cmdline
dbus-daem  5973  5975           dbus 2135r      REG                0,3          0 4287908726 /proc/16256/cmdline
dbus-daem  5973  5975           dbus 2136r      REG                0,3          0 4287917373 /proc/16087/cmdline
dbus-daem  5973  5975           dbus 2137r      REG                0,3          0 4287917373 /proc/16087/cmdline

[root@dhcp2-62 ]# 
cat 30-lsof.txt | grep dbus-daem | wc -l
4320
[root@dhcp2-62 ]# 

Attached, sosreport, strace & lsof

Comment 1 Rupesh Patel 2016-08-26 05:56:23 UTC
Created attachment 1194212 [details]
strace file

Comment 4 David King 2016-09-08 13:11:02 UTC
I have finally been able to get access to the Customer Portal to examine the case, and it seems there is already a patch in upstream dbus git that should fix the problem.

https://cgit.freedesktop.org/dbus/dbus/commit/?id=a548141b172a078dd0073d718da3fb655821860a

Comment 7 David King 2016-09-08 13:39:48 UTC
This also turns out to have already been fixed in RHEL 6.7 in bug 1118456 (ignore the later status changes in that bug, and only pay attention to the changes from 2015).

Comment 13 Rupesh Patel 2016-12-01 10:59:55 UTC
Customer mentioned he still facing an issue with test rpm too. I collected strace output and could see still those messages. 

18676 21:37:40.778455 accept4(3, 0x7ffccec79290, [16], SOCK_CLOEXEC) = -1 EMFILE (Too many open files) <0.000007>
18676 21:37:40.778474 fcntl(-1, F_GETFD) = -1 EBADF (Bad file descriptor) <0.000006>
18676 21:37:40.778505 epoll_wait(4, {{EPOLLIN, {u32=3, u64=1262679204877565955}}}, 64, -1) = 1 <0.000006>
18676 21:37:40.778536 accept4(3, 0x7ffccec79290, [16], SOCK_CLOEXEC) = -1 EMFILE (Too many open files) <0.000006>
18676 21:37:40.778556 fcntl(-1, F_GETFD) = -1 EBADF (Bad file descriptor) <0.000006>
18676 21:37:40.778575 epoll_wait(4, {{EPOLLIN, {u32=3, u64=1262679204877565955}}}, 64, -1) = 1 <0.000006>
18676 21:37:40.778594 accept4(3, 0x7ffccec79290, [16], SOCK_CLOEXEC) = -1 EMFILE (Too many open files) <0.000006>
18676 21:37:40.778614 fcntl(-1, F_GETFD) = -1 EBADF (Bad file descriptor) <0.000006>
18676 21:37:40.778633 epoll_wait(4, {{EPOLLIN, {u32=3, u64=1262679204877565955}}}, 64, -1) = 1 <0.000007>
18676 21:37:40.778652 accept4(3, 0x7ffccec79290, [16], SOCK_CLOEXEC) = -1 EMFILE (Too many open files) <0.000007>
18676 21:37:40.778672 fcntl(-1, F_GETFD) = -1 EBADF (Bad file descriptor) <0.000006>
18676 21:37:40.778691 epoll_wait(4, {{EPOLLIN, {u32=3, u64=1262679204877565955}}}, 64, -1) = 1 <0.000007>
18676 21:37:40.778710 accept4(3, 0x7ffccec79290, [16], SOCK_CLOEXEC) = -1 EMFILE (Too many open files) <0.000007>
18676 21:37:40.778732 fcntl(-1, F_GETFD) = -1 EBADF (Bad file descriptor) <0.000006>
18676 21:37:40.778751 epoll_wait(4, {{EPOLLIN, {u32=3, u64=1262679204877565955}}}, 64, -1) = 1 <0.000006>
18676 21:37:40.778771 accept4(3, 0x7ffccec79290, [16], SOCK_CLOEXEC) = -1 EMFILE (Too many open files) <0.000007>
18676 21:37:40.778801 fcntl(-1, F_GETFD) = -1 EBADF (Bad file descriptor) <0.000006>
18676 21:37:40.778819 epoll_wait(4, {{EPOLLIN, {u32=3, u64=1262679204877565955}}}, 64, -1) = 1 <0.000006>
18676 21:37:40.778838 accept4(3, 0x7ffccec79290, [16], SOCK_CLOEXEC) = -1 EMFILE (Too many open files) <0.000006>
18676 21:37:40.778858 fcntl(-1, F_GETFD) = -1 EBADF (Bad file descriptor) <0.000006>
18676 21:37:40.778876 epoll_wait(4,  <detached ...>

Comment 38 errata-xmlrpc 2018-04-10 12:52:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0765