RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 667835 - Dropped connections are not handled correctly in some cases
Summary: Dropped connections are not handled correctly in some cases
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: dbus
Version: 6.0
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: rc
: ---
Assignee: David King
QA Contact: Desktop QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-01-06 22:52 UTC by Lon Hohberger
Modified: 2016-01-05 18:58 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-01-05 18:58:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Fix (665 bytes, patch)
2011-01-06 22:53 UTC, Lon Hohberger
no flags Details | Diff
Test case (3.20 KB, text/plain)
2011-01-07 19:12 UTC, Lon Hohberger
no flags Details

Description Lon Hohberger 2011-01-06 22:52:04 UTC
Description of problem:

If the dbus-daemon dies or is killed, it's possible for certain clients to never notice.

Typically, you can figure out that the remote has closed its half of the connection by checking POLLHUP, but the dbus client library does not check for this, it only checks for POLLERR.


Version-Release number of selected component (if applicable): 1.2.24-3.el6

How reproducible: 100%


Steps to Reproduce:
1.  Connect an application which only emits signals.
2.  Kill dbus-daemon
3.  Emit a couple of signals in the client
4.  Check for dbus_connection_get_is_connected()
  
Actual results:

* Every subsequent signal emitted allocates memory, which could eventually run the system out of RAM.
* dbus_connection_get_is_connected() _never_ returns false.

Expected results:

* No memory leak
* Client low-level transport connection structures set to disconnected.  That is, "dbus_connection_get_is_connected()" should return FALSE after the dbus-daemon has died.

Additional info:

This can be worked around in clients by each one polling the dbus connection file descriptor and checking for POLLUP, but should not be necessary, and the dbus documentation expressly disdains doing this sort of thing.

The attached patch resolves the issue.

Comment 2 Lon Hohberger 2011-01-06 22:53:45 UTC
Created attachment 472155 [details]
Fix

Comment 3 RHEL Program Management 2011-01-07 16:01:36 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 4 Lon Hohberger 2011-01-07 19:12:15 UTC
Created attachment 472280 [details]
Test case

Compile:

  gcc -o dbus-bug dbus-bug.c `pkg-config dbus-1 --cflags` `pkg-config dbus-1 --libs` -D_GNU_SOURCE

usage:

  ./dbus-bug [w]

  (if argv[1] is present, it enables a poll() workaround which is similar
   to the patch provided)

Expected behavior:

   Program should exit after dbus-daemon is killed.

Behavior with unpatched dbus-1.2.24-3.el6:

   Program never exits.

Behavior with patched dbus-1.2.24-3.el6:

   Program exits after the dbus-daemon has been killed.

Behavior in both cases when workaround is enabled:

   Program exits after the dbus-daemon has been killed.

We can use the workaround as provided in this test case if required, however, since we'll have to duplicate it several times (once in each daemon), it makes more sense to fix the dbus library.

Comment 5 Lon Hohberger 2011-01-07 19:15:37 UTC
BTW, if you comment out the following line in dbus-bug.c:

    dbus_connection_set_exit_on_disconnect(dbc, FALSE);

... the behavior of the program does not change in any test case.

Comment 6 Lon Hohberger 2011-01-07 20:00:00 UTC
dbus_connection_flush() detects dbus-daemon's death in a timely manner as well.

Comment 7 Lon Hohberger 2011-01-07 20:22:51 UTC
http://dbus.freedesktop.org/doc/api/html/group__DBusConnection.html#ga580d8766c23fe5f49418bc7d87b67dc6

^^ Also can be used in a thread to detect connection death.

Comment 8 Colin Walters 2011-01-07 21:02:50 UTC
(In reply to comment #6)
> dbus_connection_flush() detects dbus-daemon's death in a timely manner as well.

If you're not using a mainloop to talk to dbus (as "dbus-bug" isn't), then I think the right thing to do is call dbus_connection_flush() after you're done with any dbus_connection_send()s.  This would be line 148 of dbus-bug.c.

The docs for dbus_connection_send() do say that a flush() is unnecessary *if* you're using a mainloop, but they don't make it clear that it really *is* necessary if you're not, otherwise the state is sort of undefined.

I'm not entirely sure about your patch; it smells right but I'd need to truly understand what it could possibly change.

Can you confirm if you add the dbus_connection_flush() that the test case works as expected?

Comment 9 Lon Hohberger 2011-01-10 23:06:05 UTC
dbus_connection_flush() works, as noted in comment #6.

We went the other route for now, adding a thread which calls dbus_connection_read_write() in a loop, which also picks up if the dbus-daemon dies.

I assume both at some level do some sort of non-POLLHUP error checking if the dbus-daemon died (ex: "select() said there was data ready to read, but read returned 0 -> oops, connection died").  I didn't investigate that much.

So, this might be a case of WONTFIX - but in talking with others, not checking POLLHUP seemed to be a bug, so I left this open.

dbus_connection_send() sends a message immediately if it can, and only queues it if it can't send it immediately (by calling poll()).  When it goes to try to send, it calls poll().

If we check POLLHUP, we can immediately detect the connection is dead instead of queueing, waiting for the other thread to wake up do its error detection (which I still need to look at) and handling it later.


Note You need to log in before you can comment on or make changes to this bug.