RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2161915 - NetworkManager stops when dbus.service is restarted
Summary: NetworkManager stops when dbus.service is restarted
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: NetworkManager
Version: 9.3
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Beniamino Galvani
QA Contact: Matej Berezny
Jaroslav Klech
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-01-18 09:33 UTC by Renaud Métrich
Modified: 2023-11-07 10:12 UTC (History)
12 users (show)

Fixed In Version: NetworkManager-1.43.6-1.el9
Doc Type: Enhancement
Doc Text:
.The `NetworkManager` service restarts immediately after the `dbus` service is restarted Previously, after restarting `dbus` for some reason, `NetworkManager` stopped. This behavior was not optimal and caused a loss of connectivity. Therefore, this enhancement updates `NetworkManager` to become more robust and to make it restart automatically upon a `dbus` restart.
Clone Of:
Environment:
Last Closed: 2023-11-07 08:37:57 UTC
Type: Bug
Target Upstream Version:
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker NMT-117 0 None None None 2023-01-22 14:16:43 UTC
Red Hat Issue Tracker RHELPLAN-145503 0 None None None 2023-01-18 09:33:50 UTC
Red Hat Product Errata RHBA-2023:6585 0 None None None 2023-11-07 08:38:29 UTC
freedesktop.org Gitlab NetworkManager NetworkManager-ci merge_requests 1414 0 None closed general: added nm_bind_to_dbus test 2023-08-03 07:56:43 UTC
freedesktop.org Gitlab NetworkManager NetworkManager merge_requests 1605 0 None merged systemd: add "BindsTo=dbus.service" to NetworkManager.service 2023-04-19 07:49:22 UTC

Description Renaud Métrich 2023-01-18 09:33:27 UTC
Description of problem:

When restarting dbus.service unit, (due to failure, patching, etc.), we can see that NetworkManager.service unit just stops, causing network to be lost upon DHCP lease expiration.

Since NetworkManager has a hard dependency on dbus, it should restart or reconnect (better).

Version-Release number of selected component (if applicable):

NetworkManager-1.40.0-5.el8_7.x86_64

How reproducible:

Always

Steps to Reproduce:
1. Restart dbus.service unit

Actual results:

NetworkManager.service stops

Expected results:

NetworkManager.service still active

Comment 3 Thomas Haller 2023-01-25 07:17:54 UTC
Much software that uses D-Bus is not handling the case that the D-Bus daemon is restarting, because doing so adds a lot of complexity (it's really convenient, to just rely on D-Bus working *all the time* and not handle the case that it could fail at any time). NetworkManager doesn't handle that, so it needs to be restarted when the D-Bus daemon gets restarted. Theoretically this could be implemented, but it's high effort for little use.

(upstream patches welcome though).

> When restarting dbus.service unit, (due to failure, patching, etc.)

dbus.service is a central part of the system. If you restart it, you must restart most services that use D-Bus. That basically amounts to a full reboot, because so many services are affected.

That's not different from kernel. If kernel fails or requires an update, you also need to reboot (kernel live patching aside, which is not available to user-space processes and also limited for kernel).


That we split a part of D-Bus out to a user-space process, has advantages and disadvantages (see kdbus). One advantage could be, that the D-Bus part is independent and you could restart it on failure. A bit like with a micro kernel approach, where the kernel is split into individual services that may fail and restart independently. So this seems something that would be nice, but it's then the dbus-daemon's responsiblity to restart/recover in a transparent manner. It cannot be, that every client is affected by the restart and needs to implement it (at significant complexity). IMO, if the restart case is important, then dbus-broker needs to handle that transparently, without requiring every client to handle the complexity of implementing it: https://github.com/bus1/dbus-broker/issues/93 .



Maybe if you list exactly why you restart dbus.service, we could suggest how to do it or what to do otherwise. Probably the likely solution will be to reboot. If you restart because of package update, then the dbus-daemon package should be mostly boring and not update frequently (when it does, reboot). If you restart because of bugs in dbus.service, the right solution is to fix the bugs. If you restart because running out of resources/memory, you probably want to allocate enough resources to dbus.service for that not to happen (and/or rate limit how much clients -- based on the user ID -- can use).


This is a WONTFIX.

Comment 6 Thomas Haller 2023-01-26 08:31:09 UTC
> this is not important for the customer, I just found that when "playing" with dbus.
> ...
> "Fix the bugs" is easy to say, but having to reboot a production system several times in order to troubleshoot a potential bug is not.

Is this just hypothetically speaking? Or what was the real world issue (where restart of dbus.service was necessary and reboot was not a good option).

Btw, the (current) workaround is `systemctl restart dbus.service NetworkManager.service`. Also, NetworkManager is probably not the only offender and other services might also require a restart. By restarting dbus.service, you might introduce other issues that interfere with "troubleshoot a potential bug". I wouldn't trust the system after `systemctl restart dbus.service` to show the bug I am hunting. I would just reboot. Of course, in production that may be problematic either way.

> If you cannot fix NetworkManager properly (from high look it's just a matter of reconnecting)

"cannot" is too strong of a word. It seems not trivial, while also not really necessary.

"BindsTo=dbus.service" is probably a good idea in the meantime!


My point is that restarting dbus.service is not something that you would usually do (I don't recall ever needing that myself). So the simple answer to this issue is "just don't do it -- reboot if you really need". It would be interesting to learn where this is used in practice. Thank you!

Comment 7 Till Maas 2023-03-01 18:57:32 UTC
Since Thomas agrees to add BindsTo, I am also fine with adding this. Any additional work does not seem to have the impact that justifies the effort. So proposed acceptance criteria:

Given a RHEL system
When running grep BindsTo=dbus.service /usr/lib/systemd/system/NetworkManager.service
Then BindsTo=dbus.service is shown

Comment 9 Thomas Haller 2023-04-14 07:51:25 UTC
(In reply to Till Maas from comment #7)
> Since Thomas agrees to add BindsTo, I am also fine with adding this. Any
> additional work does not seem to have the impact that justifies the effort.
> So proposed acceptance criteria:
> 
> Given a RHEL system
> When running grep BindsTo=dbus.service
> /usr/lib/systemd/system/NetworkManager.service
> Then BindsTo=dbus.service is shown

I would rather say:

1)
Given NetworkManager running on a systemd system
Given NetworkManager cannot handle reconnecting to the dbus daemon
When dbus daemon is stopped
Then NetworkManager is also (automatically) stopped


2)
Given NetworkManager running on a systemd system
Given NetworkManager cannot handle reconnecting to the dbus daemon
When dbus daemon is restarted
Then NetworkManager is also (automatically) restarted

(*) "dbus daemon" can refer to both dbus-daemon and dbus-broker implementations

(I am not sure, whether BindsTo= can handle restart as described in 2). I don't read that from `man systemd.unit`).

Comment 19 errata-xmlrpc 2023-11-07 08:37:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (NetworkManager bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6585


Note You need to log in before you can comment on or make changes to this bug.