Bug 1934291

Summary: wpa_supplicant.service stops with systemctl isolate mulit-user.target runlevel change
Product: Red Hat Enterprise Linux 8 Reporter: Curtis Taylor <cutaylor>
Component: NetworkManagerAssignee: NetworkManager Development Team <nm-team>
Status: CLOSED ERRATA QA Contact: Desktop QE <desktop-qa-list>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 8.3CC: acardace, atragler, bgalvani, dtardon, fge, fpokryvk, jmaxwell, lrintel, rkhan, rvr, sukulkar, systemd-maint-list, till, tpelka, vbenes
Target Milestone: rcKeywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-09 19:29:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1935910    

Description Curtis Taylor 2021-03-02 20:59:46 UTC
Description of problem:
Going from graphical.target to multi-user.target results in wpa_supplicant.service being stopped.

Version-Release number of selected component (if applicable):
wpa_supplicant-2.9-2.el8.x86_64

How reproducible:
Reproduces in VM easily with fresh RHEL8.3 installation.

Steps to Reproduce:
1. From graphical.target with wpa_supplicant running
# systemctl status wpa_supplicant.service 
wpa_supplicant.service - WPA supplicant
   Loaded: loaded (/usr/lib/systemd/system/wpa_supplicant.service; disabled; vendor preset: disabled)
   Active: active (running) since Tue 2021-02-09 13:22:59 CET; 59s ago
 Main PID: 1578 (wpa_supplicant)
    Tasks: 1 (limit: 49320)
   Memory: 5.8M
   CGroup: /system.slice/wpa_supplicant.service
           ??1578 /usr/sbin/wpa_supplicant -c /etc/wpa_supplicant/wpa_supplicant.conf -u -s -O /var/run/wpa_supplicant

2. systemctl isolate mulit-user.target

3. witness wpa_supplicant stopped

Actual results:
# systemctl status wpa_supplicant.service 
wpa_supplicant.service - WPA supplicant
   Loaded: loaded (/usr/lib/systemd/system/wpa_supplicant.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

Expected results:
# systemctl status wpa_supplicant.service 
wpa_supplicant.service - WPA supplicant
   Loaded: loaded (/usr/lib/systemd/system/wpa_supplicant.service; disabled; vendor preset: disabled)
   Active: active (running) since Tue 2021-02-09 13:22:59 CET; 59s ago
 Main PID: 1578 (wpa_supplicant)
    Tasks: 1 (limit: 49320)
   Memory: 5.8M
   CGroup: /system.slice/wpa_supplicant.service
           ??1578 /usr/sbin/wpa_supplicant -c /etc/wpa_supplicant/wpa_supplicant.conf -u -s -O /var/run/wpa_supplicant

Additional info:
Enabling IgnoreOnIsolate, with a wpa_supplicant.service override, solves the issue and seems like it should be the default in wpa_supplicant.service.

  # systemctl edit wpa_supplicant.service  
  [Unit]
  IgnoreOnIsolate=true

  # systemctl show wpa_supplicant.service | grep OnIsolate
  IgnoreOnIsolate=yes

Comment 1 Beniamino Galvani 2021-03-03 09:04:15 UTC
The current behavior seems correct because wpa_supplicant is disabled:

 $ systemctl status wpa_supplicant
 ● wpa_supplicant.service - WPA supplicant
    Loaded: loaded (/usr/lib/systemd/system/wpa_supplicant.service; disabled; vendor preset: disabled)

wpa_supplicant is started in the graphical target only because another
program activates it via D-Bus:

 dbus-daemon[947]: [system] Activating via systemd: service name='fi.w1.wpa_supplicant1' unit='wpa_supplicant.service' requested by ':1.115' (uid=997 pid=2330 comm="/usr/libexec/geoclue " label="system_u:system_r:geoclue_t:s0")

GeoClue is started when GNOME is running to provide geographical
awareness and for some reasons it needs wpa_supplicant. If you are
relying on this implicit dependency to have wpa_supplicant started
automatically, this seems wrong.

Another program that might activate wpa_supplicant is NetworkManager,
but only when a Wi-Fi device is found. So, if you isolate the
multi-user target and there is a Wi-Fi device managed by NM,
wpa_supplicant will be active.

To have wpa_supplicant always running even if not requested by another
program (like NM), you need to enable the service explicitly:

 # systemctl enable wpa_supplicant

Comment 2 Beniamino Galvani 2021-03-03 09:32:58 UTC
Okay, I think the root issue is that an ethernet 802.1X connection active in graphical.target breaks when switching to multi-user.target because wpa_supplicant gets stopped. I'm not sure if this is a systemd bug (which should not stop wpa_supplicant because it was started by a unit that is still active in the new target) or NM's (as it should restart wpa_supplicant when it gets killed externally).

Also enabling IgnoreOnIsolate could be a solution, but I'm worried that perhaps this would lead to situations where wpa_supplicant is running when it's not expected to.

Comment 3 Curtis Taylor 2021-03-04 01:30:10 UTC
(In reply to Beniamino Galvani from comment #2)
> Also enabling IgnoreOnIsolate could be a solution, but I'm worried that
> perhaps this would lead to situations where wpa_supplicant is running when
> it's not expected to.

Would not IgnoreOnIsolate just rely upon the application which started wpa_supplicant to handle stopping it when necessary and not actually result in wpa_supplicant running when not expected?

My thinking is:

service A1 starts wpa_supplicant
systemd target is changed but service A1 remains destination target
  wpa_supplicant is still required but stops if IgnoreOnIsolate is not set.
systemd target is changed to a target where A1 is stopped
  wpa_supplicant is stopped by A1 when A1 stops or else it's a bug in A1 for not stopping what it started

Therefore I believe the fix is for wpa_supplicant to enable IgnoreOnIsolate in the wpa_supplicant.service it provides.

Comment 4 Curtis Taylor 2021-03-04 01:34:34 UTC
... or else if we are thinking about "what if something else starts wpa_supplicant", then in that case A1 is not NetworkManager and the same argument, of the bug being that A1 does not stop what it started,still applies.   

The way to keep wpa_supplicant running if it is supplying a service that is not enabled by default is to have it provide a service with IgnoreOnIsolate.

Comment 5 Beniamino Galvani 2021-03-09 07:46:41 UTC
> Would not IgnoreOnIsolate just rely upon the application which
> started wpa_supplicant to handle stopping it when necessary and not
> actually result in wpa_supplicant running when not expected?

wpa_supplicant is D-Bus activatable, which means that an application
doesn't explicitly have to start it. The calling application (e.g. NM)
just needs to send a D-Bus request (ignoring whether wpa_supplicant is
actually running or not), and systemd will take care of starting the
service if needed.

Therefore, the calling application doesn't need to care about stopping
wpa_supplicant (and it wouldn't have a way to do it even if it
wanted).

> My thinking is:
>
> service A1 starts wpa_supplicant
> systemd target is changed but service A1 remains destination target
>   wpa_supplicant is still required but stops if IgnoreOnIsolate is not set.

Normally, if there is a 802.1X connection active and wpa_supplicant
stops, NM would restart it. This didn't happen for connections with
the property 802-1x.optional set to 'yes' (as in the attached case)
due to a bug. The fix for this bug is:

 https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/776

With this commit, wpa_supplicant would still be stopped on isolate
(when it's not explicitly enabled), the connection profile would
reconnect and NM would start wpa_supplicant again.

> systemd target is changed to a target where A1 is stopped
>   wpa_supplicant is stopped by A1 when A1 stops or else it's a bug in A1 for not stopping what it started

> Therefore I believe the fix is for wpa_supplicant to enable
> IgnoreOnIsolate in the wpa_supplicant.service it provides.

I am not totally sure about this. If IgnoreOnIsolate is the right
solution, then every D-Bus activatable service should have it, so that
the service survives a 'systemctl isolate'.

Comment 6 Beniamino Galvani 2021-03-09 07:48:14 UTC
I'm reassigning this bz to the systemd team to know what is their
opinion about this topic. The question is the following:

We have service A enabled and service B disabled. Service A activates
service B via D-Bus in graphical.target. Then the user does a
'systemctl isolate multi-user.target'. The result is that service B
gets stopped (because it is disabled), while service A is kept
running.

Shouldn't systemd somehow know that B was started by A and keep it
running in the new target? If not, how do you suggest to fix this
situation?

Should service B have IgnoreOnIsolate=yes? Or instead service A should
notice that B is gone and it should restart it?

Comment 7 David Tardon 2021-03-11 10:05:23 UTC
(In reply to Beniamino Galvani from comment #6)
> We have service A enabled and service B disabled. Service A activates
> service B via D-Bus in graphical.target. Then the user does a
> 'systemctl isolate multi-user.target'. The result is that service B
> gets stopped (because it is disabled), while service A is kept
> running.

Well, it works as expected. The best way to avoid this problem is to not use isolate. It's a dangerous command.

> 
> Shouldn't systemd somehow know that B was started by A and keep it
> running in the new target? If not, how do you suggest to fix this
> situation?

No, it shouldn't. systemd doesn't track any runtime relations between services, like a service starting another service via D-Bus or using "systemctl start". All it cares about are specified dependencies between units (which may be added at runtime, btw).

> Should service B have IgnoreOnIsolate=yes?

I'm not sure that's appropriate here. IgnoreOnIsolate=yes should be used for units that provide basic system functionality (to ensure that the system doesn't break after isolate).

> Or instead service A should
> notice that B is gone and it should restart it?

This looks like a reasonable solution to me.

Comment 8 Beniamino Galvani 2021-03-15 14:13:01 UTC
(In reply to David Tardon from comment #7)
> > Should service B have IgnoreOnIsolate=yes?
> 
> I'm not sure that's appropriate here. IgnoreOnIsolate=yes should be used for
> units that provide basic system functionality (to ensure that the system
> doesn't break after isolate).
> 
> > Or instead service A should
> > notice that B is gone and it should restart it?
> 
> This looks like a reasonable solution to me.

Okay, thanks. This is implemented by [1].

[1] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/776

Comment 14 errata-xmlrpc 2021-11-09 19:29:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: NetworkManager security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:4361