Bug 1428379 - start lightdm after dbus
Summary: start lightdm after dbus
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora EPEL
Classification: Fedora
Component: lightdm
Version: epel7
Hardware: All
OS: All
unspecified
medium
Target Milestone: ---
Assignee: Rex Dieter
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-02 13:02 UTC by Andrey Arapov
Modified: 2018-02-02 16:56 UTC (History)
4 users (show)

Fixed In Version: lightdm-1.25.1-5.fc27 lightdm-1.25.1-5.fc26
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-01-28 21:32:33 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Patch to make sure lightdm waits for the dbus.service (1.26 KB, patch)
2017-03-02 13:02 UTC, Andrey Arapov
no flags Details | Diff
HUP the dbus-daemon to claim the lightdm service name (941 bytes, patch)
2017-04-26 12:35 UTC, Andrey Arapov
no flags Details | Diff

Description Andrey Arapov 2017-03-02 13:02:00 UTC
Created attachment 1259155 [details]
Patch to make sure lightdm waits for the dbus.service

Description of problem:
In some conditions, lightdm is failing to start exiting with this error:
"Failed to use bus name org.freedesktop.DisplayManager, do you have appropriate permissions?"

Version-Release number of selected component (if applicable):
lightdm-1.10.6-4.el7

How reproducible:
Start lightdm before dbus.service has fully initialized.


Actual results:
"Failed to use bus name org.freedesktop.DisplayManager, do you have appropriate permissions?"

Expected results:
Lightdm is up and running.


Potential patch:

diff -Nur lightdm-1.10.6-orig/lightdm.service lightdm-1.10.6/lightdm.service
--- lightdm-1.10.6-orig/lightdm.service	2016-05-02 20:07:33.000000000 +0200
+++ lightdm-1.10.6/lightdm.service	2017-03-02 13:54:46.902140868 +0100
@@ -2,7 +2,7 @@
 Description=Light Display Manager
 Documentation=man:lightdm(1)
 Conflicts=getty
-After=systemd-user-sessions.service getty plymouth-quit.service livesys-late.service
+After=systemd-user-sessions.service getty dbus.service plymouth-quit.service livesys-late.service
 
 [Service]
 ExecStart=/usr/sbin/lightdm
diff -Nur lightdm-1.10.6-orig/lightdm.spec lightdm-1.10.6/lightdm.spec
--- lightdm-1.10.6-orig/lightdm.spec	2016-05-02 20:07:40.000000000 +0200
+++ lightdm-1.10.6/lightdm.spec	2017-03-02 13:58:08.790177915 +0100
@@ -10,7 +10,7 @@
 Name:    lightdm
 Summary: A cross-desktop Display Manager
 Version: 1.10.6
-Release: 4%{?dist}
+Release: 5%{?dist}
 
 # library/bindings are LGPLv2 or LGPLv3, the rest GPLv3+
 License: (LGPLv2 or LGPLv3) and GPLv3+
@@ -290,6 +290,9 @@
 
 
 %changelog
+* Thu Mar 02 2017 Andrey Arapov <andrey.arapov> 1.10.6-5
+- start lightdm after dbus
+
 * Mon May 02 2016 Rex Dieter <rdieter> 1.10.6-4
 - Error connecting to XServer via ipv6 (1322775)

Comment 1 Rex Dieter 2017-03-02 14:01:20 UTC
As far as I can tell, no other DM includes 
After=dbus.service
either.  I suspect
* if there is a problem with ordering, it lies elsewhere
or
* this problem is more widespread than just lightdm

Comment 2 Andrey Arapov 2017-03-02 14:23:25 UTC
Rex,

I was presuming that since [Service] part of the lightdm.service file contains:
BusName=org.freedesktop.DisplayManager

which is why it needs the dbus.service to be initialized. If fails, gives "Failed to use bus name org.freedesktop.DisplayManager".

In our lab, it turns out that change does not fully fixes the problem.
Either dbus does not have enough time to fully initialize (that even systemd doesn't help) or it really lies elsewhere...

Would you have some clues, that we could try?

Thanks,
Andrey Arapov

Comment 3 Andrey Arapov 2017-03-02 14:29:36 UTC
Upd: we've also got:

```
WARNING **: Failed to create IPv6 VNC socket: Error binding to address: Invalid argument
```

looks like IPv6 is hardcoded and there is no switch not to use it .. :/

lightdm-1.10.6$ less ./src/vnc-server.c

```
    server->priv->socket6 = open_tcp_socket (G_SOCKET_FAMILY_IPV6, server->priv->port, server->priv->listen_address, &error);
    if (error)
        g_warning ("Failed to create IPv6 VNC socket: %s", error->message);
    g_clear_error (&error);
```

Our lightdm.conf config is pretty simple as :

```
[LightDM]
start-default-seat=false

[SeatDefaults]
session-wrapper=/etc/X11/xinit/Xsession
greeter-show-manual-login=true
greeter-hide-users=true
allow-guest=false

[VNCServer]
enabled=true
command=Xvnc -SecurityTypes=none
port=5901
listen-address=127.0.0.1
width=1024
height=768
depth=24
```

Comment 4 Andrey Arapov 2017-03-02 14:39:32 UTC
Upd2: likely ``WARNING: Failed to create IPv6 VNC socket: Error binding to address: Invalid argument`` is safe one, since lightdm is able to run even while this message occurs. Important, is when "Failed to use bus name org.freedesktop.DisplayManager" occurs -> then lightdm fails.

Comment 5 Rex Dieter 2017-03-02 15:33:30 UTC
Good questions for lightdm upstream folks (I don't know, sorry),
https://lists.freedesktop.org/mailman/listinfo/lightdm

Comment 6 Andrey Arapov 2017-03-03 09:19:03 UTC
Hello,

Here is the confirmation of the observed behavior (mentioned in the top of the thread, that lightdm doesn't properly wait for the dbus and fails fast):

1. Systemd throws non-zero exit code causing trouble for puppet playbook to continue the execution chain. (which relies on 0 RC)

Of course, there must be a way to either always return 0 RC, or maybe adjust systemd's Type= so that systemd is really waiting for 0 return code after few restarts.

```
Error: Could not start Service[lightdm]: Execution of '/bin/systemctl start lightdm' returned 1: Job for lightdm.service failed because the control process exited with error code. See "systemctl status lightdm.service" and "journalctl -xe" for details.
Wrapped exception:
Execution of '/bin/systemctl start lightdm' returned 1: Job for lightdm.service failed because the control process exited with error code. See "systemctl status lightdm.service" and "journalctl -xe" for details.
Error: /Service[lightdm]/ensure: change from stopped to running failed: Could not start Service[lightdm]: Execution of '/bin/systemctl start lightdm' returned 1: Job for lightdm.service failed because the control process exited with error code. See "systemctl status lightdm.service" and "journalctl -xe" for details.
```



2. Here is the proof that lightdm starts only at the 4th attempt by systemd.

We have slightly customized lightdm.service, by adding Restart=on-failure, RestartSec=5.

```
*** journalctl -xe --no-pager ***
-- Logs begin at Thu 2017-03-02 17:10:40 CET, end at Thu 2017-03-02 17:13:28 CET. --
Mar 02 17:12:56 rhel7host systemd[1]: Starting Light Display Manager...
-- Subject: Unit lightdm.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit lightdm.service has begun starting up.
Mar 02 17:12:57 rhel7host lightdm[12665]: ** (lightdm:12665): WARNING **: Failed to create IPv6 VNC socket: Error binding to address: Invalid argument
Mar 02 17:12:57 rhel7host lightdm[12665]: Failed to use bus name org.freedesktop.DisplayManager, do you have appropriate permissions?
Mar 02 17:12:57 rhel7host systemd[1]: lightdm.service: main process exited, code=exited, status=1/FAILURE
Mar 02 17:12:57 rhel7host systemd[1]: Failed to start Light Display Manager.
-- Subject: Unit lightdm.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit lightdm.service has failed.
-- 
-- The result is failed.
Mar 02 17:12:57 rhel7host systemd[1]: Unit lightdm.service entered failed state.
Mar 02 17:12:57 rhel7host systemd[1]: lightdm.service failed.
Mar 02 17:13:02 rhel7host systemd[1]: lightdm.service holdoff time over, scheduling restart.
Mar 02 17:13:02 rhel7host systemd[1]: Starting Light Display Manager...
-- Subject: Unit lightdm.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit lightdm.service has begun starting up.
Mar 02 17:13:02 rhel7host lightdm[12782]: ** (lightdm:12782): WARNING **: Failed to create IPv6 VNC socket: Error binding to address: Invalid argument
Mar 02 17:13:02 rhel7host lightdm[12782]: Failed to use bus name org.freedesktop.DisplayManager, do you have appropriate permissions?
Mar 02 17:13:02 rhel7host systemd[1]: lightdm.service: main process exited, code=exited, status=1/FAILURE
Mar 02 17:13:02 rhel7host systemd[1]: Failed to start Light Display Manager.
-- Subject: Unit lightdm.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit lightdm.service has failed.
-- 
-- The result is failed.
Mar 02 17:13:02 rhel7host systemd[1]: Unit lightdm.service entered failed state.
Mar 02 17:13:02 rhel7host systemd[1]: lightdm.service failed.
Mar 02 17:13:07 rhel7host systemd[1]: lightdm.service holdoff time over, scheduling restart.
Mar 02 17:13:07 rhel7host systemd[1]: Starting Light Display Manager...
-- Subject: Unit lightdm.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit lightdm.service has begun starting up.
Mar 02 17:13:07 rhel7host lightdm[12881]: ** (lightdm:12881): WARNING **: Failed to create IPv6 VNC socket: Error binding to address: Invalid argument
Mar 02 17:13:07 rhel7host lightdm[12881]: Failed to use bus name org.freedesktop.DisplayManager, do you have appropriate permissions?
Mar 02 17:13:07 rhel7host systemd[1]: lightdm.service: main process exited, code=exited, status=1/FAILURE
Mar 02 17:13:07 rhel7host systemd[1]: Failed to start Light Display Manager.
-- Subject: Unit lightdm.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit lightdm.service has failed.
-- 
-- The result is failed.
Mar 02 17:13:07 rhel7host systemd[1]: Unit lightdm.service entered failed state.
Mar 02 17:13:07 rhel7host systemd[1]: lightdm.service failed.
Mar 02 17:13:19 rhel7host systemd[1]: lightdm.service holdoff time over, scheduling restart.
Mar 02 17:13:19 rhel7host systemd[1]: Starting Light Display Manager...
-- Subject: Unit lightdm.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit lightdm.service has begun starting up.
Mar 02 17:13:19 rhel7host lightdm[13118]: ** (lightdm:13118): WARNING **: Failed to create IPv6 VNC socket: Error binding to address: Invalid argument
Mar 02 17:13:19 rhel7host systemd[1]: Started Light Display Manager.
-- Subject: Unit lightdm.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit lightdm.service has finished starting up.
-- 
-- The start-up result is done.
Finished converging (2m39.58s).
```



The systemd's ``/lib/systemd/system/systemd-logind.service`` has the following expectation towards dbus:

```
# Ask for the dbus socket. If running over kdbus, the socket will
# not be actually used.
Wants=dbus.socket
After=dbus.socket
```


I am trying now to start lightdm, using the same Wants & After as just mentioned.

Will reply with update soon.

Comment 7 Andrey Arapov 2017-03-03 10:10:54 UTC
Humm,

according to the https://www.freedesktop.org/software/systemd/man/systemd.service.html


> Services with Type=dbus set automatically acquire dependencies of type Requires= and After= on dbus.socket.

In lightdm.service -> "Type=dbus" is NOT explicitly set, though it is automatically acquired since "BusName=org.freedesktop.DisplayManager" is explicitly set in the lightdm.service. 

> # systemctl show lightdm |grep Type=
> Type=dbus


Though, even that seems not enough, causing the race.

Now, I am not sure be there some systemd issue when "Type=dbus" is not explicitly set in the .service file and acquired automatically when "BusName=" is set or something else...

Comment 8 Andrey Arapov 2017-03-03 12:26:11 UTC
In case if someone is looking for a workaround, here it is:

/etc/systemd/system/lightdm.service.d/fixes.conf:
[Unit]
Wants=dbus.socket user.slice
After=dbus.socket user.slice

[Service]
# Trying to explicitly declare Type=dbus as original lightdm.service has
# BusName=org.freedesktop.DisplayManager which sets Type=dbus anyway acc. to
# https://www.freedesktop.org/software/systemd/man/systemd.service.html
Type=dbus
RestartSec=10s
ExecStartPre=/bin/sleep 5

Comment 9 Andrey Arapov 2017-04-26 12:35:15 UTC
Created attachment 1274198 [details]
HUP the dbus-daemon to claim the lightdm service name

The previous workaround did not complete its goal as the lightdm permission issue occurred again.

We came to a conclusion that the lightdm issue occurs due to D-Bus confusion over the conf files dropped into /etc/dbus-1/system.d/ and the dbus-daemon refresh.

In particular, whenever lightdm extracts the dbus configuration, the dbus may fail to read it due to some race:
---> dbus[9552]: Encountered error 'Failed to open "/etc/dbus-1/system.d/org.freedesktop.DisplayManager.conf": Permission denied' while parsing '/etc/dbus-1/system.d/org.freedesktop.DisplayManager.conf' <---

causing lightdm service fail to start:
---> "Failed to use bus name org.freedesktop.DisplayManager, do you have appropriate permissions?" <---


Similar behavior has been observed with the following configuration files. Luckily each related RPM package had ``killall -HUP dbus-daemon`` (reload dbus config) in its
postinstall scriptlet to make sure the dbus conf rules will be applied. And lightdm postinstall scriptlet does not reload the dbus-daemon.

```
dbus[9552]: Encountered error 'Failed to open "/etc/dbus-1/system.d/org.freedesktop.login1.conf": Permission denied' while parsing '/etc/dbus-1/system.d/org.freedesktop.login1.conf'
dbus[9552]: [system] Reloaded configuration

dbus[9552]: Encountered error 'Failed to open "/etc/dbus-1/system.d/org.freedesktop.locale1.conf": Permission denied' while parsing '/etc/dbus-1/system.d/org.freedesktop.locale1.conf'
dbus[9552]: [system] Reloaded configuration

dbus[9552]: Encountered error 'Failed to open "/etc/dbus-1/system.d/org.freedesktop.Accounts.conf": Permission denied' while parsing '/etc/dbus-1/system.d/org.freedesktop.Accounts.conf'
dbus[9552]: [system] Reloaded configuration

dbus[9552]: Encountered error 'Failed to open "/etc/dbus-1/system.d/certmonger.conf": Permission denied' while parsing '/etc/dbus-1/system.d/certmonger.conf'
dbus[9552]: [system] Reloaded configuration

dbus[9552]: Encountered error 'Failed to open "/etc/dbus-1/system.d/oddjob.conf": Permission denied' while parsing '/etc/dbus-1/system.d/oddjob.conf'
dbus[9552]: [system] Reloaded configuration
```

In our puppet recipes, we ensured the dbus is reloaded (``killall -HUP dbus-daemon``) before starting the lightdm service.


The relevant/similar issues:
- [1] https://bugzilla.redhat.com/show_bug.cgi?id=636876#c1
- [2] https://bugzilla.redhat.com/show_bug.cgi?id=1308549#c2
- [3] https://github.com/cockpit-project/cockpit/issues/410#issue-30837663


I have attached the patch, please review and take it if it is good from your point of view.

Thank you,

Andrey Arapov

Comment 10 Fedora Update System 2018-01-19 16:12:26 UTC
lightdm-1.25.1-5.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-2bbda7d20d

Comment 11 Fedora Update System 2018-01-19 16:12:40 UTC
lightdm-1.25.1-5.el7 has been submitted as an update to Fedora EPEL 7. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2018-ecc5aba2bf

Comment 12 Fedora Update System 2018-01-19 16:12:48 UTC
lightdm-1.25.1-5.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2018-c561169635

Comment 13 Fedora Update System 2018-01-21 09:43:08 UTC
lightdm-1.25.1-5.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-c561169635

Comment 14 Fedora Update System 2018-01-21 10:40:39 UTC
lightdm-1.25.1-5.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-2bbda7d20d

Comment 15 Fedora Update System 2018-01-21 12:58:01 UTC
lightdm-1.25.1-5.el7 has been pushed to the Fedora EPEL 7 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2018-ecc5aba2bf

Comment 16 Andrey Arapov 2018-01-21 17:51:26 UTC
I think this could be improved so that it won't fail in case there is no dbus-daemon around, which is unlikely.


# diff -Nur lightdm-1.25.1-5.fc27.src.orig lightdm-1.25.1-5.fc27.src.new/
diff -Nur lightdm-1.25.1-5.fc27.src.orig/lightdm.spec lightdm-1.25.1-5.fc27.src.new/lightdm.spec
--- lightdm-1.25.1-5.fc27.src.orig/lightdm.spec	2018-01-19 15:57:29.000000000 +0000
+++ lightdm-1.25.1-5.fc27.src.new/lightdm.spec	2018-01-21 17:49:04.145247262 +0000
@@ -236,8 +236,8 @@
 
 
 %post
-if [ $1 = 1 ] ; then
-	%{_bindir}/killall -HUP dbus-daemon 2>&1 > /dev/null
+if [ $1 -eq 1 ] ; then 
+	%{_bindir}/killall -HUP dbus-daemon >/dev/null 2>&1 || :
 fi
 %{?systemd_post:%systemd_post %{name}.service}

Comment 17 Fedora Update System 2018-01-28 21:32:33 UTC
lightdm-1.25.1-5.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.

Comment 18 Fedora Update System 2018-02-02 16:56:39 UTC
lightdm-1.25.1-5.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.