Bug 1816885 - dmraid.service has a dependency on systemd-udev-settle.service, systemd-udev-settle.service takes a long time
Summary: dmraid.service has a dependency on systemd-udev-settle.service, systemd-udev-...
Keywords:
Status: CLOSED DUPLICATE of bug 1795014
Alias: None
Product: Fedora
Classification: Fedora
Component: dmraid
Version: 32
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Peter Rajnoha
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1843925 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-25 01:18 UTC by Lucious
Modified: 2020-06-29 19:10 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-06-29 19:10:39 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
systemd-analyze plot graphic (166.75 KB, image/svg+xml)
2020-03-25 01:18 UTC, Lucious
no flags Details

Description Lucious 2020-03-25 01:18:52 UTC
Created attachment 1673229 [details]
systemd-analyze plot graphic

Description of problem: Fedora 32 Workstation beta is booting much slower than Fedora 31.


Version-Release number of selected component (if applicable): 
systemd-245.2-1.fc32.x86_64


How reproducible: easily


Steps to Reproduce:
1. Install fedora 32 Wworkstation beta.
2. Observe the boot time from grub to gdm.


Actual results: System boots in about 23 secs from grub to gdm on a samsung 840 pro ssd.


Expected results: This system used to boot fedora 31 from grub to gdm in about 10 secs.


Additional info:

System Specs:
i7-6700K,16GB ram, Samsung 840PRO SSD

I tried re-creating the installation media using fedora media writer. I re-installed the OS with and without LVM. The boot is still around 20 secs from grub to gdm.

systemd-analyze critical-chain
The time when unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.

graphical.target @17.142s
└─multi-user.target @17.142s
  └─libvirtd.service @13.821s +143ms
    └─remote-fs.target @13.816s
      └─remote-fs-pre.target @13.816s
        └─nfs-client.target @8.242s
          └─gssproxy.service @8.223s +18ms
            └─network.target @8.221s
              └─wpa_supplicant.service @16.141s +46ms
                └─dbus-broker.service @7.744s +79ms
                  └─dbus.socket @7.711s
                    └─sysinit.target @7.709s
                      └─systemd-userdbd.service @13.924s +203ms
                        └─systemd-userdbd.socket @708ms
                          └─-.mount
                            └─system.slice
                              └─-.slice

Comment 1 Zbigniew Jędrzejewski-Szmek 2020-03-25 10:08:35 UTC
You have both systemd-udev-settle.service and network-online.target blocking boot, and they both
cause significant delays of about 7s each. Both of those services are best avoided during boot,
because they cause significant delays by serializing unrelated things. Generally both
are only required by misdesigned programs that cannot handle appropriate events and
state changes on their own. They should not be used by units that are dependencies of the default
target. Using them for stuff that runs asynchronously is OK, so for example ordering
dnf-makecache.service after network-online.target is fine.

My guess is that one or both of those dependencies are new. Please check if that is the case.
If yes, then we can talk with the maintainers of that service if those dependencies are desired.

I don't think this slowdown is caused by changes in systemd itself, but my guess above may be
wrong. In particular systemd-userdbd.service is new and it might increase boot time too.

Comment 2 Lucious 2020-03-26 01:44:25 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #1)
> You have both systemd-udev-settle.service and network-online.target blocking
> boot, and they both
> cause significant delays of about 7s each. Both of those services are best
> avoided during boot,
> because they cause significant delays by serializing unrelated things.
> Generally both
> are only required by misdesigned programs that cannot handle appropriate
> events and
> state changes on their own. They should not be used by units that are
> dependencies of the default
> target. Using them for stuff that runs asynchronously is OK, so for example
> ordering
> dnf-makecache.service after network-online.target is fine.
> 
> My guess is that one or both of those dependencies are new. Please check if
> that is the case.
> If yes, then we can talk with the maintainers of that service if those
> dependencies are desired.
> 
> I don't think this slowdown is caused by changes in systemd itself, but my
> guess above may be
> wrong. In particular systemd-userdbd.service is new and it might increase
> boot time too.

Thank you. So do you recommend I mask these units?

Comment 3 Zbigniew Jędrzejewski-Szmek 2020-03-26 08:45:35 UTC
> So do you recommend I mask these units?

No. They were added for some reason: maybe a good one, maybe not. Just blindly
masking them could cause issues.

Comment 4 Lucious 2020-04-01 02:08:31 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #3)
> > So do you recommend I mask these units?
> 
> No. They were added for some reason: maybe a good one, maybe not. Just
> blindly
> masking them could cause issues.

Okay.

I decided to do an upgrade on my thinkpad t420 to test it's boot speed. It slowed down as well though not as much. Went from about 18 secs on fedora 31 to 23 secs on fedora 32. Systemd-analyze plot in the link below.

https://drive.google.com/file/d/1z7Z6YEnGNffHplBl2BDeMZLcZMoONGJt/view?usp=sharing

Comment 5 Lucious 2020-05-03 19:31:39 UTC
Now that fedora 32  is released the problem still persists. I just upgraded my main desktop running a samsung 970 nvme ssd and the boot speed has tripled. I used to be at the graphical target within 5-6 seconds. Now it takes about 18 secs to see the login screen. I have done a fresh install on my t420 and it also boots slow. 

I have shortened the boot by removing the networkManager-wait-online-service on the t420 and my gen 5 x1 carbon. It saves about 5 secons.

Comment 6 Zbigniew Jędrzejewski-Szmek 2020-05-04 09:26:39 UTC
Let's reassign this to dmraid for comments.

dmraid maintainers: /usr/lib/systemd/system/dmraid-activation.service has After=systemd-udev-settle.service,
which causes significant delay during boot. See the 'systemd-analyze plot' graphic attached to the original
report. Services which are started at boot on every machine should not depend on systemd-udev-settle.service
because that causes a very noticeable slowdown for every user. Please figure out a way to either not start
dmraid.service by default, or to avoid this dependency.

Comment 7 Peter Rajnoha 2020-05-05 21:09:03 UTC
The dmraid-activation is an old service, created long time ago as best effort service. dmraid doesn't support any kind of event-based activation and there have been no changes, fixes or new features in dmraid for several years. And dmraid should be mostly superseded by MD these days anyway. I wonder who is still bringing in the dmraid-activation.service. I know we had a few issues with udisks (or libblockdev) in the past I think, but that should be resolved already. I need to find the old BZ where this was tracked... Normally, the dmraid shouldn't be even installed if not directly needed.

Comment 8 Peter Rajnoha 2020-05-05 21:23:26 UTC
The chain of deps:


# dnf repoquery --whatrequires dmraid
Last metadata expiration check: 3:33:52 ago on Tue 05 May 2020 07:43:34 PM CEST.
dmraid-devel-0:1.0.0.rc16-44.fc32.x86_64
dmraid-events-0:1.0.0.rc16-44.fc32.x86_64
libblockdev-dm-0:2.23-2.fc32.i686
libblockdev-dm-0:2.23-2.fc32.x86_64

# dnf repoquery --whatrequires libblockdev-dm
Last metadata expiration check: 3:34:35 ago on Tue 05 May 2020 07:43:34 PM CEST.
libblockdev-dm-devel-0:2.23-2.fc32.i686
libblockdev-dm-devel-0:2.23-2.fc32.x86_64
libblockdev-plugins-all-0:2.23-2.fc32.x86_64

# dnf repoquery --whatrequires libblockdev-plugins-all
Last metadata expiration check: 3:34:53 ago on Tue 05 May 2020 07:43:34 PM CEST.
anaconda-install-env-deps-0:32.24.7-1.fc32.x86_64

# dnf repoquery --whatrequires anaconda-install-env-deps-0:32.24.7-1.fc32.x86_64
Last metadata expiration check: 3:36:05 ago on Tue 05 May 2020 07:43:34 PM CEST.
anaconda-0:32.24.7-1.fc32.x86_64

Comment 9 Zbigniew Jędrzejewski-Szmek 2020-05-06 06:49:50 UTC
OP said:
> I tried re-creating the installation media using fedora media writer.

IIRC, installation using the live image results in a copy of the installer stack
in the final image, including anaconda. Normally this is not a problem, but in this
case it brings in some extra dependencies. We can't really expect users to uninstall
anaconda by hand after installation.

(Note: There's a bug open about udevd being slower in F32, and there are some patches
upstream. It's likely that once udevd becomes faster, dependency on
systemd-udev-settle.service will become less noticable again.)

Comment 10 Peter Rajnoha 2020-05-06 07:05:47 UTC
Vojto, haven't we been disabling dmraid in libblockdev already? Or was that just RHEL?

Comment 11 Vojtech Trefny 2020-05-06 19:11:55 UTC
(In reply to Peter Rajnoha from comment #10)
> Vojto, haven't we been disabling dmraid in libblockdev already? Or was that
> just RHEL?

It was just on RHEL. libdmraid is still available in Fedora so we are still using it for firmware raids that are not supported by mdadm.

Comment 12 Lucious 2020-05-15 00:28:24 UTC
I have another device experiencing slower boot. A dell latitude e6440 I use from time to time at work. I have systemd-analyze information from fedora 31 and fedora 32 if you need it. Maybe it's something I'm doing wrong, no one else seems to be complaining of boot issues. This is on 4 different devices now, with the most noticeable being on my fastest system with the nvme drive.

Comment 13 stainless_life 2020-05-28 05:13:44 UTC
Since installing fedora 32, I have also noticed a much longer boot time. Fedora 31, from power on button push to login page, used to take a snappy 17 seconds total for both of my machines. It now takes 37 seconds for each. I have 2 similar machines both based on Intel Core i7's running fedora 32 (Workstation Edition) and I updated both from Fedora 31 about a week after 32 was released. I didn't file a bug report, as I had been waiting for an update to fix this, but I now am getting worried, as one of the recent updates actually increased my boot time from 34 seconds (after the 31 to 32 update) to the current 37 seconds. I've been running fedora workstation since version 19, and this is sufficiently worrisome that I thought I should chime in, and confirm that there is indeed an issue with a much longer boot time in fedora 32. PS Thanks for all the work you guys do.

Comment 14 Zbigniew Jędrzejewski-Szmek 2020-06-05 11:24:22 UTC
I think we have two issues here: there seems to be a significant slowdown of udev
with recent systemd and recent kernels, stemming from the kernel change to ratelimit
access to EFI variables, see https://github.com/systemd/systemd/issues/14828.
Various patches have been created, but it's not clear at this point if they are enough
to resolve the issue. The kernel change will hopefully be reverted or changed. I'll
try to push those patches to F32. Once that happens, this issue will not be as noticable.

That said, depending on systemd-udev-settle.service is also a problem.

Comment 15 Zbigniew Jędrzejewski-Szmek 2020-06-05 11:25:06 UTC
*** Bug 1843925 has been marked as a duplicate of this bug. ***

Comment 16 Hans de Goede 2020-06-29 19:10:39 UTC
I already filled a bug for this issue, with a suggested workaround a while ago.

I will mark this bug as a duplicate of the earlier dmraid bug for this, lets continue any discussion about this there.

Note I plan to implement the mentioned workaround for Fedora 33 (and later) I am in the progress of creating a change page for this, even though it is a small change, so as to get this properly documented in case the workaround ends up causing issues for anyone (it shouldn't but you never know).

*** This bug has been marked as a duplicate of bug 1795014 ***


Note You need to log in before you can comment on or make changes to this bug.