Bug 1720876 - systemd-udev-settle.service takes almost a minute to load with nvidia Turing cards
Summary: systemd-udev-settle.service takes almost a minute to load with nvidia Turing ...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: udev
Version: 30
Hardware: x86_64
OS: Linux
unspecified
low
Target Milestone: ---
Assignee: Orphan Owner
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-16 06:11 UTC by Gustavo Rubio
Modified: 2020-05-26 18:11 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-26 18:11:10 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
lspci verbose output (28.16 KB, text/plain)
2019-06-16 06:11 UTC, Gustavo Rubio
no flags Details
systemd-analize blame output (3.78 KB, text/plain)
2019-06-16 06:12 UTC, Gustavo Rubio
no flags Details
systemd-analyze critical-chain output (1.30 KB, text/plain)
2019-06-16 06:13 UTC, Gustavo Rubio
no flags Details
journalctl --this-boot -all output (1.79 MB, text/plain)
2019-06-16 06:13 UTC, Gustavo Rubio
no flags Details
nouveau boot log from live-cd (22.54 KB, text/plain)
2019-06-16 06:14 UTC, Gustavo Rubio
no flags Details
journalctl --this-boot -all (470.67 KB, text/plain)
2019-08-30 08:06 UTC, aannoaanno
no flags Details

Description Gustavo Rubio 2019-06-16 06:11:37 UTC
Created attachment 1581062 [details]
lspci verbose output

Description of problem:

On boot, the following message appears when trying to start a udev service that takes almost a minute and then finally gives up:

"A start job is running for udev Wait for complete device initialization"

This is even present when running the vanilla live-cd. Looking at the systemd logs of my installation I do see some hardware errors:


Jun 15 13:59:57 hadron-air kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20190215/psargs-330)
Jun 15 13:59:57 hadron-air kernel: ACPI Error: Aborting method \_SB.PCI0.XHC.RHUB.HS01._PLD due to previous error (AE_NOT_FOUND) (20190215/psparse-529)


Jun 15 13:59:57 hadron-air kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20190215/psargs-330)
Jun 15 13:59:57 hadron-air kernel: ACPI Error: Aborting method \_SB.PCI0.XHC.RHUB.SS01._PLD due to previous error (AE_NOT_FOUND) (20190215/psparse-529)
Jun 15 13:59:57 hadron-air kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.UBTC.RUCC], AE_NOT_FOUND (20190215/psargs-330)
Jun 15 13:59:57 hadron-air kernel: ACPI Error: Aborting method \_SB.PCI0.XHC.RHUB.SS02._PLD due to previous error (AE_NOT_FOUND) (20190215/psparse-529)

Jun 15 14:00:00 hadron-air systemd-udevd[743]: GOTO 'libsane_rules_end' has no matching label in: '/etc/udev/rules.d/S99-2000S1.rules

Jun 15 14:00:00 hadron-air kernel: iTCO_wdt iTCO_wdt: can't request region for resource [mem 0x00c5fffc-0x00c5ffff]

Jun 15 14:00:02 hadron-air kernel: iwlwifi 0000:00:14.3: BIOS contains WGDS but no WRDS

Jun 15 14:01:41 hadron-air kernel: ucsi_ccg 0-0008: failed to reset PPM!
Jun 15 14:01:41 hadron-air kernel: ucsi_ccg 0-0008: PPM init failed (-110)


However, after ruling out disks and wifi by disabling them through the BIOS I finally removed my GTX 1660 ti nvidia card and the system loaded up fast.

I ran the live-cd again since the problem is there too and found this:

Jun 16 05:40:49 localhost kernel: nouveau 0000:01:00.0: unknown chipset (168000a1)
Jun 16 05:40:49 localhost kernel: nouveau: probe of 0000:01:00.0 failed with error -12

So my guess is that the specific TU116 chip which is the one in my GPU is not yet detected by the bundled nouveau driver version. The resolution when running the livecd is actually very low, like 1024 and it cannot be changed so it fallbacks to the VESA driver.

I'm not sure but maybe the udev-settle service fires up before the proprietary nvidia driver is loaded? 

Version-Release number of selected component (if applicable):

Name         : systemd-udev
Version      : 241
Release      : 8.git9ef65cb.fc30
Architecture : x86_64


How reproducible:

Every time.


Steps to Reproduce:
1. Install any Turing based card
2. Turn on the computer

Actual results:
Long wait time for udev-settle-service to finish


Expected results:
Less time


Additional info:

My current hardware:

 - EVGA H370 motherboard
 - Intel i7-8700 CPU
 - Corsair 32GB RAM, 2666mhz
 - Kingson SATA-III SSD, 240GB (Fedora install)
 - Samsung 850 EVO - 500GB - M.2 SATA III (Windows 10 install)
 - EVGA GeForce GTX 1660ti, 6GB DDR6
 - And the usual usb keyboard, mice, etc.

I'm attaching the output of systemd-analyze blame, systemd-analyze critical-chain and the graphical plot, with /etc/udev/udev.conf flag udev_log=debug

Also attached the output of lspci -vv

This issue seems to be present on other distributions, I did googled before creating this ticket but the only fedora-related one was lost with the migration of askfedora it seems, here's a list of similar:

https://www.cipheronic.com/systemd-udev-settle-service-hangs-on-fedora-28/
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=786942
https://bbs.archlinux.org/viewtopic.php?id=189106


What I did to double-check was to mask the systemd-udev-settle service and the boot time goes back to "normal" but I understand that is not a proper fix, especially if using LVM disk scheme (which I'm using) so I turned it back on. I did not notice any problems when disabling the service but I don't want to have any issues down the road.

Also, tried the latest Ubuntu 19.04 Live CD as well, and I did not experienced that issue.

Let me know if you need further information.

Comment 1 Gustavo Rubio 2019-06-16 06:12:25 UTC
Created attachment 1581063 [details]
systemd-analize blame output

Comment 2 Gustavo Rubio 2019-06-16 06:13:12 UTC
Created attachment 1581064 [details]
systemd-analyze critical-chain output

Comment 3 Gustavo Rubio 2019-06-16 06:13:43 UTC
Created attachment 1581065 [details]
journalctl --this-boot -all output

Comment 4 Gustavo Rubio 2019-06-16 06:14:13 UTC
Created attachment 1581066 [details]
nouveau boot log from live-cd

Comment 5 aannoaanno 2019-08-30 08:02:01 UTC
Same problem here (i.e. using a nvidia turing card and having problems with systemd-udev-settle.service, found with `systemd-analyze blame`). However, I could not find the indicated lines in the kernel logs.

As a work-around, I escaped to set `TimeoutSec=15` in /usr/lib/systemd/system/systemd-udev-settle.service.

Comment 6 aannoaanno 2019-08-30 08:06:51 UTC
Created attachment 1609793 [details]
journalctl --this-boot -all

This is a log AFTER setting the timout to 15s

Comment 7 Gustavo Rubio 2019-09-03 01:21:02 UTC
Seems to be an issue with the nvidia i2c USB driver, blacklisting it takes the problem away:

sudo vim /etc/modprobe.d/blacklist_i2c-nvidia-gpu.conf

Add:

blacklist i2c_nvidia_gpu


It will disable the USB onboard of the GPU card, I don't use it but I've heard it is necessary for some VR functionality.

Comment 8 Matt Kinni 2020-01-21 12:06:10 UTC
Just wanted to chime in and say this is still an issue in F31 (5.4.10-200) - after installing a Gtx1660Ti, systemd-udev-settle.service times out at 3 minutes.

@Gustavo Rubio's blacklist file in Comment 7 solves the problem for me

Comment 9 Ben Cotton 2020-04-30 20:32:19 UTC
This message is a reminder that Fedora 30 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 30 on 2020-05-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '30'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 30 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 10 Ben Cotton 2020-05-26 18:11:10 UTC
Fedora 30 changed to end-of-life (EOL) status on 2020-05-26. Fedora 30 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.