Bug 1655194

Summary: systemd stalls boot for 1 min. 30 sec. when swap UUID refers to nonexistent device
Product: [Fedora] Fedora Reporter: Steve <y9t7sypezp>
Component: systemdAssignee: systemd-maint
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 29CC: jsynacek, lnykryn, msekleta, ssahani, s, systemd-maint, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-12-03 22:58:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Screenshot showing systemd message with countdown during boot stall
none
Corresponding log messages none

Description Steve 2018-11-30 23:10:51 UTC
Description of problem:
If a user reformats a swap device, the UUID will change. If the corresponding entry in /etc/fstab is not changed, systemd stalls the boot for 1 minute 30 seconds with a timer countdown.

The problems are:

1. The message displayed is uninformative -- basically, it shows a UUID and a countdown.

2. The timeout interval of 1 minute 30 seconds is excessively long for a swap device on the root drive.

3. There doesn't seem to be a way to prematurely terminate the timer countdown.

Version-Release number of selected component (if applicable):
systemd-239-6.git9f3aed1.fc29.x86_64
kernel-4.19.4-300.fc29.x86_64

How reproducible:
Always.

Steps to Reproduce:
1. Add a swap entry to /etc/fstab with an nonexistent UUID:
UUID=aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa swap swap    defaults        0 0
(An invalid UUID works too: "abcd".)

2. Reboot.

Actual results:
1. The boot stalls for 1 minute 30 seconds with a countdown displayed.
2. The countdown message does not explain what process is stalled or why it is stalled.
3. There is no obvious way to terminate the countdown.

Expected results:
1. A message is displayed saying what systemd is actually trying to do.
2. The timeout is shorter in cases where there is no expectation of a delay, such as when the swap device is on the same drive as the root device.
3. There is a way to terminate the countdown.

Additional info:

Tested in an F29 VM on an F28 host.

Comment 1 Steve 2018-11-30 23:22:22 UTC
Created attachment 1510303 [details]
Screenshot showing systemd message with countdown during boot stall

Screenshot showing systemd message with countdown during boot stall. The message reads:

"A start job is running for dev-disk-by\x2duuid-aaaa ... aaaa.device (42s / 1min 30s)"

(I replaced part of the UUID with an ellipsis.)

Comment 2 Steve 2018-11-30 23:38:20 UTC
Created attachment 1510304 [details]
Corresponding log messages

The corresponding log messages are more informative.

Comment 3 Steve 2018-12-01 00:20:06 UTC
(In reply to Steve from comment #1)
...
> Screenshot showing systemd message with countdown during boot stall. The
> message reads:
> 
> "A start job is running for dev-disk-by\x2duuid-aaaa ... aaaa.device (42s /
> 1min 30s)"

Issue 4:

The message contains hex codings (\x2d) for some hyphens, but not for others. Hyphens are printable characters, so they should be displayed as such.

The attached screenshot shows another example of an unneeded hex coding:

"Created slice system-systemd\x2dfsck.slice."

I will open a separate bug on this issue, if requested.

Comment 4 Steve 2018-12-01 03:00:38 UTC
(In reply to Steve from comment #0)
> Description of problem:
> If a user reformats a swap device, the UUID will change. If the
> corresponding entry in /etc/fstab is not changed, systemd stalls the boot
> for 1 minute 30 seconds with a timer countdown.
...

More precisely, I discovered this problem after I installed a second F29 system on the same physical drive.

As part of the installation process, I chose to *reuse* an existing swap partition. IIRC, the installer would not let me reuse the swap partition unless I formatted it, so I had the installer to format it. Consequently, the UUID was changed.

The end result was that the first F29 system was broken, because its fstab referred to a now non-existent UUID for its swap partition.

So the use-case is to use the same swap device with two different system installations.

NB: The disk drive is GPT.

The second install was done with this ISO image on a USB flash drive:

Fedora-Workstation-Live-x86_64-29-1.2.iso

Comment 5 Jan Synacek 2018-12-03 08:06:03 UTC
(In reply to Steve from comment #0)
> 1. The message displayed is uninformative -- basically, it shows a UUID and
> a countdown.

That is by design, I guess. You can turn on debug mode, which will give you more information, but in this particular case, I don't think it matters, because the swap that is specified in fstab is not there.

> 2. The timeout interval of 1 minute 30 seconds is excessively long for a
> swap device on the root drive.

It's the default for mount entries. Systemd doesn't really distinguish regular disks vs. swap. You can tune that by adding one of the x-systemd.*timeout= options to the options in the fstab entry. See systemd.mount(5).

> 3. There doesn't seem to be a way to prematurely terminate the timer
> countdown.

No, there isn't. Use the timeout options.

Comment 6 Steve 2018-12-03 12:57:30 UTC
(In reply to Jan Synacek from comment #5)
> (In reply to Steve from comment #0)
> > 1. The message displayed is uninformative -- basically, it shows a UUID and
> > a countdown.
> 
> That is by design, I guess. You can turn on debug mode, which will give you
> more information, but in this particular case, I don't think it matters,
> because the swap that is specified in fstab is not there.

The boot message should explain the problem, since the whole point of boot messages is to communicate with users. As the screenshot shows, there is no mention of "systemd" or "swap".

I would suggest logging a message *before* a target is started and showing *both* the unit description and the unit name:

"systemd: Starting Swap (swap.target)."

The screenshot seems to show unit descriptions in some cases, so maybe the problem is with inconsistent logging conventions.

> > 2. The timeout interval of 1 minute 30 seconds is excessively long for a
> > swap device on the root drive.
> 
> It's the default for mount entries. Systemd doesn't really distinguish
> regular disks vs. swap. You can tune that by adding one of the
> x-systemd.*timeout= options to the options in the fstab entry. See
> systemd.mount(5).

Thanks for pointing that out. The default 90 second timeout sounds like a number pulled out of the air. The default timeout should be reduced for the majority of cases. If users find that a particular mount is timing out, they can increase the timeout.

> > 3. There doesn't seem to be a way to prematurely terminate the timer
> > countdown.
> 
> No, there isn't. Use the timeout options.

In this case, there is a configuration error, so the solution is to fix the configuration, not change the timeout.

Comment 7 Steve 2018-12-03 17:49:36 UTC
After a test boot with a non-existent UUID for the swap device:

$ systemd-analyze blame
Bootup is not yet finished (org.freedesktop.systemd1.Manager.FinishTimestampMonotonic=0).
Please try again later.
Hint: Use 'systemctl list-jobs' to see active jobs

$ systemctl list-jobs
JOB UNIT                                                                        TYPE  STATE  
558 dev-disk-by\x2duuid-aaaaaaaa\x2daaaa\x2daaaa\x2daaaa\x2daaaaaaaaaaaa.swap   start waiting
559 dev-disk-by\x2duuid-aaaaaaaa\x2daaaa\x2daaaa\x2daaaa\x2daaaaaaaaaaaa.device start running

2 jobs listed.

Eventually:

$ systemd-analyze blame | head -5
         27.611s dnf-makecache.service
          7.516s plymouth-quit-wait.service
          2.069s NetworkManager-wait-online.service
          1.155s firewalld.service
           923ms sssd.service

For the record:

$ lsblk -fn
vda                                                      
├─vda1 ext4   boot  7122eda4-f3a6-445c-9b4f-76d0337eb13f /boot
├─vda2 ext4   root  e11dd2df-ccc5-49f6-90ca-898d1d4df9a7 /
└─vda3 swap   swap  de1443f2-7d0c-467e-ae9e-8b2902a4d2e8 

$ cat /etc/fstab
...
UUID=e11dd2df-ccc5-49f6-90ca-898d1d4df9a7 /       ext4    defaults        1 1
UUID=7122eda4-f3a6-445c-9b4f-76d0337eb13f /boot   ext4    defaults        1 2
#UUID=de1443f2-7d0c-467e-ae9e-8b2902a4d2e8 swap    swap    defaults        0 0
UUID=aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa swap    swap    defaults        0 0

Tested in an F29 VM on an F28 host.

Comment 8 Zbigniew Jędrzejewski-Szmek 2018-12-03 22:49:28 UTC
I don't think there's anything to fix here. The message is pretty clear: "waiting for device long-device-num". 90s is pretty long, but this is not something that happens routinely, but only on major system misconfigurations. The owner of the machine is supposed to take notice and fix the configuration.

90s is high, but systemd doesn't know if the device in question is used as swap or as a mount point or something else, and it doesn't know if the bus in question is fast (sata or pciexpress), or slow (usb, something else), until the device shows up. So the limit has to be pretty large to avoid spurious timeouts on slower hardware. In particular, daisy-chained usb controllers with many devices connected can be pretty slow. See https://superuser.com/questions/1162682/closed-puzzlingly-long-systemd-boot-times-dont-know-where-to-start for example where sda1 takes 16 s to be detected. With multiple devices, detection might be even slower, so we easily reach 30 or 40 or 50 s. The timeout is supposed to have a safety margin, so it is even higher than that.

Comment 9 Zbigniew Jędrzejewski-Szmek 2018-12-03 22:58:24 UTC
Yeah, I looked at possible implementation options, and making the timeout for swap devices would be more confusing that it is worth it. The user can always set  
DefaultTimeoutStartSec= or device-timeout= on specific devices if they want.

Comment 10 Steve 2018-12-04 06:02:50 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #8)
> I don't think there's anything to fix here. The message is pretty clear:
> "waiting for device long-device-num". 

That is my main complaint -- that "long-device-num" is worthless unless it is accompanied by some context. There is no mention of "systemd" or "swap".

The boot message in the attached screenshot reads:

"A start job is running for dev-disk-by\x2duuid-aaaa ... aaaa.device (42s / 1min 30s)"

That is ALL the information provided while the timer is counting down.

I suggest logging a message *before* each target unit is started and showing *both* the unit description and the unit name:

"systemd: Starting Swap (swap.target)."

Why doesn't systemd do that?

Comment 11 Zbigniew Jędrzejewski-Szmek 2018-12-04 08:21:44 UTC
There is no "before". Systemd starts waiting all jobs at once, and each job (in this case a device appearing), can then be a dependency for other jobs or none. So there is easy sequence of "now I need a device", oh, "a device has appeared, let's start swapon". Instead, whenever a device appears, it is handled.

(In theory, systemd could print something like "waiting for device <whatever> to be used for <something>", but in practice this <something> is not interesting to the user, because it might be something like "fsck <whatever>", followed by "resize <whatever>", followed by "mount <whatever>". Only the last step is meaningful of the user, but there's no direct dependency between the device unit and that last step. Also, sometimes devices are used for multiple mount points, etc.)

You are taking an overly simplistic view that applies to the specific case of a simple system with a single disk. If you try to consider more complicated cases with a graph of dependencies, you'll quickly realize that giving a useful summary in <80 characters is very hard. You're welcome to try to design and implement something: please open a pull request upstream.

Comment 12 Steve 2018-12-04 10:08:38 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #11)
...
> You are taking an overly simplistic view that applies to the specific case
> of a simple system with a single disk. If you try to consider more
> complicated cases with a graph of dependencies, you'll quickly realize that
> giving a useful summary in <80 characters is very hard. You're welcome to
> try to design and implement something: please open a pull request upstream.

"... a graph of dependencies ..."

If there are no cycles, then it should be possible for a low-level unit to report a unique ancestor unit in an error message.

Why can't the "dev-disk-by\x2duuid-aaaa ... aaaa.device" unit ask for its ancestor target unit and log that in an error message?

systemd-analyze can do something like that. Indeed, the graph appears to be bidirectional, so it should be possible to traverse this path:

".device" -> ".swap" -> "swap.target"

$ systemd-analyze dot | grep device | grep swap
...
	"dev-disk-by\x2duuid-de1443f2\x2d7d0c\x2d467e\x2dae9e\x2d8b2902a4d2e8.swap"->"dev-disk-by\x2duuid-de1443f2\x2d7d0c\x2d467e\x2dae9e\x2d8b2902a4d2e8.device" [color="green"];
	"dev-disk-by\x2duuid-de1443f2\x2d7d0c\x2d467e\x2dae9e\x2d8b2902a4d2e8.device"->"dev-disk-by\x2duuid-de1443f2\x2d7d0c\x2d467e\x2dae9e\x2d8b2902a4d2e8.swap" [color="grey66"];

$ systemd-analyze dot | grep swap.target | fgrep '.swap'
...
	"swap.target"->"dev-disk-by\x2duuid-de1443f2\x2d7d0c\x2d467e\x2dae9e\x2d8b2902a4d2e8.swap" [color="green"];
	"swap.target"->"dev-vda3.swap" [color="green"];
	"swap.target"->"dev-disk-by\x2duuid-de1443f2\x2d7d0c\x2d467e\x2dae9e\x2d8b2902a4d2e8.swap" [color="black"];

Comment 13 Steve 2018-12-04 17:50:06 UTC
(In reply to Steve from comment #12)
...
> If there are no cycles, then it should be possible for a low-level unit to
> report a unique ancestor unit in an error message.
> 
> Why can't the "dev-disk-by\x2duuid-aaaa ... aaaa.device" unit ask for its
> ancestor target unit and log that in an error message?
...

"systemctl list-dependencies" doesn't seem to be able to do that. I started by assuming that all a user could remember from the boot message is this pattern: "\*uuid\*.device".

Start with:

$ systemctl list-units \*uuid\*.device 
0 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.

Follow the first suggestion (the "grep" isn't really needed, it's for brevity here):

$ systemctl list-units --all \*uuid\*.device | grep dead
dev-disk-by\x2duuid-aaaaaaaa\x2daaaa\x2daaaa\x2daaaa\x2daaaaaaaaaaaa.device loaded inactive dead    dev-disk-by\x2duuid-aaaaaaaa\x2daaaa\x2daaaa\x2daaaa\x2daaaaaaaaaaaa.device

Now that we've found the full name of the device, let's try to figure out what went wrong:

$ systemctl status dev-disk-by\x2duuid-aaaaaaaa\x2daaaa\x2daaaa\x2daaaa\x2daaaaaaaaaaaa.device
● dev-disk-byx2duuid-aaaaaaaax2daaaax2daaaax2daaaax2daaaaaaaaaaaa.device
   Loaded: loaded
   Active: inactive (dead)

Read the "systemctl" man page and discover "list-dependencies":

$ systemctl list-dependencies dev-disk-by\x2duuid-aaaaaaaa\x2daaaa\x2daaaa\x2daaaa\x2daaaaaaaaaaaa.device
dev-disk-byx2duuid-aaaaaaaax2daaaax2daaaax2daaaax2daaaaaaaaaaaa.device

Maybe it should be reversed:

$ systemctl list-dependencies --reverse dev-disk-by\x2duuid-aaaaaaaa\x2daaaa\x2daaaa\x2daaaa\x2daaaaaaaaaaaa.device
dev-disk-byx2duuid-aaaaaaaax2daaaax2daaaax2daaaax2daaaaaaaaaaaa.device

How can a device be its own dependency?

Comment 14 Steve 2018-12-04 21:13:12 UTC
(In reply to Steve from comment #13)
...
> "systemctl list-dependencies" doesn't seem to be able to do that.
...

The problem is that the shell interprets the hex coded dashes, so the unit name needs to be quoted:

$ systemctl list-dependencies --reverse --all 'dev-disk-by\x2duuid-aaaaaaaa\x2daaaa\x2daaaa\x2daaaa\x2daaaaaaaaaaaa.device' | head -6
dev-disk-by\x2duuid-aaaaaaaa\x2daaaa\x2daaaa\x2daaaa\x2daaaaaaaaaaaa.device
● └─dev-disk-by\x2duuid-aaaaaaaa\x2daaaa\x2daaaa\x2daaaa\x2daaaaaaaaaaaa.swap
    │ └─...
●   └─swap.target
●     └─sysinit.target
●       ├─abrt-ccpp.service

So, in this case, the ".swap" unit or the "swap.target" unit could be reported in the boot log.

This, however, doesn't return any results mentioning "swap":

$ systemctl status 'dev-disk-by\x2duuid-aaaaaaaa\x2daaaa\x2daaaa\x2daaaa\x2daaaaaaaaaaaa.device' | grep -i swap

Pipe through "cat" to see everything:

$ systemctl status 'dev-disk-by\x2duuid-aaaaaaaa\x2daaaa\x2daaaa\x2daaaa\x2daaaaaaaaaaaa.device' | cat