Bug 1508794

Summary: systemd/dracut doesn't wait for mediacheck to finish
Product: [Fedora] Fedora Reporter: Kamil Páral <kparal>
Component: dracutAssignee: dracut-maint-list
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 27CC: awilliam, dracut-maint-list, jonathan, mattdm, robatino, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: AcceptedBlocker
Fixed In Version: dracut-046-5.fc27 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-11-08 22:10:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1396704    
Attachments:
Description Flags
mediacheck timeout in progress
none
mediacheck timeout finished none

Description Kamil Páral 2017-11-02 09:04:35 UTC
Description of problem:
This is probably another manifestation of bug 1495635. When you choose to "Test Fedora & boot" on a LiveCD, dracut times out before the check is completed, and the user ends up in a rescue shell. See screenshots.

Version-Release number of selected component (if applicable):
Fedora-Workstation-Live-x86_64-27-1.2.iso

How reproducible:
always

Steps to Reproduce:
1. burn the DVD
2. choose to test media consistency when booting

Comment 1 Kamil Páral 2017-11-02 09:05:17 UTC
Created attachment 1346906 [details]
mediacheck timeout in progress

Comment 2 Kamil Páral 2017-11-02 09:05:34 UTC
Created attachment 1346907 [details]
mediacheck timeout finished

Comment 3 Kamil Páral 2017-11-02 09:07:11 UTC
I believe this is a blocker on the very basic:
"All release-blocking images must boot in their supported configurations. "
https://fedoraproject.org/wiki/Basic_Release_Criteria#Release-blocking_images_must_boot

Comment 4 Matthew Miller 2017-11-02 12:50:39 UTC
This is definitely a blocker.

Comment 5 Kamil Páral 2017-11-02 13:15:23 UTC
In case it's not absolutely clear, this is a problem of dracut+systemd applying the default 90(?) seconds timeout even when they should not, and DVDs being slow in general. So this can't be reproduced in VMs, because VMs are fast. The speed of checking, that's the deciding factor here.

Btw, it might be a good idea to also check pxeboot with slow download speeds (unfortunately it's a bit difficult to set up). That might be affected as well.

Comment 6 Matthew Miller 2017-11-02 13:37:02 UTC
(In reply to Matthew Miller from comment #4)
> This is definitely a blocker.

Although I'm willing to be convinced that since there is a workaround, and that few people these days have CDs, that we can common bugs it and move on.

Comment 7 Josh Boyer 2017-11-02 13:38:18 UTC
(In reply to Matthew Miller from comment #6)
> (In reply to Matthew Miller from comment #4)
> > This is definitely a blocker.
> 
> Although I'm willing to be convinced that since there is a workaround, and
> that few people these days have CDs, that we can common bugs it and move on.

I think that's the case.  Yes, this is a bad experience but I think we need to consider the number of people potentially impacted.  I would suggest that our user base has largely moved on from physical DVD media.  For those that haven't, we have workarounds of not doing media check, retrying, etc.

If this was caught sooner I would support fixing it, but given all considerations I would suggest documenting it as a known issue.

Comment 8 Kamil Páral 2017-11-02 14:06:23 UTC
I just reproduced the same problem with a slow USB stick plugged in into a USB 2.0 slot. So this is not exclusive to DVD, and it's not that hard to hit it even with USBs. Depending on your media size, your drive needs to be quite fast. For Workstation Live, if your usb stick reads slower than 18MB/s, you'll hit this (1.8 GB in 90 seconds). For Server DVDs, it's 28MB/s as a minimum.

Comment 9 Lukáš Nykrýn 2017-11-02 14:26:10 UTC
What is the version of dracut there?

Comment 10 Adam Williamson 2017-11-02 15:40:29 UTC
Remember we're not shipping Server as a blocking part of F27 Final, so Server considerations don't apply. 18MB/s is a pretty fast minimum, though.

Comment 11 Adam Williamson 2017-11-02 15:41:06 UTC
Lukas: dracut-046-4.fc27

Comment 12 Zbigniew Jędrzejewski-Szmek 2017-11-02 16:09:34 UTC
https://github.com/dracutdevs/dracut/pull/302

Comment 13 Kamil Páral 2017-11-02 17:34:58 UTC
Discussed during blocker review [1]:

AcceptedBlocker (Final) - this is a conditional violation of the requirement that images boot by default. we believe it will be encountered sufficiently often in the real world, and leave a bad enough impression when it does happen, to accept it as a blocker.

[1] https://meetbot-raw.fedoraproject.org/fedora-meeting-1/2017-11-02/

Comment 14 Fedora Update System 2017-11-02 20:48:03 UTC
dracut-046-5.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2017-d4fe020d2e

Comment 15 Adam Williamson 2017-11-03 05:36:52 UTC
Tested with RC-1.3, fix is looking good: I did the media check and boot on Workstation live written to a DVD, and it didn't hit the timeout.

Comment 16 Kamil Páral 2017-11-03 12:15:07 UTC
I can confirm the timeout is fixed with RC1.3 with both DVD and a slow USB2.0 stick.

Comment 17 Fedora Update System 2017-11-04 18:03:18 UTC
dracut-046-5.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-d4fe020d2e

Comment 18 Fedora Update System 2017-11-08 22:10:58 UTC
dracut-046-5.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.