Bug 874486
Summary: | progress indicator for mediacheck isn't displayed, so users may think the installer is hung | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Andre Robatino <robatino> | ||||||||||||||
Component: | anaconda | Assignee: | Anaconda Maintenance Team <anaconda-maint-list> | ||||||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||||||
Priority: | unspecified | ||||||||||||||||
Version: | 18 | CC: | anaconda-maint-list, awilliam, bcl, dracut-maint, ed.greshko, g.kaviyarasu, gsgatlin, harald, jonathan, jreiser, kparal, lnykryn, mosterhouse2000, public.oss, rbergero, satellitgo, sbueno, vanmeeuwen+fedora | ||||||||||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||||||||||
Target Release: | --- | ||||||||||||||||
Hardware: | All | ||||||||||||||||
OS: | Linux | ||||||||||||||||
Whiteboard: | AcceptedBlocker | ||||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||
Clone Of: | Environment: | ||||||||||||||||
Last Closed: | 2013-01-02 21:47:57 UTC | Type: | Bug | ||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Bug Depends On: | |||||||||||||||||
Bug Blocks: | 752661 | ||||||||||||||||
Attachments: |
|
Description
Andre Robatino
2012-11-08 10:05:03 UTC
The progress indicator in the old mediacheck looked exactly like the output of checkisomd5 -v Fedora-18-Beta-TC9-i386-DVD.iso for example, so probably using the same code. I don't know if the design of the new mediacheck allows reusing that. It should be there, but something (systemd?) seems to be eating the output. If you look at journalctl you'll see it logs a mess of binary blob info which may be the progress output. *** Bug 882828 has been marked as a duplicate of this bug. *** Nominating as F18 NTH. *** Bug 882397 has been marked as a duplicate of this bug. *** It also looks like ESC isn't making it through and aborting the check. I kinda forgot this as I don't test physical media so much any more, but from test@ data, it seems media check on a physical DVD can take 20 minutes or more. That's a long time to be apparently frozen, especially since people don't really notice that the default boot menu option involves a media check (they may not even see it, if they boot the disc then go to make a coffee). I think it's worth considering whether this ought to be a blocker. 20 minutes? Seems like we ought to reconsider making it the default as well (which doesn't mean the above shouldn't be fixed). Adding harald to the cc to see if he has any ideas as to what's eating the i/o from it. It's a little hard for me to believe that 20 minutes is normal. I have a machine from 1999 with a 250 MHz CPU which I install Fedora on nowadays mostly as a challenge, and even on that, it doesn't take 20 minutes to do a mediacheck. In any case, once the progress indicator and ESC key are working, people can estimate how long it will take and opt out if they want. Also, the old interactive mediacheck defaulted to checking ("OK" was highlighted by default, rather than "Skip") and there was no apparent way to stop the check once started. ESC did nothing (just checked on the F16 DVD), so apparently the best you could do was reboot, which of course is still an option even if one doesn't notice the message "Press [Esc] to abort check." which is displayed by command-line checkisomd5 and presumably will be visible when this bug is fixed. In addition, if reading the media is slow, then the install should be correspondingly slow, meaning that much more wasted time if it fails due to bad media. andre: the limiting factor is not CPU speed but rotating disc speed. Were you using an actual silver DVD as your medium? which image were you testing? how long did it take for you? Adam: When you say "can take 20 minutes or more" does that mean "under normal circumstances where nothing is wrong" or "can take (up to) 20 minutes, when it is finding errors" ? I would give it some serious thought as a blocker, especially considering how much outreach ambassadors do through DVD handouts. It's often folks' first experience, and that is kind of a scary way to start off. :( If it seems like "it's not working" - and 20 minutes is an ETERNITY when you're wondering what is going on - and people are likely to hit the power button (and never get it installed as a result) then it really starts to border on "does it install"-type criteria... Even if 20m is not the norm and it's significantly faster, it would be useful to have the indicator for situations where things aren't going properly... I have done a simple computation. In 4x speed DVD drive it takes roughly 14 minutes to read the whole medium, if you read full-speed all the time. With 8x drive the time gets halved, 7 minutes, of course. Usually you have higher speed drives, but a bit worn media (lot of people use RW media for these purposes, myself included), so the estimate is more or less accurate, the required time is usually between 5-15 minutes. If the progress bar is shown and it's easy to skip the check, I see no problem at all. If we are not able to fix it in time, I'd rather see mediacheck as the second boot option, having the default boot option without it. robyn: the data is not rock solid or anything. There are two data points in the thread: "1. Stupid 20-minute pause (waiting for a timeout?) before the installation got underway." (Peter Gueckel) "I did not choose the media select (or at least I don't think I did) and also saw 20+ minute delays with no progress information while the DVD drive rapidly read data. Manually removing "rd.live.check" solved the issue." (Samuel Greenfeld, who is a reliable tester, but may have been running on something very slow, as he's a Sugar guy). I always install using optical media. The last time I installed on that machine using a DVD rather than live was F14 (due to memory limitations, it only has 512 MiB RAM) but it didn't take more than a few minutes to do the check. Since a full install takes a few hours on that machine, I'd never consider starting one without making sure the media was good. P.S. The DVD drive in that machine is not original equipment, it's from 2004. Still pretty old. I suppose you could ask people whose check took longer what kind of drive/media they're using. The live image is about 1/6th the size of the DVD, so obviously the check will complete much faster. (In reply to comment #16) > The live image is about 1/6th the size of the DVD, so obviously the check > will complete much faster. I was referring to the speed of the DVD check, not the live (which is why I mentioned F14). The DVD is only moderately larger now. Discussed at 2012-12-05 blocker review meeting - http://meetbot.fedoraproject.org/fedora-bugzappers/2012-12-05/f18final-blocker-review-2.2012-12-05-17.01.log.txt . Accepted as a blocker per criterion "If there is an embedded checksum in the image, it must match. If there is a related UI element displayed after booting the image, it must work and display the correct result" on the basis that displaying no kind of progress indicator is bad enough to consider 'not working'. We would consider this bug not serious enough to be a blocker if either some kind of progress indicator - or at least an indication that media check is in progress - is shown, or if media check were no longer the default boot option. I've tried a variety of things by editing /usr/sbin/dmsquash-live-root while setting rd.break=cmdline and noting I do makes the output go to the console. The problem is that the systemd service for dracut-initqueue redirects stdin/out/err so the output only shows up in journalctl. Even my attempts to add some diagnostic output have failed (things like adding >&2 to checkisomd5 or adding echos) Have you tried StandardOutput=journal+console or just StandardOutput=tty? (see systemd.exec) Created attachment 663457 [details]
Proposed patch
Here is my proposed patch, which will be submitted as an update soon.
Information: DVD "1X" is 1.35 megabytes per second. Real-world times for checkisomd5 of Fedora 18 vary from 15 minutes to 5 minutes. A 4X DVD+RW (re-writable) is spun at CLV (Constant Linear Velocity) so you get very close to 5.4MB/s. Thus a 4.572GB .iso takes 847 seconds, or 14.1 minutes. Even a 1GHz CPU can handle this. An 8X DVD+R (write once) is spun at zCAV (zoned Constant Angular Velocity) and starts out at about 5.5MB/s on the inner tracks, reaching 8X or more on the outer tracks. The average is around 6X for a full DVD. In addition, the opto-electronics and reader firmware can push some brands of media and styles of recording even faster than rated. In good cases reading at 10X or even 12X is possible on the outer portion of a high-quality 8X platter written with a good writer. 16X DVD+R are also zCAV. By actual measurement, checkisomd5 of Fedora-18-Beta-RC1-x86_64-DVD on my 16X DVD+R platter with 22X drive and 2.5GHz CPU takes 290 seconds (4 minutes 50 seconds). This is an average of 15.8 MB/s or "11.7X". dracut-024-15.git20121218.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/dracut-024-15.git20121218.fc18 Should add that the old machine I was talking about in comment 9 has an 8X DVD drive (my first and slowest DVD drive) purchased in 2004. My media is DVD+R labeled "1-16X speed". Package dracut-024-15.git20121218.fc18: * should fix your issue, * was pushed to the Fedora 18 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing dracut-024-15.git20121218.fc18' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-20580/dracut-024-15.git20121218.fc18 then log in and leave karma (feedback). No change in smoke8. Behaves exactly as before, no visible progress indicator and ESC does nothing. Tested both DVD and netinst. (In reply to comment #26) > No change in smoke8. Behaves exactly as before, no visible progress > indicator and ESC does nothing. Tested both DVD and netinst. Huh? Do you have a test image? Worked for me in qemu, when I tested with a handcrafted image. smoke8 is here: http://dl.fedoraproject.org/pub/alt/qa/20121218_f18-smoke8/ But it might not contain the latest dracut, just latest anaconda. I don't know how to find that out. tflink's announcement said - smoke8 now available (anaconda-18.27-4, dracut-024-15.git20121218.fc18) So it should have the right one. Search for the checkisomd5@.service file, and check its status with systemctl. The service is missing. A freshly-composed DVD (20 minutes ago) does not run the media check (neither in BIOS mode nor in UEFI mode), and "systemctl -a | grep checkiso" from VT2 of the booted installer (anaconda-18.37.3) shows nothing. The pungi compose was run on a system with: # rpm -q anaconda lorax anaconda-18.37.3-1.fc18.x86_64 lorax-18.24-1.fc18.x86_64 and the pungi log says: pylorax.ltmpl.DEBUG: removed .../work/x86_64/yumroot//usr/lib/dracut/modules.d/90dmsquash-live/checkisomd5@.service pylorax.ltmpl.DEBUG: template line 21: removefrom isomd5sum --allbut /usr/bin/checkisomd5 pylorax.ltmpl.DEBUG: isomd5sum --allbut /usr/bin/checkisomd5: removed 4/5 files, 34kb/53kb pylorax.ltmpl.DEBUG: removed /sdd15/ext4-data/Fedora18/work/x86_64/yumroot//usr/share/man/man1/checkisomd5.1.gz These packages [among others] were specially included in the compose (from TC3 and later): dracut-024-15.git20121218.fc18.x86_64.rpm anaconda-18.37.3-1.fc18.x86_64.rpm lorax-18.24-1.fc18.x86_64.rpm So to me it looks like a problem with the .ltmpl ffile from lorax. John, those are from inside the install.img not the initrd so those removals are fine. To check if it exists pass rd.break and look from the dracut shell. Harald, I don't see anything that installs the service, shouldn't it be getting added to the systemd directory in modules-setup.sh? I can't find it anywhere on the filesystem from the shell. Also, there is a typo: if [ -n "DRACUT_SYSTEMD" ]; then should be if [ -n "$DRACUT_SYSTEMD" ]; then Adding "rd.break" to the kernel boot command line, and looking around using the dracut emergency shell: "systemctl -a | grep checkiso" still shows nothing. "find / -name '*isomd5*'" shows only /usr/bin/checkisomd5 and /sysroot/usr/bin/checkisomd5. "systemctl list-unit-files" also has no "checkiso" anywhere. I haven't poked around in detail, but I can confirm that I don't see any progress with smoke8. dracut-024-15.git20121218.fc18 has been pushed to the Fedora 18 stable repository. If problems still persist, please make note of it in this bug report. ah, anaconda has it's own check. cloning the bug ah, no... reopening the bug, because of all the flags Created attachment 666643 [details]
Proposed patch for anaconda
(In reply to comment #31) > John, those are from inside the install.img not the initrd so those removals > are fine. > > To check if it exists pass rd.break and look from the dracut shell. > > > Harald, I don't see anything that installs the service, shouldn't it be > getting added to the systemd directory in modules-setup.sh? I can't find it > anywhere on the filesystem from the shell. > > Also, there is a typo: > > if [ -n "DRACUT_SYSTEMD" ]; then > > should be > > if [ -n "$DRACUT_SYSTEMD" ]; then yeah... damnit :) you are right! Created attachment 666645 [details]
Proposed patch for anaconda
updated the patch with "$DRACUT_SYSTEMD"
dracut-024-16.git20121220.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/dracut-024-16.git20121220.fc18 (In reply to comment #40) > dracut-024-16.git20121220.fc18 has been submitted as an update for Fedora 18. > https://admin.fedoraproject.org/updates/dracut-024-16.git20121220.fc18 This will fix the mediacheck for the LiveCD. The DVD still needs the anaconda patch from comment 39. Wow. I had totally forgotten about that check in Anaconda. Thanks! Package dracut-024-16.git20121220.fc18: * should fix your issue, * was pushed to the Fedora 18 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing dracut-024-16.git20121220.fc18' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-20716/dracut-024-16.git20121220.fc18 then log in and leave karma (feedback). anaconda-18.37.6-1.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/anaconda-18.37.6-1.fc18 With the images in https://dl.fedoraproject.org/pub/alt/qa/20121220_f18-smoke10/ , the progress indicator is visible now, but ESC causes a drop to the emergency prompt instead of allowing the installer to continue as it should. Created attachment 667096 [details]
screenshot after hitting ESC
The result of the mediacheck after hitting ESC is UNKNOWN, this should be treated the same as PASS rather than FAIL and allow the installer to continue.
Unfortunately the checkisomd5 man page says EXIT STATUS Program returns exit status 0 if the checksum is correct, or 1 if the checksum is incorrect, non-existent, or check was aborted. so I don't know if there's any way of doing what I described short of modifying checkisomd5 to have a separate exit status for an aborted check. I suppose if there isn't time, the existing behavior is good enough, since the old mediacheck also couldn't be bypassed after it started except by rebooting (AFAIK). (In reply to comment #46) > Created attachment 667096 [details] > screenshot after hitting ESC > > The result of the mediacheck after hitting ESC is UNKNOWN, this should be > treated the same as PASS rather than FAIL and allow the installer to > continue. This actually reveals another problem. The user should not end up in dracut shell, if the media is broken. He should be told "The media is corrupted, please create it again. Hit Enter to reboot". Dropping people to dracut shell if far from a friendly approach. Are we able to do that in the current limited timeframe? I see xx% progress counter in Fedora-18-smoke10-x86_64-DVD.iso install doing a VirualBox install (In reply to comment #48) > This actually reveals another problem. The user should not end up in dracut > shell, if the media is broken. He should be told "The media is corrupted, > please create it again. Hit Enter to reboot". > > Dropping people to dracut shell if far from a friendly approach. This was already known. It's certainly not user-friendly, but hopefully mediacheck failure will be rare, so having to reboot manually in that case shouldn't be too big a hassle, though it would be good to have it behave as you say. Created attachment 667316 [details]
screenshot when mediacheck fails
mediacheck generated by deliberately corrupted image
cp -p Fedora-18-smoke10-x86_64-netinst.iso Fedora-18-smoke10-x86_64-netinst.iso.orig
truncate -s 305184192 Fedora-18-smoke10-x86_64-netinst.iso
truncate -s 306184192 Fedora-18-smoke10-x86_64-netinst.iso
Brian, what is your preferred approach now? Do you think some adjustments are still to be done as part of this report, or should we close this and report new bugs about the issues in comment 45 and further? dracut-024-18.git20130102.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/dracut-024-18.git20130102.fc18 I think we can close this. It isn't pretty, but it works. The main goal being that they don't proceed with an install from corrupt media. dracut-024-17.git20121220.fc18, anaconda-18.37.8-1.fc18 has been pushed to the Fedora 18 stable repository. If problems still persist, please make note of it in this bug report. I have reported the current mediacheck deficiencies as bug 891548 and bug 891551. dracut-024-18.git20130102.fc18 has been pushed to the Fedora 18 stable repository. If problems still persist, please make note of it in this bug report. Dumb question, but where does one go to get the updated 64 bit DVD release? I've read through this and am not sure where to go. There aren't updated images post-release, but this issue was resolved before the F18 release and should be fixed in the release images. We know the way it is in F18 isn't entirely perfect - as Brian said, "It isn't pretty, but it works. The main goal being that they don't proceed with an install from corrupt media." Any further polish is to be treated as a separate bug. I'm not sure if anyone's filed a bug to make the process a bit more polished yet. In addition to Kparal's bugs from comment 56, I filed bug 907600 as an RFE to get 3 distinct return values from checkisomd5 for PASS, FAIL, and UNKNOWN (currently there are only two so it can't tell the difference between FAIL and UNKNOWN). |