Bug 656076 - BFO installation PITA
Summary: BFO installation PITA
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: anaconda
Version: 14
Hardware: i686
OS: Linux
low
high
Target Milestone: ---
Assignee: Will Woods
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-11-23 01:16 UTC by Chris Murphy
Modified: 2011-03-29 14:20 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-03-29 14:20:07 UTC


Attachments (Terms of Use)
anaconda logs (65.66 KB, application/x-gzip)
2010-12-07 23:27 UTC, Chris Murphy
no flags Details

Description Chris Murphy 2010-11-23 01:16:52 UTC
Description of problem: Numerous errors during both "Installation starting: Starting installation process" as well as the primary installation process comprised of "REBOOT/RETRY" dialogs, making an unattended installation impossible.


Version-Release number of selected component (if applicable):

Current bfo.usb at http://boot.fedoraproject.org/.
Fedora 14 i686


How reproducible:

Always


Steps to Reproduce:
1. Create a bfo.usb stick and boot from it.
2. Choose Fedora 14 i686 [added btrfs flag as a boot option, but have had these same problems when installing Fedora 13 with BFO and ext3 and ext4 so I don't think btrfs is the issue]
  
Actual results:

After "retrieving /install.img" and after configuring partitions and package options, there are two phases for the installation: a small status bar window titled "installation starting: Starting installation process" during which I received numerous (perhaps a dozen) error message in the form:

The file stunnel-4.33-1.fc14.i686.rpm cannot be opened. This is due to a missing file, a corrupt package or corrupt media. Please verify your installation source. If you exit, your system will be left in an inconsistent state that will likely require reinstallation. REBOOT/RETRY

If I choose retry, the installation proceeds. And then there'd be another error:

The file dnsmasq-2.52-1.fc13.i686 cannot be opened. This is due to a missing file, a corrupt package or corrupt media. Please verify your installation source. If you exit, your system will be left in an inconsistent state that will likely require reinstallation. REBOOT/RETRY

Retry again. A few minutes later another error:


The file hwdata-0.232-1.fc14.noarch.rpm cannot be opened. This is due to a missing file, a corrupt package or corrupt media. Please verify your installation source. If you exit, your system will be left in an inconsistent state that will likely require reinstallation. REBOOT/RETRY

Retry again....

Eventually I get to the point where the primary installation is taking place, with a large status bar, which includes text "Packages completed: xxx of 1192" where I proceed to get more such errors in the same form, but most of them are completely different RPMs than before.

A 2.5 hour attended installation, that does appear to work after all is said and done.



Expected results:

Same behavior as DVD iso and Live CD installation methods. The Live CD on the same hardware, same broadband connection took maybe 20 minutes to download the iso, and less than 20 minutes to do the installation. With zero errors. (But no btrfs option.)

Or if there are errors, an option to automatically retry x times before putting up a modal dialog requiring my attention when I'm just being a dumb monkey hitting retry (which appears to work).


Additional info:

Painful. In my opinion the behavior makes it unusable. If I were to try it again, I'm bailing out at the very first REBOOT/RETRY modal dialog and doing DVD iso.

Comment 1 Matt Domsch 2010-11-23 04:21:24 UTC
Reassigning to anaconda, as you've clearly gotten past gpxe downloading the first stage installer.

Comment 2 Chris Lumens 2010-11-23 14:51:34 UTC
It'd be nice to have the log files attached.

Comment 3 Chris Murphy 2010-11-23 21:35:17 UTC
Which logs would you like attached?

Comment 4 Chris Lumens 2010-11-24 16:52:00 UTC
/tmp/anaconda.log, /tmp/syslog, /tmp/yum.log during installation.

Comment 5 Chris Murphy 2010-11-24 20:14:27 UTC
Does /tmp exist only in RAM during a BFO installation? I'm not seeing these files on the USB stick. And I suppose if /tmp is on the target drive for installation, it would be cleaned up on reboot. And they're not presently on the target drive in any event.

Is ftp or ssh running during a BFO installation, or can either of them be made to run so that I can remotely grab these files?

I had this problem with Fedora 13 BFO installation as well. So it's worth doing another BFO installation just to get these logs if it helps to fix the problem down the road. But I need some explicit information on how to get them before I'd be willing to blow away the installation now on this machine.

Comment 6 Chris Lumens 2010-12-07 20:45:59 UTC
Yes, /tmp is a ramfs.  If you complete installation, they will be copied to /var/log/anaconda* afterwards.  sshd is running if you pass the sshd command line parameter to anaconda.

Comment 7 Chris Murphy 2010-12-07 20:49:40 UTC
Is there any chance these are copied to the target volume? I ask because I do have a number of anaconda items in /var/log presently on that volume with a date and time stamp of the original BFO installation.


-rw-------. 1 root    root     47175 Nov 22 18:10 anaconda.log
-rw-------. 1 root    root     19745 Nov 22 18:10 anaconda.program.log
-rw-------. 1 root    root    130545 Nov 22 18:10 anaconda.storage.log
-rw-------. 1 root    root     77829 Nov 22 18:10 anaconda.syslog
-rw-------. 1 root    root     35294 Nov 22 18:10 anaconda.xlog
-rw-------. 1 root    root    126557 Nov 22 18:10 anaconda.yum.log

Comment 8 Chris Lumens 2010-12-07 21:04:16 UTC
Yes if the installation succeeds, anaconda copies all its log files to the installed system.  If those timestamps match up with the install you were performing when you hit this problem, they'd be helpful.

Comment 9 Chris Murphy 2010-12-07 23:27:26 UTC
Created attachment 467327 [details]
anaconda logs

All anaconda logs in /var/log on the system successfully (eventually) installed F14 using BFO.
-rw-------. 1 root    root     47175 Nov 22 18:10 anaconda.log
-rw-------. 1 root    root     19745 Nov 22 18:10 anaconda.program.log
-rw-------. 1 root    root    130545 Nov 22 18:10 anaconda.storage.log
-rw-------. 1 root    root     77829 Nov 22 18:10 anaconda.syslog
-rw-------. 1 root    root     35294 Nov 22 18:10 anaconda.xlog
-rw-------. 1 root    root    126557 Nov 22 18:10 anaconda.yum.log

Comment 10 Will Woods 2011-01-18 22:44:09 UTC
The logs say things like:


Try 1/10 for http://download.fedoraproject.org/.../dnsmasq-XXX.rpm failed: [Errno 14] HTTP Error 403 : http://mirror.cc.vt.edu/.../dnsmasq-XXX.rpm 
Failed to get http://download.fedoraproject.org/.../dnsmasq-XXX.rpm from mirror 1/1, or downloaded file is corrupt


You'd think it would just move on to the next mirror, but instead it says "Failed to get ... from mirror 1/1". Huh? There should be a lot more mirrors than that, right?

I'm guessing the BFO configuration is using the round-robin redirector URL (http://download.fedoraproject.org/pub/fedora/...) as the only configured URL, so if there's any error, pycurl doesn't know it can just retry and get a new mirror. Except..


Try 1/10 for http://mirrors.xmission.com/.../libsmbclient-XXX.rpm failed: [Errno 14] HTTP Error 404 : http://mirrors.xmission.com/.../libsmbclient-XXX.rpm 
Failed to get http://mirrors.xmission.com/.../libsmbclient-XXX.rpm from mirror 1/47, or downloaded file is corrupt


Now we have 47 mirrors, but it gave up after 1 failure? This doesn't make a lot of sense.

Should probably look closer at how BFO sets up the installation, and how pycurl is configured in anaconda.

Comment 11 Matt Domsch 2011-01-18 23:13:39 UTC
download.fp.o is a one-shot HTTP 30x redirect, it doesn't send a mirror list for BFO to try the next in the list...  If BFO would simply do the original URL request again (not the resultant HTTP 30x Location like it's doing), it would get another redirect, likely to a different mirror.  If the client were part of a specific netblock though, it may continue to always get the same answer from download.fp.o, thus would always fail...  teaching BFO to get a mirrorlist would be nice.

Comment 12 Will Woods 2011-01-19 16:19:29 UTC
So the menu that gets presented to the user is stored here:

http://serverbeach1.fedoraproject.org/pub/alt/bfo/pxelinux.cfg/fedora_install.conf

which starts the install like so:

label Fedora-14-i386
	MENU LABEL Fedora-14-i386
	kernel http://download.fedoraproject.org/pub/fedora/linux/releases/14/Fedora/i386/os/images/pxeboot/vmlinuz
	initrd http://download.fedoraproject.org/pub/fedora/linux/releases/14/Fedora/i386/os/images/pxeboot/initrd.img
	append method=http://download.fedoraproject.org/pub/fedora/linux/releases/14/Fedora/i386/os/

method=XXX will use the given URL *only*, for everything.

At least for Fedora 14 and earlier, I think we can fix this bug by changing that line to:

append stage2=http://download.fedoraproject.org/pub/fedora/linux/releases/14/Fedora/i386/os/images/install.img

which will allow the booted anaconda to use its repo config / mirrorlists to fetch packages.

I'm not sure how we should handle this for F15 and later, where we'll have One Big Image. Maybe no 'append' line is necessary at all?

Comment 13 Will Woods 2011-01-19 16:24:58 UTC
Chris, could you try a BFO installation like this to see if it fixes the problem?

That is: in the BFO Install menu, choose the item you want and hit Tab to change options - then change'method=.../os/' to 'stage2=.../os/images/install.img'. 

If you do that, can you get through the install without having to hit 'Retry' over and over?

Comment 14 fred2 2011-03-25 20:38:15 UTC
have had same 'dumb monkey' problems as in description, for bfo install of f14 (gave up) and now f15 alpha (in _incredibly_tedious_ progress, 'packages completed: 658 of 1098' after 20 somewhat attended hours)

if this fails, i would try the fix of comment 13, except:
for alpha is it expected that there be no images/install.img to be used in stage2?

searching thru the directory (and thus random(?) mirror) as given in comment 13 stage2 for f15, i find:

/fedora/linux/releases/test/15-Alpha/Fedora/i386/os/images/
(boot.iso, but no image.iso)

same for /fedora/linux/development/15/i386/os/images/

if this is the case, will have to wait for f15-beta to try this fix.

Comment 15 Will Woods 2011-03-28 16:39:25 UTC
To be clear: this is not a bug in anaconda, it's a problem with the configuration used by boot.fedoraproject.org. This was fixed for Fedora 14 and earlier on February 28. 

As suggested in comment #12, for Fedora 15 no "method=xxx" line is necessary. Remove it and the install will proceed without the constant "Retry? message.

(Apparently, whoever added the Fedora 15 configuration sections on boot.fedoraproject.org didn't see the email telling them to leave out the "append method=..." part. Oh well.)

If anyone can confirm that removing "method=xxx" from the boot commandline makes the install proceed as expected, we'll change the configuration on boot.fedoraproject.org and close this bug.

Comment 16 Chris Murphy 2011-03-28 17:23:40 UTC
I have not had a chance to test because I'm unwilling to blow away the installation on the original test machine.

However, I now have a new test machine I can try this with, but it is an Intel EFI 1.1 Macbook Pro and the bfo.usb file use to create the BFO USB stick will not boot this machine at all. It also will not boot a different Macbook Pro which is a UEFI 2.x machine.

So there appears to be other problems with the state of BFO when it comes to (U)EFI machines. If someone has a suggestion on how to proceed, let me know.

Comment 17 Will Woods 2011-03-28 18:04:00 UTC
Ugh. EFI is basically black magic, and Apple's EFI implementation is so different from others as to basically be its own completely different type of black magic.

I'll just set up a virt guest to test this with.

Comment 18 Chris Murphy 2011-03-28 18:35:04 UTC
(In reply to comment #17)
> Ugh. EFI is basically black magic, and Apple's EFI implementation is so
> different from others as to basically be its own completely different type of
> black magic.

http://fedoraproject.org/wiki/Anaconda/Features/UEFI
"Other install methods (USB, PXE) are working; only CD/DVD has this problem"

Supposedly it is working or workable. It's just that there isn't a (U)EFI specific BFO image available. UEFI booting is totally different so it stands to reason there'd need to be a different BFO image. Heck maybe one isn't even needed (see below).

I know early Apple hardware is neither Intel EFI nor UEFI. It's something in the middle. The current hardware (last couple of years) should all be UEFI 2.x compliant. I have GRUB Legacy EFI and GRUB2 EFI working on both kinds of hardware. In theory I should be able to plug in the URLs for Fedora 15 found in the BFO GRUB menu from a conventional machine, into either GRUB Legacy EFI or GRUB2 EFI by command line, on either hardware variety, and it would work. But then is this really a BFO test?

There are ample explanations for why Apple did not implement Intel EFI 1.1, all other hardware vendors puked at the early EFI concept, and thus UEFI was born, abstracted away from Intel.

But if either UEFI or Apple's present implementation of it is inadequate or not conforming to the standard, then this needs to be demonstrated clearly so I, and others, can post a bug with Apple. It may go nowhere, but it certainly will go nowhere if it's not reported/complained about. 

For true blackmagic, see the incoherent abomination that is GRUB2 and associated (non)documentation. That's much worse than anything in the UEFI spec, or Apple's implementation of it. That is a project that demonstrates utter lack of discipline and has fractured developers on bootloader support and methods. It's not due to laziness that Red Hat continues to hack GRUB Legacy.

Comment 19 Will Woods 2011-03-28 18:58:51 UTC
...interesting, but wildly off-topic. You might consider filing a separate bug about making BFO handle (U)EFI hardware.

In the meantime, I've successfully completed a F15 install after removing the "method=..." argument from the boot commandline. So that's the workaround, and once the infrastructure folks fix the existing config file I'll close this bug.

Comment 20 Kevin Fenzi 2011-03-28 19:22:22 UTC
should be fixed now. Please confirm. ;)

Comment 21 Will Woods 2011-03-29 14:20:07 UTC
Works fine on my test system. Thanks very much!


Note You need to log in before you can comment on or make changes to this bug.