Red Hat Bugzilla – Bug 675685
netboot fails on ppc64 - ramdisk is >32MB
Last modified: 2013-09-12 04:16:47 EDT
Created attachment 477392 [details]
console.log from RHEL6.1-20110206.n.0
Description of problem:
ramdisk.image.gz exceeded the 32MB size and netboot fails on ppc64.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Start beaker job on ppc64 with latest nightly build
of_net_open - result: 0
<-- of_net_open - FILE_ERR_OK
yaboot_text_ui - file->buffer 2100000 -> 4100000 (2000000)
yaboot_text_ui - vmlinux 4100000 -> 5700000 (1600000)
yaboot_text_ui - initrd 0 -> 0 (0)
yaboot_text_ui - Looks like we do not need to move the kernel
of_close - <@01b36000>
of_close - of_close called
<-- of_close - 0
ramdisk load failed !
ENTER called ok
ramdisk is loaded without problems.
The sizes of latest ramdisks are:
RHEL6.1-20110206.n.0 - 34566900
RHEL6.1-20110204.n.0 - 34512863
RHEL6.1-20110203.n.0 - 34533430
Latest working was:
RHEL6.1-20110202.n.0 - 33088046
The limitation is somewhere about 32MB, which is 33554432.
similar bug is filled on Fedora:
*** Bug 675288 has been marked as a duplicate of this bug. ***
I went through the content of our initrd and found about 300KB of files which might (but it is not sure yet) be possible to remove. The space saved by this won't be enough as we are over the limit by about 1MB at the moment.
Just for the info - the locales in stage1 are already stripped down to bare minimum (en_US.utf8).
And unfortunately, since everything continues to grow over time, just removing something isn't really a sustainable fix. There's going to come a day in RHEL6 where we simply cannot remove anything else and keep the same level of functionality in anaconda. On that day, what's going to be the real answer?
Based on comment #3, the last working ppc64 tree was 20110202.n.0. The
ramdisk.image.gz file was exactly 32MB in that tree. In the current nightly
tree, the ramdisk.image.gz file has increased to 33MB.
Unpacking the trees, here's what I see:
20110202.n.0 tree - 73548k unpacked ramdisk.image.gz
20110208.n.0 tree - 75664k unpacked ramdisk.image.gz
Gathering more details.
Appears the most recent growth in the image has been for driver updates, new firmware files, and the iscsi userland tools.
Attaching a diff of the 20110202.n.0 image with the 20110208.n.0 image.
Created attachment 477640 [details]
------- Comment From firstname.lastname@example.org 2011-02-08 13:33 EDT-------
for the short term can you use lzma compression instead of gzip. The ramdisk will be smaller and the kernel can still deal with it.
(In reply to comment #8)
> ------- Comment From email@example.com 2011-02-08 13:33 EDT-------
> for the short term can you use lzma compression instead of gzip. The ramdisk
> will be smaller and the kernel can still deal with it.
We are working on this on the master branch, but changing the compression format used for the ramdisk image affects a number of other components. Ideally it's fine, but we need to check all of those to ensure we don't introduce another problem.
For now, the fix we have removes some kernel modules and firmware files from ramdisk.image.gz so that it's below 32MB. We are removing the following subdirectories from /lib/modules:
And from /lib/firmware, we are removing the following subdirectories:
The test compose we just did brings us to a 31MB ramdisk.image.gz.
------- Comment From firstname.lastname@example.org 2011-02-08 17:56 EDT-------
ok verified that I can transfer more than 32MB. Please do the following steps
and let me know what happens
boot and enter the SMS menus
type '0' then 'y' you should now be at the open firmware prompt.
setenv real-base c00000
dev /packages/gui obe //be ready right away to type '1' and go back into
the SMS menus
select "Setup Remote IPL (Initial Program Load)"
select appropriate network device
select "IPv4 - Address Format 22.214.171.124"
select "Advanced Setup: BOOTP"
change the Bootp Blocksize from 512 to 1024
select "M" and go to main menu and try the network install again
again I would also recommend using the lzma compression. For the rhel6 ramdisk
it changed the file size significantly
-r--r--r--. 1 root root 24041544 Feb 8 15:33 ramdisk.image.lzma
-r--r--r-- 1 root root 31319913 Oct 27 10:01 ramdisk.image.gz
We're not opposed to lzma for ramdisk.image, we're just not going to throw that in right now because all of the tree composition and booting tools that rely on the initrd being named ramdisk.image.gz and a gzip file all need testing to ensure that everything will still work if it's a .xz file. We just don't have the time for that for this release.
------- Comment From email@example.com 2011-02-09 00:34 EDT-------
There are 2 problems causing this failure.
1. Firmware will only TFTP in 64k packets of block-size. The default
block-size is 512bytes. This limits the transfer to 32Mb changing the TFTP
blocksize is documented in RFC 2348. I don't know how many TFTP serves in
production support this. Certainly some do.
Mike has shown how to do it for us.
2. (RedHat's) Yaboot has a default transfer buffer #defined to 32Mb. Exceeding
this size in the TFTP transfer should fail. 32Mb was chosen as a default
maximum size that will work on systems with 128Mb RMA
Going forward there are several options:
1. Remove un-need data from the ramdisk to keep it's size under 32Mb
Providing this is done only to the ramdisk that's used for netbooting that
should create minimal breakage (if any).
2. Switch to LZMA compression, again to keep the size under 32Mb
Both of these options require changes to the Redhat build/test process but
shoudl allow 6.1 to get out the door while we discuss a more long term strategy
3. Change yaboot to use the initrd-size option in yaboot.conf to modify the
size of the buffer allocated for the TFTP load. This will also need to
change the blocksize for the TFTP request. Also this will need backporting
several changes from upstream to RHEL's yaboot. Infact it may be easier to
rebase yaboot completely on upstream
4. Change yaboot to request a minumum RMA for the LPAR of 256Mb, and increase
the default TFTP buffer size. I believe that there is a chance, should the
kernel specify a differrnt value, to get into an infinite boot loop. This
option will need to be very well tested, and we'd need to talk to the FW
One drawback to approaches 3 and 4 is that almost certainly some portion of the
change to yaboot will not be acceptable upstream so Redhat would need to accept
the risk of shipping a version of yaboot with code that isn't upstream,
We need to think carefully about the pro and cons and of course the timeline for RHEL 6.1
Retested on build RHEL6.1-20110224.2, anaconda-13.21.100-1.
-rw-r--r--. 2 root root 32961917 2011-02-25 07:02 ramdisk.image.gz
Moving to VERIFIED.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.