Hide Forgot
Created attachment 477392 [details] console.log from RHEL6.1-20110206.n.0 Description of problem: ramdisk.image.gz exceeded the 32MB size and netboot fails on ppc64. Version-Release number of selected component (if applicable): RHEL6.1-20110206.n.0 RHEL6.1-20110204.n.0 RHEL6.1-20110203.n.0 How reproducible: always Steps to Reproduce: 1. Start beaker job on ppc64 with latest nightly build Actual results: !EA017021 ! of_net_open - result: 0 <-- of_net_open - FILE_ERR_OK yaboot_text_ui - file->buffer 2100000 -> 4100000 (2000000) yaboot_text_ui - vmlinux 4100000 -> 5700000 (1600000) yaboot_text_ui - initrd 0 -> 0 (0) yaboot_text_ui - Looks like we do not need to move the kernel --> of_close of_close - <@01b36000> of_close - of_close called <-- of_close - 0 ramdisk load failed ! ENTER called ok 0 > Expected results: ramdisk is loaded without problems. Additional info: The sizes of latest ramdisks are: RHEL6.1-20110206.n.0 - 34566900 RHEL6.1-20110204.n.0 - 34512863 RHEL6.1-20110203.n.0 - 34533430 Latest working was: RHEL6.1-20110202.n.0 - 33088046 The limitation is somewhere about 32MB, which is 33554432. similar bug is filled on Fedora: https://bugzilla.redhat.com/show_bug.cgi?id=653986
*** Bug 675288 has been marked as a duplicate of this bug. ***
I went through the content of our initrd and found about 300KB of files which might (but it is not sure yet) be possible to remove. The space saved by this won't be enough as we are over the limit by about 1MB at the moment. Just for the info - the locales in stage1 are already stripped down to bare minimum (en_US.utf8).
And unfortunately, since everything continues to grow over time, just removing something isn't really a sustainable fix. There's going to come a day in RHEL6 where we simply cannot remove anything else and keep the same level of functionality in anaconda. On that day, what's going to be the real answer?
Based on comment #3, the last working ppc64 tree was 20110202.n.0. The ramdisk.image.gz file was exactly 32MB in that tree. In the current nightly tree, the ramdisk.image.gz file has increased to 33MB. Unpacking the trees, here's what I see: 20110202.n.0 tree - 73548k unpacked ramdisk.image.gz 20110208.n.0 tree - 75664k unpacked ramdisk.image.gz Gathering more details.
Appears the most recent growth in the image has been for driver updates, new firmware files, and the iscsi userland tools. Attaching a diff of the 20110202.n.0 image with the 20110208.n.0 image.
Created attachment 477640 [details] image.diff
------- Comment From mjwolf.com 2011-02-08 13:33 EDT------- for the short term can you use lzma compression instead of gzip. The ramdisk will be smaller and the kernel can still deal with it.
(In reply to comment #8) > ------- Comment From mjwolf.com 2011-02-08 13:33 EDT------- > for the short term can you use lzma compression instead of gzip. The ramdisk > will be smaller and the kernel can still deal with it. We are working on this on the master branch, but changing the compression format used for the ramdisk image affects a number of other components. Ideally it's fine, but we need to check all of those to ensure we don't introduce another problem. For now, the fix we have removes some kernel modules and firmware files from ramdisk.image.gz so that it's below 32MB. We are removing the following subdirectories from /lib/modules: firewire pcmcia sound wireless And from /lib/firmware, we are removing the following subdirectories: matrox r128 radeon zd1211 The test compose we just did brings us to a 31MB ramdisk.image.gz.
------- Comment From mjwolf.com 2011-02-08 17:56 EDT------- ok verified that I can transfer more than 32MB. Please do the following steps and let me know what happens boot and enter the SMS menus type '0' then 'y' you should now be at the open firmware prompt. setenv real-base c00000 dev /packages/gui obe //be ready right away to type '1' and go back into the SMS menus select "Setup Remote IPL (Initial Program Load)" select appropriate network device select "IPv4 - Address Format 123.231.111.222" select "BOOTP" select "Advanced Setup: BOOTP" change the Bootp Blocksize from 512 to 1024 select "M" and go to main menu and try the network install again again I would also recommend using the lzma compression. For the rhel6 ramdisk it changed the file size significantly -r--r--r--. 1 root root 24041544 Feb 8 15:33 ramdisk.image.lzma -r--r--r-- 1 root root 31319913 Oct 27 10:01 ramdisk.image.gz
We're not opposed to lzma for ramdisk.image, we're just not going to throw that in right now because all of the tree composition and booting tools that rely on the initrd being named ramdisk.image.gz and a gzip file all need testing to ensure that everything will still work if it's a .xz file. We just don't have the time for that for this release.
------- Comment From tonyb.com 2011-02-09 00:34 EDT------- There are 2 problems causing this failure. 1. Firmware will only TFTP in 64k packets of block-size. The default block-size is 512bytes. This limits the transfer to 32Mb changing the TFTP blocksize is documented in RFC 2348. I don't know how many TFTP serves in production support this. Certainly some do. Mike has shown how to do it for us. 2. (RedHat's) Yaboot has a default transfer buffer #defined to 32Mb. Exceeding this size in the TFTP transfer should fail. 32Mb was chosen as a default maximum size that will work on systems with 128Mb RMA Going forward there are several options: 1. Remove un-need data from the ramdisk to keep it's size under 32Mb Providing this is done only to the ramdisk that's used for netbooting that should create minimal breakage (if any). 2. Switch to LZMA compression, again to keep the size under 32Mb Both of these options require changes to the Redhat build/test process but shoudl allow 6.1 to get out the door while we discuss a more long term strategy 3. Change yaboot to use the initrd-size option in yaboot.conf to modify the size of the buffer allocated for the TFTP load. This will also need to change the blocksize for the TFTP request. Also this will need backporting several changes from upstream to RHEL's yaboot. Infact it may be easier to rebase yaboot completely on upstream 4. Change yaboot to request a minumum RMA for the LPAR of 256Mb, and increase the default TFTP buffer size. I believe that there is a chance, should the kernel specify a differrnt value, to get into an infinite boot loop. This option will need to be very well tested, and we'd need to talk to the FW team. One drawback to approaches 3 and 4 is that almost certainly some portion of the change to yaboot will not be acceptable upstream so Redhat would need to accept the risk of shipping a version of yaboot with code that isn't upstream, We need to think carefully about the pro and cons and of course the timeline for RHEL 6.1
Retested on build RHEL6.1-20110224.2, anaconda-13.21.100-1. -rw-r--r--. 2 root root 32961917 2011-02-25 07:02 ramdisk.image.gz Moving to VERIFIED.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0530.html