| Summary: | netboot fails on ppc64 - ramdisk is >32MB | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Martin Banas <mbanas> | ||||||
| Component: | anaconda | Assignee: | David Cantrell <dcantrell> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Martin Banas <mbanas> | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | urgent | ||||||||
| Version: | 6.1 | CC: | borgan, hannsj_uhl, jburke, jjarvis, jstodola, mbanas, msivak, rwilliam, sbest, syeghiay | ||||||
| Target Milestone: | beta | Keywords: | TestBlocker | ||||||
| Target Release: | 6.1 | ||||||||
| Hardware: | ppc64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | anaconda-13.21.96-1 | Doc Type: | Bug Fix | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2011-05-19 12:37:34 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Bug Depends On: | 670159 | ||||||||
| Bug Blocks: | 1006043 | ||||||||
| Attachments: |
|
||||||||
|
Description
Martin Banas
2011-02-07 10:36:38 UTC
*** Bug 675288 has been marked as a duplicate of this bug. *** I went through the content of our initrd and found about 300KB of files which might (but it is not sure yet) be possible to remove. The space saved by this won't be enough as we are over the limit by about 1MB at the moment. Just for the info - the locales in stage1 are already stripped down to bare minimum (en_US.utf8). And unfortunately, since everything continues to grow over time, just removing something isn't really a sustainable fix. There's going to come a day in RHEL6 where we simply cannot remove anything else and keep the same level of functionality in anaconda. On that day, what's going to be the real answer? Based on comment #3, the last working ppc64 tree was 20110202.n.0. The ramdisk.image.gz file was exactly 32MB in that tree. In the current nightly tree, the ramdisk.image.gz file has increased to 33MB. Unpacking the trees, here's what I see: 20110202.n.0 tree - 73548k unpacked ramdisk.image.gz 20110208.n.0 tree - 75664k unpacked ramdisk.image.gz Gathering more details. Appears the most recent growth in the image has been for driver updates, new firmware files, and the iscsi userland tools. Attaching a diff of the 20110202.n.0 image with the 20110208.n.0 image. Created attachment 477640 [details]
image.diff
------- Comment From mjwolf.com 2011-02-08 13:33 EDT------- for the short term can you use lzma compression instead of gzip. The ramdisk will be smaller and the kernel can still deal with it. (In reply to comment #8) > ------- Comment From mjwolf.com 2011-02-08 13:33 EDT------- > for the short term can you use lzma compression instead of gzip. The ramdisk > will be smaller and the kernel can still deal with it. We are working on this on the master branch, but changing the compression format used for the ramdisk image affects a number of other components. Ideally it's fine, but we need to check all of those to ensure we don't introduce another problem. For now, the fix we have removes some kernel modules and firmware files from ramdisk.image.gz so that it's below 32MB. We are removing the following subdirectories from /lib/modules: firewire pcmcia sound wireless And from /lib/firmware, we are removing the following subdirectories: matrox r128 radeon zd1211 The test compose we just did brings us to a 31MB ramdisk.image.gz. ------- Comment From mjwolf.com 2011-02-08 17:56 EDT------- ok verified that I can transfer more than 32MB. Please do the following steps and let me know what happens boot and enter the SMS menus type '0' then 'y' you should now be at the open firmware prompt. setenv real-base c00000 dev /packages/gui obe //be ready right away to type '1' and go back into the SMS menus select "Setup Remote IPL (Initial Program Load)" select appropriate network device select "IPv4 - Address Format 123.231.111.222" select "BOOTP" select "Advanced Setup: BOOTP" change the Bootp Blocksize from 512 to 1024 select "M" and go to main menu and try the network install again again I would also recommend using the lzma compression. For the rhel6 ramdisk it changed the file size significantly -r--r--r--. 1 root root 24041544 Feb 8 15:33 ramdisk.image.lzma -r--r--r-- 1 root root 31319913 Oct 27 10:01 ramdisk.image.gz We're not opposed to lzma for ramdisk.image, we're just not going to throw that in right now because all of the tree composition and booting tools that rely on the initrd being named ramdisk.image.gz and a gzip file all need testing to ensure that everything will still work if it's a .xz file. We just don't have the time for that for this release. ------- Comment From tonyb.com 2011-02-09 00:34 EDT------- There are 2 problems causing this failure. 1. Firmware will only TFTP in 64k packets of block-size. The default block-size is 512bytes. This limits the transfer to 32Mb changing the TFTP blocksize is documented in RFC 2348. I don't know how many TFTP serves in production support this. Certainly some do. Mike has shown how to do it for us. 2. (RedHat's) Yaboot has a default transfer buffer #defined to 32Mb. Exceeding this size in the TFTP transfer should fail. 32Mb was chosen as a default maximum size that will work on systems with 128Mb RMA Going forward there are several options: 1. Remove un-need data from the ramdisk to keep it's size under 32Mb Providing this is done only to the ramdisk that's used for netbooting that should create minimal breakage (if any). 2. Switch to LZMA compression, again to keep the size under 32Mb Both of these options require changes to the Redhat build/test process but shoudl allow 6.1 to get out the door while we discuss a more long term strategy 3. Change yaboot to use the initrd-size option in yaboot.conf to modify the size of the buffer allocated for the TFTP load. This will also need to change the blocksize for the TFTP request. Also this will need backporting several changes from upstream to RHEL's yaboot. Infact it may be easier to rebase yaboot completely on upstream 4. Change yaboot to request a minumum RMA for the LPAR of 256Mb, and increase the default TFTP buffer size. I believe that there is a chance, should the kernel specify a differrnt value, to get into an infinite boot loop. This option will need to be very well tested, and we'd need to talk to the FW team. One drawback to approaches 3 and 4 is that almost certainly some portion of the change to yaboot will not be acceptable upstream so Redhat would need to accept the risk of shipping a version of yaboot with code that isn't upstream, We need to think carefully about the pro and cons and of course the timeline for RHEL 6.1 Retested on build RHEL6.1-20110224.2, anaconda-13.21.100-1. -rw-r--r--. 2 root root 32961917 2011-02-25 07:02 ramdisk.image.gz Moving to VERIFIED. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0530.html |