Bug 250412

Summary: Rhel5 failed to pxe boot a Xen kernel (x86_64)
Product: Red Hat Enterprise Linux 5 Reporter: George Beshers <gbeshers>
Component: syslinuxAssignee: Peter Jones <pjones>
Status: CLOSED WONTFIX QA Contact: Brock Organ <borgan>
Severity: high Docs Contact:
Priority: medium    
Version: 5.1CC: clalance, edwardsg, holt, jfenal, jh, jlan, jlim, martinez, pasteur, tao, xen-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-08-27 08:58:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 244727, 252963, 253736    

Description George Beshers 2007-08-01 13:52:25 UTC
PV968135

Description of problem:

PXE boot failed on an rhel5 Xen kernel from our x86_64 systems.
The pxelinux boot loader is from syslinux-3.11-4. When it failed,
the console displayed: "Incorrect or corrupt kernel image".
Both xen.gz from 2.6.18-8.el5xen and 2.6.18-28.el5xen failed the
same way.

Both server and the disless are x86_64. The server is running
2.6.18-8.el5xen.

I noticed there was a BZ 220132, but that BZ was on IA64 and about Xen boot
failure locally. Actually Xen kernel _DID_ boot fine locally on a rhel5 
in my XE systems. This PV is about PXE boot.

**NOTE: I have asked for this to be retested with 5.1 beta.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 George Beshers 2007-08-09 14:35:30 UTC
This is what was observered in the lab at SGI.  Note that the failure is
at an early stage.

The diskless would send a request for netbooting. The DHCP server acknowleged
it. Then the tftp server tried to send the initial kernel, which is xen.gz
in the case of Xen kernel, but complained the xen.gz is "incorrect or corrupted".

The vmlinuz, initrd, and nfsroot have not come into play yet.
On tftp server at /tftpboot, i have:
[root@dopple tftpboot]# ll
total 8684
-rw-r--r-- 1 root root 1912503 Jul 23 16:13 initrd-nfsroot-28.img
-rw-r--r-- 1 root root 2584655 Jul 23 16:13 initrd-nfsroot-28-xen.img
-rw-r--r-- 1 root root   13148 Jul 23 16:10 pxelinux.0
drwxr-xr-x 2 root root    4096 Jul 27 11:05 pxelinux.cfg
-rwxr-xr-x 1 root root 1840860 Jul 23 16:13 vmlinuz-2.6.18-28.el5
-rwxr-xr-x 1 root root 1914419 Jul 23 16:13 vmlinuz-2.6.18-28.el5xen
-rw-r--r-- 1 root root  276909 Jul 23 16:13 xen.gz-2.6.18-28.el
-rw-r--r-- 1 root root  276332 Jul 24 13:13 xen.gz-2.6.18-8.el5
[root@dopple tftpboot]# 

The vmlinuz, xen.gz are from redhat. The 'pxelinux.0' is from redhat as well.
I used this command to create initrd:
mkinitrd -v -f --preload=e1000 --preload=nfs --net-dev="eth0"
--rootdev=150.166.37.66:/var/lib/nfsroot/flipper-xen --rootfs=nfs
/boot/initrd-nfsroot-28-xen.img 2.6.18-28.el5xen

from the the local disk of the supposedly diskless machine while the local
disk is still mounted.


Under /tftpboot/pxelinux.cfg/ directory, i have:
[root@dopple tftpboot]# ls -l pxelinux.cfg
total 4
-rw-r--r-- 1 root root 438 Jul 24 13:14 96A62539
[root@dopple tftpboot]# 


The content of 96A62539 is as below:
[root@dopple tftpboot]# more *.cfg/96A62539
prompt 10
timeout 15
#serial --unit=1 --speed=38400 --word=8 --parity=no --stop=1
#terminal --timeout=5 --dumb serial console
#rfsroot=150.166.37.66:/

default el5nfs-xen

label el5nfs
  kernel vmlinuz-2.6.18-28.el5
  APPEND initrd=initrd-nfsroot-28.img console=ttyS1,38400n8 console=tty0

label el5nfs-xen
  kernel xen.gz-2.6.18-8.el5
  APPEND vmlinuz-2.6.18-28.el5xen initrd=initrd-nfsroot-28-xen.img
console=ttyS1,38400n8 console=tty0

I change the default line to boot either the nfsroot of regular rhel5 (-28
kernel) or the nfsroot of xen kernel.


On the NFS server, i exported rhel5 nfsroot at
/var/lib/nfsroot/flipper   
and rhel5 xen nfsroot at
/var/lib/nfsroot/flipper-xen





Comment 2 Jonathan Lim 2007-08-10 19:21:12 UTC
I upgraded both test server and client machines to RHEL5.1-Beta
(2.6.18-36.el5) yesterday and repeated the setup procedure above.
The problem remains unresolved.

RHEL5.1-Beta comes with syslinux-3.11-4.x86_64.rpm, but I also
tried installing syslinux-3.51-1.x86_64.rpm that I got from

  http://www.kernel.org/pub/linux/utils/boot/syslinux/RPMS/x86_64/

The new /usr/share/syslinux/pxelinux.0 copied to /tftpboot on the
server does not solve the problem either.


Comment 3 Jonathan Lim 2007-08-10 21:34:52 UTC
I've got RHEL5.1-Beta PXE boot to Xen working now:

  1. Install syslinux-3.51-1.x86_64.rpm on the server from the
     www.kernel.org link above (syslinux-3.11-4 does not work).

  2. Copy /usr/share/syslinux/{pxelinux.0,mboot.c32} to /tftpboot.

  3. Replace the contents of /tftpboot/pxelinux.cfg/96A62539 with

       DEFAULT mboot.c32 xen.gz-2.6.18-36.el5 --- vmlinuz-2.6.18-36.el5xen \
         console=ttyS1,38400n8 console=tty0 --- initrd-nfsroot-36-xen.img


Comment 4 Jonathan Lim 2007-08-16 19:05:57 UTC
syslinux-3.20-1 also works; from /usr/share/doc/syslinux-3.20/NEWS:

Changes in 3.20:
        * EXTLINUX: New options --install (-i) and --update (-U), to
          make it clear if a boot loader should be installed or
          updated.  For now, defaults to --install for compatibility;
          a future version will require one of these options.
        * New library functions to load and place files in memory.
        * mboot.c32 bug fixes.
        * Remove 8 MB kernel size restriction.
        * Add "klibc" target for building unix/syslinux and
          extlinux/extlinux with klcc (klibc-1.4.27 or later.)
        * PXELINUX: Fail (and eventually reboot) if no configuration
          file was found.
        * COM32 module by Erwan Velu to make decisions based on DMI
          info.
        * Fix issue where going back and forth between menus a lot
          would cause a hang.
        * ISOLINUX: Fix bug which made "cd boot sectors" not work.


Comment 5 Jonathan Lim 2007-08-18 00:53:10 UTC
More details on this: as long as I have the following line in
/tftpboot/pxelinux.cfg/...

  DEFAULT mboot.c32 xen.gz-2.6.18-36.el5 ...

then it doesn't matter if pxelinux.0 is from syslinux-3.11 or syslinux-3.20.
The Xen kernel will start to boot without the "Incorrect or corrupt kernel
image" message.

However, if I use the mboot.c32 from syslinux-3.11, the Xen boot will
subsequently fail with the following message:

  (XEN) Not enough memory to stash the DOM0 kernel image.

With the mboot.c32 from syslinux-3.20, the Xen kernel boots without errors.


Comment 6 Jonathan Lim 2007-12-13 21:13:06 UTC
I obtained com32/modules/mboot.c from

  http://www.kernel.org/pub/linux/utils/boot/syslinux/Old/syslinux-3.20.tar.gz

and replaced the one that came with syslinux-3.11-4.src.rpm on RHEL5.1-GA.

I then built mboot.c32 with the other existing syslinux-3.11-4 source and
was able to PXE boot a client running 2.6.18-53.el5xen (RHEL5.1-GA).

The original mboot.c from syslinux-3.11-4.src.rpm is "COM32 Multiboot
loader v0.1" whereas the one from syslinux-3.20 is "... v0.2".


Comment 7 Chris Lalancette 2008-02-27 03:17:19 UTC
OK.  So it looks like there isn't really a bug here, just some wacky
configuration options needed.  Does someone who got this working want to
volunteer to write a KBase about this, and then we can close out this bug?

Thanks,
Chris Lalancette

Comment 8 Jonathan Lim 2008-03-07 20:05:12 UTC
All that's needed is syslinux-3.20, mboot.c specifically.


Comment 9 Don Domingo 2008-04-02 02:16:50 UTC
Hi,
the RHEL5.2 release notes will be dropped to translation on April 15, 2008, at
which point no further additions or revisions will be entertained.

a mockup of the RHEL5.2 release notes can be viewed at the following link:
http://intranet.corp.redhat.com/ic/intranet/RHEL5u2relnotesmockup.html

please use the aforementioned link to verify if your bugzilla is already in the
release notes (if it needs to be). each item in the release notes contains a
link to its original bug; as such, you can search through the release notes by
bug number.

Cheers,
Don

Comment 11 Chris Lalancette 2009-08-27 08:58:07 UTC
Given that the corresponding IT lapsed long ago, I'm closing this out as WONTFIX.

Chris Lalancette