Bug 113357

Summary: Recompiling kernel with PAGE_SIZE!=16KB, then, kernel can't boot on IA64 machine
Product: Red Hat Enterprise Linux 3 Reporter: Zhang Yanmin <yanmin.zhang>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: petrides, riel, tao, yanmin.zhang
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-01-13 13:30:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Fix bug 113357 none

Description Zhang Yanmin 2004-01-13 02:08:26 UTC
Description of problem:
After recompiling the kernel of EL3 with PAGE_SIZE!=16KB, kernel 
can't boot on IA64 machine. The boot messages show an error at line 
112 of file page_alloc.c.

Version-Release number of selected component (if applicable):
linux-2.4.21-4.EL

How reproducible:
Just recompile EL3's kernel with PAGE_SIZE=4KB or 8KB, then, boot the 
machine with the new image.

Steps to Reproduce:
1. Recompile the kernel: 
#make menuconfig
Choose General setup->Kernel page size=4KB. Then, save the config and 
exit menuconfig.
#make dep
#make vmlinux
#make modules
#make modules_install
Then, copy kernel image to /boot/efi/efi/redhat and change elilo.conf.
#reboot

2.
3.
  
Actual results:
Boot messages show error at line 112 of file page_alloc.c and the 
booting stops.

Expected results:
Kernel booting succeeds.

Additional info:
I will paste the root cause and patch to the bug report.

Comment 1 Zhang Yanmin 2004-01-13 02:45:20 UTC
Created attachment 96921 [details]
Fix bug 113357

When PAGE_SIZE!=16KB, EL3 kernel can't boot on IA64 machine, and booting
message shows error on page_alloc.c line 112. There is a BUG check at this
source line of function destroy_compound_page to make sure parameter order is
really the order used by __alloc_pages. When kernel allocates
task_struct(16KB), it chooses order 'IA64_TASK_STRUCT_LOG_NUM_PAGES' based on
PAGE_SIZE. However, when it releases task_struct, it uses a hardcoded order 1
as its parameter. So the BUG checking in destroy_compound_page will be
triggered.

I tested the patch on Tiger-2(IA64) machine for many times. All except one run
well. When the exception happened, SCSI kept reseting again and again. I used
my own config file with all needed modules are compiled into the kernel
directly.

One question: I checked the base kernels of 2.4 and 2.6. They all do not have
such bug. Why did redhat people put a hardcoded 1 there? Just force EL3 to
support only 16KB on purpose?

Comment 2 Zhang Yanmin 2004-01-13 02:55:21 UTC
One more statement: This error is the only problem that prevents 
booting with PAGE_SIZE!=16KB.

Comment 3 Arjan van de Ven 2004-01-13 13:30:39 UTC
We explicitly don't support recompiled kernels and esp not kernels
with a different userspace ABI.