Description of problem: RHEL3 U2 beta (2.4.21-11.EL kernel) only supports IO port space for one IO chassis. Cards in a second IO chassis that use IO port space do not work correctly. Version-Release number of selected component (if applicable): linux kernel 2.4.21-11.EL How reproducible: Configure a partition with two cells and two IO chassis Observed on an Oly with cells 0 & 1 in a partition (cells 2 & 3 would not have any IO attached, so would be unusable). Should also occur on an Eiger with both cells in one partition, or an Oly with an IOX. The requirement is to have two cells, each of which has attached IO. Load cards that use IO port space in both IO chassis. We saw the problem with a smartarray card in the IO chassis attached to the root cell, and a quad Tulip in the other IO chassis. The smartarray card at PCI 30:01.0 was allocated IO ports 0x8000-0x80ff. The quad Tulip card (bridge at PCI 50:01.0, NICs below the bridge at PCI 51:04.0, 51:05.0, 51:06.0, and 51:07.0) was allocated IO ports 0x8000-0x807f, 0x8080-0x80ff, 0x8100-0x817f, and 0x8180-0x81ff. All the Tulip IO port ranges are wrong because the entire 64KB legacy IO port space (0x0-0xffff) gets routed to the IO chassis on the root cell, but the Tulip is in the other IO chassis. The code to support IO port space for multiple IO chassis appeared in the linux-2.4.21-ia64-030702 patch and in 2.5 on 2003-05-06 (http://linux.bkbits.net:8080/linux-2.5/cset% 403eb808e5esoL8Pj4ppGCpghKStZb4w). For some reason, part of this support is already in RHEL3 U2 (the changes in include/asm-ia64/io.h to __ia64_mk_io_addr() and the setup changes in arch/ia64/kernel/setup.c), but the code to discover IO and mem windows from ACPI is missing. The attached patch is derived from the current 2.6 code and fills in the missing bits. With the patch, IO port space for the root cell's IO chassis is 0x0-0xffff as before. IO port space for the other IO chassis becomes 0x1000000-0100ffff. In addition, the specific ranges routed to each PCI bus are now shown in /proc/ioports (and /proc/iomem). For example: # cat /proc/ioports 00000000-00000fff : PCI Bus 0000:00 00000500-000005ff : sym53c8xx 00000600-000006ff : sym53c8xx 00000700-000007ff : sym53c8xx 00000800-000008ff : sym53c8xx 00001000-00001fff : PCI Bus 0000:02 00002000-00003fff : PCI Bus 0000:04 00004000-00005fff : PCI Bus 0000:08 00006000-00007fff : PCI Bus 0000:0c 00008000-00009fff : PCI Bus 0000:10 0000a000-0000bfff : PCI Bus 0000:14 0000c000-0000dfff : PCI Bus 0000:18 0000e000-0000ffff : PCI Bus 0000:1c 01000000-01000fff : PCI Bus 0000:20 01000500-010005ff : sym53c8xx 01000600-010006ff : sym53c8xx 01000700-010007ff : sym53c8xx 01000800-010008ff : sym53c8xx 01001000-01001fff : PCI Bus 0000:22 01002000-01003fff : PCI Bus 0000:24 01004000-01005fff : PCI Bus 0000:28 01006000-01007fff : PCI Bus 0000:2c 01008000-01009fff : PCI Bus 0000:30 01008000-0100807f : tulip 01008080-010080ff : tulip 01008100-0100817f : tulip 01008180-010081ff : tulip 0100a000-0100bfff : PCI Bus 0000:34 0100a000-0100a07f : tulip 0100a080-0100a0ff : tulip 0100a100-0100a17f : tulip 0100a180-0100a1ff : tulip 0100c000-0100dfff : PCI Bus 0000:38 0100e000-0100ffff : PCI Bus 0000:3c
Created attachment 99565 [details] add the missing IO port space support I used spec changes like this: --- SPECS/kernel-2.4.spec.orig 2004-03-08 17:36:53.000000000 -0700 +++ SPECS/kernel-2.4.spec 2004-04-19 16:40:54.000000000 -0600 @@ -19,7 +19,7 @@ # that the kernel isn't the stock RHL kernel, for example by # adding some text to the end of the version number. # -%define release 11.EL +%define release 11.EL.BH1 %define sublevel 21 %define kversion 2.4.%{sublevel} # /usr/src/%{kslnk} -> /usr/src/linux-%{KVERREL} @@ -232,6 +232,7 @@ Patch274: linux-2.4.20-ia64-mcadump.patch Patch276: linux-2.4.21-ia64-fancyiommu.patch Patch278: linux-2.4.21-ia64-cyclone.patch +Patch280: linux-2.4.21-ia64-ioport.patch # PPC is 340 - 419 Patch340: linux-2.4.21-ppc64-core.patch @@ -759,6 +760,8 @@ %patch276 -p1 # cyclone timer support %patch278 -p1 +# multiple I/O port space support +%patch280 -p1 # s390 s390x
I have tested this patch out and verified that it works OK and doesnt break the kernel ABI. However, we need to change it so that the output of /proc/ioports is not altered because that might break utilities that read that file, Bjorn agreed to include a new patch that does this. Once that is done we will include it in RHEL3-U3, it is too late for U2. Larry Woodman
Created attachment 99964 [details] Updated patch (no root bridge regions in /proc/io{mem,ports}) Here's the updated patch that doesn't put root bridge regions in /proc/iomem and /proc/ioports.
A fix for this problem has been committed to the RHEL3 U3 patch pool this evening (in kernel version 2.4.21-15.6.EL).
HP, can you confirm that the latest U3-candidate kernel resolves the issues you were seeing?
weekly rhel meeting jneedle "this is absolutely not a stop-ship in my mind"
I configured a Superdome to try to reproduce this bug using RHEL3u3 public beta (our beta 2) with the 2.4.21-18.EL kernel: -------------------------------+ Cabinet | 0 | 1 | -------------+--------+--------+ Slot |01234567|01234567| -------------+--------+--------+ Partition 0 |X.....X.|........| Partition 1 |....X...|........| Partition 2 |.XX.....|........| Partition 3 |.....X..|........| Partition 4 |...X...X|........| --------------------------+ Cabinet | 0 | 1 | --------+--------+--------+ Slot |01234567|01234567| --------+--------+--------+ Cell |XXXXXXXX|........| IO Cab |008800..|........| IO Bay |100010..|........| IO Chas |311313..|........| I installed the OS on Partition 2, which you can see has 2 IO bays. I put a fibre channel card in cabinet 1, chassis 1 and a 4-port tulip in cabinet 8 (our IOX), chassis 1. Sharon Smith ran some tests. It looks like the bug is fixed. Her results will follow.
After checking the configuration, I ran /proc/ioports again to ck the fix. Here's the info: cat /proc/ioports 00000d00-00000d7f : tulip 00001000-000010ff : sym53c8xx 00003000-000030ff : qla2300 00003100-000031ff : qla2300 01000d00-01000d7f : tulip 01001000-010010ff : sym53c8xx 01002000-0100207f : tulip 01002080-010020ff : tulip 01002100-0100217f : tulip 01002180-010021ff : tulip 01008000-010080ff : sym53c8xx 0100b000-0100b0ff : sym53c8xx ~Sharon
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-433.html