Bug 104567 - system hang while running some file creation tests on our IA64
system hang while running some file creation tests on our IA64
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
ia64 Linux
medium Severity high
: ---
: ---
Assigned To: Larry Woodman
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-09-17 06:53 EDT by Marc Varel
Modified: 2007-11-30 17:06 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-12-20 15:54:43 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
undo fancyswiommupatch (16.73 KB, text/plain)
2003-12-04 03:53 EST, Marc Varel
no flags Details

  None (edit)
Description Marc Varel 2003-09-17 06:53:46 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.78 [en] (X11; U; AIX 4.3)

Description of problem:
Working with RedHat AS 2.1 or RedHat AS 3.0 beta 2, we are hitting a
system hang while running some file creation tests on our IA64
platform with large physical memory ( > 16 GB).

Platform curently supports 16 Itanium-2 processors and 64 GB of physical memory.
The physical memory is implemented in a sparse space of 256 GB as
follows:

 2 GB  at physical address  0
14 GB                       6 GB (4GB PCI gap)
16 GB                      64 GB
16 GB                     128 GB
16 GB                     192 GB

The RedHat kernels curently divide memory in:
Low  memory:  2 GB ( at addresses < 4 GB)
High memory: 62 GB

The test shows that system finally freezes with low memory exausted,
and still a lot of high memory available. Kernel messages show that
ENOMEM is reported.


Such problem is not seen when we run the same test on ia64 2.4.20 or 2.5 kernels
from kernel.org. We did not see it either using RedHat kernel with
less than 16 GB memory (in this case low memory stays stable, as high memory
reaches low levels).

Comparing wit 2.4 and 2.5 linux kernel lines for IA64, it seems that
the usage of High memory is a specificity of RedHat kernels. 

We would like to know:

. what is the reason for using the High memory feature, knowing that IA64
architecture supports direct access to the whole physical address
space from processor and from IO devices ? (Until now we thought this
was only a 32 bit processor feature).

. is there a way to disable this feature using some boot parameter. or
building a new kernel with CONFIG_HIGHMEM disabled ?

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.running some file creation tests on our IA64 platform with large physical
memory ( > 16 GB)
2.
3.
    

Additional info:
Comment 1 Arjan van de Ven 2003-09-17 07:06:05 EDT
>
>. what is the reason for using the High memory feature, knowing that IA64
>architecture supports direct access to the whole physical address
>space from processor and from IO devices ? (Until now we thought this
>was only a 32 bit processor feature).

We only use the high memory feature on systems with a crippled, light chipset
(eg one without IOMMU); it is required for system stability there.

What chipset do you use ?
Comment 2 Marc Varel 2003-09-17 07:25:00 EDT
Chipset Intel 82870
Comment 3 Arjan van de Ven 2003-09-17 07:26:30 EDT
ah yes that's one without an iommu...
Comment 4 Ingo Molnar 2003-09-17 08:32:21 EDT
could you please attach the /proc/slabinfo, /proc/meminfo and full bootup-log
(kernel messages) files to this report?
Comment 5 Marc Varel 2003-09-19 09:14:02 EDT
meminfo

        total:    used:    free:  shared: buffers:  cached:
Mem:  68213407744 945766400 67267641344        0 55377920 347095040
Swap: 2098135040        0 2098135040
MemTotal:     66614656 kB
MemFree:      65691056 kB
MemShared:           0 kB
Buffers:         54080 kB
Cached:         338960 kB
SwapCached:          0 kB
Active:         402112 kB
ActiveAnon:     136192 kB
ActiveCache:    265920 kB
Inact_dirty:    127120 kB
Inact_laundry:       0 kB
Inact_clean:         0 kB
Inact_target:   105840 kB
HighTotal:    65010976 kB
HighFree:     64436528 kB
LowTotal:      1603680 kB
LowFree:       1254528 kB
SwapTotal:     2048960 kB
SwapFree:      2048960 kB
HugePages_Total:     0
HugePages_Free:      0
Hugepagesize:     4096 kB



slabinfo
slabinfo - version: 1.1 (SMP)
kmem_cache           111    111    432    3    3    1 : 2108  527
ip_fib_hash         1356   1356     32    3    3    1 : 4284 1071
urb_priv             492    492    128    4    4    1 : 4284 1071
ext3_xattr             0      0     80    0    0    1 : 4284 1071
journal_head        6885   9027     88   44   51    1 : 4284 1071
revoke_table        1632   1632     16    2    2    1 : 4284 1071
revoke_record       2400   2400     64   10   10    1 : 4284 1071
ip_mrt_cache           0      0    128    0    0    1 : 4284 1071
tcp_tw_bucket        186    186    256    3    3    1 : 4284 1071
tcp_bind_bucket     3120   3120     64   13   13    1 : 4284 1071
tcp_open_request     738    738    128    6    6    1 : 4284 1071
inet_peer_cache      246    246    128    2    2    1 : 4284 1071
secpath_cache          0      0    256    0    0    1 : 4284 1071
xfrm_dst_cache         0      0    384    0    0    1 : 2108  527
ip_dst_cache         252    252    384    6    6    1 : 2108  527
arp_cache            186    186    256    3    3    1 : 4284 1071
flow_cache             0      0    128    0    0    1 : 4284 1071
blkdev_requests     7254   7254    256  117  117    1 : 4284 1071
kioctx                 0      0    256    0    0    1 : 4284 1071
kiocb                  0      0    256    0    0    1 : 4284 1071
dnotify_cache        742    742     40    2    2    1 : 4284 1071
file_lock_cache     1316   1316    168   14   14    1 : 4284 1071
async poll table       0      0    272    0    0    1 : 2108  527
fasync_cache        1162   1162     24    2    2    1 : 4284 1071
uid_cache           1200   1200     64    5    5    1 : 4284 1071
skbuff_head_cache  20877  21948    256  351  354    1 : 4284 1071
sock                 424    424   1920   53   53    1 : 1020  255
sigqueue           19024  19024    136  164  164    1 : 4284 1071
kiobuf                 0      0    128    0    0    1 : 4284 1071
cdev_cache           984    984    128    8    8    1 : 4284 1071
bdev_cache          1845   1845    128   15   15    1 : 4284 1071
mnt_cache            369    369    128    3    3    1 : 4284 1071
inode_cache        11682  11682    896  649  649    1 : 2108  527
dentry_cache       15624  15624    256  252  252    1 : 4284 1071
dquot                  0      0    256    0    0    1 : 4284 1071
filp                1922   1922    256   31   31    1 : 4284 1071
names_cache           72     72   4096   18   18    1 : 1020  255
buffer_head       108880 108880    200 1361 1361    1 : 4284 1071
mm_struct            527    527    512   17   17    1 : 2108  527
vm_area_struct      8990   8990    256  145  145    1 : 4284 1071
fs_cache            1968   1968    128   16   16    1 : 4284 1071
files_cache          360    360    896   20   20    1 : 2108  527
signal_cache        1968   1968    128   16   16    1 : 4284 1071
sighand_cache        243    243   1664   27   27    1 : 1020  255
task_struct            0      0   4480    0    0    2 : 1020  255
pte_chain          12915  12915    128  105  105    1 : 4284 1071
size-131072(DMA)       0      0 131072    0    0    8 :    0    0
size-131072            0      0 131072    0    0    8 :    0    0
size-65536(DMA)        0      0  65536    0    0    4 :    0    0
size-65536           659    659  65536  659  659    4 :    0    0
size-32768(DMA)        0      0  32768    0    0    2 :    0    0
size-32768           194    194  32768  194  194    2 :    0    0
size-16384(DMA)        0      0  16384    0    0    1 : 1020  255
size-16384          1248   1248  16384 1248 1248    1 : 1020  255
size-8192(DMA)         0      0   8192    0    0    1 : 1020  255
size-8192            216    216   8192  108  108    1 : 1020  255
size-4096(DMA)         0      0   4096    0    0    1 : 1020  255
size-4096           2752   2752   4096  688  688    1 : 1020  255
size-2048(DMA)         0      0   2048    0    0    1 : 1020  255
size-2048           1000   1000   2048  125  125    1 : 1020  255
size-1024(DMA)         0      0   1024    0    0    1 : 2108  527
size-1024           4575   4575   1024  305  305    1 : 2108  527
size-512(DMA)          0      0    512    0    0    1 : 2108  527
size-512           10540  10540    512  340  340    1 : 2108  527
size-256(DMA)          0      0    256    0    0    1 : 4284 1071
size-256            9052   9052    256  146  146    1 : 4284 1071
size-128(DMA)        123    123    128    1    1    1 : 4284 1071
size-128            8856   8856    128   72   72    1 : 4284 1071
size-64(DMA)         123    123    128    1    1    1 : 4284 1071
size-64             5904   5904    128   48   48    1 : 4284 1071



boot.log

Sep 18 16:51:13 clusquad15 syslog: syslogd startup succeeded^M
Sep 18 16:51:13 clusquad15 syslog: klogd startup succeeded^M
Sep 18 16:51:13 clusquad15 portmap: portmap startup succeeded^M
Sep 18 16:51:13 clusquad15 nfslock: rpc.statd startup succeeded^M
Sep 18 16:51:13 clusquad15 keytable: Loading keymap: ^M
Sep 18 16:51:13 clusquad15 keytable: ^[[60G^M
Sep 18 16:51:13 clusquad15 keytable: ^M
Sep 18 16:51:13 clusquad15 keytable: Loading system font: ^M
Sep 18 16:51:13 clusquad15 keytable: ^[[60G^M
Sep 18 16:51:13 clusquad15 keytable: ^M
Sep 18 16:51:13 clusquad15 rc: Starting keytable:  succeeded^M
Sep 18 16:51:13 clusquad15 random: Initializing random number generator:  succee
ded^M
Sep 18 16:51:13 clusquad15 netfs: Mounting other filesystems:  succeeded^M
Sep 18 16:51:13 clusquad15 autofs: automount startup succeeded^M
Sep 18 16:51:14 clusquad15 acpid: acpid startup succeeded^M
Sep 18 16:51:09 clusquad15 network: Setting network parameters:  succeeded ^M
Sep 18 16:51:09 clusquad15 network: Bringing up loopback interface:  succeeded ^
M
Sep 18 16:51:17 clusquad15 cups: cupsd startup succeeded^M
Sep 18 16:51:17 clusquad15 sshd: RSA1 key generation succeeded^M
Sep 18 16:51:17 clusquad15 sshd: RSA key generation succeeded^M
Sep 18 16:51:19 clusquad15 sshd: DSA key generation succeeded^M
Sep 18 16:51:19 clusquad15 sshd:  succeeded^M
Sep 18 16:51:19 clusquad15 xinetd: xinetd startup succeeded^M
Sep 18 16:51:20 clusquad15 sendmail: sendmail startup succeeded^M
Sep 18 16:51:20 clusquad15 sendmail: sm-client startup succeeded^M
Sep 18 16:51:20 clusquad15 gpm: gpm startup succeeded^M
Sep 18 16:51:22 clusquad15 canna:  succeeded^M
Sep 18 16:51:23 clusquad15 crond: crond startup succeeded^M
Sep 18 16:51:32 clusquad15 xfs: xfs startup succeeded^M
Sep 18 16:51:32 clusquad15 atd: atd startup succeeded^M
Sep 18 16:51:32 clusquad15 firstboot: .^M
Sep 18 16:51:32 clusquad15 firstboot: ^M
Sep 18 16:51:32 clusquad15 firstboot: XFree86 Version 4.3.0^M
Sep 18 16:51:32 clusquad15 firstboot:  (Red Hat Linux release: 4.3.0-16.EL)^M
Sep 18 16:51:32 clusquad15 firstboot: Release Date: 9 May 2003^M
Sep 18 16:51:32 clusquad15 firstboot: X Protocol Version 11, Revision 0, Release
 6.6^M
Sep 18 16:51:32 clusquad15 firstboot: Build Operating System: Linux 2.4.21-1.193
1.2.307.ent ia64 [ELF] ^M
Sep 18 16:51:32 clusquad15 firstboot: Build Date: 10 July 2003^M
Sep 18 16:51:32 clusquad15 firstboot: Build Host: boris.devel.redhat.com^M
Sep 18 16:51:32 clusquad15 firstboot:  ^M
Sep 18 16:51:32 clusquad15 firstboot: ^IBefore reporting any problems, please ma
ke sure you are using the most^M
Sep 18 16:51:32 clusquad15 firstboot: ^Irecent XFree86 packages available from R
ed Hat by checking for updates^M
Sep 18 16:51:32 clusquad15 firstboot: ^Iat http://rhn.redhat.com/errata or by us
ing the Red Hat Network up2date^M
Sep 18 16:51:32 clusquad15 firstboot: ^Itool.  If you still encounter problems,
please file bug reports in the^M
Sep 18 16:51:32 clusquad15 firstboot: ^IXFree86.org bugzilla at http://bugs.xfre
e86.org and/or Red Hat^M
Sep 18 16:51:32 clusquad15 firstboot: ^Ibugzilla at http://bugzilla.redhat.com^M
Sep 18 16:51:32 clusquad15 firstboot: ^M
Sep 18 16:51:32 clusquad15 firstboot: Module Loader present^M
Sep 18 16:51:32 clusquad15 firstboot: OS Kernel: Linux version 2.4.21-1.1931.2.3
99.ent (bhcompile@boris.devel.redhat.com) (gcc version 3.2.3 20030502 (Red Hat L
inux 3.2.3-16)) #1 SMP Wed Aug 20 15:23:44 EDT 2003 ^M
Sep 18 16:51:32 clusquad15 firstboot: Markers: (--) probed, (**) from config fil
e, (==) default setting,^M
Sep 18 16:51:32 clusquad15 firstboot:          (++) from command line, (!!) noti
ce, (II) informational,^M
Sep 18 16:51:32 clusquad15 firstboot:          (WW) warning, (EE) error, (NI) no
t implemented, (??) unknown.^M
Sep 18 16:51:32 clusquad15 firstboot: (==) Log file: "/var/log/XFree86.1.log", T
ime: Thu Sep 18 16:51:32 2003^M
Sep 18 16:51:32 clusquad15 firstboot: (==) Using config file: "/etc/X11/XF86Conf
ig"^M
Sep 18 16:51:33 clusquad15 firstboot: .^M
Sep 18 16:51:37 clusquad15 firstboot: Window manager warning: Failed to a open c
onnection to a session manager, so window positions will not be saved: SESSION_M
ANAGER environment variable not defined^M
Sep 18 16:52:17 clusquad15 rc: Starting firstboot:  succeeded^M
Sep 19 11:46:42 clusquad15 atd: atd shutdown succeeded^M
Sep 19 11:46:42 clusquad15 rc: Stopping keytable:  succeeded^M
Sep 19 11:46:42 clusquad15 cups: cupsd shutdown succeeded^M
Sep 19 11:46:43 clusquad15 xfs: xfs shutdown succeeded^M
Sep 19 11:46:43 clusquad15 canna: Stopping Canna server: succeeded^M
Sep 19 11:46:43 clusquad15 FreeWnn: jserver shutdown succeeded^M
Sep 19 11:46:43 clusquad15 gpm: gpm shutdown succeeded^M
Sep 19 11:46:43 clusquad15 sshd: sshd -TERM succeeded^M
Sep 19 11:46:43 clusquad15 sendmail: sendmail shutdown succeeded^M
Sep 19 11:46:43 clusquad15 sendmail: sm-client shutdown succeeded^M
Sep 19 11:46:43 clusquad15 xinetd: xinetd shutdown succeeded^M
Sep 19 11:46:43 clusquad15 acpid: acpid shutdown failed^M
Sep 19 11:46:43 clusquad15 crond: crond shutdown succeeded^M
Sep 19 11:46:43 clusquad15 dd: 1+0 records in^M
Sep 19 11:46:43 clusquad15 dd: 1+0 records out^M
Sep 19 11:46:43 clusquad15 random: Saving random seed:  succeeded^M
Sep 19 11:46:44 clusquad15 nfslock: rpc.statd shutdown succeeded^M
Sep 19 11:46:44 clusquad15 portmap: portmap shutdown succeeded^M
Sep 19 11:46:45 clusquad15 syslog: klogd shutdown succeeded^M
Sep 19 12:07:32 clusquad15 syslog: syslogd startup succeeded^M
Sep 19 12:07:32 clusquad15 syslog: klogd startup succeeded^M
Sep 19 12:07:32 clusquad15 portmap: portmap startup succeeded^M
Sep 19 12:07:32 clusquad15 nfslock: rpc.statd startup succeeded^M
Sep 19 12:07:32 clusquad15 keytable: Loading keymap: ^M
Sep 19 12:07:33 clusquad15 keytable: ^[[60G^M
Sep 19 12:07:33 clusquad15 keytable: ^M
Sep 19 12:07:33 clusquad15 keytable: Loading system font: ^M
Sep 19 12:07:33 clusquad15 keytable: ^[[60G[  ^[[0;32m^M
Sep 19 12:07:33 clusquad15 keytable: ^M
Sep 19 12:07:33 clusquad15 rc: Starting keytable:  succeeded^M
Sep 19 12:07:33 clusquad15 random: Initializing random number generator:  succee
ded^M
Sep 19 12:07:33 clusquad15 netfs: Mounting other filesystems:  succeeded^M
Sep 19 12:07:33 clusquad15 autofs: automount startup succeeded^M
Sep 19 12:07:33 clusquad15 acpid: acpid startup succeeded^M
Sep 19 12:07:32 clusquad15 network: Bringing up interface eth0:  succeeded ^M
Sep 19 12:07:36 clusquad15 cups: cupsd startup succeeded^M
Sep 19 12:07:36 clusquad15 sshd:  succeeded^M
Sep 19 12:07:36 clusquad15 xinetd: xinetd startup succeeded^M
Sep 19 12:07:37 clusquad15 sendmail: sendmail startup succeeded^M
Sep 19 12:07:37 clusquad15 sendmail: sm-client startup succeeded^M
Sep 19 12:07:37 clusquad15 gpm: gpm startup succeeded^M
Sep 19 12:07:39 clusquad15 canna:  succeeded^M
Sep 19 12:07:40 clusquad15 crond: crond startup succeeded^M
Sep 19 12:07:47 clusquad15 xfs: xfs startup succeeded^M
Sep 19 12:07:47 clusquad15 atd: atd startup succeeded^M


Comment 6 Marc Varel 2003-09-24 04:40:24 EDT
dmesg:

CPU 5: synchronized ITC with CPU 0 (last diff 0 cycles, maxerr 1624 cycles)^M
CPU 5: base freq=200.162MHz, ITC ratio=13/2, ITC freq=1301.056MHz^M
CPU 5: checking for saved MCA error records^M
Calibrating delay loop... 1943.56 BogoMIPS^M
CPU5: CPU has booted.^M
CPU 6: mapping PAL code [0x7fe00000-0x7fe40000) into [0xe00000007f000000-0xe0000
00080000000)^M
CPU 6: 61 virtual and 50 physical address bits^M
CPU 6: synchronized ITC with CPU 0 (last diff 0 cycles, maxerr 1624 cycles)^M
CPU 6: base freq=200.162MHz, ITC ratio=13/2, ITC freq=1301.056MHz^M
CPU 6: checking for saved MCA error records^M
Calibrating delay loop... 1938.44 BogoMIPS^M
CPU6: CPU has booted.^M
CPU 7: mapping PAL code [0x7fe00000-0x7fe40000) into [0xe00000007f000000-0xe0000
00080000000)^M
CPU 7: 61 virtual and 50 physical address bits^M
CPU 7: synchronized ITC with CPU 0 (last diff 0 cycles, maxerr 1624 cycles)^M
CPU 7: base freq=200.162MHz, ITC ratio=13/2, ITC freq=1301.056MHz^M
CPU 7: checking for saved MCA error records^M
Calibrating delay loop... 1938.44 BogoMIPS^M
CPU7: CPU has booted.^M
CPU 8: mapping PAL code [0x7fe00000-0x7fe40000) into [0xe00000007f000000-0xe0000
00080000000)^M
CPU 8: 61 virtual and 50 physical address bits^M
CPU 8: synchronized ITC with CPU 0 (last diff 0 cycles, maxerr 1635 cycles)^M
CPU 8: base freq=200.162MHz, ITC ratio=13/2, ITC freq=1301.056MHz^M
CPU 8: checking for saved MCA error records^M
Calibrating delay loop... 1938.44 BogoMIPS^M
CPU8: CPU has booted.^M
CPU 9: mapping PAL code [0x7fe00000-0x7fe40000) into [0xe00000007f000000-0xe0000
00080000000)^M
CPU 9: 61 virtual and 50 physical address bits^M
CPU 9: synchronized ITC with CPU 0 (last diff 0 cycles, maxerr 1635 cycles)^M
CPU 9: base freq=200.162MHz, ITC ratio=13/2, ITC freq=1301.056MHz^M
CPU 9: checking for saved MCA error records^M
Calibrating delay loop... 1934.32 BogoMIPS^M
CPU9: CPU has booted.^M
CPU 10: mapping PAL code [0x7fe00000-0x7fe40000) into [0xe00000007f000000-0xe000
000080000000)^M
CPU 10: 61 virtual and 50 physical address bits^M
CPU 10: synchronized ITC with CPU 0 (last diff 0 cycles, maxerr 1635 cycles)^M
CPU 10: base freq=200.162MHz, ITC ratio=13/2, ITC freq=1301.056MHz^M
CPU 10: checking for saved MCA error records^M
Calibrating delay loop... 1934.32 BogoMIPS^M
CPU10: CPU has booted.^M
CPU 11: mapping PAL code [0x7fe00000-0x7fe40000) into [0xe00000007f000000-0xe000
000080000000)^M
CPU 11: 61 virtual and 50 physical address bits^M
CPU 11: synchronized ITC with CPU 0 (last diff 0 cycles, maxerr 1635 cycles)^M
CPU 11: base freq=200.162MHz, ITC ratio=13/2, ITC freq=1301.056MHz^M
CPU 11: checking for saved MCA error records^M
Calibrating delay loop... 1934.32 BogoMIPS^M
CPU11: CPU has booted.^M
CPU 12: mapping PAL code [0x7fe00000-0x7fe40000) into [0xe00000007f000000-0xe000
000080000000)^M
CPU 12: 61 virtual and 50 physical address bits^M
CPU 12: synchronized ITC with CPU 0 (last diff 0 cycles, maxerr 1635 cycles)^M
CPU 12: base freq=200.162MHz, ITC ratio=13/2, ITC freq=1301.056MHz^M
CPU 12: checking for saved MCA error records^M
Calibrating delay loop... 1934.32 BogoMIPS^M
CPU12: CPU has booted.^M
CPU 13: mapping PAL code [0x7fe00000-0x7fe40000) into [0xe00000007f000000-0xe000
000080000000)^M
CPU 13: 61 virtual and 50 physical address bits^M
CPU 13: synchronized ITC with CPU 0 (last diff 0 cycles, maxerr 1635 cycles)^M
CPU 13: base freq=200.162MHz, ITC ratio=13/2, ITC freq=1301.056MHz^M
CPU 13: checking for saved MCA error records^M
Calibrating delay loop... 1930.20 BogoMIPS^M
CPU13: CPU has booted.^M
CPU 14: mapping PAL code [0x7fe00000-0x7fe40000) into [0xe00000007f000000-0xe000
000080000000)^M
CPU 14: 61 virtual and 50 physical address bits^M
CPU 14: synchronized ITC with CPU 0 (last diff 0 cycles, maxerr 1635 cycles)^M
CPU 14: base freq=200.162MHz, ITC ratio=13/2, ITC freq=1301.056MHz^M
CPU 14: checking for saved MCA error records^M
Calibrating delay loop... 1930.20 BogoMIPS^M
CPU14: CPU has booted.^M
CPU 15: mapping PAL code [0x7fe00000-0x7fe40000) into [0xe00000007f000000-0xe000
000080000000)^M
CPU 15: 61 virtual and 50 physical address bits^M
CPU 15: synchronized ITC with CPU 0 (last diff 0 cycles, maxerr 1635 cycles)^M
CPU 15: base freq=200.162MHz, ITC ratio=13/2, ITC freq=1301.056MHz^M
CPU 15: checking for saved MCA error records^M
Calibrating delay loop... 1930.20 BogoMIPS^M
CPU15: CPU has booted.^M
Before bogomips.^M
Total of 16 processors activated (31013.80 BogoMIPS).^M
Starting migration thread for cpu 0^M
Starting migration thread for cpu 1^M
Starting migration thread for cpu 2^M
Starting migration thread for cpu 3^M
Starting migration thread for cpu 4^M
Starting migration thread for cpu 5^M
Starting migration thread for cpu 6^M
Starting migration thread for cpu 7^M
Starting migration thread for cpu 8^M
Starting migration thread for cpu 9^M
Starting migration thread for cpu 10^M
Starting migration thread for cpu 11^M
Starting migration thread for cpu 12^M
Starting migration thread for cpu 13^M
Starting migration thread for cpu 14^M
Starting migration thread for cpu 15^M
ACPI: Subsystem revision 20030619^M
PCI: Using SAL to access configuration space^M
ACPI: Interpreter enabled^M
ACPI: Using IOSAPIC for interrupt routing^M
ACPI: System [ACPI] (supports S0 S5)^M
ACPI: PCI Root Bridge [PCI0] (00:00)^M
PCI: Ignoring BAR0-3 of IDE controller 00:1f.1^M
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]^M
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.H2PB._PRT]^M
ACPI: PCI Root Bridge [PC01] (00:02)^M
ACPI: PCI Interrupt Routing Table [\_SB_.PC01.P2PA._PRT]^M
ACPI: PCI Interrupt Routing Table [\_SB_.PC01.P2PB._PRT]^M
ACPI: PCI Root Bridge [PC02] (00:05)^M
ACPI: PCI Interrupt Routing Table [\_SB_.PC02.P2PA._PRT]^M
ACPI: PCI Interrupt Routing Table [\_SB_.PC02.P2PB._PRT]^M
ACPI: PCI Root Bridge [PC03] (00:0a)^M
ACPI: PCI Interrupt Routing Table [\_SB_.PC03.P2PA._PRT]^M
ACPI: PCI Interrupt Routing Table [\_SB_.PC03.P2PB._PRT]^M
ACPI: PCI Root Bridge [PC04] (00:10)^M
ACPI: PCI Interrupt Routing Table [\_SB_.PC04.P2PA._PRT]^M
ACPI: PCI Interrupt Routing Table [\_SB_.PC04.P2PB._PRT]^M
ACPI: PCI Root Bridge [CSFF] (00:ff)^M
CPU 0: checking for saved MCA error records^M
PCI->APIC IRQ transform: (00:1d.0 INTA) -> CPU 0xc018 vector 48^M
PCI->APIC IRQ transform: (00:1d.1 INTB) -> CPU 0xc218 vector 49^M
PCI: no interrupt route for 00:00:1f pin A^M
PCI: no interrupt route for 00:00:1f pin B^M
PCI->APIC IRQ transform: (01:00.0 INTA) -> CPU 0xc418 vector 48^M
PCI->APIC IRQ transform: (01:01.0 INTA) -> CPU 0xc618 vector 50^M
PCI->APIC IRQ transform: (01:02.0 INTA) -> CPU 0xc819 vector 50^M
PCI->APIC IRQ transform: (03:01.0 INTA) -> CPU 0xca19 vector 51^M
PCI->APIC IRQ transform: (03:1f.0 INTA) -> CPU 0xcc19 vector 52^M
PCI->APIC IRQ transform: (04:01.0 INTA) -> CPU 0xce19 vector 53^M
PCI->APIC IRQ transform: (04:01.1 INTB) -> CPU 0xd01a vector 54^M
PCI->APIC IRQ transform: (04:1f.0 INTA) -> CPU 0xd21a vector 55^M
PCI->APIC IRQ transform: (06:1f.0 INTA) -> CPU 0xd41a vector 56^M
PCI->APIC IRQ transform: (09:01.0 INTA) -> CPU 0xd61a vector 57^M
PCI->APIC IRQ transform: (09:1f.0 INTA) -> CPU 0xd81b vector 58^M
PCI->APIC IRQ transform: (0b:1f.0 INTA) -> CPU 0xda1b vector 59^M
PCI->APIC IRQ transform: (0e:1f.0 INTA) -> CPU 0xdc1b vector 60^M
PCI->APIC IRQ transform: (11:1f.0 INTA) -> CPU 0xde1b vector 61^M
PCI->APIC IRQ transform: (14:01.0 INTA) -> CPU 0xc018 vector 62^M
PCI->APIC IRQ transform: (14:1f.0 INTA) -> CPU 0xc218 vector 63^M
Linux NET4.0 for Linux 2.4^M
Based upon Swansea University Computer Society NET3.039^M
Initializing RT netlink socket^M
perfmon: version 1.3 IRQ 238^M
perfmon: 16 PMCs, 18 PMDs, 4 counters (47 bits)^M
PAL Information Facility v0.5^M
EFI Variables Facility v0.05 2002-Mar-26^M
Total HugeTLB memory allocated, 0^M
Starting kswapd^M
allocated 32 pages and 32 bhs reserved for the highmem bounces^M
VFS: Disk quotas vdquot_6.5.1^M
aio_setup: num_physpages = 1048222^M
aio_setup: sizeof(struct page) = 112^M
Hugetlbfs mounted.^M
initialize_kbd: Keyboard reset failed, no ACK^M
pty: 2048 Unix98 ptys configured^M
Serial driver version 5.05c (2001-07-08) with MANY_PORTS MULTIPORT SHARE_IRQ DET
ECT_IRQ SERIAL_PCI SERIAL_ACPI enabled^M
ttyS0 at 0x03f8 (irq = 44) is a 16550A^M
ttyS1 at 0x02f8 (irq = 45) is a 16550A^M
register_serial(): autoconfig failed^M
register_serial(): autoconfig failed^M
EFI Time Services Driver v0.4^M
NET4: Frame Diverter 0.46^M
RAMDISK driver initialized: 256 RAM disks of 8192K size 1024 blocksize^M
Uniform Multi-Platform E-IDE driver Revision: 7.00beta4-2.4^M
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx^M
ICH4: IDE controller at PCI slot 00:1f.1^M
PCI: Device 00:1f.1 not available because of resource collisions^M
PCI: Found IRQ 0 for device 00:1f.1^M
ICH4: Not fully BIOS configured!^M
ICH4: chipset revision 1^M
ICH4: not 100% native mode: will probe irqs later^M
    ide0: BM-DMA at 0x1000-0x1007, BIOS settings: hda:DMA, hdb:pio^M
    ide1: BM-DMA at 0x1008-0x100f, BIOS settings: hdc:DMA, hdd:pio^M
hda: LS-120/240 00 UHD Floppy, ATAPI FLOPPY drive^M
hdc: MATSHITADVD-ROM SR-8177, ATAPI CD/DVD-ROM drive^M
global_restore_flags: 10084a6010 (e0000000049776a0)^M
ide0 at 0x1f0-0x1f7,0x3f6 on irq 34^M
global_restore_flags: 10084a2010 (e0000000049776a0)^M
ide1 at 0x170-0x177,0x376 on irq 33^M
ide-floppy driver 0.99.newide^M
hda: attached ide-floppy driver.^M
hda: No disk in drive^M
hda: 234752kB, 262/32/56 CHS, 2995 kBps, 512 sector size, 1500 rpm^M
ide-floppy driver 0.99.newide^M
md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27^M
md: Autodetecting RAID arrays.^M
md: autorun ...^M
md: ... autorun DONE.^M
Initializing Cryptographic API^M
NET4: Linux TCP/IP 1.0 for NET4.0^M
IP: routing cache hash table of 1048576 buckets, 16384Kbytes^M
TCP: Hash tables configured (established 1048576 bind 65536)^M
Linux IP multicast router 0.06 plus PIM-SM^M
Initializing IPsec netlink socket^M
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.^M
RAMDISK: Compressed image found at block 0^M
Freeing initrd memory: 1376kB freed^M
VFS: Mounted root (ext2 filesystem).^M
SCSI subsystem driver Revision: 1.00^M
PCI: Found IRQ 53 for device 04:01.0^M
PCI: Found IRQ 54 for device 04:01.1^M
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36^M
        <Adaptec 3960D Ultra160 SCSI adapter>^M
        aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs^M
^M
scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36^M
        <Adaptec 3960D Ultra160 SCSI adapter>^M
        aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs^M
        aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs^M
^M
Starting timer : 0 0^M
blk: queue e0000000786868b0, I/O limit 524287Mb (mask 0x7fffffffff)^M
(scsi0:A:0): 160.000MB/s transfers (80.000MHz DT, offset 127, 16bit)^M
  Vendor: CNSi      Model: JSS122            Rev: L425^M
  Type:   Direct-Access                      ANSI SCSI revision: 03^M
Starting timer : 0 0^M
blk: queue e0000000786864b0, I/O limit 524287Mb (mask 0x7fffffffff)^M
  Vendor: CNSi      Model: JSS122            Rev: L425^M
  Type:   Direct-Access                      ANSI SCSI revision: 03^M
Starting timer : 0 0^M
blk: queue e00000007869afb0, I/O limit 524287Mb (mask 0x7fffffffff)^M
  Vendor: CNSi      Model: JSS122            Rev: L425^M
  Type:   Direct-Access                      ANSI SCSI revision: 03^M
Starting timer : 0 0^M
blk: queue e0000000786860b0, I/O limit 524287Mb (mask 0x7fffffffff)^M
  Vendor: CNSi      Model: JSS122            Rev: L425^M
  Type:   Direct-Access                      ANSI SCSI revision: 03^M
Starting timer : 0 0^M
blk: queue e00000007869b3b0, I/O limit 524287Mb (mask 0x7fffffffff)^M
scsi0:A:0:0: Tagged Queuing enabled.  Depth 32^M
scsi0:A:0:1: Tagged Queuing enabled.  Depth 32^M
scsi0:A:0:2: Tagged Queuing enabled.  Depth 32^M
scsi0:A:0:3: Tagged Queuing enabled.  Depth 32^M
Starting timer : 0 0^M
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0^M
Attached scsi disk sdb at scsi0, channel 0, id 0, lun 1^M
Attached scsi disk sdc at scsi0, channel 0, id 0, lun 2^M
Attached scsi disk sdd at scsi0, channel 0, id 0, lun 3^M
SCSI device sda: 70970368 512-byte hdwr sectors (36337 MB)^M
Partition check:^M
 sda: sda1 sda2 sda3 sda4^M
SCSI device sdb: 70970368 512-byte hdwr sectors (36337 MB)^M
 sdb: sdb1 sdb2 sdb3 sdb4^M
SCSI device sdc: 70970368 512-byte hdwr sectors (36337 MB)^M
 sdc: sdc1 sdc2 sdc3 sdc4^M
SCSI device sdd: 70970368 512-byte hdwr sectors (36337 MB)^M
 sdd: sdd1 sdd2 sdd3 sdd4^M
Journalled Block Device driver loaded^M
kjournald starting.  Commit interval 5 seconds^M
EXT3-fs: mounted filesystem with ordered data mode.^M
Freeing unused kernel memory: 368kB freed^M
usb.c: registered new driver usbdevfs^M
usb.c: registered new driver hub^M
usb-uhci.c: $Revision: 1.275 $ time 15:34:36 Aug 20 2003^M
usb-uhci.c: High bandwidth mode enabled^M
PCI: Found IRQ 48 for device 00:1d.0^M
usb-uhci.c: USB UHCI at I/O 0xef40, IRQ 48^M
usb-uhci.c: Detected 2 ports^M
usb.c: new USB bus registered, assigned bus number 1^M
hub.c: USB hub found^M
hub.c: 2 ports detected^M
PCI: Found IRQ 49 for device 00:1d.1^M
usb-uhci.c: USB UHCI at I/O 0xef80, IRQ 49^M
usb-uhci.c: Detected 2 ports^M
usb.c: new USB bus registered, assigned bus number 2^M
hub.c: USB hub found^M
hub.c: 2 ports detected^M
usb-uhci.c: v1.275:USB Universal Host Controller Interface driver^M
usb.c: registered new driver hiddev^M
usb.c: registered new driver hid^M
hid-core.c: v1.8.1 Andreas Gal, Vojtech Pavlik <vojtech@suse.cz>^M
hid-core.c: USB HID support drivers^M
mice: PS/2 mouse device common for all mice^M
EXT3 FS 2.4-0.9.19, 19 August 2002 on sd(8,50), internal journal^M
Adding Swap: 2048960k swap-space (priority -1)^M
hub.c: new USB device 00:1d.1-1, assigned address 2^M
input0: USB HID v1.00 Keyboard [Tangtop Generic USBPS2] on usb2:2.0^M
input1: USB HID v1.00 Mouse [Tangtop Generic USBPS2] on usb2:2.1^M
kjournald starting.  Commit interval 5 seconds^M
EXT3 FS 2.4-0.9.19, 19 August 2002 on sd(8,52), internal journal^M
EXT3-fs: mounted filesystem with ordered data mode.^M
usb-uhci.c: ENXIO 84000280, flags 0, urb e00000007759fd80, burb e000000021a4ae80
^M
usbdevfs: USBDEVFS_CONTROL failed dev 2 rqt 128 rq 6 len 18 ret -6^M
usb-uhci.c: ENXIO 84000280, flags 0, urb e000000077bbfa80, burb e000000076eb3d80
^M
usbdevfs: USBDEVFS_CONTROL failed dev 2 rqt 128 rq 6 len 18 ret -6^M
Intel(R) PRO/100 Network Driver - version 2.3.13-k1-1^M
Copyright (c) 2003 Intel Corporation^M
^M
PCI: Found IRQ 50 for device 01:02.0^M
divert: allocating divert_blk for eth0^M
e100: selftest OK.^M
e100: eth0: Intel(R) PRO/100 Network Connection^M
  Hardware receive checksums enabled^M
  cpu cycle saver enabled^M
^M
Intel(R) PRO/1000 Network Driver - version 5.1.11-k1^M
Copyright (c) 1999-2003 Intel Corporation.^M
PCI: Found IRQ 51 for device 03:01.0^M
divert: allocating divert_blk for eth1^M
eth1: Intel(R) PRO/1000 Network Connection^M
PCI: Found IRQ 57 for device 09:01.0^M
divert: allocating divert_blk for eth2^M
eth2: Intel(R) PRO/1000 Network Connection^M
divert: freeing divert_blk for eth0^M
divert: freeing divert_blk for eth1^M
divert: freeing divert_blk for eth2^M
ip_tables: (C) 2000-2002 Netfilter core team^M
Intel(R) PRO/100 Network Driver - version 2.3.13-k1-1^M
Copyright (c) 2003 Intel Corporation^M
^M
PCI: Found IRQ 50 for device 01:02.0^M
divert: allocating divert_blk for eth0^M
e100: selftest OK.^M
e100: eth0: Intel(R) PRO/100 Network Connection^M
  Hardware receive checksums enabled^M
  cpu cycle saver enabled^M
^M
ip_tables: (C) 2000-2002 Netfilter core team^M
e100: eth0 NIC Link is Up 100 Mbps Full duplex^M
arping(5677): unaligned access to 0x60000fffffffbe91, ip=0xe0000000046f1e10^M
Intel(R) PRO/1000 Network Driver - version 5.1.11-k1^M
Copyright (c) 1999-2003 Intel Corporation.^M
PCI: Found IRQ 51 for device 03:01.0^M
divert: allocating divert_blk for eth1^M
eth1: Intel(R) PRO/1000 Network Connection^M
PCI: Found IRQ 57 for device 09:01.0^M
divert: allocating divert_blk for eth2^M
eth2: Intel(R) PRO/1000 Network Connection^M
ACPI: Power Button (FF) [PWRF]^M
arping(5858): unaligned access to 0x60000fffffffbe91, ip=0xe0000000046f1e10^M

Comment 7 Marc Varel 2003-09-29 04:00:39 EDT
we have reproduce the probleme on NovaScale 4040 with only 4 cpu , but 32 Gb of
memory
Comment 8 Marc Varel 2003-12-04 03:53:28 EST
Created attachment 96331 [details]
undo fancyswiommupatch
Comment 9 Larry Woodman 2004-02-11 11:51:35 EST
Can you get us several AltSysrq M outputs when the system hangs up
due to lowmem exhaustion?

Thanks, Larry Woodman
Comment 10 Larry Woodman 2004-11-29 15:42:50 EST
This problem has been fixed in RHEL3-U4.  We could not revert the
fancyiommu patch because it broke the kernel ABI.  Instead, we changd
the slab allocator so that it allocates the kernel data structures
from highmem.  This prevents exhausting lowmem on the IA64 systems
with holes in memory and large amounts of ram.

Please grab the patest RHEL3-U4 kernel and verify that this problem is
fixed.


Larry Woodman
Comment 12 Ernie Petrides 2004-11-29 17:51:00 EST
Larry's fixes (referred to in comment #10) were committed to the RHEL3 U4
patch pool on 18-Oct-2004 (in kernel version 2.4.21-22.EL).  I'm changing
the state of this to MODIFIED now.
Comment 13 John Flanagan 2004-12-20 15:54:43 EST
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2004-550.html

Note You need to log in before you can comment on or make changes to this bug.