Bug 86467 - e100/e1000 stack corruption problems
e100/e1000 stack corruption problems
Status: CLOSED NOTABUG
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.2
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Arjan van de Ven
Brian Brock
http://planetmirror.com/pub/pmstuff/e...
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-03-23 02:34 EST by jason andrade
Modified: 2007-04-18 12:52 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2003-03-31 17:51:56 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description jason andrade 2003-03-23 02:34:35 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-US; rv:1.0rc2)
Gecko/20020510

Description of problem:
I have been going through today upgrading all our core servers to the
latest redhat kernel errata for 7.2 and 7.3 due to the recent security
issues and also to catch up to any another bug fixes. 

I have been running a mixture of 2.4.18 kernels previously, all from
redhat's errata, e.g 2.4.18-17.7.x and using intel's downloadable
drivers from their web site for the e100 and e1000 NICs in the servers.

Testing of the latest 2.4.18-27.7.x on our test server showed up no
problems so i've upgraded our front and back end servers.. which has
proven to be a mistake :-(

I am now seeing interesting stack traces in syslog messages and weird
application behaviour on our front end boxes resulting in huge load
averages and system lockups over time. This has never happened with
previous kernels or with previous network drivers.  I have tried both
the build in redhat drivers and the intel drivers with the same
results (both versions match anyway..)

This only seems to happen on our two Red Hat 7.2 systems at present and not on
the Red Hat 7.3 boxes..



Version-Release number of selected component (if applicable):
kernel-2.4.18-27.7.xcustom

How reproducible:
Always

Steps to Reproduce:
1.install 2.4.18-27.7.x on heavily loaded web/ftp server
2.wait for it to get busy
3.stack traces.. crashes and applications fall over
    

Actual Results:  nothing for about 20 minutes as network load builds up.  then i
start
getting these stack traces and applications start falling over.


Expected Results:  none of the above! :-)

Additional info:

Mar 23 12:51:22 chaos kernel: do_IRQ: stack overflow: 836
Mar 23 12:51:22 chaos kernel: c023dc7d 00000344 00000001 c36b5bf0 00000202
c36b5bf0 00000246 c010d058
Mar 23 12:51:22 chaos kernel:        c36b5bf0 000006bc c3696b80 00000202
c36b5bf0 00000246 cfa14000 c36c0018
Mar 23 12:51:22 chaos kernel:        c3500018 ffffff16 c0134011 00000010
00000246 f2d74000 c36c1580 00000010
Mar 23 12:51:22 chaos kernel: Call Trace: [call_do_IRQ+5/13]  (0xf2d749f8))
Mar 23 12:51:22 chaos kernel: Call Trace: [<c010d058>]  (0xf2d749f8))
Mar 23 12:51:22 chaos kernel: [kmalloc+289/352]  (0xf2d74a24))
Mar 23 12:51:22 chaos kernel: [<c0134011>]  (0xf2d74a24))
Mar 23 12:51:22 chaos kernel: [alloc_skb+239/448]  (0xf2d74a54))
Mar 23 12:51:22 chaos kernel: [<c01e29ff>]  (0xf2d74a54))
Mar 23 12:51:22 chaos kernel:
[e1000:__insmod_e1000_O/lib/modules/2.4.18-27.7.xcustom/kernel/dri+-160365/96] 
(0xf2d74a6c))
Mar 23 12:51:22 chaos kernel: [<f899cd93>]  (0xf2d74a6c))
Mar 23 12:51:22 chaos kernel:
[e1000:__insmod_e1000_O/lib/modules/2.4.18-27.7.xcustom/kernel/dri+-161054/96] 
(0xf2d74ad8))
Mar 23 12:51:22 chaos kernel: [<f899cae2>]  (0xf2d74ad8))
Mar 23 12:51:23 chaos kernel:
[e1000:__insmod_e1000_O/lib/modules/2.4.18-27.7.xcustom/kernel/dri+-161054/96] 
(0xf2d74af0))
Mar 23 12:51:23 chaos kernel: [<f899cae2>]  (0xf2d74af0))
Mar 23 12:51:23 chaos kernel: [add_interrupt_randomness+34/48]  (0xf2d74af8))
Mar 23 12:51:23 chaos kernel: [<c01a3912>]  (0xf2d74af8))
Mar 23 12:51:23 chaos kernel: [handle_IRQ_event+118/144]  (0xf2d74b04))
Mar 23 12:51:23 chaos kernel: [<c010a646>]  (0xf2d74b04))
Mar 23 12:51:23 chaos kernel: [do_IRQ+254/272]  (0xf2d74b28))
Mar 23 12:51:23 chaos kernel: [<c010a89e>]  (0xf2d74b28))
Mar 23 12:51:24 chaos kernel: [call_do_IRQ+5/13]  (0xf2d74b40))
Mar 23 12:51:24 chaos kernel: [<c010d058>]  (0xf2d74b40))
Mar 23 12:51:24 chaos kernel: [handle_IRQ_event+94/144]  (0xf2d74b5c))
Mar 23 12:51:24 chaos kernel: [<c010a62e>]  (0xf2d74b5c))
Mar 23 12:51:24 chaos kernel:
[e1000:__insmod_e1000_O/lib/modules/2.4.18-27.7.xcustom/kernel/dri+-161082/96] 
(0xf2d74b9c))
Mar 23 12:51:24 chaos kernel: [<f899cac6>]  (0xf2d74b9c))
Mar 23 12:51:24 chaos kernel:
[e1000:__insmod_e1000_O/lib/modules/2.4.18-27.7.xcustom/kernel/dri+-161082/96] 
(0xf2d74bb0))
Mar 23 12:51:24 chaos kernel: [<f899cac6>]  (0xf2d74bb0))
Mar 23 12:51:24 chaos kernel:
[e1000:__insmod_e1000_O/lib/modules/2.4.18-27.7.xcustom/kernel/dri+-148567/96] 
(0xf2d74bc4))
Mar 23 12:51:24 chaos kernel: [<f899fba9>]  (0xf2d74bc4))
Mar 23 12:51:25 chaos kernel:
[e1000:__insmod_e1000_O/lib/modules/2.4.18-27.7.xcustom/kernel/dri+-160841/96] 
(0xf2d74be8))
Mar 23 12:51:25 chaos kernel: [<f899cbb7>]  (0xf2d74be8))
Mar 23 12:51:25 chaos kernel:
[e1000:__insmod_e1000_O/lib/modules/2.4.18-27.7.xcustom/kernel/dri+-164759/96] 
(0xf2d74c14))

The stack trace goes on for much longer.. i can provide the complete one if
needed in email..
Comment 1 jason andrade 2003-03-23 03:57:53 EST
Just so you know the customizations in the kernel are purely cosmetic to
reduce the number of modules built and suchlike.

There are no additional patches added to this kernel - it's a stock standard
one built from the redhat kernel-source.

At this stage this bug might be more informational than fixable as i either have
to back down to the previous kernel or - more likely - do a quick reinstall to
redhat 7.3 which doesn't seem to have this issue.


regards,

-jason
Comment 2 jason andrade 2003-03-24 00:17:04 EST
I am also seeing kernel panics on redhat 7.3 now :-(

I am attacking my .config file from the kernel (so you know what customizations
are in place)

I am also attaching the start of the new error


Mar 24 10:26:40 karl kernel: do_IRQ: stack overflow: 920
Mar 24 10:26:40 karl kernel: c023dc5d 00000398 00000001 c1ee8a00 c1f4499c
c2310480 c1f44980 c010d058
Mar 24 10:26:40 karl kernel:        c1ee8a00 d6ddd680 00000010 c1f4499c c2310480
c1f44980 ef9e3812 c1f40018
Mar 24 10:26:40 karl kernel:        d6dd0018 ffffff16 f894a99f 00000010 00000286
f7586000 c1f449a4 0000003e
Mar 24 10:26:40 karl kernel: Call Trace: [<c010d058>]  (0xf7586a4c))
Mar 24 10:26:40 karl kernel: [<f894a99f>]  (0xf7586a78))
Mar 24 10:26:40 karl kernel: [<c01e6a6e>]  (0xf7586aa4))
Mar 24 10:26:40 karl kernel: [<c01ee3c4>]  (0xf7586ab0))
Mar 24 10:26:40 karl kernel: [<f894a79f>]  (0xf7586ac8))
Mar 24 10:26:40 karl kernel: [<f895cb69>]  (0xf7586ad4))
Mar 24 10:26:40 karl kernel: [<c01ed80e>]  (0xf7586adc))
Mar 24 10:26:40 karl kernel: [<c01fd170>]  (0xf7586b00))
Mar 24 10:26:40 karl kernel: [<c01edb3f>]  (0xf7586b04))
Mar 24 10:26:40 karl kernel: [<c01edb76>]  (0xf7586b1c))
Mar 24 10:26:40 karl kernel: [<c01a3891>]  (0xf7586b20))
Mar 24 10:26:40 karl kernel: [<c010a62e>]  (0xf7586b44))
Mar 24 10:26:40 karl kernel: [<f894d6b7>]  (0xf7586b5c))
Mar 24 10:26:40 karl kernel: [<f894a79f>]  (0xf7586b84))



kernel config file:

#
# Automatically generated by make menuconfig: don't edit
#
CONFIG_X86=y
CONFIG_ISA=y
# CONFIG_SBUS is not set
CONFIG_UID16=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODVERSIONS=y
CONFIG_KMOD=y

#
# Processor type and features
#
CONFIG_LOLAT=y
CONFIG_LOLAT_SYSCTL=y
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
CONFIG_MPENTIUMIII=y
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MELAN is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y 
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_PGE=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_MCE=y
# CONFIG_CPU_FREQ is not set
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
CONFIG_MICROCODE=m
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m
# CONFIG_E820_PROC is not set
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_HIGHMEM=y
CONFIG_HIGHIO=y
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_SMP=y
# CONFIG_MULTIQUAD is not set
CONFIG_HAVE_DEC_LOCK=y

#
# General setup
#
CONFIG_HZ=100
CONFIG_NET=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_NAMES=y
# CONFIG_EISA is not set
# CONFIG_MCA is not set
# CONFIG_HOTPLUG is not set
# CONFIG_PCMCIA is not set
# CONFIG_HOTPLUG_PCI is not set
CONFIG_SYSVIPC=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_SYSCTL=y
CONFIG_KCORE_ELF=y
# CONFIG_KCORE_AOUT is not set
CONFIG_BINFMT_AOUT=m
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=m
# CONFIG_IKCONFIG is not set
# CONFIG_PM is not set

#
# Additional device driver support
#
# CONFIG_CIPE is not set
# CONFIG_CRYPTO_AEP is not set
# CONFIG_MEGARAC is not set
# CONFIG_FC_QLA2200 is not set
# CONFIG_FC_QLA2300 is not set
# CONFIG_SCSI_ISCSI is not set
# CONFIG_ACPI is not set
# CONFIG_APM is not set

#
# Binary emulation of other systems
#
# CONFIG_ABI is not set
# CONFIG_ABI_SVR4 is not set
# CONFIG_BINFMT_COFF is not set
# CONFIG_BINFMT_XOUT is not set
# CONFIG_BINFMT_XOUT_X286 is not set

#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set

#
# Parallel port support
#
# CONFIG_PARPORT is not set

#
# Plug and Play configuration
#
# CONFIG_PNP is not set
# CONFIG_ISAPNP is not set
# CONFIG_PNPBIOS is not set

#
# Block devices
#
CONFIG_BLK_DEV_FD=y
# CONFIG_BLK_DEV_XD is not set
# CONFIG_PARIDE is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_CISS_SCSI_TAPE is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_NBD=m
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=4096
CONFIG_BLK_DEV_INITRD=y

#
# Multi-device support (RAID and LVM)
#
# CONFIG_MD is not set
# CONFIG_BLK_DEV_MD is not set
# CONFIG_MD_LINEAR is not set
# CONFIG_MD_RAID0 is not set
# CONFIG_MD_RAID1 is not set
# CONFIG_MD_RAID5 is not set
# CONFIG_MD_MULTIPATH is not set
# CONFIG_BLK_DEV_LVM is not set

#
# Cryptography support (CryptoAPI)
#
# CONFIG_CRYPTO is not set
# CONFIG_CIPHERS is not set
# CONFIG_CRYPTODEV is not set
# CONFIG_CRYPTOLOOP is not set

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_NETLINK_DEV=y
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_FILTER=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_TUX=m
# CONFIG_TUX_EXTCGI is not set
# CONFIG_TUX_EXTENDED_LOG is not set
# CONFIG_TUX_DEBUG is not set
# CONFIG_IP_MULTICAST is not set
# CONFIG_IP_ADVANCED_ROUTER is not set
# CONFIG_IP_PNP is not set
CONFIG_NET_IPIP=m
# CONFIG_NET_IPGRE is not set
# CONFIG_ARPD is not set
# CONFIG_INET_ECN is not set
# CONFIG_SYN_COOKIES is not set

#
#   IP: Netfilter Configuration
#
CONFIG_IP_NF_CONNTRACK=m
CONFIG_IP_NF_FTP=m
CONFIG_IP_NF_IRC=m
CONFIG_IP_NF_QUEUE=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_LIMIT=m
CONFIG_IP_NF_MATCH_MAC=m
CONFIG_IP_NF_MATCH_MARK=m
CONFIG_IP_NF_MATCH_MULTIPORT=m
CONFIG_IP_NF_MATCH_TOS=m
# CONFIG_IP_NF_MATCH_AH_ESP is not set
CONFIG_IP_NF_MATCH_LENGTH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_TCPMSS=m
CONFIG_IP_NF_MATCH_STATE=m
CONFIG_IP_NF_MATCH_UNCLEAN=m
CONFIG_IP_NF_MATCH_OWNER=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_MIRROR=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_REDIRECT=m
# CONFIG_IP_NF_NAT_LOCAL is not set
CONFIG_IP_NF_NAT_SNMP_BASIC=m
CONFIG_IP_NF_NAT_IRC=m
CONFIG_IP_NF_NAT_FTP=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_TOS=m
CONFIG_IP_NF_TARGET_MARK=m
CONFIG_IP_NF_TARGET_LOG=m
# CONFIG_IP_NF_TARGET_ULOG is not set
CONFIG_IP_NF_TARGET_TCPMSS=m
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_COMPAT_IPCHAINS=m
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_COMPAT_IPFWADM=m
CONFIG_IP_NF_NAT_NEEDED=y

#
#   IP: Virtual Server Configuration
#
CONFIG_IP_VS=m
# CONFIG_IP_VS_DEBUG is not set
CONFIG_IP_VS_TAB_BITS=16
CONFIG_IP_VS_RR=m
CONFIG_IP_VS_WRR=m
CONFIG_IP_VS_LC=m
CONFIG_IP_VS_WLC=m
CONFIG_IP_VS_LBLC=m
CONFIG_IP_VS_LBLCR=m
CONFIG_IP_VS_DH=m
CONFIG_IP_VS_SH=m
CONFIG_IP_VS_FTP=m
CONFIG_IPV6=m

#
#   IPv6: Netfilter Configuration
#
# CONFIG_IP6_NF_QUEUE is not set
CONFIG_IP6_NF_IPTABLES=m
CONFIG_IP6_NF_MATCH_LIMIT=m
CONFIG_IP6_NF_MATCH_MAC=m
CONFIG_IP6_NF_MATCH_MULTIPORT=m
CONFIG_IP6_NF_MATCH_OWNER=m
CONFIG_IP6_NF_MATCH_MARK=m
CONFIG_IP6_NF_FILTER=m
CONFIG_IP6_NF_TARGET_LOG=m
CONFIG_IP6_NF_MANGLE=m
CONFIG_IP6_NF_TARGET_MARK=m
# CONFIG_KHTTPD is not set
# CONFIG_ATM is not set
CONFIG_VLAN_8021Q=m
# CONFIG_IPX is not set
# CONFIG_ATALK is not set

#
# Appletalk devices
#
# CONFIG_DEV_APPLETALK is not set
# CONFIG_DECNET is not set
# CONFIG_BRIDGE is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_LLC is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_FASTROUTE is not set
# CONFIG_NET_HW_FLOWCONTROL is not set

#
# QoS and/or fair queueing
#
CONFIG_NET_SCHED=y
CONFIG_NET_SCH_CBQ=m
# CONFIG_NET_SCH_HTB is not set
CONFIG_NET_SCH_CSZ=m
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
CONFIG_NET_SCH_INGRESS=m
CONFIG_NET_QOS=y
CONFIG_NET_ESTIMATOR=y
CONFIG_NET_CLS=y
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_ROUTE=y
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
CONFIG_NET_CLS_POLICE=y

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set

#
# Telephony Support
#
# CONFIG_PHONE is not set
# CONFIG_PHONE_IXJ is not set
# CONFIG_PHONE_IXJ_PCMCIA is not set

#
# ATA/IDE/MFM/RLL support
#
CONFIG_IDE=y

#
# IDE, ATA and ATAPI Block devices
#
CONFIG_BLK_DEV_IDE=y
# CONFIG_BLK_DEV_HD_IDE is not set
# CONFIG_BLK_DEV_HD is not set
# CONFIG_BLK_DEV_IDEDISK is not set
# CONFIG_IDEDISK_MULTI_MODE is not set
# CONFIG_IDEDISK_STROKE is not set
# CONFIG_BLK_DEV_IDEDISK_VENDOR is not set
# CONFIG_BLK_DEV_IDEDISK_FUJITSU is not set
# CONFIG_BLK_DEV_IDEDISK_IBM is not set
# CONFIG_BLK_DEV_IDEDISK_MAXTOR is not set
# CONFIG_BLK_DEV_IDEDISK_QUANTUM is not set
# CONFIG_BLK_DEV_IDEDISK_SEAGATE is not set
# CONFIG_BLK_DEV_IDEDISK_WD is not set
# CONFIG_BLK_DEV_COMMERIAL is not set
# CONFIG_BLK_DEV_TIVO is not set
# CONFIG_BLK_DEV_IDECS is not set
CONFIG_BLK_DEV_IDECD=m
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set
# CONFIG_BLK_DEV_IDESCSI is not set
# CONFIG_IDE_TASK_IOCTL is not set
# CONFIG_BLK_DEV_CMD640 is not set
# CONFIG_BLK_DEV_CMD640_ENHANCED is not set
# CONFIG_BLK_DEV_ISAPNP is not set
# CONFIG_BLK_DEV_RZ1000 is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_OFFBOARD is not set
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
# CONFIG_IDEDMA_PCI_AUTO is not set
# CONFIG_IDEDMA_ONLYDISK is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_PCI_WIP is not set
# CONFIG_BLK_DEV_IDEDMA_TIMEOUT is not set
# CONFIG_IDEDMA_NEW_DRIVE_LISTINGS is not set
CONFIG_BLK_DEV_ADMA=y
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_AEC62XX_TUNING is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_WDC_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_AMD74XX_OVERRIDE is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_CMD680 is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_HPT34X is not set
# CONFIG_HPT34X_AUTODMA is not set
# CONFIG_BLK_DEV_HPT366 is not set
# CONFIG_BLK_DEV_PIIX is not set
# CONFIG_PIIX_TUNING is not set
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_ADMA100 is not set
# CONFIG_BLK_DEV_PDC202XX is not set
# CONFIG_PDC202XX_BURST is not set
# CONFIG_PDC202XX_FORCE is not set
CONFIG_BLK_DEV_SVWKS=y
# CONFIG_BLK_DEV_SIS5513 is not set
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_BLK_DEV_CENATEK is not set
# CONFIG_IDE_CHIPSETS is not set
# CONFIG_BLK_DEV_ELEVATOR_NOOP is not set
# CONFIG_IDEDMA_AUTO is not set
# CONFIG_IDEDMA_IVB is not set
# CONFIG_DMA_NONPCI is not set
CONFIG_BLK_DEV_IDE_MODES=y
# CONFIG_BLK_DEV_ATARAID is not set
# CONFIG_BLK_DEV_ATARAID_PDC is not set
# CONFIG_BLK_DEV_ATARAID_HPT is not set

#
# SCSI support
#
CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_SD_EXTRA_DEVS=40
CONFIG_CHR_DEV_ST=m
# CONFIG_CHR_DEV_OSST is not set
CONFIG_BLK_DEV_SR=m
# CONFIG_BLK_DEV_SR_VENDOR is not set
CONFIG_SR_EXTRA_DEVS=2
# CONFIG_CHR_DEV_SG is not set
# CONFIG_SCSI_DEBUG_QUEUES is not set
# CONFIG_SCSI_MULTI_LUN is not set
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y

#
# SCSI low-level drivers
#
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_7000FASST is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AHA152X is not set
# CONFIG_SCSI_AHA1542 is not set
# CONFIG_SCSI_AHA1740 is not set
CONFIG_SCSI_AACRAID=m
CONFIG_SCSI_AIC7XXX=m
CONFIG_AIC7XXX_CMDS_PER_DEVICE=253
CONFIG_AIC7XXX_RESET_DELAY_MS=15000
# CONFIG_AIC7XXX_PROBE_EISA_VL is not set
# CONFIG_AIC7XXX_BUILD_FIRMWARE is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_IN2000 is not set
# CONFIG_SCSI_AM53C974 is not set
# CONFIG_SCSI_MEGARAID is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_CPQFCTS is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_DTC3280 is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_EATA_DMA is not set
# CONFIG_SCSI_EATA_PIO is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_GENERIC_NCR5380 is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_NCR53C406A is not set
# CONFIG_SCSI_NCR53C7xx is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_NCR53C8XX is not set
# CONFIG_SCSI_SYM53C8XX is not set
# CONFIG_SCSI_PAS16 is not set
# CONFIG_SCSI_PCI2000 is not set
# CONFIG_SCSI_PCI2220I is not set
# CONFIG_SCSI_PSI240I is not set
# CONFIG_SCSI_QLOGIC_FAS is not set
# CONFIG_SCSI_QLOGIC_ISP is not set
# CONFIG_SCSI_QLOGIC_FC is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_NEWISP is not set
# CONFIG_SCSI_SEAGATE is not set
# CONFIG_SCSI_SIM710 is not set
# CONFIG_SCSI_SYM53C416 is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_T128 is not set
# CONFIG_SCSI_U14_34F is not set
# CONFIG_SCSI_ULTRASTOR is not set
# CONFIG_SCSI_DEBUG is not set

#
# Fusion MPT device support
#
# CONFIG_FUSION is not set
# CONFIG_FUSION_BOOT is not set
# CONFIG_FUSION_ISENSE is not set
# CONFIG_FUSION_CTL is not set
# CONFIG_FUSION_LAN is not set

#
# IEEE 1394 (FireWire) support (EXPERIMENTAL)
#
# CONFIG_IEEE1394 is not set

#
# I2O device support
#
# CONFIG_I2O is not set
# CONFIG_I2O_PCI is not set
# CONFIG_I2O_BLOCK is not set
# CONFIG_I2O_LAN is not set
# CONFIG_I2O_SCSI is not set
# CONFIG_I2O_PROC is not set

#
# Network device support
#
CONFIG_NETDEVICES=y

#
# ARCnet devices
#
# CONFIG_ARCNET is not set
CONFIG_DUMMY=m
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_ETHERTAP is not set

#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
# CONFIG_SUNLANCE is not set
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNBMAC is not set
# CONFIG_SUNQE is not set
# CONFIG_SUNGEM is not set
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_LANCE is not set
# CONFIG_NET_VENDOR_SMC is not set
# CONFIG_NET_VENDOR_RACAL is not set
# CONFIG_AT1700 is not set
# CONFIG_DEPCA is not set
# CONFIG_HP100 is not set
# CONFIG_NET_ISA is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_AC3200 is not set
# CONFIG_APRICOT is not set
# CONFIG_CS89x0 is not set
# CONFIG_TULIP is not set
# CONFIG_TC35815 is not set
# CONFIG_DE4X5 is not set
# CONFIG_DGRS is not set
# CONFIG_DM9102 is not set
CONFIG_EEPRO100=m
CONFIG_NET_E100=m
# CONFIG_LNE390 is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
# CONFIG_NE3210 is not set
# CONFIG_ES3210 is not set
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
# CONFIG_8139TOO_PIO is not set
# CONFIG_8139TOO_TUNE_TWISTER is not set
# CONFIG_8139TOO_8129 is not set
# CONFIG_8139_NEW_RX_RESET is not set
# CONFIG_SIS900 is not set
# CONFIG_SIS900_OLD is not set
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set
# CONFIG_VIA_RHINE_MMIO is not set
# CONFIG_WINBOND_840 is not set
# CONFIG_NET_POCKET is not set

#
# Ethernet (1000 Mbit)
#
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
# CONFIG_MYRI_SBUS is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_SK98LIN is not set
# CONFIG_TIGON3 is not set
CONFIG_NET_E1000=m
# CONFIG_FDDI is not set
CONFIG_NETCONSOLE=m
# CONFIG_HIPPI is not set
# CONFIG_PLIP is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set

#
# Wireless LAN (non-hamradio)
#
# CONFIG_NET_RADIO is not set

#
# Token Ring devices
#
# CONFIG_TR is not set
# CONFIG_NET_FC is not set
# CONFIG_RCPCI is not set
CONFIG_SHAPER=m

#
# Wan interfaces
#
# CONFIG_WAN is not set

#
# Amateur Radio support
#
# CONFIG_HAMRADIO is not set

#
# IrDA (infrared) support
#
# CONFIG_IRDA is not set

#
# ISDN subsystem
#
# CONFIG_ISDN is not set
# CONFIG_KALLSYMS is not set

#
# Old CD-ROM drivers (not SCSI, not IDE)
#
# CONFIG_CD_NO_IDESCSI is not set

#
# Input core support
#
# CONFIG_INPUT is not set
# CONFIG_INPUT_KEYBDEV is not set
# CONFIG_INPUT_MOUSEDEV is not set
# CONFIG_INPUT_JOYDEV is not set
# CONFIG_INPUT_EVDEV is not set

#
# Character devices
#
CONFIG_VT=y
CONFIG_ECC=m
CONFIG_VT_CONSOLE=y
CONFIG_SERIAL=y
CONFIG_SERIAL_CONSOLE=y
# CONFIG_SERIAL_EXTENDED is not set
# CONFIG_SERIAL_NONSTANDARD is not set
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=256

#
# I2C support
#
# CONFIG_I2C is not set

#
# Mice
#
# CONFIG_BUSMOUSE is not set
# CONFIG_MOUSE is not set

#
# Joysticks
#
# CONFIG_INPUT_GAMEPORT is not set
# CONFIG_QIC02_TAPE is not set

#
# Watchdog Cards
#
# CONFIG_WATCHDOG is not set
# CONFIG_AMD_RNG is not set
# CONFIG_INTEL_RNG is not set
# CONFIG_AMD_PM768 is not set
CONFIG_NVRAM=m
CONFIG_RTC=y
# CONFIG_DTLK is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set

#
# Ftape, the floppy tape device driver
#
# CONFIG_FTAPE is not set
# CONFIG_AGP is not set
# CONFIG_DRM is not set
# CONFIG_MWAVE is not set
# CONFIG_BATTERY_GERICOM is not set

#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set

#
# Crypto Hardware support
#
# CONFIG_CRYPTO is not set

#
# File systems
#
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS_FS is not set
# CONFIG_AUTOFS4_FS is not set
# CONFIG_REISERFS_FS is not set
# CONFIG_REISERFS_CHECK is not set
# CONFIG_REISERFS_PROC_INFO is not set
# CONFIG_ADFS_FS is not set
# CONFIG_AFS_FS is not set
# CONFIG_ADFS_FS_RW is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BEFS_DEBUG is not set
# CONFIG_BFS_FS is not set
CONFIG_EXT3_FS=y
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_UMSDOS_FS=m
CONFIG_VFAT_FS=m
# CONFIG_EFS_FS is not set
# CONFIG_JFFS_FS is not set
# CONFIG_JFFS2_FS is not set
# CONFIG_CRAMFS is not set
CONFIG_TMPFS=y
CONFIG_RAMFS=y
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
# CONFIG_JFS_FS is not set
# CONFIG_JFS_DEBUG is not set
# CONFIG_JFS_STATISTICS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_NTFS_FS is not set
# CONFIG_NTFS_RW is not set
# CONFIG_HPFS_FS is not set
CONFIG_PROC_FS=y
# CONFIG_DEVFS_FS is not set
# CONFIG_DEVFS_MOUNT is not set
# CONFIG_DEVFS_DEBUG is not set
CONFIG_DEVPTS_FS=y
# CONFIG_QNX4FS_FS is not set
# CONFIG_QNX4FS_RW is not set
# CONFIG_ROMFS_FS is not set
CONFIG_EXT2_FS=y
# CONFIG_SYSV_FS is not set
# CONFIG_UDF_FS is not set
# CONFIG_UDF_RW is not set
# CONFIG_UFS_FS is not set
# CONFIG_UFS_FS_WRITE is not set

#
# Network File Systems
#
# CONFIG_CODA_FS is not set
# CONFIG_INTERMEZZO_FS is not set
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
# CONFIG_ROOT_NFS is not set
# CONFIG_NFSD is not set
# CONFIG_NFSD_V3 is not set
# CONFIG_NFSD_TCP is not set
CONFIG_SUNRPC=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
# CONFIG_SMB_FS is not set
# CONFIG_NCP_FS is not set
# CONFIG_NCPFS_PACKET_SIGNING is not set
# CONFIG_NCPFS_IOCTL_LOCKING is not set
# CONFIG_NCPFS_STRONG is not set
# CONFIG_NCPFS_NFS_NS is not set
# CONFIG_NCPFS_OS2_NS is not set
# CONFIG_NCPFS_SMALLDOS is not set
# CONFIG_NCPFS_NLS is not set
# CONFIG_NCPFS_EXTRAS is not set
CONFIG_ZISOFS_FS=y

#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
# CONFIG_SMB_NLS is not set
CONFIG_NLS=y

#
# Native Language Support
#
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=m
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
# CONFIG_NLS_CODEPAGE_850 is not set
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_CODEPAGE_1250 is not set
# CONFIG_NLS_CODEPAGE_1251 is not set
CONFIG_NLS_ISO8859_1=m
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
# CONFIG_NLS_ISO8859_15 is not set
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
# CONFIG_NLS_UTF8 is not set

#
# Console drivers
#
CONFIG_VGA_CONSOLE=y
CONFIG_VIDEO_SELECT=y
# CONFIG_VIDEO_IGNORE_BAD_MODE is not set
# CONFIG_MDA_CONSOLE is not set

#
# Frame-buffer support
#
# CONFIG_FB is not set
# CONFIG_SPEAKUP is not set

#
# Sound
#
# CONFIG_SOUND is not set

#
# USB support
#
# CONFIG_USB is not set

#
# Bluetooth support
#
# CONFIG_BLUEZ is not set

#
# Kernel hacking
#
# CONFIG_DEBUG_KERNEL is not set

#

#
# Library routines
#
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=m
Comment 3 jason andrade 2003-03-24 03:26:39 EST
some additional information:

[root@chaos /]# cat /proc/interrupts
           CPU0       CPU1
  0:    1263581    1264396    IO-APIC-edge  timer
  1:          0          2    IO-APIC-edge  keyboard
  2:          0          0          XT-PIC  cascade
  8:          1          0    IO-APIC-edge  rtc
 14:          0          2    IO-APIC-edge  ide0
 16:   37812315   37823907   IO-APIC-level  eth0
 22:   76062153   76054510   IO-APIC-level  eth1
 30:          7          9   IO-APIC-level  aic7xxx
 31:     114746     115501   IO-APIC-level  aacraid
NMI:          0          0
LOC:    2527867    2527865
ERR:          0
MIS:          0

[root@chaos /]# uptime
  6:25pm  up  7:06,  2 users,  load average: 2.73, 2.75, 4.16
Comment 4 Arjan van de Ven 2003-03-24 04:21:29 EST
just to make sure, the later stuff where you see the stack overflows is with the
e100/e1000 WE ship, not intels drivers?
Comment 5 jason andrade 2003-03-24 04:34:04 EST
yes, i am using redhat's drivers.

Mar 24 10:44:04 chaos kernel: Intel(R) PRO/100 Network Driver - version 2.1.29-k3
Mar 24 10:44:04 chaos kernel: Copyright (c) 2002 Intel Corporation
Mar 24 10:44:04 chaos kernel: 
Mar 24 10:44:04 chaos kernel: e100: eth0: Intel(R) 8255x Based Network Connection
Mar 24 10:44:04 chaos kernel:   Hardware receive checksums enabled
Mar 24 10:44:04 chaos kernel:   cpu cycle saver enabled
Mar 24 10:44:04 chaos kernel: 
Mar 24 10:44:04 chaos kernel: e100: eth0 NIC Link is Up 100 Mbps Full duplex
Mar 24 10:44:04 chaos kernel: Intel(R) PRO/1000 Network Driver - version 4.4.19-k2
Mar 24 10:44:05 chaos kernel: Copyright (c) 1999-2002 Intel Corporation.
Mar 24 10:44:05 chaos kernel: eth1: Intel(R) PRO/1000 Network Connection
Mar 24 10:44:05 chaos kernel: e1000: eth1 NIC Link is Up 1000 Mbps Full Duplex
Comment 6 Arjan van de Ven 2003-03-24 04:38:20 EST
Jeff: your ballgame
Comment 7 jason andrade 2003-03-24 05:45:06 EST
please let me know if you need any more information.  i can also provide remote
ssh to the box if that would help speed up any debug process.

the boxes in question are the main ftp/http servers for downloads and i'd
really like to get this sorted out before the next redhat release happens 
(hint hint :-)

some additional information - we have applied this kernel change (we tested
it on a dev box) across all our servers and to date we are only seeing this
behaviour on our two main ftp/http public servers.  i don't see this on our
main NFS server or some of our other download (boa/web) servers..  i'm not
complaining they are still behaving though..

at this point my choices (if i can't get a solution/fix) appear to be either
upgrading to a redhat rawhide (2.4.20 kernel) to see if the problem continues
there or to downgrade back to the last working kernel version for us in this
config which was 2.4.18-18.7.x.

a quick look in messages shows this (is this related to a process
or is the number something else?)

Mar 24 17:04:48 chaos kernel: do_IRQ: stack overflow: 780
Mar 24 17:05:20 chaos kernel: do_IRQ: stack overflow: 860
Mar 24 17:05:30 chaos kernel: do_IRQ: stack overflow: 968
Mar 24 17:05:39 chaos kernel: do_IRQ: stack overflow: 968
Mar 24 17:05:50 chaos kernel: do_IRQ: stack overflow: 708
Mar 24 17:06:00 chaos kernel: do_IRQ: stack overflow: 708
Mar 24 17:13:32 chaos kernel: do_IRQ: stack overflow: 904
Mar 24 17:23:18 chaos kernel: do_IRQ: stack overflow: 844
Mar 24 17:23:24 chaos kernel: do_IRQ: stack overflow: 844
Mar 24 17:23:32 chaos kernel: do_IRQ: stack overflow: 736
Mar 24 17:23:40 chaos kernel: do_IRQ: stack overflow: 844
Mar 24 17:23:49 chaos kernel: do_IRQ: stack overflow: 844
Mar 24 17:23:58 chaos kernel: do_IRQ: stack overflow: 736
Mar 24 17:24:05 chaos kernel: do_IRQ: stack overflow: 996
Mar 24 17:44:29 chaos kernel: do_IRQ: stack overflow: 940
Mar 24 17:44:39 chaos kernel: do_IRQ: stack overflow: 1020
Mar 24 17:44:41 chaos kernel: do_IRQ: stack overflow: 1020
Mar 24 17:44:44 chaos kernel: do_IRQ: stack overflow: 736
Mar 24 17:44:53 chaos kernel: do_IRQ: stack overflow: 736
Mar 24 17:45:03 chaos kernel: do_IRQ: stack overflow: 736
Mar 24 17:45:12 chaos kernel: do_IRQ: stack overflow: 736
Mar 24 17:45:21 chaos kernel: do_IRQ: stack overflow: 736
Mar 24 17:45:29 chaos kernel: do_IRQ: stack overflow: 944
Mar 24 17:45:36 chaos kernel: do_IRQ: stack overflow: 944
Mar 24 17:45:44 chaos kernel: do_IRQ: stack overflow: 944
Mar 24 17:45:51 chaos kernel: do_IRQ: stack overflow: 944
Mar 24 17:45:59 chaos kernel: do_IRQ: stack overflow: 944
Mar 24 18:55:38 chaos kernel: do_IRQ: stack overflow: 1000
Comment 8 jason andrade 2003-03-25 00:38:01 EST
i have had to revert back to 2.4.18-19.7.x on both our servers at this point 
as they were crashing every 2-3 hours.  (they have both been rebuild as
redhat 7.3 boxes now but the problem still persisted)

i am still running 2.4.18-27.7.x on all our other servers.

regards,

-jason
Comment 9 jason andrade 2003-03-25 02:20:29 EST
i have had to revert back to 2.4.18-19.7.x on both our servers at this point 
as they were crashing every 2-3 hours.  (they have both been rebuild as
redhat 7.3 boxes now but the problem still persisted)

i am still running 2.4.18-27.7.x on all our other servers.

regards,

-jason
Comment 10 jason andrade 2003-03-25 02:22:55 EST
i am still seeing kernel stack traces being dropped into syslog/messages using
a 2.4.18-19.7.x kernel.  i am going to try to go back to 2.4.18-18.7.x to see if
that fixes it (it was what it was running before all the upgrades...)

regards,

-jason
Comment 11 jason andrade 2003-03-26 16:17:52 EST
i have tried a number of things with no luck so far.

o i have tried compiling the e100/e1000 modules into the kernel.
   - still get stack traces

o i have tried using the eepro100 module instead of e100
  - still get stack traces

i've just bad both boxes fall over.  one has rebooted and the other is hung.    
this problem is getting quite bad now :-(

regards,

-jason
 

Comment 12 jason andrade 2003-03-27 00:13:19 EST
I have added a 3c905B ethernet card and am using the 3c59x driver and
i am still seeing these problems.

so i suspect both the e100 and e1000 drivers are buggy.  

i'd appreciate some further guidance on anything that might help to stop
these stack traces and/or these machines crashing..  would going back
to redhat 7.2 be an option now ?

regards,

-jason
Comment 13 jason andrade 2003-03-27 23:01:12 EST
i added a second 3c905 card and disabled using the e100 and e1000 
and am still getting these kernel traces and machine lockups.

any other ideas on what might be causing this and/or can i provide any 
further debug information - does the stack trace say much about where
this problem might lie ?  i can run the messages file through ksymoops
if that will help..

regards,

-jason
Comment 14 jason andrade 2003-03-31 17:51:56 EST
this is not a bug with the ethernet driver stack at all.  i believe i have
narrowed this down to it being a bug with the compiler or some other
part of the os (glibc?) which causes custom kernels to induce stack
traces.

i will submit a separate bug report  if anyone wants to look at that.

it appears this bug does not appear unless a system is at very high 
network load.. so it probably will not affect most people.

Note You need to log in before you can comment on or make changes to this bug.