From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003 Description of problem: While running IBM's WebSphere Portal Server 4.1.4 and WebSphere Studio Application Developer 5.0 for Linux on a PIII 1.13 Ghz IBM Laptop w/ 1 G RAM, one of the java processes will die and cause the ps command to hang. This only occurs on a heavily loaded system with lots of CPU and RAM usage. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1.Start WPS 4.1.4 2.Start WebSphere Studio 5 3.Let run. 4.Wait for crash. 5.Run 'ps'.. watch it hang, hard reboot. Actual Results: 'ps' hangs.. the only way to restart the system is with a hard reboot. Expected Results: Java processes should die if they need to, but it shouldn't cause the ps command to hang. Additional info: I'm running IBM's JDK for Linux version 1.3.1 java version "1.3.1" - for WS Studio 5 Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1) Classic VM (build 1.3.1, J2RE 1.3.1 IBM build cxia32131-20020622 (JIT enabled: jitc)) java version "1.3.1" - for WPS 4.1.4 Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1) Classic VM (build 1.3.1, J2RE 1.3.1 IBM build cxia32131w-20020710 ORB130 (JIT enabled: jitc))
what kernel version are you using EXACTLY ? also can you get sysreq-T output during the hang ?
Created attachment 89069 [details] dmesg from kernel on hung ps dmesg from kernel after ps command hangs...
Created attachment 89070 [details] partial output from 'strace ps' after ps command hangs
uname -a gives: Linux philpott.houston.ibm.com 2.4.18-18.7.xcustom #2 Mon Dec 9 14:08:47 CST 2002 i686 unknown How do I get sysreq-T output exactly?
The xcustom is the stock 2.4.18-18.7.x kernel with NTFS support compiled as a module.
never mind the sysreq-t output; you got an oops not a deadlock can you paste the output of lsmod ?
[root@philpott ~]# lsmod Module Size Used by Tainted: GF i810_audio 23232 1 (autoclean) ac97_codec 12256 0 (autoclean) [i810_audio] soundcore 6212 2 (autoclean) [i810_audio] parport_pc 17476 1 (autoclean) lp 8608 0 (autoclean) parport 33536 1 (autoclean) [parport_pc lp] ipsec 252096 0 (unused) autofs 11140 0 (autoclean) (unused) ds 8416 2 yenta_socket 12000 2 pcmcia_core 49888 0 [ds yenta_socket] eepro100 20240 1 ipchains 39272 10 ide-scsi 9344 0 scsi_mod 104400 1 [ide-scsi] ide-cd 30112 0 cdrom 31936 0 [ide-cd] ntfs 54912 1 (autoclean) nls_iso8859-1 3488 1 (autoclean) nls_cp437 5120 1 (autoclean) vfat 11836 1 (autoclean) fat 36216 0 (autoclean) [vfat] mousedev 5024 1 hid 20608 0 (unused) input 5728 0 [mousedev hid] usb-uhci 24324 0 (unused) usbcore 71072 1 [hid usb-uhci] ext3 65312 1 jbd 47796 1 [ext3] [root@philpott ~]# lsmod Module Size Used by Tainted: GF i810_audio 23232 1 (autoclean) ac97_codec 12256 0 (autoclean) [i810_audio] soundcore 6212 2 (autoclean) [i810_audio] parport_pc 17476 1 (autoclean) lp 8608 0 (autoclean) parport 33536 1 (autoclean) [parport_pc lp] ipsec 252096 0 (unused) autofs 11140 0 (autoclean) (unused) ds 8416 2 yenta_socket 12000 2 pcmcia_core 49888 0 [ds yenta_socket] eepro100 20240 1 ipchains 39272 10 ide-scsi 9344 0 scsi_mod 104400 1 [ide-scsi] ide-cd 30112 0 cdrom 31936 0 [ide-cd] ntfs 54912 1 (autoclean) nls_iso8859-1 3488 1 (autoclean) nls_cp437 5120 1 (autoclean) vfat 11836 1 (autoclean) fat 36216 0 (autoclean) [vfat] mousedev 5024 1 hid 20608 0 (unused) input 5728 0 [mousedev hid] usb-uhci 24324 0 (unused) usbcore 71072 1 [hid usb-uhci] ext3 65312 1 jbd 47796 1 [ext3]
I was only running with 128 MB of Swap space. I have 1 GB of RAM. I added a 1 GB swapfile and everything seems OK. Please close this bug.
I'm currently seeing this bug on a a system running 2.4.20-20.7bigmem. I don't think it's related to the size of the swap: penguinC:~$ free total used free shared buffers cached Mem: 2062708 1860252 202456 0 7548 1661956 -/+ buffers/cache: 190748 1871960 Swap: 2080312 30252 2050060 penguinC:~$ uname -a Linux penguinC.corp.fiveprime.net 2.4.20-20.7bigmem #1 SMP Mon Aug 18 14:34:37 EDT 2003 i686 unknown my strace output looks the same; it dies in the middle of a read, leaving "read(7," as the last line.
Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/