From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 5.22; Mac_PowerPC) Description of problem: Installed a Cenetek Rocket Drive, and enabled ext3 on it. kjournald is now using between 50-80% cpu as reported by top. System Info: ------------ Intel SE7501BR2 motherboard Dual Xeon 2.6Ghz 2GB Memory 2 x 3ware 7500 Escalade IDE RAID Controllers Cenetek Rocket Drive 4GB Before the installation of the Cenetek, kjournald looked like this: root 21 0.0 0.0 0 0 ? SW Jun24 0:08 [kjournald] root 148 0.0 0.0 0 0 ? SW Jun24 0:00 [kjournald] root 149 0.0 0.0 0 0 ? SW Jun24 0:04 [kjournald] root 150 0.0 0.0 0 0 ? SW Jun24 0:08 [kjournald] root 151 0.0 0.0 0 0 ? SW Jun24 0:23 [kjournald] root 152 0.5 0.0 0 0 ? DW Jun24 2:36 [kjournald] After the installation of the Cenetek Rocket Drive, kjournald now looks like this: root 21 0.1 0.0 0 0 ? SW 17:51 0:02 [kjournald] root 148 0.0 0.0 0 0 ? SW 17:51 0:00 [kjournald] root 149 0.0 0.0 0 0 ? SW 17:51 0:01 [kjournald] root 150 0.4 0.0 0 0 ? SW 17:51 0:09 [kjournald] root 151 1.1 0.0 0 0 ? SW 17:51 0:22 [kjournald] root 152 60.1 0.0 0 0 ? DW 17:51 20:04 [kjournald] This machine is a mailserver. Here is some df and iostat info: [root@mercury tmp]# df Filesystem 1k-blocks Used Available Use% Mounted on /dev/sda2 5036316 3909992 870492 82% / /dev/sda1 46636 14371 29857 33% /boot none 1032412 0 1032412 0% /dev/shm /dev/sda5 5044156 1447792 3340132 31% /var/log/qmail /dev/sdb1 76928448 4283476 68737164 6% /var/spool/spam /dev/sdc1 114247496 42658632 65785388 40% /home/cust /dev/hde1 4062768 124352 3728708 4% /var/qmail/queue sol:/home/ftp/pub/mirrors/updates.redhat.com/7.3 5044156 4186104 601820 88% /usr/src/redhat/ updates.redhat.com [root@mercury tmp]# iostat -x Linux 2.4.20-18.7smp (mercury.shreve.net) 06/25/2003 avg-cpu: %user %nice %sys %idle 13.46 0.00 64.63 21.91 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util /dev/sda 23.03 21.89 3.17 3.91 209.16 206.51 104.58 103.26 58.66 1.38 194.25 43.36 3.07 /dev/sda1 0.02 0.00 0.01 0.00 0.06 0.01 0.03 0.01 5.03 0.01 497.30 494.59 0.07 /dev/sda2 22.66 18.91 3.12 3.29 206.25 177.72 103.13 88.86 59.87 1.24 192.79 47.07 3.02 /dev/sda3 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 8.00 0.00 100.00 100.00 0.00 /dev/sda5 0.35 2.98 0.04 0.62 2.83 28.78 1.41 14.39 48.20 0.13 202.02 152.54 1.00 /dev/sdb 0.17 13.05 4.98 6.32 40.83 155.02 20.41 77.51 17.33 15.64 1384.31 55.21 6.24 /dev/sdb1 0.17 13.05 4.98 6.32 40.83 155.02 20.41 77.51 17.34 15.64 1384.36 55.21 6.24 /dev/sdc 10.60 65.25 28.84 58.14 315.07 987.38 157.54 493.69 14.97 20.95 240.87 35.40 30.79 /dev/sdc1 10.59 65.25 28.84 58.14 315.04 987.38 157.52 493.69 14.97 20.95 240.88 35.39 30.79 /dev/sdc2 0.01 0.00 0.00 0.00 0.03 0.00 0.02 0.00 8.00 0.00 80.00 80.00 0.00 /dev/hde 1.34 551.53 6.20 235.76 60.18 6299.56 30.09 3149.78 26.28 167.91 24.34 23.19 56.12 /dev/hde1 1.34 551.53 6.20 235.76 60.18 6299.56 30.09 3149.78 26.28 5.88 24.34 20.39 49.34 I am willing to do whatever you all need from me, to help diagnose. If you need me to run profile=2, i can do that. System load is impacted, load is: 6:34pm up 42 min, 3 users, load average: 15.28, 14.46, 13.11 It will go as high as 20-25. There is plenty of memory (2GB) available, and really we are overkill in the memory and cpu department. The mail queue (qmail system) is very thrashy, and so we moved it off the IDE RAID onto a Rocket Drive (RAM Drive), to lower the Transactions Per Second on that drive array. Perhaps this is either a bug with Cenetek, or I need to use a different ext journaling mode? or perhaps ext3 is a bad idea on a thrashy mail queue? My last attempt at putting this ram drive in, resulted in a some NMI error that left the box hosed. e2fsck could not recover, what it did recover, was a bunch of trashed data and unlinked inodes in lost+found. I attributed it to bad karma since alot of other things "changed" that day during the scheduled downtime. I am now wondering if maybe kjournald caused the NMI, and was just buried so hard it corrupted the journal. Maybe thats far fetched, appreciate any help. Version-Release number of selected component (if applicable): kernel-2.4.20-18.7 How reproducible: Always Steps to Reproduce: 1. Install Cenetek Rocket Drive 2. Enable ext3 on Rocket Drive 3. Thrash the drive with lots of read/writes (bonnie maybe), and watch kjournald go thru the roof. Actual Results: kjournald is in a state of high cpu utilization Expected Results: kjournald should get a little busier with an addtional journal, but not to the extent of being 100x more cpu intensive. Additional info: The machine is running Qmail, and is a mailserver. It uses courier-imap for imapd and pop3d. It also runs maildrop as a LDA, apache for webmail via squirrelmail, and a few other mail related things such as mailman. The box has 14 drives laid out as follows: /dev/hda RAID1 1 set of 2 drives /dev/hdb RAID10 2 sets of 2 drives /dev/hdc RAID10 3 sets of 2 drives plus 2 hot spares. This is all on the 3ware Escalade 7500 series controllers. All drives are 7200 RPM matched drives.
NMI comes from hardware so that may well indicate the card wasnt inserted properly or you inserted it while soft off not with power removed etc. Probably irrelevant in itself As to the drive itself, Cenetek afaik doesn't have Linux DMA driver support so the transfers would be PIO and thus very slow. Does hdparm think the device is in dma mode ?
cenatek supports DMA but it might not be enabled by default; hdparm -d1 will fix that
Ahah ok. In which case can you attach a dmesg of the boot up and I'll tweak the ide generic for cenatek to turn on DMA (if the hdparm -d1 works)
EOL'd, and needinfo > 6 months.