Bug 149635
Summary: | SCSI errors and kernel OOM killer under scsi load under RHEL4 | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Dean Jansa <djansa> |
Component: | kernel | Assignee: | Larry Woodman <lwoodman> |
Status: | CLOSED NOTABUG | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 4.0 | CC: | bmarzins, davej, dff, kanderso, mspqa-list, riel, rkenna, tburke |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-03-11 21:08:18 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 144795 |
Description
Dean Jansa
2005-02-24 18:43:21 UTC
Test program: /* * writer.c -- * * writer <numblks per transfer> <#transerfs (0 = until end of disk)> <dev> */ #include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <string.h> #include <errno.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #define BLK 512 int main(int argc, char **argv) { int fd; char *bufh; int numblks; int numtrans; ssize_t nbytes; int i; if (argc != 4) { fprintf(stderr, "%s <numblks per transfer> <#transerfs (0 = unt; exit(2); } if ((fd = open(argv[3], O_RDWR)) < 0) { perror(argv[3]); exit(1); } numblks = atoi(argv[1]); numtrans = atoi(argv[2]); if ((bufh = malloc(numblks * BLK)) == NULL) { fprintf(stderr, "Can not allocate %d bytes\n", numblks * BLK); exit(1); } memset(bufh, '#', numblks * BLK); printf("Starting %d transfers of %d bytes\n", numtrans, numblks * BLK); for (i = 1; (numtrans == 0) || i <= numtrans; i++) { if ((nbytes = write(fd, bufh, numblks * BLK)) < 0) { fprintf(stderr, "Error on interation %d: %s\n", i, strerror(errno)); exit(1); } if (nbytes < (numblks * BLK)) { fprintf(stderr, "Short write (%d of %d) on iteration %i, nbytes, numblks * BLK, i); } else { printf("\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b%d", i); fflush(stdout); } } free(bufh); close(fd); printf("\n"); exit(0); } Running on 5 nodes (tank02 - tank-06) to shared storage with 2.6.10 Korg kernel, qla2300 drivers: Command line on all nodes: writer 40 0 /dev/sda 6107662 + iterations no issues at all Running the same command line on 2.6.9 2.6.9-5.EL Around iteration 78: 2.6.9 RHEL4 up: tank-02: oom-killer: gfp_mask=0xd0 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 HighMem per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 Free pages: 1152216kB (1151296kB HighMem) Active:4604 inactive:210748 dirty:3 writeback:208411 unstable:0 free:288054 slab:13100 mapped:4008 pagetables:186 DMA free:16kB min:16kB low:32kB high:48kB active:0kB inactive:11876kB present:16384kB protections[]: 0 0 0 Normal free:904kB min:936kB low:1872kB high:2808kB active:48kB inactive:824840kB present:901120kB protections[]: 0 0 0 HighMem free:1151296kB min:512kB low:1024kB high:1536kB active:18368kB inactive:6276kB present:1179136kB protections[]: 0 0 0 DMA: 0*4kB 0*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 16kB Normal: 0*4kB 1*8kB 0*16kB 0*32kB 0*64kB 3*128kB 2*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 904kB HighMem: 402*4kB 209*8kB 93*16kB 47*32kB 53*64kB 23*128kB 14*256kB 3*512kB 1*1024kB 1*2048kB 276*4096kB = 1151296kB Swap cache: add 0, delete 0, find 0/0, race 0+0 Out of Memory: Killed process 1806 (sendmail). tank-04: oom-killer: gfp_mask=0xd0 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 HighMem per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 Free pages: 1152080kB (1151360kB HighMem) Active:4598 inactive:210908 dirty:1 writeback:208058 unstable:0 free:288020 slab:12981 mapped:4004 pagetables:182 DMA free:16kB min:16kB low:32kB high:48kB active:0kB inactive:9072kB present:16384kB protections[]: 0 0 0 Normal free:704kB min:936kB low:1872kB high:2808kB active:28kB inactive:828284kB present:901120kB protections[]: 0 0 0 HighMem free:1151360kB min:512kB low:1024kB high:1536kB active:18364kB inactive:6276kB present:1179136kB protections[]: 0 0 0 DMA: 0*4kB 0*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 16kB Normal: 0*4kB 0*8kB 0*16kB 0*32kB 1*64kB 1*128kB 2*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 704kB HighMem: 368*4kB 170*8kB 85*16kB 61*32kB 60*64kB 25*128kB 12*256kB 5*512kB 0*1024kB 1*2048kB 276*4096kB = 1151360kB Swap cache: add 0, delete 0, find 0/0, race 0+0 Out of Memory: Killed process 1808 (sendmail). oom-killer: gfp_mask=0xd0 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 HighMem per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 Free pages: 1153996kB (1152192kB HighMem) Active:4390 inactive:210676 dirty:2 writeback:208832 unstable:0 free:288499 slab:12969 mapped:3800 pagetables:175 DMA free:12kB min:16kB low:32kB high:48kB active:4kB inactive:9272kB present:16384kB Running on 5 nodes (tank02 - tank-06) to shared storage with 2.6.9-5.ELsmp kernel: One node hung, no messages on console, nothing in /var/log/message. Just a dead node. Another trial with ./writer 40000 0 /dev/sda around iteration 50 -- tank-02: Starting 0 transfers of 20480000 bytes 34SCSI error : <0 0 0 0> return code = 0x20008 end_request: I/O error, dev sda, sector 64400 Buffer I/O error on device sda, logical block 8050 lost page write due to I/O error on sda SCSI error : <0 0 0 0> return code = 0x20008 end_request: I/O error, dev sda, sector 67472 Buffer I/O error on device sda, logical block 8434 lost page write due to I/O error on sda SCSI error : <0 0 0 0> return code = 0x20008 end_request: I/O error, dev sda, sector 69520 Buffer I/O error on device sda, logical block 8690 lost page write due to I/O error on sda SCSI error : <0 0 0 0> return code = 0x20008 end_request: I/O error, dev sda, sector 70544 Buffer I/O error on device sda, logical block 8818 lost page write due to I/O error on sda SCSI error : <0 0 0 0> return code = 0x20008 end_request: I/O error, dev sda, sector 64408 Buffer I/O error on device sda, logical block 8051 lost page write due to I/O error on sda tank-03: om-killer: gfp_mask=0xd0 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 HighMem per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 Free pages: 1133072kB (1130112kB HighMem) Active:9787 inactive:209869 dirty:4 writeback:207304 unstable:0 free:283268 slab:13214 mapped:9126 pagetables:249 DMA free:32kB min:16kB low:32kB high:48kB active:0kB inactive:11392kB present:16384kB protections[]: 0 0 0 Normal free:2928kB min:936kB low:1872kB high:2808kB active:56kB inactive:821796kB present:901120kB protections[]: 0 0 0 HighMem free:1130112kB min:512kB low:1024kB high:1536kB active:39092kB inactive:6288kB present:1179136kB protections[]: 0 0 0 DMA: 4*4kB 0*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 32kB Normal: 492*4kB 10*8kB 27*16kB 12*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2928kB HighMem: 2*4kB 1*8kB 1*16kB 1*32kB 1*64kB 0*128kB 0*256kB 5*512kB 3*1024kB 3*2048kB 273*4096kB = 1130112kB Swap cache: add 0, delete 0, find 0/0, race 0+0 Out of Memory: Killed process 1952 (sendmail). tank-04: oom-killer: gfp_mask=0xd0 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 HighMem per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 Free pages: 1132880kB (1129984kB HighMem) Active:9768 inactive:209882 dirty:11 writeback:208195 unstable:0 free:283220 slab:13195 mapped:9147 pagetables:264 DMA free:16kB min:16kB low:32kB high:48kB active:0kB inactive:11444kB present:16384kB protections[]: 0 0 0 Normal free:2880kB min:936kB low:1872kB high:2808kB active:12kB inactive:821924kB present:901120kB protections[]: 0 0 0 HighMem free:1129984kB min:512kB low:1024kB high:1536kB active:39068kB inactive:6200kB present:1179136kB protections[]: 0 0 0 DMA: 0*4kB 2*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 16kB Normal: 424*4kB 36*8kB 42*16kB 7*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2880kB HighMem: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 5*512kB 5*1024kB 2*2048kB 273*4096kB = 1129984kB Swap cache: add 0, delete 0, find 0/0, race 0+0 Out of Memory: Killed process 1952 (sendmail). oom-killer: gfp_mask=0xd0 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 Normal per-cpu: Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 HighMem per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 Free pages: 1133720kB (1130816kB HighMem) Active:9564 inactive:209894 dirty:0 writeback:207721 unstable:0 free:283430 slab:13198 mapped:8943 pagetables:253 DMA free:16kB min:16kB low:32kB high:48kB active:0kB inactive:11432kB present:16384kB protections[]: 0 0 0 Normal free:2824kB min:936kB low:1872kB high:2808kB active:12kB inactive:821992kB present:901120kB protections[]: 0 0 0 HighMem free:1130816kB min:512kB low:1024kB high:1536kB active:38256kB inactive:6200kB present:1179136kB protections[]: 0 0 0 DMA: 0*4kB 2*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 16kB Normal: 434*4kB 24*8kB 42*16kB 7*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2824kB HighMem: 62*4kB 27*8kB 7*16kB 4*32kB 2*64kB 0*128kB 0*256kB 5*512kB 5*1024kB 2*2048kB 273*4096kB = 1130816kB Swap cache: add 0, delete 0, find 0/0, race 0+0 Out of Memory: Killed process 2082 (xfs). Note -- this is blocking GFS testing, as we cannot put any load on the system without hitting this. The writer senario ran all last night without any problems on the upgraded 2.6.9-6.4.ELsmp kernel. This may be fixed. I'll continue to attempt to reproduce. This issue has been fixed in 2.6.9-6.4. Not so fast: We no longer see it with the simple 'writer' load shown. However, I have hit it twice in a row running a regression test load on a 6 node cluster: oom-killer: gfp_mask=0xd2 Mem-info: DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 HighMem per-cpu: empty Free pages: 1660kB (0kB HighMem) Active:63638 inactive:54145 dirty:0 writeback:54 unstable:0 free:415 slab:7032 mapped:116864 pagetables:695 DMA free:188kB min:20kB low:40kB high:60kB active:6260kB inactive:6188kB present:16384kB protections[]: 0 0 0 Normal free:1472kB min:692kB low:1384kB high:2076kB active:248292kB inactive:210392kB present:499648kB protections[]: 0 0 0 HighMem free:0kB min:128kB low:256kB high:384kB active:0kB inactive:0kB present:0kB protections[]: 0 0 0 DMA: 35*4kB 4*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 188kB Normal: 6*4kB 51*8kB 19*16kB 3*32kB 0*64kB 5*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1472kB HighMem: empty Swap cache: add 945657, delete 945520, find 335077/390289, race 0+0 Free swap: 40kB 129008 pages of RAM 0 pages of HIGHMEM 2010 reserved pages 209 pages shared 137 pages swap cached Out of Memory: Killed process 12859 (doio). Still seems to be here with: Kernel 2.6.9-6.4.EL on an i686 Just an FYI -- I see this with: 2.6.9-6.16.ELsmp as well. This is a blocker for GFS release, let me know what info you need to help with this one. Donald, This one needs to be a blocker for U1. We can't run our GFS verification suites tests to completion, or at least we can't get to the GFS bugs, we hit this one first. Kevin I am seeing this too. This log message is from 2.6.9-6.24.ELsmp, but I have been able to see the same thing on a mainline 2.6.11 kernel. I keep hitting this while trying to recreate a GFS bug. oom-killer: gfp_mask=0x1d2 Mem-info: DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 HighMem per-cpu: empty Free pages: 776kB (0kB HighMem) Active:57341 inactive:63913 dirty:0 writeback:2 unstable:0 free:194 slab:2924 ma pped:119584 pagetables:1112 DMA free:32kB min:20kB low:40kB high:60kB active:6232kB inactive:5624kB present: 16384kB protections[]: 0 0 0 Normal free:744kB min:692kB low:1384kB high:2076kB active:223132kB inactive:2500 28kB present:499648kB protections[]: 0 0 0 HighMem free:0kB min:128kB low:256kB high:384kB active:0kB inactive:0kB present: 0kB protections[]: 0 0 0 DMA: 4*4kB 0*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 32kB Normal: 2*4kB 4*8kB 2*16kB 1*32kB 0*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 0*2048 kB 0*4096kB = 744kB HighMem: empty Swap cache: add 625195, delete 625184, find 147874/174487, race 0+0 Free swap: 0kB 129008 pages of RAM 0 pages of HIGHMEM 2387 reserved pages 1753 pages shared 11 pages swap cached Out of Memory: Killed process 7747 (mDNSResponder). The recent vmscan.c patch off rhkernel-list doesn't seem to change my ability to reproduce this bug. I can still consistently see it within 30 minutes Does this happen without GFS? THe reason I ask is that someone is allocating Normal zone rather than Highmem pages for the pagecache. This is likey the cause of the OOM kills in this case. Larry Woodman Actually after looking at this more, in addition to the pagecache pages getting allocated out of lowmem(Normal and/or DMA zones) the entire set of inactive pages are in writeback state. This means that they have been written to disk in an attempt to reclaim them but the IO completion has not occurred! ----------------------------------------------------------------- Active:9787 inactive:209869 dirty:4 writeback:207304 unstable:0 ----------------------------------------------------------------- Since the device "lost page write due to I/O error on sda" the completion never occured, the page never exited writeback state and therefore can not be reclaimed. If this happens to all of the lowmem pages the system will choke and eventually start OOM killing. Why are these device errors happening here? Larry Please attach the OOM kill show_mem() from the messages file when you are running with that vmscan patch. Also, are you still gettin getting the scsi IO errors? Larry Ben, in your last posting the problem is that swapfree is zero and all of the pages on the system are mapped! I dont know what you are doing here but if the system needs to reclaim anonymous memory and there is no swap space it has no choice but to OOM kill. Larry Ben, in your last posting the problem is that swapfree is zero and all of the pages on the system are mapped! I dont know what you are doing here but if the system needs to reclaim anonymous memory and there is no swap space it has no choice but to OOM kill. Larry Yes, I am hitting this bug while using GFS. I have not tried to hit this bug without gfs. I think dean has tried, and I don't think he was able to. here is what a log message looks like with the vmscan patch oom-killer: gfp_mask=0x1d2 Mem-info: DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 HighMem per-cpu: empty Free pages: 660kB (0kB HighMem) Active:87697 inactive:34046 dirty:0 writeback:0 unstable:0 free:165 slab:2495 mapped:1 21429 pagetables:1087 DMA free:20kB min:20kB low:40kB high:60kB active:6860kB inactive:5164kB present:16384k B protections[]: 0 0 0 Normal free:640kB min:692kB low:1384kB high:2076kB active:343928kB inactive:131020kB p resent:499648kB protections[]: 0 0 0 HighMem free:0kB min:128kB low:256kB high:384kB active:0kB inactive:0kB present:0kB protections[]: 0 0 0 DMA: 1*4kB 0*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096 kB = 20kB Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 3*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4 096kB = 640kB HighMem: empty Swap cache: add 532138, delete 532138, find 117967/137145, race 0+1 Free swap: 0kB 129008 pages of RAM 0 pages of HIGHMEM 2387 reserved pages 86 pages shared 0 pages swap cached Out of Memory: Killed process 9195 (doio). I do not see any SCSI IO errors while running these tests, and in fact, I can repoduce this while running over GNBD, which completely avoids the SCSI layer. FWIW -- I haven't tried to hit it without GFS with the new kernel. The attached 'writer.c' test doesn't seem to hit it, however that is a light load compared to the load we see with a GFS regression test. I'd guess if we up the load on the RAID we'd see it even without GFS. That is what triggered it with 'writer' we needed to run writer from multiple nodes to the shared raid. Once again, the latest memory stats show a potential problem. There is no swap space and there are 121429 out of 129008 pages that are mapped into some address space. I dont know this appicatiomn is doing but the system can not reclaim anonymous pages without locations on a swap device to write them. Why is there no swap space? Larry um... You can probably stop looking so hard at this. Things weren't quite adding up, so I stuck some memory debugging code into the test program that Dean and I have been using to get these errors... and I'm pretty certain that this is a bug in the test program, not in the kernel. I don't exactly know everything that's going on with the program, so I can't say that for certain, but I'm pretty sure. I'm convinced that every time I saw this happen, it was the fault of the testing program. The writer testing program that was hitting it earlier doesn't hit it anymore. So I'm perfectly happy with this bug being closed. Yep -- I had writer run overnight, no issues. We can close it. Dean, while I will happily close this bug can you tell me what you were doing that caused the system to OOM kill? The reason I ask is that we are still running into OOM kills in which we see the systems in a variety of strange states that I still can not fully explain. Knowing what you did might help me think about these problems from a different angle. Thanks for your help, Larry Woodman. The test case was doing large writes -- 204800000 byes each request. Multiple processes doing that was the cause of the memory exhaustion. |