1. We have an LPAR on a Firebird (system type 7895) allocated with 2GB RAM 2. Installed Fedora 16 in that, which is successful. 3. In the end when it is booting back up, the reboot got hung by telling the message Kernel Panic - not syncing: Out of memory and no killable processes But once we allocate 4GB to it's RAM, then we found rebooting is happening is successfully. Where as in Juno iocl (8246) lpars it seems rebooting is successful for 2GB memory also. setup details - fbfirebird02 - 2GB - booting unsucessful fbfirebird07 - 4GB - booting successful We have attached two documents 1. Attached document is the /var/log/message of 4GB lpar where booting of fedora is successful. 2. Whereas another document refers to the terminal console message which appears during booting of fedora We tried patching the kernel with the oom-killer patch from RH Bug 741207, but the issue was not resolved.
Created attachment 531028 [details] 2gb boot output
Created attachment 531029 [details] 4Gb messages
------- Comment From anton.com 2011-11-02 05:40 EDT------- I had a look at this issue by booting one of my POWER6 Fedora16 boxes with mem=2G. A few observations: 1. The initramfs isn't removed after boot: # du -s /run/initramfs 127040 /run/initramfs/ That might be by design, but it does use up over 100MB of memory. 2. There is over 1GB of memory in slab: OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 118745 13237 11% 2.30K 2159 55 276352K kmalloc-2048 292000 17944 6% 0.80K 3650 80 233600K kmalloc-512 402714 4723 1% 0.55K 3442 117 220288K kmalloc-256 349948 6545 1% 0.36K 1966 178 125824K kmalloc-64 3596 550 15% 32.00K 230 16 117760K thread_info 4650 4171 89% 16.30K 150 31 76800K kmalloc-16384 Notice in particular how low the utilisation is for the top 5: 15% or below. This box has 4 NUMA nodes and 64 HW threads. If I boot with smt=0 nr_cpus=1 such that we only start 1 HW thread, we see a much nicer picture: OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 4095 4093 99% 16.30K 273 15 69888K kmalloc-16384 10719 10697 99% 2.30K 397 27 25408K kmalloc-2048 17150 17105 99% 1.30K 350 49 22400K kmalloc-1024 25705 25608 99% 0.66K 265 97 16960K blkdev_requests 24684 24561 99% 0.62K 242 102 15488K skbuff_head_cache 6420 6413 99% 2.07K 214 30 13696K blkdev_queue Only 200MB in slab usage, much better. I would have expected some wastage due to slub SMP optimisations (per cpu and per node pools), but utilisations of 6% and even 1% are pretty awful. It looks like there is an issue with slub and SMP.
filed 751189 for the initramfs thing. I trust you'll bring up the slub problem with upstream ? thanks.
(In reply to comment #3) > ------- Comment From anton.com 2011-11-02 05:40 EDT------- > I had a look at this issue by booting one of my POWER6 Fedora16 boxes with > mem=2G. A few observations: > > 1. The initramfs isn't removed after boot: > > # du -s /run/initramfs > 127040 /run/initramfs/ > > That might be by design, but it does use up over 100MB of memory. > To turn it off: # echo "unset prefix" >> /etc/dracut.conf.d/99-my.conf # dracut -f
------- Comment From anton.com 2011-11-07 05:43 EDT------- I found a memory leak in the SCSI layer that was responsible for about 15MB on my POWER7 box. commit f7c9c6bb14f3 ([SCSI] Fix block queue and elevator memory leak in scsi_alloc_sdev), marked stable@ so it should make it's way back to 3.1.X. Dave: the slub issue is next on my list.
------- Comment From anton.com 2011-11-07 05:54 EDT------- The ehea driver (IBM 1G/10G ethernet) is always filling the jumbo ring. That's worth about 60-70M of memory per interface. Since jumbo frame usage should be very rare, we should only fill the jumbo ring when we increase the MTU. I've brought this issue up with the ehea maintainer.
------- Comment From anton.com 2011-12-02 19:14 EDT------- I worked out why my low memory tests caused slub to consume much more memory. By clamping memory to 2GB, I had one NUMA node of memory and 4 NUMA nodes of CPUs. For the 3 nodes without memory the slub code would always go through the remote node alloc and free path. The long term fix is possibly HAVE_MEMORYLESS_NODES, but there is no chance of that for FC16. It would be nice to fix it more generally though, I would think other architectures see this with unbalanced CPU/memory layouts,
Comments #4 and #5 address the initramfs thing. The scsi fix mentioned in comment #6 is in the 3.2 kernel. The ehea fix in comment #7 seems to be aa9084a01a7893a9f4bed98aa29081f15d403a88, which is also in 3.2. Comment #8 basically suggests this is a NUMA balance problem that isn't going to be fixed anytime soon. Now that F16 is on the 3.2 kernel, I think this bug can be closed out.