Bug 761262

Summary: Huge TLB Broken
Product: [Fedora] Fedora Reporter: Andrig Miller <andrig.t.miller>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16CC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-07-16 18:32:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
test log none

Description Andrig Miller 2011-12-07 22:25:10 UTC
Description of problem:

When trying to preallocate large page memory (HugePages or HugeTLB), and use it with Java and MySql (both support it), I cannot specify more than 7GB or so, with it getting errno 28 (no space left on device).

Version-Release number of selected component (if applicable):

3.1.4-1.fc16.x86_64

How reproducible:

Everytime

Steps to Reproduce:
1. Allocate large pages through sysctl.conf, with the following:
# Enable large page memory
kernel.shmmax=25769803776
vm.nr_hugepages=10752
vm.hugetlb_shm_group=1001

There is 24 GB of memory on the server, and I'm allocating 21GB (I have done this on Fedora 14 and 15 with no issues.

2. Set /etc/security/limits.conf to allow for memlock to be unlimited for the user.
3. Create the hugetlb group, and put the users in that group.
4. Turn off transparent huge pages through a boot parameter transparent_hugepage=never
5. Run the following java command:

java -XX:+UseLargePages -Xms8g -Xmx8g -version
  
Actual results:

java -XX:+UseLargePages -Xms8g -Xmx8g -version
OpenJDK 64-Bit Server VM warning: Failed to reserve shared memory (errno = 28).
java version "1.6.0_22"
OpenJDK Runtime Environment (IcedTea6 1.10.4) (fedora-60.1.10.4.fc16-x86_64)
OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)

Expected results:

I would expect it to not have the error, like when you do it with 7GB, such as:

java -XX:+UseLargePages -Xms7g -Xmx7g -version
java version "1.6.0_22"
OpenJDK Runtime Environment (IcedTea6 1.10.4) (fedora-60.1.10.4.fc16-x86_64)
OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)


Additional info:

I have also looked at how zone_reclaim_mode was set, as 0 should allow the memory allocation to go across NUMA nodes (this system has two NUMA nodes, as its a two socket, 4 core per socket, with HT, Intel Nehalem based server).

This exact configuration worked without issue on previous Fedora releases, like 15 and 14 for sure.

Comment 1 Josh Boyer 2011-12-07 22:56:28 UTC
Fedora 15 and Fedora 16 are on the same kernel version now (3.1.4).  You will probably make more progress by reporting this upstream to the MM developers.  Things to note are the kernel version it last worked with, and the first one it stopped working on.

E.g. 2.6.38 works, 3.0 works, 3.1 fails (or whatever the case may be).

Comment 2 Chuck Ebbert 2011-12-09 02:12:27 UTC
What version was the last F15 kernel that worked?

Comment 3 Josh Boyer 2011-12-09 12:28:48 UTC
I emailed upstream.  No replies as of yet.

http://marc.info/?l=linux-mm&m=132336457726519&w=2

Comment 4 Andrig Miller 2011-12-10 03:55:18 UTC
(In reply to comment #1)
> Fedora 15 and Fedora 16 are on the same kernel version now (3.1.4).  You will
> probably make more progress by reporting this upstream to the MM developers. 
> Things to note are the kernel version it last worked with, and the first one it
> stopped working on.
> 
> E.g. 2.6.38 works, 3.0 works, 3.1 fails (or whatever the case may be).



I really don't know because the server where I needed to use more than 7GB I added the Fedora update repo during installation so I never had anything older than the 3.1.4 kernel.

Comment 5 Josh Boyer 2011-12-11 15:09:27 UTC
(In reply to comment #4)
> (In reply to comment #1)
> > Fedora 15 and Fedora 16 are on the same kernel version now (3.1.4).  You will
> > probably make more progress by reporting this upstream to the MM developers. 
> > Things to note are the kernel version it last worked with, and the first one it
> > stopped working on.
> > 
> > E.g. 2.6.38 works, 3.0 works, 3.1 fails (or whatever the case may be).
> 
> 
> 
> I really don't know because the server where I needed to use more than 7GB I
> added the Fedora update repo during installation so I never had anything older
> than the 3.1.4 kernel.

You said F14 and F15 worked.  Which kernel versions from F15 work?

Comment 6 Andrig Miller 2011-12-11 15:22:50 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > (In reply to comment #1)
> > > Fedora 15 and Fedora 16 are on the same kernel version now (3.1.4).  You will
> > > probably make more progress by reporting this upstream to the MM developers. 
> > > Things to note are the kernel version it last worked with, and the first one it
> > > stopped working on.
> > > 
> > > E.g. 2.6.38 works, 3.0 works, 3.1 fails (or whatever the case may be).
> > 
> > 
> > 
> > I really don't know because the server where I needed to use more than 7GB I
> > added the Fedora update repo during installation so I never had anything older
> > than the 3.1.4 kernel.
> 
> You said F14 and F15 worked.  Which kernel versions from F15 work?



Which ever kernel version it had before its last kernel upgrade.  It was probably before 3.x, but I don't remember.

Comment 7 Josh Boyer 2011-12-11 15:28:50 UTC
(In reply to comment #6) 
> 
> Which ever kernel version it had before its last kernel upgrade.  It was
> probably before 3.x, but I don't remember.

Yum keeps a log of upgrade transactions.  You might be able to find it in /var/log/yum.log.  Just grep for the kernel installs like:

[jwboyer@zod kernel]$ sudo grep ".* Installed: kernel-[23].*" /var/log/yum.log 
Jul 27 09:28:21 Installed: kernel-2.6.38.8-35.fc15.x86_64
Aug 02 08:49:57 Installed: kernel-2.6.40-4.fc15.x86_64
Aug 18 10:31:49 Installed: kernel-2.6.40.3-0.fc15.x86_64
Sep 15 08:50:25 Installed: kernel-2.6.40.4-5.fc15.x86_64
Oct 04 16:56:44 Installed: kernel-2.6.40.6-0.fc15.x86_64
Nov 25 23:13:20 Installed: kernel-3.1.1-2.fc16.x86_64
Dec 02 11:22:28 Installed: kernel-3.1.2-1.fc16.x86_64
Dec 08 12:27:19 Installed: kernel-3.1.4-1.fc16.x86_64

Comment 8 Andrig Miller 2011-12-11 20:28:41 UTC
I don't have the yum logs anymore because I needed to have a working system, so I installed RHEL 6.2.

Comment 9 Josh Boyer 2011-12-12 14:45:10 UTC
Created attachment 545762 [details]
test log

I've tested every major kernel release from 3.2-rc4-git6 (rawhide) back to the 2.6.35 kernel on f16 userspace.  None of them work with the -Xms8g -Xmx8g command line options to java, all failing with errno = 28.  They all work with 7g.

The test machine has 12GB of RAM and I've reserved 9GB for shmmax and nr_hugetlb_pages:

[root@vader ~]# cat /proc/sys/kernel/shmmax 
9663676416
[root@vader ~]# cat /proc/sys/vm/nr_hugepages
4608
[root@vader ~]# free -m
             total       used       free     shared    buffers     cached
Mem:         11999       9924       2074          0         19        207
-/+ buffers/cache:       9697       2302
Swap:        14047          0      14047
[root@vader ~]# 


So either something in your recreate steps is missing, or this isn't really a kernel problem...  am I missing something?

Comment 10 Andrig Miller 2011-12-13 20:39:37 UTC
I believe I have figured out the issue.  I started to have the problem on RHEL 6.2, which I installed over Fedora 16, and initially it worked just fine.

I have no idea why I suddenly started to get the issue, but its probably because I rebooted.

It turns out, when I change the shmmax setting to 24GB, which is the amount of physical RAM on the system, is when I started to have the issue.  The original default, after install was 64GB.  When I put it back to 64GB, then I was able to allocate as much of the large memory pages as I needed.

If someone could just test with setting shmmax to a much higher value, let's say three times the physical RAM like I have done, I bet the issue goes away.

I'm curious as to what the math is though, so I can capture and document what is going on.

Comment 11 Dave Jones 2012-03-22 17:12:09 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 12 Dave Jones 2012-03-22 17:14:30 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 13 Dave Jones 2012-03-22 17:23:43 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 14 Dave Jones 2012-07-16 18:32:27 UTC
I suspect your best bet here is to take questions to linux-mm where the VM developers hang out.