The following has be reported by IBM LTC:
Question concerning the algorithm used for choosing where the shared libraries load
This is more a question than a bug.
On RHEL 3 IA32, we noticed that the shared libraries no longer load at
0x40000000 and they load in the space between 0x0 and the executable in the
My question is what happens when this space is filled. Do you have a specific
place where you put these after the exec, or is it random? My concern is that
this could cause a potential problem for DB2. We are still investigating the
effects of this change.Glen/Greg - a good question for Red Hat on RHEL3. Thanks.
Yvonne - does this impact DB2 at all ?
Unless you're using prelink(8), it is kernel which decides these addresses.
if this space is filled, we go to other free regions (in practice,
while this is suboptimal (eg it breaks the big free area between the binary and
the top of stack/mmaps) it's a safe thing to do for a rare situation
------ Additional Comments From firstname.lastname@example.org 2003-23-09 10:05 -------
Is the TASK_UNMAPPED_BASE the same as previous kernels and is this moveable if
TASK_UNMAPPED_BASE is at 1/3 of virtual space still; it is not movable.
but with prelink YOU decide where libs get loaded; kernel policy only matters if
there's no preference...
------ Additional Comments From email@example.com 2003-23-09 10:59 -------
There are a couple of things I'd like to clarify first.
From what I can tell of prelink, it would be run on the system after we install
DB2? If this is a the case, then we can *NOT* do this. It could cause DB2 to
stop running should the customer update their system libraries for any reason.
This would become a maintenance/support nightmare for us.
Do the binaries need to be built with gcc 3.2+ to use this? If so, we can not
do this on x86, or IA64 since we use the Intel compiler.
you can tell prelink to run once before you ship stuff if you manually pick
where libs go.
If it works with Intel's compiler I don't know, I have no information on how
compatible that is with gcc.
------ Additional Comments From firstname.lastname@example.org 2003-23-09 13:09 -------
ok. This is not an option then. We build on RH 7.2 and ship only 1 set of
binaries per architecture. We do *NOT* re-build or plan to change any of this
until v9, due to customer commitments. We can not afford the effort to ship
multiple binaries, due to both resouce, and support issues.
Since we have 1 binary (per arch) we can not use this option since we support
quite a few distro's with that one binary. We used to make use of the
mapped_base file that existed in /proc/<pid>/ directory. This has been
obviously removed. Is there a reason for this?
------ Additional Comments From email@example.com 2003-23-09 13:10 -------
One more thing to note the mapped_base file was availabe in Red Hat Advanced
mapped_base has not been added to RHEL3 because it is basically not needed; we
expect(ed) that all libs would just not be in that area.
Question: is this actually observed in practice or is this just a theoretical
"what if" thing ?
For a next generation of your product it may even make sense to move the
executable up a bit so that all libs fit below it for sure; that way a maximum
space between binary and stack is available (eg big mmap segment possible)
------ Additional Comments From firstname.lastname@example.org 2003-23-09 16:13 -------
We are looking at creating a scenario where this would happen. It's not
unlikely in our opinion that this could cause us problems.
If the shared library comes along and decides that it attaches at
TASK_UNMAPPED_BASE which is 1/3 of vm is 0x40000000(?) because the section
above the exe is full. And we come along and detect that the shared libs are
higher than 0x30000000, we attach our shared memory starting here. (We make an
assumption that if it's higher then 0x3, then the shared libs are at
0x20000000 -- I know, not the greatest, but we can fix this.) However, the
problem is when we attach at 0x30000000 and the shared lib starts attaching at
0x40000000 and we bump into it. (I assume in this scenario, we sigsegv!).
The other possibility is that we have our shared memory attach for say 1.5 G
starting from 0x30000000, and we need more space for the libraries. What
happens? Is the kernel smart enough to move the libs to below our shared memory
segment? or do we get even weirder behaviour?
> TASK_UNMAPPED_BASE which is 1/3 of vm is 0x40000000(?)
it is that value for 3Gb userspace; for 4Gb userspace it's more (1/3rd more)
The kernel will never map shared libraries in a place where something else
already exists. (and if you use dlopen() glibc also has a big say in this btw)
My recommendation would be that if you map a large area, to either not provide a
hint address (eg let the kernel find a hole this big), or to try to work from
the stack downwards. In no case should MAP_FIXED be used for things like this,
since that erradicates all existing mappings that might exist.
------ Additional Comments From email@example.com 2003-24-09 11:36 -------
We don't use MAPPED_FIX as far as I know since we use shmget/shmat with our
shared memory mapping. The reason we use the hint address is so that all our
our executables can have the shared memory attached at the same place.
However, perhaps this won't be a problem. We have just gotten a test program
that will either dlopen, and/or just dynamically link the shared libs in. The
shared libs load in the space before the executable, and then right after it.
This is good news, however it conflicts with your comment that the shared libs
would contine to use TASK_UNMAPPED_BASE as the position to restart from.
------ Additional Comments From firstname.lastname@example.org 2003-26-09 11:27 -------
*SIGH* we have just hit what I was afraid of.
This is coming to us from the LDAP team and this is the kernel they are using:
Linux ldapdut009 2.4.21-3.ELsmp #1 SMP Fri Sep 19 14:06:12 EDT 2003 i686 i686
This is a process mapping of the db2sysc process:
08048000-08051000 r-xp 00000000 08:05
08051000-08056000 rw-p 00008000 08:05
08056000-08077000 rw-p 00000000 00:00 0
10000000-10ef4000 rw-s 00000000 00:04 4161540 /SYSV9ff53761 (deleted)
11000000-125fc000 rw-s 00000000 00:04 4194309 /SYSV00000000 (deleted)
50000000-54ecc000 rw-s 00000000 00:04 4227078 /SYSV00000000 (deleted)
b3ccb000-b3ecb000 r--p 00000000 08:08 2966002 /usr/lib/locale/locale-archive
b3ecb000-b3f4c000 rw-p 00001000 00:00 0
b3f4c000-b416e000 rw-s 00000000 00:04 4128771 /SYSV9ff53774 (deleted)
b416e000-b41af000 rw-p 00000000 00:00 0
b41af000-b41ba000 r-xp 00000000 08:08 507940 /lib/libnss_files-2.3.2.so
b41ba000-b41bb000 rw-p 0000a000 08:08 507940 /lib/libnss_files-2.3.2.so
b41bb000-b41bc000 rw-p 00000000 00:00 0
b41bc000-b41c3000 r-xp 00000000 08:02
b41c3000-b41c7000 rw-p 00006000 08:02
b41c7000-b41c8000 rw-p 00000000 00:00 0
b41c8000-b41ed000 r-xp 00000000 08:02
b41ed000-b420b000 rw-p 00024000 08:02
b420b000-b4210000 rw-p 00000000 00:00 0
b4210000-b4221000 r-xp 00000000 08:02
b4221000-b422e000 rw-p 00010000 08:02
b422e000-b4230000 rw-p 00000000 00:00 0
b4230000-b4232000 r-xp 00000000 08:02
b4232000-b4233000 rw-p 00001000 08:02
b4233000-b423b000 r-xp 00000000 08:08 4096014 /lib/tls/librtkaio-2.3.2.so
b423b000-b423c000 rw-p 00007000 08:08 4096014 /lib/tls/librtkaio-2.3.2.so
b423c000-b4247000 rw-p 00000000 00:00 0
b4247000-b4249000 r-xp 00000000 08:08 507920 /lib/libdl-2.3.2.so
b4249000-b424a000 rw-p 00001000 08:08 507920 /lib/libdl-2.3.2.so
b424a000-b424f000 r-xp 00000000 08:08 507918 /lib/libcrypt-2.3.2.so
b424f000-b4250000 rw-p 00004000 08:08 507918 /lib/libcrypt-2.3.2.so
b4250000-b4277000 rw-p 00000000 00:00 0
b4277000-b43a8000 r-xp 00000000 08:08 4096006 /lib/tls/libc-2.3.2.so
b43a8000-b43ab000 rw-p 00130000 08:08 4096006 /lib/tls/libc-2.3.2.so
b43ab000-b43af000 rw-p 00000000 00:00 0
b43af000-b43fe000 r-xp 00000000 08:02
b43fe000-b441d000 rw-p 0004e000 08:02
b441d000-b443e000 r-xp 00000000 08:08 4096009 /lib/tls/libm-2.3.2.so
b443e000-b443f000 rw-p 00020000 08:08 4096009 /lib/tls/libm-2.3.2.so
b443f000-b4475000 r-xp 00000000 08:02
b4475000-b44b8000 rw-p 00035000 08:02
b44b8000-b44bc000 rw-p 00000000 00:00 0
b44bc000-b60d2000 r-xp 00000000 08:02
b60d2000-b7564000 rw-p 01c15000 08:02
b7564000-b75cd000 rw-p 00000000 00:00 0
b75cd000-b75da000 r-xp 00000000 08:08 4096011 /lib/tls/libpthread-0.59.so
b75da000-b75db000 rw-p 0000c000 08:08 4096011 /lib/tls/libpthread-0.59.so
b75db000-b75dd000 rw-p 00000000 00:00 0
b75ea000-b75eb000 rw-p 00001000 00:00 0
b75eb000-b7600000 r-xp 00000000 08:08 507907 /lib/ld-2.3.2.so
b7600000-b7601000 rw-p 00015000 08:08 507907 /lib/ld-2.3.2.so
bfff7000-c0000000 rwxp ffffa000 00:00 0
1/3 of 4G address space is 0x50000000 which is right in the middle of our
shared memory attachment points! As we can see, the shared libs then jump down
to 0xb3... which is ok, but this breaks things up horribly for us. Is there a
reason that the shared libs in this case don't start their attachment point at
an address lower than the exe's?
are there any special LD_ASSUME_KERNEL settings used here ?
btw all non-PROT_EXEC mmaps will grow down from the stack (eg on a 3Gb kernel,
down from 0xbffff... ), PROT_EXEC mmaps (eg libaries, assuming ld.so mmaps them
with PROT_EXEC, which is why LD_ASSUME_KERNEL influences this) should use the
different 'below the binary' allocator.
(oh and setarch usage will impact this too)
------ Additional Comments From email@example.com 2003-26-09 11:51 -------
hmm. no LD_ASSUME_KERNEL isn't set in the environment. I don't think we have
setarch usage.. but the PROT_EXEC sounds familiar. I will look into that. I
suppose it's too late to ask for the mapped_base patch to be put back in?
basically yes ;(
but for this specific case I don't think it would have helped either btw; this
appears to use the non-PROT_EXEC allocator which doesn't use TASK_UNMAPPED_BASE
------ Additional Comments From firstname.lastname@example.org 2003-26-09 13:17 -------
I'm trying to confirm this, but I believe we do use the PROT_EXEC. I know the
changes we make with respect to mapped_base does work on RH AS 2.1.
------ Additional Comments From email@example.com 2003-03-10 13:28 -------
It looks like that's correct. We don't use PROT_EXEC.
Can I ask how the mapped_base patch in RHAS 2.1 worked if from what you've said
earlier shouldn't have?
I don't fully understand what you mean exactly.
In AS2.1 all mmap allocations were allocated from TASK_UNMAPPED_BASE onwards,
which by default is set at 1Gb (1/3rd of VA). We made this a per process tunable
because loosing all the VA below 1Gb (the brk space) was undesirable for databases.
In RHEL3 we don't break the VA space in the middle by default but make
non-PROT_EXEC mmaps grow downwards from the stack and put PROT_EXEC's below the
executable. This leaves the brkspace vs mmap space undivided; TASK_UNMAPPED_BASE
isn't really relevant normally therefore.
------ Additional Comments From firstname.lastname@example.org 2003-03-10 16:23 -------
*** Bug 4741 has been marked as a duplicate of this bug. ***
------ Additional Comments From email@example.com 2003-06-10 12:53 -------
I think I should have looked at the process map that I sent you more closely,
and it looks like some stuff has changed since I originally opened this
bugzilla with respects to the shared library attachments.
So effectively we have 0x10000000 to 0xa0000000 available to use for shared
memory attaches... (everything just above the exec to just below the stack +
shared libs since we are not PROT_EXEC)
I apologize for some of these questions, but I need to make sure I understand
Oh, and one last thing, does the stack still start at 0xbfffffff or is it now
>(everything just above the exec to just below the stack +
>shared libs since we are not PROT_EXEC)
that is the general idea yes; this should be more than you had before
> Oh, and one last thing, does the stack still start at 0xbfffffff or is it now
this depends on which kernel you use. The kernel-smp kernel will have it start
at 0xbfff ... while the kernel-hugemem kernel will start at 4Gb minus a tiny bit.
------ Additional Comments From firstname.lastname@example.org 2003-15-10 10:17 -------
Thanks everyone. This issue can be closed. We are comfortable with the
information here, and it looks like this change on x86 is to our advantage.
> Thanks everyone. This issue can be closed. We are comfortable with the
> information here, and it looks like this change on x86 is to our advantage.
that was the goal of the change ;)
anyway closing on the Red Hat side
------ Additional Comments From email@example.com 2003-15-10 21:02 -------
Closing this bug!