The following has be reported by IBM LTC: Question concerning the algorithm used for choosing where the shared libraries load This is more a question than a bug. On RHEL 3 IA32, we noticed that the shared libraries no longer load at 0x40000000 and they load in the space between 0x0 and the executable in the address space. My question is what happens when this space is filled. Do you have a specific place where you put these after the exec, or is it random? My concern is that this could cause a potential problem for DB2. We are still investigating the effects of this change.Glen/Greg - a good question for Red Hat on RHEL3. Thanks. Yvonne - does this impact DB2 at all ?
Unless you're using prelink(8), it is kernel which decides these addresses.
if this space is filled, we go to other free regions (in practice, TASK_UNMAPPED_BASE onwards) while this is suboptimal (eg it breaks the big free area between the binary and the top of stack/mmaps) it's a safe thing to do for a rare situation
------ Additional Comments From yvchan.com 2003-23-09 10:05 ------- Is the TASK_UNMAPPED_BASE the same as previous kernels and is this moveable if need be?
TASK_UNMAPPED_BASE is at 1/3 of virtual space still; it is not movable. but with prelink YOU decide where libs get loaded; kernel policy only matters if there's no preference...
------ Additional Comments From yvchan.com 2003-23-09 10:59 ------- There are a couple of things I'd like to clarify first. From what I can tell of prelink, it would be run on the system after we install DB2? If this is a the case, then we can *NOT* do this. It could cause DB2 to stop running should the customer update their system libraries for any reason. This would become a maintenance/support nightmare for us. Do the binaries need to be built with gcc 3.2+ to use this? If so, we can not do this on x86, or IA64 since we use the Intel compiler.
you can tell prelink to run once before you ship stuff if you manually pick where libs go. If it works with Intel's compiler I don't know, I have no information on how compatible that is with gcc.
------ Additional Comments From yvchan.com 2003-23-09 13:09 ------- ok. This is not an option then. We build on RH 7.2 and ship only 1 set of binaries per architecture. We do *NOT* re-build or plan to change any of this until v9, due to customer commitments. We can not afford the effort to ship multiple binaries, due to both resouce, and support issues. Since we have 1 binary (per arch) we can not use this option since we support quite a few distro's with that one binary. We used to make use of the mapped_base file that existed in /proc/<pid>/ directory. This has been obviously removed. Is there a reason for this?
------ Additional Comments From yvchan.com 2003-23-09 13:10 ------- One more thing to note the mapped_base file was availabe in Red Hat Advanced Server 2.1
mapped_base has not been added to RHEL3 because it is basically not needed; we expect(ed) that all libs would just not be in that area. Question: is this actually observed in practice or is this just a theoretical "what if" thing ?
For a next generation of your product it may even make sense to move the executable up a bit so that all libs fit below it for sure; that way a maximum space between binary and stack is available (eg big mmap segment possible)
------ Additional Comments From yvchan.com 2003-23-09 16:13 ------- We are looking at creating a scenario where this would happen. It's not unlikely in our opinion that this could cause us problems. If the shared library comes along and decides that it attaches at TASK_UNMAPPED_BASE which is 1/3 of vm is 0x40000000(?) because the section above the exe is full. And we come along and detect that the shared libs are higher than 0x30000000, we attach our shared memory starting here. (We make an assumption that if it's higher then 0x3, then the shared libs are at 0x20000000 -- I know, not the greatest, but we can fix this.) However, the problem is when we attach at 0x30000000 and the shared lib starts attaching at 0x40000000 and we bump into it. (I assume in this scenario, we sigsegv!). The other possibility is that we have our shared memory attach for say 1.5 G starting from 0x30000000, and we need more space for the libraries. What happens? Is the kernel smart enough to move the libs to below our shared memory segment? or do we get even weirder behaviour?
> TASK_UNMAPPED_BASE which is 1/3 of vm is 0x40000000(?) it is that value for 3Gb userspace; for 4Gb userspace it's more (1/3rd more) The kernel will never map shared libraries in a place where something else already exists. (and if you use dlopen() glibc also has a big say in this btw) My recommendation would be that if you map a large area, to either not provide a hint address (eg let the kernel find a hole this big), or to try to work from the stack downwards. In no case should MAP_FIXED be used for things like this, since that erradicates all existing mappings that might exist.
------ Additional Comments From yvchan.com 2003-24-09 11:36 ------- We don't use MAPPED_FIX as far as I know since we use shmget/shmat with our shared memory mapping. The reason we use the hint address is so that all our our executables can have the shared memory attached at the same place. However, perhaps this won't be a problem. We have just gotten a test program that will either dlopen, and/or just dynamically link the shared libs in. The shared libs load in the space before the executable, and then right after it. This is good news, however it conflicts with your comment that the shared libs would contine to use TASK_UNMAPPED_BASE as the position to restart from. Comments?
------ Additional Comments From yvchan.com 2003-26-09 11:27 ------- *SIGH* we have just hit what I was afraid of. This is coming to us from the LDAP team and this is the kernel they are using: Linux ldapdut009 2.4.21-3.ELsmp #1 SMP Fri Sep 19 14:06:12 EDT 2003 i686 i686 i386 GNU/Linux This is a process mapping of the db2sysc process: -----------------------------------------------------------------BEGIN 08048000-08051000 r-xp 00000000 08:05 359776 /home/ldapdb2/sqllib/adm/db2sysc 08051000-08056000 rw-p 00008000 08:05 359776 /home/ldapdb2/sqllib/adm/db2sysc 08056000-08077000 rw-p 00000000 00:00 0 10000000-10ef4000 rw-s 00000000 00:04 4161540 /SYSV9ff53761 (deleted) 11000000-125fc000 rw-s 00000000 00:04 4194309 /SYSV00000000 (deleted) 50000000-54ecc000 rw-s 00000000 00:04 4227078 /SYSV00000000 (deleted) b3ccb000-b3ecb000 r--p 00000000 08:08 2966002 /usr/lib/locale/locale-archive b3ecb000-b3f4c000 rw-p 00001000 00:00 0 b3f4c000-b416e000 rw-s 00000000 00:04 4128771 /SYSV9ff53774 (deleted) b416e000-b41af000 rw-p 00000000 00:00 0 b41af000-b41ba000 r-xp 00000000 08:08 507940 /lib/libnss_files-2.3.2.so b41ba000-b41bb000 rw-p 0000a000 08:08 507940 /lib/libnss_files-2.3.2.so b41bb000-b41bc000 rw-p 00000000 00:00 0 b41bc000-b41c3000 r-xp 00000000 08:02 392488 /opt/IBM/db2/V8.1/lib/libdb2trcapi.so.1 b41c3000-b41c7000 rw-p 00006000 08:02 392488 /opt/IBM/db2/V8.1/lib/libdb2trcapi.so.1 b41c7000-b41c8000 rw-p 00000000 00:00 0 b41c8000-b41ed000 r-xp 00000000 08:02 392471 /opt/IBM/db2/V8.1/lib/libdb2genreg.so.1 b41ed000-b420b000 rw-p 00024000 08:02 392471 /opt/IBM/db2/V8.1/lib/libdb2genreg.so.1 b420b000-b4210000 rw-p 00000000 00:00 0 b4210000-b4221000 r-xp 00000000 08:02 392480 /opt/IBM/db2/V8.1/lib/libdb2locale.so.1 b4221000-b422e000 rw-p 00010000 08:02 392480 /opt/IBM/db2/V8.1/lib/libdb2locale.so.1 b422e000-b4230000 rw-p 00000000 00:00 0 b4230000-b4232000 r-xp 00000000 08:02 392477 /opt/IBM/db2/V8.1/lib/libdb2install.so.1 b4232000-b4233000 rw-p 00001000 08:02 392477 /opt/IBM/db2/V8.1/lib/libdb2install.so.1 b4233000-b423b000 r-xp 00000000 08:08 4096014 /lib/tls/librtkaio-2.3.2.so b423b000-b423c000 rw-p 00007000 08:08 4096014 /lib/tls/librtkaio-2.3.2.so b423c000-b4247000 rw-p 00000000 00:00 0 b4247000-b4249000 r-xp 00000000 08:08 507920 /lib/libdl-2.3.2.so b4249000-b424a000 rw-p 00001000 08:08 507920 /lib/libdl-2.3.2.so b424a000-b424f000 r-xp 00000000 08:08 507918 /lib/libcrypt-2.3.2.so b424f000-b4250000 rw-p 00004000 08:08 507918 /lib/libcrypt-2.3.2.so b4250000-b4277000 rw-p 00000000 00:00 0 b4277000-b43a8000 r-xp 00000000 08:08 4096006 /lib/tls/libc-2.3.2.so b43a8000-b43ab000 rw-p 00130000 08:08 4096006 /lib/tls/libc-2.3.2.so b43ab000-b43af000 rw-p 00000000 00:00 0 b43af000-b43fe000 r-xp 00000000 08:02 392454 /opt/IBM/db2/V8.1/lib/libcxa.so.1 b43fe000-b441d000 rw-p 0004e000 08:02 392454 /opt/IBM/db2/V8.1/lib/libcxa.so.1 b441d000-b443e000 r-xp 00000000 08:08 4096009 /lib/tls/libm-2.3.2.so b443e000-b443f000 rw-p 00020000 08:08 4096009 /lib/tls/libm-2.3.2.so b443f000-b4475000 r-xp 00000000 08:02 392482 /opt/IBM/db2/V8.1/lib/libdb2osse.so.1 b4475000-b44b8000 rw-p 00035000 08:02 392482 /opt/IBM/db2/V8.1/lib/libdb2osse.so.1 b44b8000-b44bc000 rw-p 00000000 00:00 0 b44bc000-b60d2000 r-xp 00000000 08:02 392531 /opt/IBM/db2/V8.1/lib/libdb2e.so.1 b60d2000-b7564000 rw-p 01c15000 08:02 392531 /opt/IBM/db2/V8.1/lib/libdb2e.so.1 b7564000-b75cd000 rw-p 00000000 00:00 0 b75cd000-b75da000 r-xp 00000000 08:08 4096011 /lib/tls/libpthread-0.59.so b75da000-b75db000 rw-p 0000c000 08:08 4096011 /lib/tls/libpthread-0.59.so b75db000-b75dd000 rw-p 00000000 00:00 0 b75ea000-b75eb000 rw-p 00001000 00:00 0 b75eb000-b7600000 r-xp 00000000 08:08 507907 /lib/ld-2.3.2.so b7600000-b7601000 rw-p 00015000 08:08 507907 /lib/ld-2.3.2.so bfff7000-c0000000 rwxp ffffa000 00:00 0 -----------------------------------------------------------------END 1/3 of 4G address space is 0x50000000 which is right in the middle of our shared memory attachment points! As we can see, the shared libs then jump down to 0xb3... which is ok, but this breaks things up horribly for us. Is there a reason that the shared libs in this case don't start their attachment point at an address lower than the exe's?
are there any special LD_ASSUME_KERNEL settings used here ?
btw all non-PROT_EXEC mmaps will grow down from the stack (eg on a 3Gb kernel, down from 0xbffff... ), PROT_EXEC mmaps (eg libaries, assuming ld.so mmaps them with PROT_EXEC, which is why LD_ASSUME_KERNEL influences this) should use the different 'below the binary' allocator.
(oh and setarch usage will impact this too)
------ Additional Comments From yvchan.com 2003-26-09 11:51 ------- hmm. no LD_ASSUME_KERNEL isn't set in the environment. I don't think we have setarch usage.. but the PROT_EXEC sounds familiar. I will look into that. I suppose it's too late to ask for the mapped_base patch to be put back in?
basically yes ;( but for this specific case I don't think it would have helped either btw; this appears to use the non-PROT_EXEC allocator which doesn't use TASK_UNMAPPED_BASE at all.
------ Additional Comments From yvchan.com 2003-26-09 13:17 ------- I'm trying to confirm this, but I believe we do use the PROT_EXEC. I know the changes we make with respect to mapped_base does work on RH AS 2.1.
------ Additional Comments From yvchan.com 2003-03-10 13:28 ------- It looks like that's correct. We don't use PROT_EXEC. Can I ask how the mapped_base patch in RHAS 2.1 worked if from what you've said earlier shouldn't have?
I don't fully understand what you mean exactly. In AS2.1 all mmap allocations were allocated from TASK_UNMAPPED_BASE onwards, which by default is set at 1Gb (1/3rd of VA). We made this a per process tunable because loosing all the VA below 1Gb (the brk space) was undesirable for databases. In RHEL3 we don't break the VA space in the middle by default but make non-PROT_EXEC mmaps grow downwards from the stack and put PROT_EXEC's below the executable. This leaves the brkspace vs mmap space undivided; TASK_UNMAPPED_BASE isn't really relevant normally therefore.
------ Additional Comments From khoa.com 2003-03-10 16:23 ------- *** Bug 4741 has been marked as a duplicate of this bug. ***
------ Additional Comments From yvchan.com 2003-06-10 12:53 ------- I think I should have looked at the process map that I sent you more closely, and it looks like some stuff has changed since I originally opened this bugzilla with respects to the shared library attachments. So effectively we have 0x10000000 to 0xa0000000 available to use for shared memory attaches... (everything just above the exec to just below the stack + shared libs since we are not PROT_EXEC) I apologize for some of these questions, but I need to make sure I understand this properly. Oh, and one last thing, does the stack still start at 0xbfffffff or is it now 0xffffffff?
>(everything just above the exec to just below the stack + >shared libs since we are not PROT_EXEC) that is the general idea yes; this should be more than you had before > Oh, and one last thing, does the stack still start at 0xbfffffff or is it now > 0xffffffff? this depends on which kernel you use. The kernel-smp kernel will have it start at 0xbfff ... while the kernel-hugemem kernel will start at 4Gb minus a tiny bit.
------ Additional Comments From yvchan.com 2003-15-10 10:17 ------- Thanks everyone. This issue can be closed. We are comfortable with the information here, and it looks like this change on x86 is to our advantage.
> Thanks everyone. This issue can be closed. We are comfortable with the > information here, and it looks like this change on x86 is to our advantage. that was the goal of the change ;) anyway closing on the Red Hat side
------ Additional Comments From khoa.com 2003-15-10 21:02 ------- Closing this bug!