From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020607 Description of problem: I have four machines, two of them are running Pensacola beta and two running 7.2. I have a problem when upgrading glibc on any of them. Every one has different glibc version right now (2.2.4-13, 2.2.5-32, 2.2.90-8, 2.2.90-15), but they all have one problem - iconvconfig simply segfaults every time I try to run it. Trying to search for the problem in bugzilla I found nothing, so it has to be some problem specific to me. The machines are Intel or AMD and the only thing I can imagine causing this is my kernel. All of them use 2.4.18 kernels with grsecurity patch applied, I can attach a .config or something if there is a need for that. The problem is bigger than just iconvcinfig not working. Iconvconfig is used by post-install script of glibc rpm. When it crashes, the script aborts, but that's not all - rpm aborts the installation process at the point where all the files are copied, new package is installed, but the old one isn't removed from the database. The effect of this is me having 5 glibc packets at a time, which not always can be removed (when I had three versions of 2.2.4, I couldn't do an rpm -e (e.g.) glibc-2.2.4-1 nor 2.2.4-2, even if the files where actually claimed by glibc-2.2.4-3). This seems to be a bug in RPM (it should gracefully exit leaving only one package installed). But RPM is not the cause of this behaviour, and iconvconfig is. Please guide me, what can I do to confirm the behaviour or find it's cause. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1.Be myself I guess, since nobody reported this before. Actual Results: <22> [root@bukszpryt ~]# iconvconfig (0:30) (pts/29 screen) (0) Segmentation fault (core dumped) <23> [root@bukszpryt ~]# gdb `which iconvconfig` core [...] (gdb) bt #0 0x420c5dbb in twalk () from /lib/i686/libc.so.6 #1 0x08049b56 in mkstemp () #2 0x08048d04 in mkstemp () #3 0x42016644 in __libc_start_main () from /lib/i686/libc.so.6 <24> [root@bukszpryt ~]# strace iconvconfig (0:46) (pts/29 screen) (0) execve("/usr/sbin/iconvconfig", ["iconvconfig"], [/* 20 vars */]) = 0 [...] gettimeofday({1027637183, 628257}, NULL) = 0 getpid() = 27015 open("/usr/lib/gconv/gconv-modules.cache.03p25P", O_RDWR|O_CREAT|O_EXCL, 0600) = 3 brk(0x8082000) = 0x8082000 brk(0x8083000) = 0x8083000 --- SIGSEGV (Segmentation fault) --- +++ killed by SIGSEGV +++ "[...]" means there was some output which I thought isn't useful. Expected Results: 1. iconvconfig working 2. detecting "broken" hosts (if it is my kernel's fault) and do things better (e.g. exit after removing previous glibc installation) Additional info: see Actual Results
Can you a) save /usr/lib/gconv/gconv-modules and /usr/lib/gconv/gconv-modules.cache files somewhere b) check if you remove /usr/lib/gconv/gconv-modules.cache the bug goes away c) tell me whether gconv-modules has been modified by you d) mail me both files, so that I can try to reproduce it? I have never seen this problem.
a, b) I have 0 bytes of modules.cache. Removing it doesn't help. Removing gconv.modules gives me: iconvconfig: cannot open `/usr/lib/gconv/gconv-modules': No such file or directory iconvconfig: no output file produced because warning were issued (as you can see, the program is working to some point) Touching gconv.modules (so it's 0 bytes) results in segmentation fault. It doesn't seem to parse the data from gconv-modules after opening it and before crashing. c) no, never, I don't know nothing about iconv. d) I have said before, it seems to depend on kernel or some other not-RedHat-provided component of my systems (but I don't have any idea, what could it be). I can take _any_ gconv-modules and reproduce the crash at any time (on my system). I'll try to compile debug-version of iconvconfig and see, what exactly crashes it, today. P.S. What is bug 67218?
I did it, here are the results: (gdb) bt #0 0x420c5dbb in twalk () from /lib/i686/libc.so.6 #1 0x0804a082 in write_output () at iconvconfig.c:1055 #2 0x08048e31 in main (argc=1, argv=0xbffffc24) at iconvconfig.c:323 #3 0x42016644 in __libc_start_main () from /lib/i686/libc.so.6 (gdb) f 1 #1 0x0804a082 in write_output () at iconvconfig.c:1055 1055 twalk (names, name_insert); (gdb) l 1050 larger than the number of strings. */ 1051 hash_size = next_prime (nnames * 1.4); 1052 hash_table = (struct hash_entry *) xcalloc (hash_size, 1053 sizeof (struct hash_entry)); 1054 /* Fill the hash table. */ 1055 twalk (names, name_insert); 1056 1057 /* Create the section for the module list. */ 1058 module_table = (struct module_entry *) xcalloc (sizeof (struct module_entry), 1059 nname_info); (gdb) print names $1 = (void *) 0x804ed30 Not too much information here, so I did $ setenv LD_LIBRARY_PATH /usr/lib/debug Debugging gave me this: Program received signal SIGSEGV, Segmentation fault. 0x001e8dbb in __twalk (vroot=0x806ff50, action=0xbffffb20) at tsearch.c:604 604 (*action) (root, preorder, level); (gdb) print action $1 = (void (*)(const void *, enum {...}, int)) 0xbffffb20 (gdb) print root $2 = (struct hdr *) 0x0 (gdb) print preorder $3 = preorder (gdb) print level No symbol "level" in current context. (gdb) l 599 600 if (root->left == NULL && root->right == NULL) 601 (*action) (root, leaf, level); 602 else 603 { 604 (*action) (root, preorder, level); 605 if (root->left != NULL) 606 trecurse (root->left, action, level + 1); 607 (*action) (root, postorder, level); 608 if (root->right != NULL) I've done rpm -bb on spec from glibc-2.2.90-15.src.rpm, it failed with msgfmt error because of my iconv problems and deleting of files, but the source should be identical to this of my (RedHat) debug glibc (after %prep with patches applied). The machine I'm debugging on is an AMD Athlon, but the others with same problem are Intel Celerons. This is all the information I could get. Before debugging I have copied /usr/lib/gconv from glibc-2.2.90-15.i686.rpm, so that there was no file not provided by RedHat. That's why I think it's my kernel's fault somewhere on the way. I'm willing to provide any information wanted, I can even provide you with an account on one of my systems. One last thought that came to my mind: how to check which gcc compiled glibc installed on particular machines?
I knew it! It's Solar Designer's non-executable stack patch (included in grsecurity, which I'm using on all my systems) being the cause of my problem. I've downloaded the original OpenWall Linux patch, compiled chstk and: [root@bukszpryt #chstk]$ ./chstk -d /usr/sbin/iconvconfig [root@bukszpryt #chstk]$ ./chstk -v /usr/sbin/iconvconfig /usr/sbin/iconvconfig: Non-executable stack area [root@bukszpryt #chstk]$ iconvconfig Segmentation fault (core dumped) [root@bukszpryt #chstk]$ ./chstk -e /usr/sbin/iconvconfig [root@bukszpryt #chstk]$ ./chstk -v /usr/sbin/iconvconfig /usr/sbin/iconvconfig: Executable stack area [root@bukszpryt #chstk]$ iconvconfig [root@bukszpryt #chstk]$ ./chstk -d /usr/sbin/iconvconfig [root@bukszpryt #chstk]$ ./chstk -v /usr/sbin/iconvconfig /usr/sbin/iconvconfig: Non-executable stack area [root@bukszpryt #chstk]$ iconvconfig Segmentation fault (core dumped) It works with executable stack. I don't know if you consider this a glibc bug. It isn't a gcc trampoline, so I don't think there is any workaround other than disabling the non-executable stack patch. One solution for this problem would be to set F_STACKEXEC in distribution /usr/sbin/iconvconfig. I don't know if there are any other (system) programs using twalk() this way, or calling the stack. This iconvconfig problem stops glibc from being upgraded on any system with non-executable stack area, and I know people using RedHat systems with kernels patched this way.
Why do you think it is not a gcc trampoline. The only twalk which might cause the problem is in static int write_output (void) { ... /* Function to insert the names. */ static void name_insert (const void *nodep, VISIT value, int level) { ... } ... twalk (names, name_insert); ... } so it clearly is a gcc trampoline. As glibc uses nested functions quite often, I'd guess that if gcc trampolines don't work in Solar's patches, then you cannot use glibc at all...
There is an option for emulating and detecting gcc trampolines, but it says in the FAQ and Configure.help, that only glibc 2.0 is affected. Everything except iconvconfig works well without trampolines on my glibc 2.2.4-2.2.90. I will try to enable gcc trampoline detection later today and see if that helps. If it is the case, I'm going to write to openwall and grsecurity mailing lists with the request for documentation update :) Still, I think post-install script should be more error-aware and not mess my rpmdb after that kind of situation.
Any update on this? Does enabling the build option work for you?
I'm terribly sorry. I didn't have time to investigate the problem deeper, but being busy working, I have compiled the 2.4.19 kernel with grsecurity and gcc trampolines detection turned on. I used it on one machine, running RedHat Limbo beta, and iconvconfig works there. It needs more testing on another boxes, but it seems to me that iconvconfig is the first program which uses gcc trampolines I have ever seen. I will remember about that in my future kernel upgrades on other machines. If it helps, someone should tell Solar Designer and grsecurity team to update Configure.help for the patch. Again, it is not only kernel's problem. I can still make iconvconfig to fail during installation of glibc (e.g. giving it a broken gconv.modules, rpm will most likely write gconv.modules.rpmnew in that case, and iconvconfig should fail), and I'll end up with many glibc versions installed. I've asked before, but what is this dependant bug? "Sorry; you do not have the permissions necessary to see bug 67218.". Sigh.
Well, I still think "NOTABUG" is not what this is, especially that now glibc-2.3.2-* packages include iconvconfig without trampolines, if it wasn't bad, why would Red Hat change it? Thanks anyhow.