Description of problem: My locale: LANG=zh_CN LC_CTYPE=zh_CN LC_NUMERIC="zh_CN" LC_TIME=en_US LC_COLLATE=en_US LC_MONETARY="zh_CN" LC_MESSAGES=en_US LC_PAPER="zh_CN" LC_NAME="zh_CN" LC_ADDRESS="zh_CN" LC_TELEPHONE="zh_CN" LC_MEASUREMENT="zh_CN" LC_IDENTIFICATION="zh_CN" LC_ALL= When I run "grep -i ANYWORD /etc/* -r", it always outputs: grep: /etc/aep/aeptarg.bin: Permission denied grep: /etc/alchemist/namespace/printconf/local.adl: Permission denied grep: /etc/alchemist/namespace/system-config-httpd/local.adl: Permission denied grep: /etc/alchemist/namespace/system-config-httpd/rpm.adl: Permission denied grep: /etc/alchemist/switchboard/system-config-httpd.switchboard.adl: Permission denied grep: /etc/aliases.db: Permission denied Segmentation fault I have installed the grep-debuginfo-2.5.1-31.i386.rpm, and when run " gdb --args grep -i ANYWORD /etc/* -r; run; bt", it's output is: "Program received signal SIGSEGV, Segmentation fault. 0x00beb0cc in memcpy () from /lib/tls/libc.so.6 (gdb) bt #0 0x00beb0cc in memcpy () from /lib/tls/libc.so.6 #1 0x00c16738 in build_wcs_upper_buffer () from /lib/tls/libc.so.6 #2 0xbff42600 in ?? () #3 0xffffffff in ?? () #4 0xbff42940 in ?? () #5 0x00ca6ff4 in ?? () from /lib/tls/libc.so.6 #6 0x09c420a8 in ?? () #7 0x09c42090 in ?? () #8 0xbff42678 in ?? () #9 0x00bf013b in mbrtowc () from /lib/tls/libc.so.6 Previous frame inner to this frame (corrupt stack?)" Version-Release number of selected component (if applicable): grep-2.5.1-31.4 How reproducible: Steps to Reproduce: 1.set locale as : LANG=zh_CN LC_CTYPE=zh_CN LC_NUMERIC="zh_CN" LC_TIME=en_US LC_COLLATE=en_US LC_MONETARY="zh_CN" LC_MESSAGES=en_US LC_PAPER="zh_CN" LC_NAME="zh_CN" LC_ADDRESS="zh_CN" LC_TELEPHONE="zh_CN" LC_MEASUREMENT="zh_CN" LC_IDENTIFICATION="zh_CN" LC_ALL= 2.run "grep -i anyword /etc/* -r" 3. Actual results: Segmentation fault Expected results: Additional info: set the locale something else, such as "zh_CN.utf-8", seems no such problem
I think you must mean Fedora Core 3, not test3.
grep-debuginfo-2.5.1-31.i386.rpm is the wrong debuginfo package, since you have 2.5.1-31.4 (not 2.5.1-31) installed. As a result the backtrace is not as useful as it could be with the correct debuginfo package installed. http://download.fedora.redhat.com/pub/fedora/linux/core/updates/3/i386/debug/grep-debuginfo-2.5.1-31.4.i386.rpm is the correct package. I cannot reproduce the problem here, most likely as a result of my /etc directory being different to yours. It will be a particular file in the /etc directory structure that is causing the problem. Please try to narrow it down to that particular file so that we can both see the problem. One way to do it is to run 'strace -eopen grep ...' and see the last file opened.
Yes, it's FC3, not test3 I have install the grep-debuginfo-2.5.1-34.4.i386.rpm. When run the gdb, it prints: Program received signal SIGSEGV, Segmentation fault. 0x00beb0cc in memcpy () from /lib/tls/libc.so.6 (gdb) bt #0 0x00beb0cc in memcpy () from /lib/tls/libc.so.6 #1 0x00c16738 in build_wcs_upper_buffer () from /lib/tls/libc.so.6 #2 0xbfe78590 in ?? () #3 0xffffffff in ?? () #4 0xbfe788d0 in ?? () #5 0x00ca6ff4 in ?? () from /lib/tls/libc.so.6 #6 0x0968a0a8 in ?? () #7 0x0968a090 in ?? () #8 0xbfe78608 in ?? () #9 0x00bf013b in mbrtowc () from /lib/tls/libc.so.6 Previous frame inner to this frame (corrupt stack?) (gdb) When run the "strace -eopen grep -i scim /etc/* -r", it prints: ..... open("/etc/alsa/sndo-mixer.alisp", O_RDONLY|O_LARGEFILE) = 3 open("/etc/alternatives", O_RDONLY|O_LARGEFILE) = 3 open("/etc/alternatives", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 3 open("/etc/alternatives/mta", O_RDONLY|O_LARGEFILE) = 3 --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++ When run the "strace -eopen grep -i aaa /etc/* -r", it prints: .... Binary file /etc/alternatives/mta-mailq matches open("/etc/alternatives/mta-newaliases", O_RDONLY|O_LARGEFILE) = 3 Binary file /etc/alternatives/mta-newaliases matches open("/etc/alternatives/mta-rmail", O_RDONLY|O_LARGEFILE) = 3 open("/etc/alternatives/mta-mailqman", O_RDONLY|O_LARGEFILE) = 3 --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++
Created attachment 112068 [details] The file cause the problem
Created attachment 112069 [details] The file cause the problem
I can repeat this by saving the attachment from comment #4 ("mta") and running: LANG=zh_CN grep -i scim mta As far as I can tell, though, this is a glibc problem. Here is the backtrace, FWIW: #0 0x0066916c in ?? () from /lib/tls/libc.so.6 #1 0x00697e2e in build_wcs_upper_buffer (pstr=0x8406fff) at regex_internal.c:422 422 memcpy (pstr->mbs + byte_idx, buf, mbcdlen); #2 0x006981bb in re_string_reconstruct (pstr=0xbff00660, idx=102, eflags=Variable "eflags" is not available. ) at regex_internal.c:724 724 int ret = build_wcs_upper_buffer (pstr); #3 0x006a6252 in re_search_internal (preg=0x83e70e0, string=0x83eeb70 "", length=Variable "length" is not available. ) at regexec.c:789 789 err = re_string_reconstruct (&mctx.input, match_first, eflags); #4 0x006a7586 in re_search_stub (bufp=0x83e70e0, string=0x83eeb70 "", length=1652, start=0, range=1652, stop=-1074788768, regs=0x83e7100, ret_len=0) at regexec.c:444 444 result = re_search_internal (bufp, string, length, start, range, stop, #5 0x006a78e3 in __re_search (bufp=0xbff00660, string=0xffffffff <Address 0xffffffff out of bounds>, length=1073709573, start=0, range=0, regs=0x0) at regexec.c:314 314 return re_search_stub (bufp, string, length, start, range, length, regs, 0); #6 0x08057604 in EGexecute (buf=0x83e8fa8 "", size=32823, match_size=0xbff0087c, exact=0) at search.c:495 495 if (0 <= (start = re_search (&(patterns[i].regexbuf), beg, #7 0x0804ac6d in grepbuf (beg=0x83e8fa8 "", lim=0x83f0fdf "") at grep.c:691 691 while ((match_offset = (*execute) (p, lim - p, &match_size, 0)) != (size_t) -1) #8 0x0804afab in grep (fd=6, file=0xbff80582 "/tmp/tim/mta", stats=0x805d300) at grep.c:810 #9 0x0804b34d in grepfile (file=0xbff80582 "/tmp/tim/mta", stats=0x805d300) at grep.c:928 #10 0x0804caf0 in main (argc=4, argv=0xbff00aa4) at grep.c:1742
Note: to actually get this backtrace I had to force gdb to set the LANG environment variable correctly, like this: gdb --args ... (gdb) break main ... (gdb) print setenv("LANG","zh_CN",1) 0 (gdb) continue
This is either a bug in regex, or in zh_CN locale. The problem is that in GB2312 there is a: <U011B> /xa8/xa7 LATIN SMALL LETTER E WITH CARON character. towupper((wchar_t) 0x11B) is 0x11A, i.e. LATIN CAPITAL LETTER E WITH CARON But this character is not represented in the GB2312 charset. Uli/Roland, shouldn't towupper etc. only return characters valid in the particular locale's charset, or is it ok if for a char from the charset towupper returns a wide character that has no multi-byte representation? If towupper is right, then the bug is in build_wcs_upper_buffer, where it doesn't take that possibility into account and assumes if mbrtowc succeeded, that wcrtomb of towupper'ed char will succeed too. In that case I'll create a testcase and fix it.
http://sources.redhat.com/ml/libc-hacker/2005-03/msg00046.html
Should be fixed in glibc-2.3.4-16.