Bug 151215 - grep Segmentation fault
Summary: grep Segmentation fault
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: glibc
Version: 3
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-03-16 01:53 UTC by han pingtian
Modified: 2007-11-30 22:11 UTC (History)
3 users (show)

Fixed In Version: 2.3.4-16
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-03-20 15:16:50 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
The file cause the problem (715.19 KB, application/octet-stream)
2005-03-17 01:55 UTC, han pingtian
no flags Details
The file cause the problem (1.33 KB, application/octet-stream)
2005-03-17 01:57 UTC, han pingtian
no flags Details

Description han pingtian 2005-03-16 01:53:58 UTC
Description of problem:
My locale:
LANG=zh_CN
LC_CTYPE=zh_CN
LC_NUMERIC="zh_CN"
LC_TIME=en_US
LC_COLLATE=en_US
LC_MONETARY="zh_CN"
LC_MESSAGES=en_US
LC_PAPER="zh_CN"
LC_NAME="zh_CN"
LC_ADDRESS="zh_CN"
LC_TELEPHONE="zh_CN"
LC_MEASUREMENT="zh_CN"
LC_IDENTIFICATION="zh_CN"
LC_ALL=
When I run "grep -i ANYWORD /etc/* -r", it always outputs:
grep: /etc/aep/aeptarg.bin: Permission denied
grep: /etc/alchemist/namespace/printconf/local.adl: Permission denied
grep: /etc/alchemist/namespace/system-config-httpd/local.adl:
Permission denied
grep: /etc/alchemist/namespace/system-config-httpd/rpm.adl: Permission
denied
grep: /etc/alchemist/switchboard/system-config-httpd.switchboard.adl:
Permission denied
grep: /etc/aliases.db: Permission denied
Segmentation fault

I have installed the grep-debuginfo-2.5.1-31.i386.rpm, and when run
" gdb --args grep -i ANYWORD /etc/* -r; run; bt", it's output is:
"Program received signal SIGSEGV, Segmentation fault.
0x00beb0cc in memcpy () from /lib/tls/libc.so.6
(gdb) bt
#0  0x00beb0cc in memcpy () from /lib/tls/libc.so.6
#1  0x00c16738 in build_wcs_upper_buffer () from /lib/tls/libc.so.6
#2  0xbff42600 in ?? ()
#3  0xffffffff in ?? ()
#4  0xbff42940 in ?? ()
#5  0x00ca6ff4 in ?? () from /lib/tls/libc.so.6
#6  0x09c420a8 in ?? ()
#7  0x09c42090 in ?? ()
#8  0xbff42678 in ?? ()
#9  0x00bf013b in mbrtowc () from /lib/tls/libc.so.6
Previous frame inner to this frame (corrupt stack?)"

Version-Release number of selected component (if applicable):
grep-2.5.1-31.4

How reproducible:


Steps to Reproduce:
1.set locale as :
LANG=zh_CN
LC_CTYPE=zh_CN
LC_NUMERIC="zh_CN"
LC_TIME=en_US
LC_COLLATE=en_US
LC_MONETARY="zh_CN"
LC_MESSAGES=en_US
LC_PAPER="zh_CN"
LC_NAME="zh_CN"
LC_ADDRESS="zh_CN"
LC_TELEPHONE="zh_CN"
LC_MEASUREMENT="zh_CN"
LC_IDENTIFICATION="zh_CN"
LC_ALL=

2.run "grep -i anyword /etc/* -r"
3.
  
Actual results:
Segmentation fault

Expected results:


Additional info:
set the locale something else, such as "zh_CN.utf-8", seems no such
problem

Comment 1 Tim Waugh 2005-03-16 09:38:05 UTC
I think you must mean Fedora Core 3, not test3.

Comment 2 Tim Waugh 2005-03-16 09:42:40 UTC
grep-debuginfo-2.5.1-31.i386.rpm is the wrong debuginfo package, since you have
2.5.1-31.4 (not 2.5.1-31) installed.  As a result the backtrace is not as useful
as it could be with the correct debuginfo package installed.

http://download.fedora.redhat.com/pub/fedora/linux/core/updates/3/i386/debug/grep-debuginfo-2.5.1-31.4.i386.rpm
is the correct package.

I cannot reproduce the problem here, most likely as a result of my /etc
directory being different to yours.  It will be a particular file in the /etc
directory structure that is causing the problem.  Please try to narrow it down
to that particular file so that we can both see the problem.

One way to do it is to run 'strace -eopen grep ...' and see the last file opened.

Comment 3 han pingtian 2005-03-17 01:39:18 UTC
Yes, it's FC3, not test3

I have install the grep-debuginfo-2.5.1-34.4.i386.rpm. When run the gdb, it prints:
Program received signal SIGSEGV, Segmentation fault.
0x00beb0cc in memcpy () from /lib/tls/libc.so.6
(gdb) bt
#0  0x00beb0cc in memcpy () from /lib/tls/libc.so.6
#1  0x00c16738 in build_wcs_upper_buffer () from /lib/tls/libc.so.6
#2  0xbfe78590 in ?? ()
#3  0xffffffff in ?? ()
#4  0xbfe788d0 in ?? ()
#5  0x00ca6ff4 in ?? () from /lib/tls/libc.so.6
#6  0x0968a0a8 in ?? ()
#7  0x0968a090 in ?? ()
#8  0xbfe78608 in ?? ()
#9  0x00bf013b in mbrtowc () from /lib/tls/libc.so.6
Previous frame inner to this frame (corrupt stack?)
(gdb)

When run the "strace -eopen grep -i scim /etc/* -r", it prints:
.....
open("/etc/alsa/sndo-mixer.alisp", O_RDONLY|O_LARGEFILE) = 3
open("/etc/alternatives", O_RDONLY|O_LARGEFILE) = 3
open("/etc/alternatives", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 3
open("/etc/alternatives/mta", O_RDONLY|O_LARGEFILE) = 3
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

When run the "strace -eopen grep -i aaa /etc/* -r", it prints:
....
Binary file /etc/alternatives/mta-mailq matches
open("/etc/alternatives/mta-newaliases", O_RDONLY|O_LARGEFILE) = 3
Binary file /etc/alternatives/mta-newaliases matches
open("/etc/alternatives/mta-rmail", O_RDONLY|O_LARGEFILE) = 3
open("/etc/alternatives/mta-mailqman", O_RDONLY|O_LARGEFILE) = 3
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

Comment 4 han pingtian 2005-03-17 01:55:48 UTC
Created attachment 112068 [details]
The file cause the problem

Comment 5 han pingtian 2005-03-17 01:57:43 UTC
Created attachment 112069 [details]
The file cause the problem

Comment 6 Tim Waugh 2005-03-17 16:28:04 UTC
I can repeat this by saving the attachment from comment #4 ("mta") and running:

LANG=zh_CN grep -i scim mta

As far as I can tell, though, this is a glibc problem.  Here is the backtrace, FWIW:

#0  0x0066916c in ?? () from /lib/tls/libc.so.6
#1  0x00697e2e in build_wcs_upper_buffer (pstr=0x8406fff)
    at regex_internal.c:422
422                         memcpy (pstr->mbs + byte_idx, buf, mbcdlen);

#2  0x006981bb in re_string_reconstruct (pstr=0xbff00660, idx=102,
eflags=Variable "eflags" is not available.
)
    at regex_internal.c:724
724               int ret = build_wcs_upper_buffer (pstr);

#3  0x006a6252 in re_search_internal (preg=0x83e70e0, string=0x83eeb70 "",
    length=Variable "length" is not available.
) at regexec.c:789
789           err = re_string_reconstruct (&mctx.input, match_first, eflags);

#4  0x006a7586 in re_search_stub (bufp=0x83e70e0, string=0x83eeb70 "",
    length=1652, start=0, range=1652, stop=-1074788768, regs=0x83e7100,
    ret_len=0) at regexec.c:444
444       result = re_search_internal (bufp, string, length, start, range, stop,

#5  0x006a78e3 in __re_search (bufp=0xbff00660,
    string=0xffffffff <Address 0xffffffff out of bounds>, length=1073709573,
    start=0, range=0, regs=0x0) at regexec.c:314
314       return re_search_stub (bufp, string, length, start, range, length,
regs, 0);

#6  0x08057604 in EGexecute (buf=0x83e8fa8 "", size=32823,
    match_size=0xbff0087c, exact=0) at search.c:495
495               if (0 <= (start = re_search (&(patterns[i].regexbuf), beg,

#7  0x0804ac6d in grepbuf (beg=0x83e8fa8 "", lim=0x83f0fdf "") at grep.c:691
691       while ((match_offset = (*execute) (p, lim - p, &match_size, 0)) !=
(size_t) -1)

#8  0x0804afab in grep (fd=6, file=0xbff80582 "/tmp/tim/mta", stats=0x805d300)
    at grep.c:810
#9  0x0804b34d in grepfile (file=0xbff80582 "/tmp/tim/mta", stats=0x805d300)
    at grep.c:928
#10 0x0804caf0 in main (argc=4, argv=0xbff00aa4) at grep.c:1742


Comment 7 Tim Waugh 2005-03-17 16:48:35 UTC
Note: to actually get this backtrace I had to force gdb to set the LANG
environment variable correctly, like this:

gdb --args ...
(gdb) break main
...
(gdb) print setenv("LANG","zh_CN",1)
0
(gdb) continue


Comment 8 Jakub Jelinek 2005-03-17 18:13:24 UTC
This is either a bug in regex, or in zh_CN locale.
The problem is that in GB2312 there is a:
<U011B>     /xa8/xa7     LATIN SMALL LETTER E WITH CARON
character.  towupper((wchar_t) 0x11B) is 0x11A, i.e.
LATIN CAPITAL LETTER E WITH CARON
But this character is not represented in the GB2312 charset.

Uli/Roland, shouldn't towupper etc. only return characters valid in the
particular locale's charset, or is it ok if for a char from the charset
towupper returns a wide character that has no multi-byte representation?

If towupper is right, then the bug is in build_wcs_upper_buffer,
where it doesn't take that possibility into account and assumes if mbrtowc
succeeded, that wcrtomb of towupper'ed char will succeed too.
In that case I'll create a testcase and fix it.

Comment 10 Jakub Jelinek 2005-03-20 15:16:50 UTC
Should be fixed in glibc-2.3.4-16.


Note You need to log in before you can comment on or make changes to this bug.