Bug 734335

Summary: [abrt] coreutils-8.10-2.fc15: different_multi: Process /usr/bin/uniq was killed by signal 11 (SIGSEGV)
Product: [Fedora] Fedora Reporter: seoaqua <seoaqua>
Component: coreutilsAssignee: Ondrej Vasik <ovasik>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 15CC: kdudka, maxamillion, ovasik, rrakus, twaugh, vvitek
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: abrt_hash:9ba38693068905d711c4e5638b8ac2630c7ab785
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-07-16 11:38:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description seoaqua 2011-08-30 06:13:46 UTC
abrt version: 2.0.3
architecture:   x86_64
backtrace_rating: 4
cmdline:        uniq mapsource/dict/source/en/abc/a.raw
comment:        unia a_big_text_file > target file
component:      coreutils
crash_function: different_multi
executable:     /usr/bin/uniq
kernel:         2.6.40.3-0.fc15.x86_64
os_release:     Fedora release 15 (Lovelock)
package:        coreutils-8.10-2.fc15
reason:         Process /usr/bin/uniq was killed by signal 11 (SIGSEGV)
time:           Tue Aug 30 14:09:08 2011
uid:            500
username:       aqua

backtrace:
:[New LWP 31433]
:Core was generated by `uniq mapsource/dict/source/en/abc/a.raw'.
:Program terminated with signal 11, Segmentation fault.
:#0  0x0000000000402735 in different_multi (old=<optimized out>, new=<optimized out>, oldlen=<optimized out>, newlen=<optimized out>, oldstate=..., newstate=...) at uniq.c:397
:397	uniq.c: No such file or directory.
:	in uniq.c
:
:Thread 1 (LWP 31433):
:#0  0x0000000000402735 in different_multi (old=<optimized out>, new=<optimized out>, oldlen=<optimized out>, newlen=<optimized out>, oldstate=..., newstate=...) at uniq.c:397
:        i = <optimized out>
:        j = <optimized out>
:        chars = <optimized out>
:        str = {0x7fd06cd4f010 "ICNYAHOOCOMIGNICULUS DESIDERIIN WORDTUPOLEV PSCQUOTCQUOT IS FOR CORPSEESAW\\ CRIED SO HARD THEN LAUGHED\\ SMOKED A CIGARETTE IMPROVE THE LINE CONTROL RADIX ACONITI BRACHYPODI V FALLEN ASLEEP AT WORK\\ SC"..., 0x24d8030 "\351\273\222\345\237\267\344\272\213\n\213\274\351\237\263\n\231\205\346\234\272\345\234\272\n\246\231\346\270\257\351\203\250\351\232\212\n\223\241\346\234\203\n\274\232\n\214\345\217\221\345\261\225\347\232\204\345\216\237\345\210\231\n"}
:        copy = {0x7fffa2dfde30 <Address 0x7fffa2dfde30 out of bounds>, 0x38369939c0 "\230$\255\373"}
:        len = {1025406798, 9}
:        state = {{__count = 0, __value = {__wch = 0, __wchb = "\000\000\000"}}, {__count = 0, __value = {__wch = 0, __wchb = "\000\000\000"}}}
:        mblength = <optimized out>
:        wc = 32720 L'\x7fd0'
:        uwc = <optimized out>
:        state_bak = {__count = 0, __value = {__wch = 0, __wchb = "\000\000\000"}}
:#1  0x000000000040201b in check_file (delimiter=10 '\n', outfile=0x40666a "-", infile=0x7fffdffe756b "mapsource/dict/source/en/abc/a.raw") at uniq.c:515
:        thisfield = 0x7fd06cd4f010 "ICNYAHOOCOMIGNICULUS DESIDERIIN WORDTUPOLEV PSCQUOTCQUOT IS FOR CORPSEESAW\\ CRIED SO HARD THEN LAUGHED\\ SMOKED A CIGARETTE IMPROVE THE LINE CONTROL RADIX ACONITI BRACHYPODI V FALLEN ASLEEP AT WORK\\ SC"...
:        thislen = 1025406798
:        thisstate = {__count = 0, __value = {__wch = 0, __wchb = "\000\000\000"}}
:        prevfield = 0x24d8030 "\351\273\222\345\237\267\344\272\213\n\213\274\351\237\263\n\231\205\346\234\272\345\234\272\n\246\231\346\270\257\351\203\250\351\232\212\n\223\241\346\234\203\n\274\232\n\214\345\217\221\345\261\225\347\232\204\345\216\237\345\210\231\n"
:        prevlen = 9
:        prevstate = {__count = 0, __value = {__wch = 0, __wchb = "\000\000\000"}}
:        lb1 = {size = 64, length = 10, buffer = 0x24d8030 "\351\273\222\345\237\267\344\272\213\n\213\274\351\237\263\n\231\205\346\234\272\345\234\272\n\246\231\346\270\257\351\203\250\351\232\212\n\223\241\346\234\203\n\274\232\n\214\345\217\221\345\261\225\347\232\204\345\216\237\345\210\231\n", state = {__count = 0, __value = {__wch = 0, __wchb = "\000\000\000"}}}
:        thisline = 0x7fffdffe5700
:        lb2 = {size = 1062361421, length = 1025406799, buffer = 0x7fd06cd4f010 "ICNYAHOOCOMIGNICULUS DESIDERIIN WORDTUPOLEV PSCQUOTCQUOT IS FOR CORPSEESAW\\ CRIED SO HARD THEN LAUGHED\\ SMOKED A CIGARETTE IMPROVE THE LINE CONTROL RADIX ACONITI BRACHYPODI V FALLEN ASLEEP AT WORK\\ SC"..., state = {__count = 0, __value = {__wch = 0, __wchb = "\000\000\000"}}}
:        prevline = 0x7fffdffe56e0
:#2  main (argc=<optimized out>, argv=<optimized out>) at uniq.c:809
:        optc = <optimized out>
:        posixly_correct = <optimized out>
:        skip_field_option_type = <optimized out>
:        nfiles = <optimized out>
:        file = {0x7fffdffe756b "mapsource/dict/source/en/abc/a.raw", 0x40666a "-"}
:        delimiter = 10 '\n'
:From                To                  Syms Read   Shared Object Library
:0x000000383661ece0  0x000000383674338c  Yes         /lib64/libc.so.6
:0x0000003836200b20  0x0000003836218cca  Yes         /lib64/ld-linux-x86-64.so.2
:$1 = 0x0
:No symbol "__glib_assert_msg" in current context.
:rax            0x7fffa2dfde30	140735925968432
:rbx            0x3d1e774e	1025406798
:rcx            0x7fffdffe55f0	140736951375344
:rdx            0x3d1e774e	1025406798
:rsi            0x7fd06cd4f010	140533155819536
:rdi            0x7fffdffe560c	140736951375372
:rbp            0x7fffdffe5650	0x7fffdffe5650
:rsp            0x7fffa2dfde30	0x7fffa2dfde30
:r8             0x0	0
:r9             0x0	0
:r10            0x1	1
:r11            0x246	582
:r12            0x0	0
:r13            0x0	0
:r14            0x0	0
:r15            0x7fd06cd4f010	140533155819536
:rip            0x402735	0x402735 <different_multi+229>
:eflags         0x10202	[ IF RF ]
:cs             0x33	51
:ss             0x2b	43
:ds             0x0	0
:es             0x0	0
:fs             0x0	0
:gs             0x0	0
:Dump of assembler code for function different_multi:
:   0x0000000000402650 <+0>:	push   %rbp
:   0x0000000000402651 <+1>:	mov    %rsp,%rbp
:   0x0000000000402654 <+4>:	push   %r15
:   0x0000000000402656 <+6>:	push   %r14
:   0x0000000000402658 <+8>:	xor    %r14d,%r14d
:   0x000000000040265b <+11>:	push   %r13
:   0x000000000040265d <+13>:	push   %r12
:   0x000000000040265f <+15>:	push   %rbx
:   0x0000000000402660 <+16>:	mov    %rdx,%rbx
:   0x0000000000402663 <+19>:	sub    $0x98,%rsp
:   0x000000000040266a <+26>:	mov    %rdi,-0x90(%rbp)
:   0x0000000000402671 <+33>:	mov    %rsi,-0x88(%rbp)
:   0x0000000000402678 <+40>:	mov    %fs:0x28,%rax
:   0x0000000000402681 <+49>:	mov    %rax,-0x38(%rbp)
:   0x0000000000402685 <+53>:	xor    %eax,%eax
:   0x0000000000402687 <+55>:	mov    %rdx,-0x70(%rbp)
:   0x000000000040268b <+59>:	mov    %rcx,-0x68(%rbp)
:   0x000000000040268f <+63>:	mov    %r8,-0x60(%rbp)
:   0x0000000000402693 <+67>:	mov    %r9,-0x58(%rbp)
:   0x0000000000402697 <+71>:	lea    0x1f(%rbx),%rax
:   0x000000000040269b <+75>:	xor    %edx,%edx
:   0x000000000040269d <+77>:	mov    $0x10,%ecx
:   0x00000000004026a2 <+82>:	xor    %r12d,%r12d
:   0x00000000004026a5 <+85>:	div    %rcx
:   0x00000000004026a8 <+88>:	shl    $0x4,%rax
:   0x00000000004026ac <+92>:	sub    %rax,%rsp
:   0x00000000004026af <+95>:	lea    0xf(%rsp),%rax
:   0x00000000004026b4 <+100>:	and    $0xfffffffffffffff0,%rax
:   0x00000000004026b8 <+104>:	test   %rbx,%rbx
:   0x00000000004026bb <+107>:	mov    %rax,-0xa8(%rbp)
:   0x00000000004026c2 <+114>:	mov    %rax,-0x80(%rbp,%r14,8)
:   0x00000000004026c7 <+119>:	je     0x402768 <different_multi+280>
:   0x00000000004026cd <+125>:	cmpq   $0x0,0x205f9b(%rip)        # 0x608670 <check_chars>
:   0x00000000004026d5 <+133>:	je     0x402768 <different_multi+280>
:   0x00000000004026db <+139>:	lea    -0x60(%rbp,%r14,8),%rdx
:   0x00000000004026e0 <+144>:	xor    %r13d,%r13d
:   0x00000000004026e3 <+147>:	mov    %rdx,-0xa0(%rbp)
:   0x00000000004026ea <+154>:	jmp    0x40270a <different_multi+186>
:   0x00000000004026ec <+156>:	nopl   0x0(%rax)
:   0x00000000004026f0 <+160>:	mov    $0x1,%ecx
:   0x00000000004026f5 <+165>:	add    %rcx,%r12
:   0x00000000004026f8 <+168>:	cmp    %rbx,%r12
:   0x00000000004026fb <+171>:	jae    0x402768 <different_multi+280>
:   0x00000000004026fd <+173>:	add    $0x1,%r13
:   0x0000000000402701 <+177>:	cmp    %r13,0x205f68(%rip)        # 0x608670 <check_chars>
:   0x0000000000402708 <+184>:	jbe    0x402768 <different_multi+280>
:   0x000000000040270a <+186>:	mov    -0x60(%rbp,%r14,8),%rcx
:   0x000000000040270f <+191>:	mov    -0x90(%rbp,%r14,8),%r15
:   0x0000000000402717 <+199>:	mov    %rbx,%rdx
:   0x000000000040271a <+202>:	lea    -0x44(%rbp),%rdi
:   0x000000000040271e <+206>:	sub    %r12,%rdx
:   0x0000000000402721 <+209>:	mov    %rcx,-0x98(%rbp)
:   0x0000000000402728 <+216>:	mov    -0xa0(%rbp),%rcx
:   0x000000000040272f <+223>:	add    %r12,%r15
:   0x0000000000402732 <+226>:	mov    %r15,%rsi
:=> 0x0000000000402735 <+229>:	callq  0x401378 <mbrtowc@plt>
:   0x000000000040273a <+234>:	test   %rax,%rax
:   0x000000000040273d <+237>:	mov    %rax,%rcx
:   0x0000000000402740 <+240>:	je     0x4026f0 <different_multi+160>
:   0x0000000000402742 <+242>:	cmp    $0xfffffffffffffffe,%rax
:   0x0000000000402746 <+246>:	jb     0x402798 <different_multi+328>
:   0x0000000000402748 <+248>:	mov    -0x98(%rbp),%rax
:   0x000000000040274f <+255>:	mov    $0x1,%ecx
:   0x0000000000402754 <+260>:	add    %rcx,%r12
:   0x0000000000402757 <+263>:	cmp    %rbx,%r12
:   0x000000000040275a <+266>:	mov    %rax,-0x60(%rbp,%r14,8)
:   0x000000000040275f <+271>:	jb     0x4026fd <different_multi+173>
:   0x0000000000402761 <+273>:	nopl   0x0(%rax)
:   0x0000000000402768 <+280>:	mov    -0xa8(%rbp),%rdx
:   0x000000000040276f <+287>:	cmp    $0x1,%r14
:   0x0000000000402773 <+291>:	mov    %r12,-0x70(%rbp,%r14,8)
:   0x0000000000402778 <+296>:	movb   $0x0,(%rdx,%r12,1)
:   0x000000000040277d <+301>:	je     0x402820 <different_multi+464>
:   0x0000000000402783 <+307>:	mov    -0x68(%rbp),%rbx
:   0x0000000000402787 <+311>:	mov    $0x1,%r14d
:   0x000000000040278d <+317>:	jmpq   0x402697 <different_multi+71>
:   0x0000000000402792 <+322>:	nopw   0x0(%rax,%rax,1)
:   0x0000000000402798 <+328>:	cmpb   $0x0,0x205ee5(%rip)        # 0x608684 <ignore_case>
:   0x000000000040279f <+335>:	je     0x4027f8 <different_multi+424>
:   0x00000000004027a1 <+337>:	mov    -0x44(%rbp),%edx
:   0x00000000004027a4 <+340>:	mov    %rax,-0xb8(%rbp)
:   0x00000000004027ab <+347>:	mov    %edx,%edi
:   0x00000000004027ad <+349>:	mov    %edx,-0xb0(%rbp)
:   0x00000000004027b3 <+355>:	callq  0x4014d8 <towupper@plt>
:   0x00000000004027b8 <+360>:	mov    -0xb0(%rbp),%edx
:   0x00000000004027be <+366>:	mov    -0xb8(%rbp),%rcx
:   0x00000000004027c5 <+373>:	cmp    %eax,%edx
:   0x00000000004027c7 <+375>:	je     0x4027f8 <different_multi+424>
:   0x00000000004027c9 <+377>:	mov    -0xa8(%rbp),%rdi
:   0x00000000004027d0 <+384>:	lea    -0x40(%rbp),%rdx
:   0x00000000004027d4 <+388>:	mov    %eax,%esi
:   0x00000000004027d6 <+390>:	movq   $0x0,-0x40(%rbp)
:   0x00000000004027de <+398>:	add    %r12,%rdi
:   0x00000000004027e1 <+401>:	callq  0x401538 <wcrtomb@plt>
:   0x00000000004027e6 <+406>:	mov    -0xb8(%rbp),%rcx
:   0x00000000004027ed <+413>:	jmpq   0x4026f5 <different_multi+165>
:   0x00000000004027f2 <+418>:	nopw   0x0(%rax,%rax,1)
:   0x00000000004027f8 <+424>:	mov    -0xa8(%rbp),%rdi
:   0x00000000004027ff <+431>:	mov    %rcx,%rdx
:   0x0000000000402802 <+434>:	mov    %r15,%rsi
:   0x0000000000402805 <+437>:	mov    %rcx,-0xb8(%rbp)
:   0x000000000040280c <+444>:	add    %r12,%rdi
:   0x000000000040280f <+447>:	callq  0x401548 <memcpy@plt>
:   0x0000000000402814 <+452>:	mov    -0xb8(%rbp),%rcx
:   0x000000000040281b <+459>:	jmpq   0x4026f5 <different_multi+165>
:   0x0000000000402820 <+464>:	mov    -0x68(%rbp),%rcx
:   0x0000000000402824 <+468>:	mov    -0x78(%rbp),%rdx
:   0x0000000000402828 <+472>:	mov    -0x70(%rbp),%rsi
:   0x000000000040282c <+476>:	mov    -0x80(%rbp),%rdi
:   0x0000000000402830 <+480>:	callq  0x403500 <xmemcoll>
:   0x0000000000402835 <+485>:	mov    -0x38(%rbp),%rcx
:   0x0000000000402839 <+489>:	xor    %fs:0x28,%rcx
:   0x0000000000402842 <+498>:	jne    0x402853 <different_multi+515>
:   0x0000000000402844 <+500>:	lea    -0x28(%rbp),%rsp
:   0x0000000000402848 <+504>:	pop    %rbx
:   0x0000000000402849 <+505>:	pop    %r12
:   0x000000000040284b <+507>:	pop    %r13
:   0x000000000040284d <+509>:	pop    %r14
:   0x000000000040284f <+511>:	pop    %r15
:   0x0000000000402851 <+513>:	pop    %rbp
:   0x0000000000402852 <+514>:	retq   
:   0x0000000000402853 <+515>:	callq  0x4015b8 <__stack_chk_fail@plt>
:End of assembler dump.

build_ids:
:fd629e5d73463f985c297358d5eed7c226714051
:90fe13fac734263f663981ce4d1691cf926ced25
:1d87720659528ad80e8870da2b2bb4af4470cf66

dso_list:
:/usr/lib64/gconv/gconv-modules.cache glibc-2.14-5.x86_64 (Fedora Project) 1313135968
:/usr/bin/uniq coreutils-8.10-2.fc15.x86_64 (Fedora Project) 1305315740
:/lib64/libc-2.14.so glibc-2.14-5.x86_64 (Fedora Project) 1313135968
:/usr/lib/locale/locale-archive glibc-common-2.14-5.x86_64 (Fedora Project) 1313135984
:/lib64/ld-2.14.so glibc-2.14-5.x86_64 (Fedora Project) 1313135968

environ:
:ORBIT_SOCKETDIR=/tmp/orbit-aqua
:XDG_SESSION_ID=1
:HOSTNAME=aqua
:IMSETTINGS_INTEGRATE_DESKTOP=yes
:GPG_AGENT_INFO=/tmp/keyring-TrgTTX/gpg:0:1
:SHELL=/bin/bash
:TERM=xterm
:XDG_SESSION_COOKIE=5623623e765d4de407a81a860000000a-1314581145.914383-650960096
:HISTSIZE=1000
:GJS_DEBUG_OUTPUT=stderr
:WINDOWID=54525958
:QTDIR=/usr/lib/qt-3.3
:GNOME_KEYRING_CONTROL=/tmp/keyring-TrgTTX
:'GJS_DEBUG_TOPICS=JS ERROR;JS LOG'
:QTINC=/usr/lib/qt-3.3/include
:IMSETTINGS_MODULE=IBus
:USER=aqua
:LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.tbz=01;31:*.tbz2=01;31:*.bz=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:
:SSH_AUTH_SOCK=/tmp/keyring-TrgTTX/ssh
:SESSION_MANAGER=local/unix:@/tmp/.ICE-unix/1596,unix/unix:/tmp/.ICE-unix/1596
:USERNAME=aqua
:DESKTOP_SESSION=gnome
:MAIL=/var/spool/mail/aqua
:PATH=/usr/lib/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/aqua/bin
:_=/usr/bin/uniq
:QT_IM_MODULE=xim
:PWD=/home/aqua
:XMODIFIERS=@im=ibus
:KDE_IS_PRELINKED=1
:LANG=en_US.utf8
:GNOME_KEYRING_PID=1588
:GDM_LANG=en_US.utf8
:KDEDIRS=/usr
:GDMSESSION=gnome
:HISTCONTROL=ignoredups
:SHLVL=3
:HOME=/home/aqua
:GNOME_DESKTOP_SESSION_ID=this-is-deprecated
:LOGNAME=aqua
:QTLIB=/usr/lib/qt-3.3/lib
:DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-6F3C2hti33,guid=27a8b06c21d4a194d7cb393b000000d8
:'LESSOPEN=||/usr/bin/lesspipe.sh %s'
:WINDOWPATH=1
:DISPLAY=:0.0
:XDG_RUNTIME_DIR=/run/user/aqua
:G_BROKEN_FILENAMES=1
:XAUTHORITY=/var/run/gdm/auth-for-aqua-jha6t2/database
:COLORTERM=gnome-terminal

maps:
:00400000-00408000 r-xp 00000000 fd:01 161990                             /usr/bin/uniq
:00608000-00609000 rw-p 00008000 fd:01 161990                             /usr/bin/uniq
:024d7000-024f8000 rw-p 00000000 00:00 0                                  [heap]
:3836200000-383621f000 r-xp 00000000 fd:01 136190                         /lib64/ld-2.14.so
:383641e000-383641f000 r--p 0001e000 fd:01 136190                         /lib64/ld-2.14.so
:383641f000-3836420000 rw-p 0001f000 fd:01 136190                         /lib64/ld-2.14.so
:3836420000-3836421000 rw-p 00000000 00:00 0 
:3836600000-383678f000 r-xp 00000000 fd:01 161970                         /lib64/libc-2.14.so
:383678f000-383698f000 ---p 0018f000 fd:01 161970                         /lib64/libc-2.14.so
:383698f000-3836993000 r--p 0018f000 fd:01 161970                         /lib64/libc-2.14.so
:3836993000-3836994000 rw-p 00193000 fd:01 161970                         /lib64/libc-2.14.so
:3836994000-383699a000 rw-p 00000000 00:00 0 
:7fd06cd4f000-7fd0ac275000 rw-p 00000000 00:00 0 
:7fd0b8d17000-7fd0bf13a000 r--p 00000000 fd:01 161531                     /usr/lib/locale/locale-archive
:7fd0bf13a000-7fd0bf13d000 rw-p 00000000 00:00 0 
:7fd0bf14f000-7fd0bf156000 r--s 00000000 fd:01 134840                     /usr/lib64/gconv/gconv-modules.cache
:7fd0bf156000-7fd0bf159000 rw-p 00000000 00:00 0 
:7fffdffc7000-7fffdffe8000 rw-p 00000000 00:00 0                          [stack]
:7fffdffff000-7fffe0000000 r-xp 00000000 00:00 0                          [vdso]
:ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

var_log_messages:
:Aug 29 16:06:24 aqua kernel: [24255.347681] uniq[2664]: segfault at 7fff1121bb78 ip 0000000000402735 sp 00007fff1121bb80 error 6 in uniq[400000+8000]
:Aug 29 16:06:24 aqua abrt[2665]: saved core dump of pid 2664 (/usr/bin/uniq) to /var/spool/abrt/ccpp-2011-08-29-16:06:24-2664.new/coredump (12664832 bytes)
:Aug 29 20:13:08 aqua kernel: [39058.896717] uniq[27472]: segfault at 7fffacb44f08 ip 0000000000402735 sp 00007fffacb44f10 error 6 in uniq[400000+8000]
:Aug 29 20:13:14 aqua abrt[27473]: saved core dump of pid 27472 (/usr/bin/uniq) to /var/spool/abrt/ccpp-2011-08-29-20:13:08-27472.new/coredump (472543232 bytes)
:Aug 30 11:02:27 aqua kernel: [92418.367795] uniq[22419]: segfault at 7fff70ce05c8 ip 0000000000402735 sp 00007fff70ce05d0 error 6 in uniq[400000+8000]
:Aug 30 11:02:33 aqua abrt[22420]: saved core dump of pid 22419 (/usr/bin/uniq) to /var/spool/abrt/ccpp-2011-08-30-11:02:27-22419.new/coredump (472543232 bytes)
:Aug 30 11:05:02 aqua kernel: [92573.746774] uniq[22583]: segfault at 7fff9de50388 ip 0000000000402735 sp 00007fff9de50390 error 6 in uniq[400000+8000]
:Aug 30 11:05:08 aqua abrt[22586]: saved core dump of pid 22583 (/usr/bin/uniq) to /var/spool/abrt/ccpp-2011-08-30-11:05:02-22583.new/coredump (472543232 bytes)
:Aug 30 12:02:13 aqua kernel: [96004.817170] uniq[25107]: segfault at 7fff93aba018 ip 0000000000402735 sp 00007fff93aba020 error 6 in uniq[400000+8000]
:Aug 30 12:02:30 aqua abrt[25134]: saved core dump of pid 25107 (/usr/bin/uniq) to /var/spool/abrt/ccpp-2011-08-30-12:02:14-25107.new/coredump (708624384 bytes)
:Aug 30 14:09:08 aqua kernel: [103619.378063] uniq[31433]: segfault at 7fffa2dfde28 ip 0000000000402735 sp 00007fffa2dfde30 error 6 in uniq[400000+8000]
:Aug 30 14:09:37 aqua abrt[31454]: saved core dump of pid 31433 (/usr/bin/uniq) to /var/spool/abrt/ccpp-2011-08-30-14:09:08-31433.new/coredump (1062744064 bytes)

Comment 1 seoaqua 2011-08-30 06:29:17 UTC
i think its caused by using one super long string as one line
i forgot to put "\n" at the end of each line in my program

Comment 2 Ondrej Vasik 2011-08-30 09:00:03 UTC
Thanks for report ... 2 questions:
1) Is it possible to get this "a_big_text_file" ? Even directly via email, if you don't want to attach it to bugzilla... I tried 1G4 file and it didn't crash...
2) Could you please try to run the same with C locales? (I mean LC_ALL=C uniq ... )

Comment 3 seoaqua 2011-08-30 09:47:11 UTC
sorry it was a temp file in my program.and i think u r right that long string didnt make crash.

im trying to reproduce the issue in my old way, and i dont know what u r talking about in Q2,sorry >.<

Comment 4 Ondrej Vasik 2011-08-30 11:46:25 UTC
Thanks for the update...Please let me know if you manage to reproduce the crash scenario.

About the Q2 - based on the LANG environmental variable (en_US.utf8) automatically sent by ABRT, you use the multibyte support in uniq. I just wanted to be sure that the crash occurs with single byte (e.g. C) locales. To run the uniq utility with different locales you could just run it with LC_ALL=C in the front of command - so the command from your example will look like:

LC_ALL=C uniq a_big_text_file > target file

In that case, uniq should work just fine. (and faster, as multibyte functions are very slow).

Comment 5 seoaqua 2011-08-31 02:49:59 UTC
sorry i need to process chinese under utf-8, but i'll try to reproduce in the old stupid way^_^

Comment 6 seoaqua 2011-08-31 06:43:11 UTC
i can't reproduce. the code wasn't under version control.
just imagine 16,000,000 line words, with 5000 thousand in one line, and maybe 15000 in another line and 99% single words in the rest lines

maybe you should close this now, sorry to bother :P

Comment 7 seoaqua 2011-08-31 07:46:34 UTC
if possible, could you tell me how to represent unicode range in shell? like:

sed 's/\u000-\u001/xxx/' filename 

produces range error

many thanks !

Comment 8 Ondrej Vasik 2011-08-31 11:06:39 UTC
No closing is not necessary atm. - as the backtrace is almost complete - so I could analyze the issue even without reproducer ... but it would be better to have it for verification. 
As for your question about unicode range representation - I don't know - so adding sed maintainer to cc ... we'll see :)

Comment 9 Vojtech Vitek 2011-08-31 20:10:46 UTC
(In reply to comment #7)
> sed 's/\u000-\u001/xxx/' filename 

Ifaik, you can't specify Unicode character with a '\uXXXX' escape sequence in sed yet, thus you can't even make a range like that..

But you can specify the ranges with the Unicode characters themselves:
$ echo -e "øùúûüý" | sed 's/[û-ü]/_/g'
øùú__ý

Or you can use their hexadecimal representation:
$ echo -e "øùúûüý" | sed 's/[\xC3\xBB-\xC3\xBC]/_/g'
øùú__ý

Please, beware of strange behaviour while not using plain ASCII ordering (LC_ALL=C). See http://lists.gnu.org/archive/html/bug-gnu-utils/2011-04/msg00016.html for more details..

Comment 10 seoaqua 2011-09-01 03:46:55 UTC
i have been search this answer for three days
thank you so much!
in chinese: 非常感谢^_^

(In reply to comment #9)
> (In reply to comment #7)
> > sed 's/\u000-\u001/xxx/' filename 
> 
> Ifaik, you can't specify Unicode character with a '\uXXXX' escape sequence in
> sed yet, thus you can't even make a range like that..
> 
> But you can specify the ranges with the Unicode characters themselves:
> $ echo -e "øùúûüý" | sed 's/[û-ü]/_/g'
> øùú__ý
> 
> Or you can use their hexadecimal representation:
> $ echo -e "øùúûüý" | sed 's/[\xC3\xBB-\xC3\xBC]/_/g'
> øùú__ý
> 
> Please, beware of strange behaviour while not using plain ASCII ordering
> (LC_ALL=C). See
> http://lists.gnu.org/archive/html/bug-gnu-utils/2011-04/msg00016.html for more
> details..

Comment 11 seoaqua 2011-09-13 07:14:12 UTC
sorry to bother you again,
actually i want to replace all non-chinese characters with NULL

i think there are several chinese characters missing from the sed range

it almost works with
=============command===============
sed 's/[一-龥]//g' filename
=============command===============
or
====================command====================================
sed 's/[\xE4\xB8\x80-\xE9\xBE\xA5]//g' filename 
====================command====================================

but there are some other chinese characters 

=====================command==========================
sed 's/[\xE4\xB8\x80-\xE9\xBE\xA6]//g' filename 
=====================command==========================
will cause error:
sed: -e expression #1, char 14: Invalid collation character








(In reply to comment #9)
> (In reply to comment #7)
> > sed 's/\u000-\u001/xxx/' filename 
> 
> Ifaik, you can't specify Unicode character with a '\uXXXX' escape sequence in
> sed yet, thus you can't even make a range like that..
> 
> But you can specify the ranges with the Unicode characters themselves:
> $ echo -e "øùúûüý" | sed 's/[û-ü]/_/g'
> øùú__ý
> 
> Or you can use their hexadecimal representation:
> $ echo -e "øùúûüý" | sed 's/[\xC3\xBB-\xC3\xBC]/_/g'
> øùú__ý
> 
> Please, beware of strange behaviour while not using plain ASCII ordering
> (LC_ALL=C). See
> http://lists.gnu.org/archive/html/bug-gnu-utils/2011-04/msg00016.html for more
> details..

Comment 12 Ondrej Vasik 2012-07-16 11:38:07 UTC
Cleanup, sorry for no solution of this bug report, closing INSUFFICIENT_DATA - I'm not able to reproduce it, reporter is not able to reproduce it and I don't see any clear flaw in the code in question. (ok, some additional safety checks could be done there, but it will slow down everything for uncertain gain). Anyone could reopen this bugzilla if he is able to reproduce the issue.