Bug 222599

Summary: Memory Corruption when parsing a text file with gawk on FC5.
Product: [Fedora] Fedora Reporter: Scott Chandler <chanman72002>
Component: gawkAssignee: Karel Zak <kzak>
Status: CLOSED ERRATA QA Contact: Brock Organ <borgan>
Severity: high Docs Contact:
Priority: medium    
Version: 5   
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-01-15 11:14:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Scott Chandler 2007-01-15 02:49:32 UTC
I use the cdbmake package that is written by Dan Bernstein and it is used to
compile a list of email addresses from a text file and make a .CDB database.
In order to parse the text file in to a proper format, a shell script called
cdbmake-12 is used to set it up properly. File listed below

#!/bin/sh
# WARNING: This file was auto-generated. Do not edit!
awk '
  /^[^#]/ {
    print "+" length($1) "," length($2) ":" $1 "->" $2
  }
  END {
    print ""
}
'
I snipped off the pipe to the actual cdbmake program at the end of the script
because the problem happens before it gets called.

After making a text file with the contents.
testing1
testing2
testing3


Then going to the command line and executing.
./cdbmake-12 validrcpt.cdb validrcpt.cdb.tmp < fakemail.txt

It works fine. But if I change the fakemail.txt to:

testing1.us
testing2.us
testing3.us


Then run the above command, I get:

[root@intranet ~]# ./cdbmake-12 validrcptto.cdb fakemail.txt.tmp < fakemail.txt
> debugdat.txt
*** glibc detected *** awk: double free or corruption (out): 0x095bd9e8 ***
======= Backtrace: =========
/lib/libc.so.6[0x174a68]
/lib/libc.so.6(__libc_free+0x78)[0x177f6f]
awk(unref+0x8e)[0x806ffde]
awk(r_tree_eval+0xe26)[0x8082e26]
awk(do_print+0x10c)[0x805a84c]
awk(interpret+0x4fa)[0x8080a3a]
awk(interpret+0x1d6)[0x8080716]
awk(do_input+0x38)[0x806d1f8]
awk(main+0x1044)[0x806f844]
/lib/libc.so.6(__libc_start_main+0xdc)[0x1264e4]
awk[0x804cab1]
======= Memory map: ========
00111000-0023e000 r-xp 00000000 09:00 3566752    /lib/libc-2.4.so
0023e000-00240000 r-xp 0012d000 09:00 3566752    /lib/libc-2.4.so
00240000-00241000 rwxp 0012f000 09:00 3566752    /lib/libc-2.4.so
00241000-00244000 rwxp 00241000 00:00 0
0037f000-00381000 r-xp 00000000 09:00 3564761    /lib/libdl-2.4.so
00381000-00382000 r-xp 00001000 09:00 3564761    /lib/libdl-2.4.so
00382000-00383000 rwxp 00002000 09:00 3564761    /lib/libdl-2.4.so
0091f000-00938000 r-xp 00000000 09:00 3566751    /lib/ld-2.4.so
00938000-00939000 r-xp 00018000 09:00 3566751    /lib/ld-2.4.so
00939000-0093a000 rwxp 00019000 09:00 3566751    /lib/ld-2.4.so
009af000-009b0000 r-xp 009af000 00:00 0          [vdso]
00a77000-00a9a000 r-xp 00000000 09:00 3566753    /lib/libm-2.4.so
00a9a000-00a9b000 r-xp 00022000 09:00 3566753    /lib/libm-2.4.so
00a9b000-00a9c000 rwxp 00023000 09:00 3566753    /lib/libm-2.4.so
00c76000-00c81000 r-xp 00000000 09:00 3566754    /lib/libgcc_s-4.1.1-20060525.so.1
00c81000-00c82000 rwxp 0000a000 09:00 3566754    /lib/libgcc_s-4.1.1-20060525.so.1
08047000-08095000 r-xp 00000000 09:00 5527021    /bin/gawk
08095000-08096000 rw-p 0004d000 09:00 5527021    /bin/gawk
08096000-0809b000 rw-p 08096000 00:00 0
095bd000-095de000 rw-p 095bd000 00:00 0          [heap]
b7c00000-b7c21000 rw-p b7c00000 00:00 0
b7c21000-b7d00000 ---p b7c21000 00:00 0
b7dd5000-b7fd5000 r--p 00000000 09:00 3600782    /usr/lib/locale/locale-archive
b7fd5000-b7fd8000 rw-p b7fd5000 00:00 0
b7fd8000-b7fdf000 r--s 00000000 09:00 3663109    /usr/lib/gconv/gconv-modules.cache
bfac9000-bfadf000 rw-p bfac9000 00:00 0          [stack]
./cdbmake-12: line 10: 17111 Aborted                 awk '
  /^[^#]/ {
    print "+" length($1) "," length($2) ":" $1 "->" $2
  }
  END {
    print ""
  }
'

As you can see from the command line that I ran, I redirected the output to
debugdat.txt and that contains.

+29,0:testing1.us->
+29,0:testing2.us->


Of course this is not my list of real email addresses but with my real list I
get same results.

I am running Fedora Core 5 with these RPM packages installed in case you need to
know.

gawk-3.1.5-6.3
glibc-2.4-11
glibc-common-2.4-11
glibc-devel-2.4-11
glibc-headers-2.4-11
kernel-2.6.15-1.2054_FC5
I do have kernel-2.6.18-1.2257.fc5 but it acts up.

Any help would be appreciated. I don't know if this is a gawk, glibc, or kernel
problem.  By the way, I do have a development machine that runs Fedora Core 4
and I have no problems with it.

Thanks,
Scott Chandler